Office 4603
James Clerk Maxwell Building
School of Mathematics
University of Edinburgh, EH9 3FD
I was born in Scotland and brought up in France, and the general opinion is that I speak both English and French with a foreign accent. I obtained my PhD in Statistics at Imperial College London in 2008, supervised by Prof. Andrew Walden. I was awarded a Heilbronn research fellowship in Data Science in 2012, which I first held at the University of Bristol and then at the University of Oxford. I became assistant professor of Statistics at the University of Bristol in July 2017, and was promoted to full professor in July 2022. As of Jan 2024, I am chair of Statistical Learning at the University of Edinburgh. My research interests include data exploration, embedding, machine learning and AI.
data exploration; statistical testing; clustering; anomaly detection; embedding; graph analytics; behaviour analytics; manifold learning; topological data analysis; non-parametric statistics; high-dimensional statistics; representation learning; unsupervised learning; machine learning.
In recent years, there has been a significant opportunity for innovation in Statistics. When I was doing my PhD, it was not "easy" to find data. Now, you can pretty much pick any subject, and find a relevant data source, with data processing pipelines and documentation. It's a lot easier to combine mathematical thinking with real-world data to make something useful.
I think a key opportunity now for statisticians is making better sense of embeddings, which have been hugely impactful in powering LLMs and can be seen to be at the heart of many deep learning algorithms. I believe a better understanding of embeddings could significantly expand the horizons of science and is a requirement for safe AI.
Embeddings are continuous vector representations of entities, such as words or nodes, and are the touching point between ML algorithms and the real world. There is growing evidence that embeddings can behave as quite pure mathematical objects, specifically as point clouds concentrated around explainable manifolds. Concepts like similarity, or trend, have a `shape’; abstract notions such as political opinion, the health of a patient, the function of a cell, can be made geometric and measurable. It's exciting to see the scope of rigorous statistical inference extend to areas not even traditionally called `science'.
More generally, my research is about discovering structure, for example, correlations, clusters, hierarchy, trends, or manifold structure; in complex data such as large relational databases, dynamic networks, or high-dimensional data (e.g. tables with many columns, text, images).
The applications of my research are quite wide-ranging, and I have won funding (over £7M between government & industry) for applications in biosciences, healthcare, (cyber-)security, societal resilience, environmental protection and more. For example, Microsoft uses unfolded spectral embedding[3] for anti-corruption[4]
Multiple PhD and postdoctoral research positions are to be opened for NeST. Please contact me if you want to discuss these or other NeST research/industrial collaboration opportunities.
More generally, I'm always happy to hear from students interested in doing research, e.g. a PhD. Fundamentally, you'll need to enjoy doing maths — everything else you can learn.