The course organiser is David Siska, School of Mathematics, Room 4611.
This is Semester 2 course. The course timetable is here: timetable.Â
The course details are here: course details.
The course Learn page is here: course Learn page (you may need to be already logged-in to learn for this to work).
Speaker: Yanzhao Yang (Oxford).
Title: Adaptive Partitioning and Learning for Stochastic Control of Diffusion Processes
Abstract: We study reinforcement learning for controlled diffusion processes with unbounded continuous state spaces, bounded continuous actions, and polynomially growing rewards: settings that arise naturally in finance, economics, and operations research. To overcome the challenges of continuous and high-dimensional domains, we introduce a model-based algorithm that adaptively partitions the joint state-action space. The algorithm maintains estimators of drift, volatility, and rewards within each partition, refining the discretization whenever estimation bias exceeds statistical confidence. This adaptive scheme balances exploration and approximation, enabling efficient learning in unbounded domains. Our analysis establishes regret bounds that depend on the problem horizon, state dimension, reward growth order, and a newly defined notion of zooming dimension tailored to unbounded diffusion processes. The bounds recover existing results for bounded settings as a special case, while extending theoretical guarantees to a broader class of diffusion-type problems. Finally, we validate the effectiveness of our approach through numerical experiments, including applications to high-dimensional problems such as multi-asset mean-variance portfolio selection.
The paper https://arxiv.org/abs/2512.14991
Speaker: Galen Cao (Edinburgh).
Title: Learning in Market Making
Abstract: Market making plays a central role in modern algorithmic trading, particularly with the rise of high-frequency trading and increasingly fast electronic markets. While the basic principle of market making -- buy at low and sell at high -- is simple, achieving consistent profitability is challenging due to various market microstructures, inventory risk, and etc.
This talk focuses on how to design efficient learning algorithms for market making with theoretical performance guarantees. We primarily focus on a model-based method, in which the market response to quotes post by the market maker is parameterised using the classical Avellaneda–Stoikov model. Within this setting, we propose an online learning algorithm based on the regularised maximum-likelihood estimator and show that it achieves the regret upper bound of order $\mathcal{O}(\log^2 T)$ in expectation. We then briefly outline ongoing work on model-free reinforcement learning methods, which aim to relax modelling assumptions, adapt to more complex and non-stationary market environments and show its performance guarantees.
Speaker: Konark Jain (UCL).
Title: From Limit Order Book Data to Hawkes Processes: A Hands-on Introduction to Self-Exciting Price Dynamics
Abstract: In this project, we will build a complete empirical pipeline for modelling high-frequency price dynamics using self-exciting point processes. Starting from raw limit order book data from the LOBSTER database, we go through data ingestion, exploratory analysis, and statistical modelling of event-driven price changes.
We first construct an event-level dataset from LOBSTER message files and extract price-changing events. Using this data, we reproduce key stylized facts of high-frequency markets, including heavy-tailed inter-arrival times and short-range dependence in signed price moves. Event counts are then aggregated in fixed time bins and analysed using vector autoregressive (VAR) models to provide evidence of temporal dependence and memory effects in order flow.
Motivated by these findings, we fit a one-dimensional Hawkes process with an exponential kernel to price change events. Model parameters are estimated via maximum likelihood estimation (MLE), and the fitted model is interpreted in terms of baseline activity, excitation strength, and branching ratio. The project concludes with diagnostic checks and simple extensions.
The course is 100% assesed through coursework.
You will work in a group of 3 students. You will be expected to produce a short report, a presentation and answer questions about your report and presentation. You will get one mark for this overall.
The report submission deadline will be 10am Monday Week 10, Semester 2.
The presentations will take place Week 10 and Monday and Tuesday of Week 11.
This year we'll focus on Reinforcement Learning (RL) applications in Financial Mathematics.
That's all this page has to say as of 3rd February 2026.