MSc Thesis “The Performance of Artificial Neural Networks on Rough Heston Model”
By Chun Kiat Ong, University of Birmingham U.K. 2020
Supervisor: Dr. Daniel J. Duffy, Datasim Education BV Amsterdam
September 24 2020
This is one of the first MSc theses to address the full software lifecycle of the analysis (maths), design (Structured Analysis/top-down decomposition) and implementation (C++, Python, ANN, Keras, TensorFlow) to computing option prices and implied volatility under rough Heston model. This new model resolves a number of issues surrounding the original Heston model.
We compare the solutions based on ANNs with more traditional computational solutions; based on our level playing field analysis (that is, we compare “apples with apples”), for this problem the performance of the ANN solution is 7 times slower for option pricing and 17 times slower for implied volatility modelling than traditional methods. Of course, this is only one example but it is hard evidence nonetheless.
There are few articles that discuss the application of ANNs to computational finance and the ones that have been published claim outlandish performance improvements (10,000 times faster) or claim that they can solve 100-factor partial differential equations (PDEs) with Deep Learning techniques.
The popularity of Machine and Statistical Learning techniques (ML for short) of recent years can be found in pattern and image recognition, classification, social media services, online fraud detection, to name a few. More recently, there has been a flurry of activity and interest in applying ML to financial applications, in particular, option pricing, calibration and volatility modelling. The relatively few published articles devoted to these topics are testament to the fact that much needs to be done in order to advance the current mathematical and software knowledge from ad-hoc solutions, trial-and-error experimentation and folklore to defined processes and standardised design patterns.
The main goal of this thesis is to discuss the applicability of ML (in particular, Artificial Neural Networks (ANN)) to option pricing and implied volatility using the rough Heston model. This is a generalisation and improvement of the popular Heston model to address the latter’s difficulty in matching observed vanilla option prices. We shall deem ML to be successful (or not) by comparing it with more traditional methods such as analytical solutions and numerical methods. The main requirements and metrics are that the new methods be accurate (in some sense) and have good run-time performance. In any case, we wish to unambiguously quantify these metrics.
The approach taken in this thesis is state-of-art and original in a number of ways:
- Rough Heston and the related (numerical) mathematics (fractional Riccati equation).
- The optimal combination (speed, usability) of C++ and Python.
- The software design is based on Duffy’s Domain Architectures to partition a software system into loosely-coupled and autonomous subsystems. We have a defined process to effect this decomposition, thus allowing the student to “hit the ground running” and increase productivity.
- Taking the mystique out of ML applications by viewing them as standard software systems with “embedded AI components”, which are basically algorithms that have been written in C++ and Fortran and wrapped into libraries that can be called from Python.
- Few MSc theses reach this level. There are a number of reasons for this achieved level of expertise.
Prediction is difficult, especially predicting the Future
The area of Machine Learning and its realisation in software reminds me of the heyday of the Object-Oriented (OO) Paradigm, characterised by ad hoc solutions and by solving problems in any way that developers could manage. After some time developers were able to distinguish in these ad hoc solutions things that usually work and things that do not usually work (a classic example is that class inheritance is a mixed blessing). The ones that work entered the folklore and people tell each other about them informally. We codify the folklore as written heuristics and rules of procedures as it becomes more and more systematic. Eventually this codification becomes crisp enough to support models and theories, together with the associated mathematics (Duffy 2004, Shaw and Garlan 1996).
In the case of Machine Learning we see some opportunities for improvement by trying to replace ad hoc solutions by more robust ones:
- Broadening the mathematical scope of current practice … much of the theory seems to be based on linear algebra, discrete mathematics and finite dimensional problems. These methods will face a brick wall at some stage. It is worth mentioning that many of the ML algorithms can be replaced by other algorithms based on advanced mathematics such as Functional Analysis and Hilbert Space (for example, RKHS) methods.
- A more disciplined approach to software design and avoiding “balls of mud” code. Avoiding the scenario of next-generation armies of Python maintenance programmers. In particular, data scientists and non-programmers need to develop their software skills.
- The optimal combination of C++ and Python for ML applications.
If you have any queries please do not hesitate to contact me firstname.lastname@example.org
Duffy, Daniel J. Duffy, Domain Architectures Wiley 2004.
Duffy, Daniel J. Duffy, Financial Instrument Pricing using C++, second edition, Wiley 2018.
Mandara, Dalvir, Artificial Neural Networks for Black-Scholes Option Pricing and Prediction of Implied Volatility for the SABR Stochastic Volatility Model, MSc Mathematical Finance - 2018/19 University of Birmingham.
Shaw, M. and Garlan, D. Software Architecture, Prentice-Hall 1996.