Friday, May 15, 2009

Cholesky & Jakarta Commons Math

In Finance, Cholesky is a useful way to decompose Matrix. It is not so simple to find a BSD licensed code using cholesky (most of them are GPL like this one). There is one in Apache Commons Maths library, which is a very interesting library. However for performance, it is still not very practical for some things like Cholesky.

Looking at the source one can easily understand why. I did a small (many people will say not representative 1 million loop test) and finds out:
cholesky GPL= 5.4ms
cholesky BSD=37.1ms

So BSD code is 7 times slower! Of course it can do a bit more and has many checks of validity, but still. It shows it is not easy to do Math libraries, because some people will care a lot about this performance difference, and some other people won't but will like the other "features".

Cholesky & Jakarta Commons Math

In Finance, Cholesky is a useful way to decompose Matrix. It is not so simple to find a BSD licensed code using cholesky (most of them are GPL like this one). There is one in Apache Commons Maths library, which is a very interesting library. However for performance, it is still not very practical for some things like Cholesky.

Looking at the source one can easily understand why. I did a small (many people will say not representative 1 million loop test) and finds out:
cholesky GPL= 5.4ms
cholesky BSD=37.1ms

So BSD code is 7 times slower! Of course it can do a bit more and has many checks of validity, but still. It shows it is not easy to do Math libraries, because some people will care a lot about this performance difference, and some other people won't but will like the other "features".

Hull American Option Price Fallacies


Hull says American put is best exercised immediately and american call is optimal at expiry like a european. Is this really true?

At first it seems really clever and model show clearly this. But if we change the market assumptions only a tiny bit, everything falls down.

I could not detail everything in a blog post so I created a static web page about it. Everything was produced in Java using algorithm found in popular books and graphs through JFreeChart.

Hull American Option Price Fallacies


Hull says American put is best exercised immediately and american call is optimal at expiry like a european. Is this really true?

At first it seems really clever and model show clearly this. But if we change the market assumptions only a tiny bit, everything falls down.

I could not detail everything in a blog post so I created a static web page about it. Everything was produced in Java using algorithm found in popular books and graphs through JFreeChart.

Tuesday, May 05, 2009

On Quasi Random Numbers - MersenneTwister vs Sobol precision in Asian Option Pricing

While starting a side project that does Monte Carlo pricing in Java (http://code.google.com/p/javamc/ - nothing yet there I am waiting for Mercurial repository support), I wondered what was the importance of quasi random numbers versus more regular pseudo random numbers in Monte Carlo simulations.

This brought me to read more carefully several books about Monte Carlo and Finance (Haug Option Pricing, Sobol Primer on Monte Carlo, and Glasserman Monte Carlo Methods in Finance Engineering). I had quite a hard time to understand why the dimension of the quasi random generator was so important to price an asian option. Intuitively I thought the averaging points of an asian option were all on the same path, so they should be using the same random generator. This is very wrong as one does not care about the path in the first place but just in simulating each point in the average (using the regular black and scholes hypothesis). Finding the estimation for the average on the given points forces to use independent random generators for each point, because we want to approximate the estimation by the sum over those random points for each point.

There is another simple argument to explain why independence of the random generators is so important. If we use the same generator for each point, then each point will move exactly the same way at each simulation. The average of those point will therefore behave exactly the same way as if there was only 1 point using the same generator. And we don't price an asian anymore but just a regular vanilla option.

Using a pseudo random generator, one does not see the problem of dimension, because we can create N independent dimensions by just taking numbers N by N on a pseudo random generator. So effectively having 1 or N dimensions is the same on a pseudo random generator.

Still I wrote a small test to see if a 1D quasi random generator was so bad when simulating N dimensions (taking values N by N on the quasi random generator). Here are the results:

MersenneTwister vs MersenneTwister on 10D asian:
14:43:51,111 INFO MonteCarloSimulationTest:114 - 867970000 -- expPrice=0.978958644504466
14:43:51,428 INFO MonteCarloSimulationTest:120 - 314619000 -- expPrice=0.9733220318545934
14:43:51,430 INFO MonteCarloSimulationTest:122 - relative difference=-0.005757763804951897
can be as high as 2%

Sobol vs MersenneTwister on 10D asian:
14:48:46,909 INFO MonteCarloSimulationTest:115 - 980209000 -- expPrice=0.9895032774079221
14:48:47,345 INFO MonteCarloSimulationTest:121 - 433685000 -- expPrice=0.9790264042895171
14:48:47,348 INFO MonteCarloSimulationTest:123 - relative difference=-0.010588012548932534
about 1% it is actually bounded by MersenneTwister precision.

Sobol vs Sobol1D on 10D asian:
14:47:08,614 INFO MonteCarloSimulationTest:115 - 717444000 -- expPrice=0.8810736428068913
14:47:08,925 INFO MonteCarloSimulationTest:121 - 308499000 -- expPrice=0.9791449305055208
14:47:08,927 INFO MonteCarloSimulationTest:123 - relative difference=0.11130884290920073
about 10% and stays that way even when increasing number of simulations.


Using an asian rate with 10 points, we see that Sobol1D will always give a very bad estimate, no matter the number of simulations. While Sobol used properly will give (much) better precision for less iterations. So even though there is the word random in quasi random, the numbers are very far from being random or even behaving like random numbers. It helped me to read about Van der Corput and Halton numbers to really understand quasi random numbers.

On Quasi Random Numbers - MersenneTwister vs Sobol precision in Asian Option Pricing

While starting a side project that does Monte Carlo pricing in Java (http://code.google.com/p/javamc/ - nothing yet there I am waiting for Mercurial repository support), I wondered what was the importance of quasi random numbers versus more regular pseudo random numbers in Monte Carlo simulations.

This brought me to read more carefully several books about Monte Carlo and Finance (Haug Option Pricing, Sobol Primer on Monte Carlo, and Glasserman Monte Carlo Methods in Finance Engineering). I had quite a hard time to understand why the dimension of the quasi random generator was so important to price an asian option. Intuitively I thought the averaging points of an asian option were all on the same path, so they should be using the same random generator. This is very wrong as one does not care about the path in the first place but just in simulating each point in the average (using the regular black and scholes hypothesis). Finding the estimation for the average on the given points forces to use independent random generators for each point, because we want to approximate the estimation by the sum over those random points for each point.

There is another simple argument to explain why independence of the random generators is so important. If we use the same generator for each point, then each point will move exactly the same way at each simulation. The average of those point will therefore behave exactly the same way as if there was only 1 point using the same generator. And we don't price an asian anymore but just a regular vanilla option.

Using a pseudo random generator, one does not see the problem of dimension, because we can create N independent dimensions by just taking numbers N by N on a pseudo random generator. So effectively having 1 or N dimensions is the same on a pseudo random generator.

Still I wrote a small test to see if a 1D quasi random generator was so bad when simulating N dimensions (taking values N by N on the quasi random generator). Here are the results:

MersenneTwister vs MersenneTwister on 10D asian:
14:43:51,111 INFO MonteCarloSimulationTest:114 - 867970000 -- expPrice=0.978958644504466
14:43:51,428 INFO MonteCarloSimulationTest:120 - 314619000 -- expPrice=0.9733220318545934
14:43:51,430 INFO MonteCarloSimulationTest:122 - relative difference=-0.005757763804951897
can be as high as 2%

Sobol vs MersenneTwister on 10D asian:
14:48:46,909 INFO MonteCarloSimulationTest:115 - 980209000 -- expPrice=0.9895032774079221
14:48:47,345 INFO MonteCarloSimulationTest:121 - 433685000 -- expPrice=0.9790264042895171
14:48:47,348 INFO MonteCarloSimulationTest:123 - relative difference=-0.010588012548932534
about 1% it is actually bounded by MersenneTwister precision.

Sobol vs Sobol1D on 10D asian:
14:47:08,614 INFO MonteCarloSimulationTest:115 - 717444000 -- expPrice=0.8810736428068913
14:47:08,925 INFO MonteCarloSimulationTest:121 - 308499000 -- expPrice=0.9791449305055208
14:47:08,927 INFO MonteCarloSimulationTest:123 - relative difference=0.11130884290920073
about 10% and stays that way even when increasing number of simulations.


Using an asian rate with 10 points, we see that Sobol1D will always give a very bad estimate, no matter the number of simulations. While Sobol used properly will give (much) better precision for less iterations. So even though there is the word random in quasi random, the numbers are very far from being random or even behaving like random numbers. It helped me to read about Van der Corput and Halton numbers to really understand quasi random numbers.