Saturday, April 18, 2015

Modern Programming Language for Monte-Carlo

A few recent programming languages sparked my interest:
  • Julia: because of the wide coverage of mathematical functions, and great attention to quality of the implementations. It has also some interesting web interface.
  • Dart: because it's a language focused purely on building apps for the web, and has a supposedly good VM.
  • Rust: it's the latest fad. It has interesting concepts around concurrency and a focus on being low level all the while being simpler than C.
I decided to see how well suited they would be on a simple Monte-Carlo simulation of a forward start option under the Black model. I am no expert at all in any of the languages, so this is a beginner's test. I compared the runtime for executing 16K simulations times a multiplier.

Multipl. Scala  Julia  JuliaA  Dart  Python  Rust
1        0.03   0.08    0.09   0.03     0.4  0.004
10       0.07   0.02    0.06   0.11     3.9  0.04
100      0.51   0.21    0.40   0.88          0.23
1000     4.11   2.07    4.17   8.04          2.01


About performance

I am quite impressed at Dart performance versus Scala (or vs. Java, as it has the same performance as Scala) given that it is much less strict about types and its focus is not at all on this kind of stuff.

Julia performance is great, that is if one is careful about types. Julia is very finicky about casting and optimizations, fortunately @time helps spotting the issues (often an inefficient cast will lead to copy and thus high allocation). JuliaA is my first attempt, with an implicit badly performing conversion of MersenneTwister to AbstractRNG. It is slower first, as the JIT costs is reflected on the first run, very much like in Java (although it appears to be even worse).

Rust is the most impressive. I had to add the --release flag to the cargo build tool to produce a properly optimized binary, otherwise the performance is up to 7x worse.

About the languages

My Python code is not vectorized, just like any of the other implementations. While the code looks relatively clean, I made the most errors compared to Julia or Scala. Python numpy isn't always great: norm.ppf is very slow, slower than my hand coded python implementation of AS241.

Dart does not have fixed arrays: everything is a list. It also does not have strict 64 bit int: they can be arbitrarily large. The dev environment is ok but not great.

Julia is a bit funny, not very OO (no object method) but more functional, although many OO concepts are there (type inheritence, type constructors). It was relatively straightforward, although I do not find intuitive the type conversion issues (eventual copy on conversion).

Rust took me the most time to write, as it has quite new concepts around mutable variables, and "pointers" scope. I relied on an existing MersenneTwister64 that worked with latest Rust. It was a bit disappointing to see that some dSFMT git project did not compile with the latest Rust, likely because Rust is still a bit too young. This does not sound so positive, but I found it to be the language the most interesting to learn.

I was familiar with Scala before this exercise. I used a non functional approach, with while loops in order to make sure I had maximum performance. This is something I find a bit annoying in Scala, I always wonder if for performance I need to do a while instead of a for, when the classic for makes much more sense (that and the fact that the classic for leads to some annoying delegation in runtime errors/on debug).

I relied on the default RNG for Dart but MersenneTwister for Scala, Julia, Python, Rust. All implementations use a hand coded AS241 for the inverse cumulative normal.

Update 

Using FastMath.exp instead of Math.exp leads a slightly better performance for Scala:

1 0.06
10 0.05
100 0.39
1000 2.66

I did not expect that this would still be true in 2015 with Java 8 Oracle JVM.

Monday, March 16, 2015

Volatility Swap vs Variance Swap Replication - Truncation

I have looked at jump effects on volatility vs. variance swaps. There is a similar behavior on tail events, that is, on truncating the replication.

One main problem with discrete replication of variance swaps is the implicit domain truncation, mainly because the variance swap equivalent log payoff is far from being linear in the wings.

The equivalent payoff with Carr-Lee for a volatility swap is much more linear in the wings (not so far of a straddle). So we could expect the replication to be less sensitive to the wings truncation.

I have done a simple test on flat 40% volatility:

As expected, the vol swap is much less sensitive, and interestingly, very much like for the jumps, it moves in the opposite direction: the truncated price is higher than the non truncated price.

Wednesday, March 11, 2015

Arbitrage free SABR with negative rates - alternative to shifted SABR

Antonov et al. present an interesting view on SABR with negative rates: instead of relying on a shifted SABR to allow negative rates up to a somewhat arbitrary shift, they modify slightly the SABR model to allow negative rates directly:
$$ dF_t = |F_t|^\beta v_t dW_F $$
with \( v_t \) being the standard lognormal volatility process of SABR.

Furthermore they derive a clever semi-analytical approximation for this model, based on low correlation, quite close to the Monte-Carlo prices in their tests. It's however not clear if it is arbitrage-free.

It turns out that it is easy to tweak Hagan SABR PDE approach to this "absolute SABR" model: one just needs to push the boundary F_min far away, and to use the absolute value in C(F).

It then reproduces the same behavior as in Antonov et al. paper:

"Absolute SABR" arbitrage free PDE

Antonov et al. graph
 I obtain a higher spike, it would look much more like Antonov graph had I used a lower resolution to compute the density: the spike would be smoothed out.

Interestingly, the arbitrage free PDE will also work for high beta (larger than 0.5):
beta = 0.75
It turns out to be then nearly the same as the absorbing SABR, even if prices can cross a little the 0. This is how the bpvols look like with beta = 0.75:
red = absolute SABR, blue = absorbing SABR with beta=0.75
They overlap when the strike is positive.

If we go back to Antonov et al. first example, the bpvols look a bit funny (very symmetric) with beta=0.1:

For beta=0.25 we also reproduce Antonov bpvol graph, but with a lower slope for the left wing:
bpvols with beta = 0.25
It's interesting to see that in this case, the positive strikes bp vols are closer to the normal Hagan analytic approximation (which is not arbitrage free) than to the absorbing PDE solution.

For longer maturities, the results start to be a bit different from Antonov, as Hagan PDE relies on a order 2 approximation only:

absolute SABR PDE with 10y maturity
The right wing is quite similar, except when it goes towards 0, it's not as flat, the left wing is much lower.

Another important aspect is to reproduce Hagan's knee, the atm vols should produce a knee like curve, as different studies show (see for example this recent Hull & White study or this other recent analysis by DeGuillaume). Using the same parameters as Hagan (beta=0, rho=0) leads to a nearly flat bpvol: no knee for the absolute SABR, curiously there is a bump at zero, possibly due to numerical difficulty with the spike in the density:
The problem is still there with beta = 0.1:



Overall, the idea of extending SABR to the full real line with the absolute value looks particularly simple, but it's not clear that it makes real financial sense.