Monday, July 6, 2015

Machine Learning For Stock Price Forecasting

During my postgrad studies this semester I undertook an analysis of a few machine learning algorithms in an attempt to predict photometric redshifts (see the video here: https://www.youtube.com/watch?v=BARCby6X0uk ). I ended up focussing on Classification And Regression Trees (CART) as well as their ensembles. First I considered standard regression trees moving onto to random forests and boosting and found that using random forests coupled with boosting produced the lowest RMS. I have begun to think about using the same approach to forecast stock prices. CART is a form of supervised learning that looks to find a map $$f: \mathbb{R}^n \rightarrow \mathbb{R}$$ by dividing the measurement space $\chi$ consisting of all the measurement vectors $\vec{x}_i = (x_{i, 1}, x_{i, 2}, ..., x_{i, n})$, using binary trees, such that every measurement vector is mapped to a $j^{th}$ class. For continuous output, a regression is performed within each class. After some research (and stumbling onto Axel Sunden's thesis) I began with a crude beginning, I defined the measurement vector $\vec{x} =$ (Opening Price, Closing Price, Todays Low, Volume, Fast Stochastic, Slow Stochastic) looking to predict tomorrows High Price. I have the code in place so it was just a matter of changing the input data and training the algorithm. To get the daily historical data, I used Yahoo! Finance's python API and began with General Electric to train my algo. I used 8977 days to train and I tested it using an independent sample of 4489 days. To quantify the output, I classified the predictions as
  • 1. 'long' - tomorrows high is forecasted to be 3% higher than todays. 
  •  2. 'flat' - tomorrows high is forccasted to between [-3, 3]% of todays 
  • 3. 'short' - tomorrows high is forecasted to be 3% lower than todays 
Comparing the predicted with actual sample results, the algorithm classified 93% of the days correctly. More importantly, the algo never predicted a 'long' when the actual sample reflected a 'short' move. There were, however, 2 predictions where the actual market reflected a 'long' and the algo called a 'short' move. Showing results for the last 100 days is the figure below.


 While this was a very crude model, I believe it can be more successful with improvements, perhaps using weekly sampled data to improve on a definitive market long or short position and using more technical analysis indicators which I will implement soon.

Thursday, July 2, 2015

Monte-Carlo Option Pricing Via Encapsulation

In the spirit of Object Orientated Programming, we revisit our previous post using 'C++ Design Patterns and Derivative Pricing' by M Joshi to extend vanilla options pricing to not only calls but puts as well. We wish to encapsulate the code using seperate compilation and header files. This will allow the code to be used in such a way that the programmers need not know whats going behind the scenes of the class but is still able to use it. we begin by defining a class PayOff in the header PayOff1.h that has an enumeration declaration for the option types: call or put as well as the constructor for the Payoff which takes in the strike of the option and the type of the option pay-off. Lastly in the public section we have the main method for this class 'double PayOff::operator()(double spot) const' which is given a value of spot and returns the value of the pay-off. The operator() has been overloaded which is used as a 'functor'. We store the variables 'strike' and 'TheOptionsType' in the private section. We then define the implementation file 'PayOff1.cpp' which includes the above header file to initialize the constructor as well as the method operator(). By seperating the definitions and implementations for the private data we increase the privacy of the code. PayOff1.cpp can be found below: . Simply, the 'PayOff::operator ()(double spot) const' looks at the option type - call or put - and defines the payoff accordingly. Since puts are bets on prices falling, the amount paid would assume that Strike price > Spot price or else the payoff would be 0. We then define a header file called 'SimpleMC.h' storing the function declaration for our simulation calculator whose implementation is stored in 'SimpleMC.cpp'. This function is exactly the same as the function in the "call1" function from the previous post http://youth-economics.blogspot.com/2015/06/monte-carlo-methods-for-valuing-call.html except now we have a PayOff memory address to store thePayOff in the function declaration, also since we are now calculating calls and puts, the lines 22 and 23 of the previous calculator are replaced with The main programme is also the same as the previous post except when calling the final price of the put or call. The implementation can be found below. Here we use the method operator() to calculate the the payoff for a call and put and these values are then plugged into the monte-carlo engine to calculate call and put prices. Calculating option prices with a Spot = 80, Strike = 100, time = 1 year, volatility = 50% and the risk free rate of 4%, we use 100 000 paths to calcuate Call: 10.3117 Put: 26.6019 which are very close to the theoretical prices given by http://www.option-price.com/index.php