Abstract
This letter presents the ideas and methods of the winning solution* for the Kaggle Algorithmic Trading Challenge. This analysis challenge took place between 11th November 2011 and 8th January 2012, and 264 competitors submitted solutions. The objective of this competition was to develop empirical predictive models to explain stock market prices following a liquidity shock. The winning system builds upon the optimal composition of several models and a feature extraction and selection strategy. We used Random Forest as a modeling technique to train all sub-models as a function of an optimal feature set. The modeling approach can cope with highly complex data having low Maximal Information Coefficients between the dependent variable and the feature set and provides a feature ranking metric which we used in our feature selection algorithm.