We present a predictive analysis model for 2019 men’s Cricket World Cup. We believe this predictive analysis strategy would be very useful for viewers, sponsors, and team strategists. This would also give insights to various cricket analysts and commentators about the features that play a crucial role in the statistical analysis. This model is developed based on the historical data collected for the 10 participating teams (Afghanistan, Australia, Bangladesh, England, India, New Zealand, Pakistan, South Africa, Sri Lanka, and West Indies). In addition, we test our model on 2015 world cup data and measure the accuracy of predictions. We are planning to expand this model as the tournament is close by and once the final squads are announced. This model is developed based on the players who were a part of their respective teams/squads in the recently 5 concluded tournaments.
To train our model, we utilize the data collected from every men’s cricket world cup. From 1975 to the present, there have been 11 world cups (1975, 1979, 1983, 1987, 1992, 1996, 1999, 2003, 2007, 2011 and 2015) played so far. One thing to be noticed is that until 1983 world cup, each team played 60 overs each whereas from 1987 onwards, 50 overs. Also, run scoring has increased incredibly over the last few years, that will be considered in our features as well.
All these features except the number of ICC trophies won for the last 12 years is based solely on One-Day International (ODI) records. All the individual features are converted to a team statistic by taking the overall mean. Certain features are provided with a description of Recent which basically means the period from 2015 world cup to present. Some features were also selected based on the location of the upcoming World Cup.
In this research, we present two different approaches for our predictive analysis. At first, we present a classifier approach and later we present a neural network approach with hidden layers. Classifier approach would help us identify the pattern whereas neural network would help us identify the weights allocated after training for each feature.
A. Ensemble Classification Approach
The framework of ensemble classifier systems is established by combining numerous basic classifiers together to reduce the variance caused by a single training set and more expressive concept in classification than a single classifier. We utilize the 8 basic classifiers for this study. The number of basic classifiers are selected based on the leave one out fold validation of the training data. Ensemble classifier has proven to be effective for predictive analysis, hence we adopted the same for this research.
B. Neural Network Approach
In this neural network approach, we utilize 12 hidden layers for this study. The number of hidden layers was chosen based on leave one out validation of the training data. Gradient descent back propagation method is utilized.
2015 Cricket World Cup
At first, we validate our approach by estimating the probabilities of winning the World Cup of these 10 teams for the 2015 world cup and match with the actual 2015 world cup results. We estimate the probabilities based on the data collected from 1975-2011 world cups. Despite the 2015 world cup being played among 14 different countries, we focus on the results of these 10 teams. Table 2 lists the probabilities for the 2015 world cup based on both classifier and neural network approaches along with the actual result.
2019 Men’s Cricket World Cup
Now, we predict the 2019 world cup results based on the data collected from 1975-2015 world cups. Table 3 presents the probabilities based on the classification approaches based on the data collected until 18th July 2018.
Defending champions Australia gets a relatively lower probability due to their poor performance at England recently. Top 2 contenders for the world cup according to the classifier are England and India.
Who will win Cricket World Cup 2019?
Classifier approach predicts that the England cricket team has the highest probability of winning the Cricket World Cup 2019. This could be because the previous world cups were won by host nations and due to excellent record by their team in last few years at their home ground (world cup location).
Meanwhile, the Neural Network approach predicts India and Pakistan as the top two contenders, this could be due to their excellent performance in the ICC champions trophy 2017.
Also, India won the champions trophy 2013 and reached the finals in 2017 which happened in England, hence India gets a relatively higher probability by both Neural Network and Classifier approaches. However, their middle order performance is considerably low when compared to England, hence Classifier predicts England as the winner whereas the Neural Network still believes India has higher chances.
Some of the other interesting notes about this research: If Team India plays without Virat Kohli and Rohit Sharma in this upcoming world cup, India's chances of winning the world cup reduces to 2% whereas if South Africa plays with AB de Villiers, their chances go up to 18%.