Predicting the 2018-2019 Playoffs using a Pseudo-Stochastic Neural Network

Khai LAI
Mar 11, 2019
4 min read

Hello everyone,

So...roughly 3 months have passed since my last playoff prediction. During that time, a lot of things have changed in the NBA: 76'ers adding All-Star caliber Tobias Harris to their already-stacked roster, Lakers missing up the playoffs due to massive injuries problems, and the Warriors now back at the number 1 spot.

So using the latest data, as of the current time of writing, I want to do another prediction. but this applying some basic Machine Learning techniques to do so :)

Using data from the past 10 seasons, I constructed a simple 3-layers Neural Network, with the 3 input features, 50 hidden layers (for computation) and one single output unit to predict the champion.

My input data-set is fairly simple. I have 10 different .csv files containing all 30 teams' Margin of Victory, Offensive Rating, and Defensive Rating, across 10 different seasons ( from 2008 to the latest data available for the current season). The teams are ordered alphabetically. This way, we can have an unlabeled X matrix of dimension 30 x 3 to feed into the Neural Network, and still be able to know which row of data corresponds to which team.

I also have 9 other .csv files for 9 seasons from 2008 to 2017 labeling each year's corresponding NBA champion, in form of binary-valued vector consisting of 29 0's and a single 1 denoting the champion. The position of the 1 in the vector correspond the winning team. For example, the Golden State Warrior is 10th in alphabetical order, so in any year that they won the championship, the Y-matrix would be a 30 x 1 vector consisting of 29 0's and a single 1 at the 10th index.

To train the neural network, I feed in the data necessary for one season at a time, and let the network to learn its parameters for that data-set over 1000 iterations. Keeping the previously learned parameters, I would then feed in the corresponding data-set for the next season. This allows the Neural Network to continuously adjust its weights and adapt the learning process to any changing statistical trend in the new season. The data-feeding/training continues in this manner, until the network has fitted its parameters over all 9 seasons. Since, the data-set is fed into the network one season at a time instead of all 10 seasons in one single batch, this training method somewhat resembles that of a stochastic neural network. Thus, I consider this method to be 'pseudo-stochastic'.

Using the fitted parameters learned by our simple Neural Network, I can now predict the 2018-2019 season NBA champion!

To ensure our Neural Network generalize well with potential new data, it compare several versions of it with different values of regularization. For an unregularized network ( meaning that the network overfits our training data, and perhaps does not generalize well with new dataset), we arrive at the following results:

I then compare this with the results generated by other regularized versions of the same Neural Network. If the predicted team turns out to not be the actual winner, I include a brief description of where in the playoffs that predicted team was eliminated.

Key:

WCF = Western Conference Finals

ECF = Eastern Conference Finals

WCSF = Western Conference Semi-Finals

ECSF = Eastern Conference Semi-Finals

DMP = Didn't Make Playoffs

Finals = NBA Finals

Champion = NBA Champion that season

Note: the higher the regularization parameter, the more the network underfits its training data-set and thus, generalizes better with unseen example (up to a certain degree). If the regularization parameter is too high, the network's weights would be severely underfitted, resulting in very simple/inaccurate predictions.

Overall, even though the regularized versions has a low absolute accuracy (30% accuracy across all tested regularization parameters), one can clearly see that they all do fairly decent job filtering out top teams and predicting them as the Champion. Observing the actual regular-season ranking of the predicted winners, 6-7 out of 10 predicted winners were all ranked 1st in the league in their corresponding season.

However, there were also occasions where the Charlotte Hornets were predicted to win, despite being in the very bottom of the league. This shed lights on how our Neural Network functions. I think it learned to sort out teams with extreme values in term of the 3 chosen statistics and pick from the selection.

Since both the team's name and their actual rankings are not provided in the unlabeled input data-set, we observe that our Neural Network does an extremely good job at predicting the top-ranked teams. Furthermore, across all regularization parameters, all predicted teams go very far into the playoffs. If they don't win the actual championship, they would either get eliminated in the Finals or their respective Conference Finals.

Unsurprisingly, across all versions of the Neural Network, the Milwaukee Bucks are still favored to become the 2018-2019 season NBA champions! This prediction matches my previous one earlier this December, despite them being generated by two completely different methods.

Similar to my last model 3 months ago, this Neural Network still has some note-worthy weaknesses:

It only tells you who the new Champion is, but does not tell you how the Playoffs would happen exactly.
It still faces the potential problem where team statistics in the East might be overly-inflated compared to those in the Western conference, since Eastern teams are generally weaker.
No random elements are considered. The Neural Network assumes all teams maintain their current offensive and defensive performance into the playoffs, while in reality, this is rarely the case.
Does not factor in sudden injuries that may happen.

But for now, this prediction will do! :)

- Khai Lai

Predicting the 2018-2019 Playoffs using a Pseudo-Stochastic Neural Network

Recent Posts

Comments