Predicting the 2018-2019 NBA Playoff. ALREADY??

Khai LAI
Dec 8, 2018
6 min read

Hello guys, it’s me again!!

After waiting for roughly 25 games to ensure we have enough game data to work with, and also partly because of some procrastination,... I finally have a large enough sample size to predict how the 2018-2019 NBA Playoff is unfold!

As of the time of writing this, ESPN’s NBA Standing through the first week of December 2018 (yay Christmas!) looks something like this. Note that I added the bar to distinguish teams in the top 8 in each conference

As the year progresses, obviously, the final standing at the end of the season would change. Thus, to predict the seeding position at the end of the regular season, I included information regarding teams’ PPG (Average Points per Game) as of now, and their latest Opp PPG (Average Opponent Points per Game). These two metrics give us a brief indication of how teams are doing both offensively and defensively on the court.

Working under the assumption that teams maintain such performance consistently throughout the year, we can then use the Pythagorean Winning Percentage equation to predict how many games in the regular season these teams are expected to win. As defined by NBA Stuffer, the Pythagorean Winning Percentage is a “method that gives an expected winning percentage using the ratio of a team’s wins and losses related to the number of points it scored and allowed [in those past games]”. By multiplying such Expected Winning Percentage to a season of 82 games, we yield the expected number of wins each team should have by the end.

Taking only the Top 8 predicted standing for teams in each conference, we arrive at the following expected seeding.

Note an interesting phenomenon where the currently 11th seed New Orleans Pelicans (NOP) displaces the currently 7th seed Portland Trailblazers (POR) in the final expected seeding.

This is because the Pythagorean Winning Percentage takes in account of team’s true offensive and defensive capabilities to calculate the expected win rate. Since we have only been through 25-27 games of the season so far, with such a small sample size, teams might get lucky (or unlucky) here and there and gets ranked in seeding positions that don’t really reflect their true offensive and defensive performance. As the season progresses, and the number of games increases, given that teams stay consistent performance-wise, the win count should converge to, or at least very close to the number of their theoretical expected wins.

Doing so, we arrive at the following predicted Playoffs picture for the 2018-2019 Season.

When I was deciding how to predict what would happen during the playoffs, I considered many models. Those includes using the Pythagorean Winning Percentage model I did to predict the playoff last year (check out that post here). However, because the EW% approach heavily rely on the already-existing match-up data between two teams (namely, a team’s specific PPG and Opp PPG toward a opponent), and since there has only been 25-27 games in the season so far, no team has gotten the chance to play all the other 29 yet. Thus, for some of the predicted first-round match-ups above, we have literally ZERO data to extrapolate from.

I also considered using a Neural Network to predict the NBA Playoffs. However, since I am still new to Machine Learning, in fact, I literally just learn how to implement basic back-propagation a few days ago, so I am still not quite confident of my ability to accurately implement such complex learning algorithm for this particular problem, yet!

After doing some further research, I came upon The Basketball Distribution who posted a blog regarding a highly interesting way to predict the final score of an NCAA game. Even though, the original post was meant to predict college basketball results, many of the statistical metrics is translatable to NBA-level basketball! For this post, I will tweak the original approach a little bit by getting rid of some unnecessary steps that can add way more complexity to our model, but not necessarily improve the results by any significant amount. Thus, I will simplify the Basketball Distribution’s approach into the following steps:

Step 1. Find each team’s Offensive and Defensive Efficiency (aka their Points Scored/Allowed Per 100 possessions)

Using the 2018-2019 Hollinger Team Statistics, for the Western match-up between the predicted 1st seeded Denver Nuggets and 8th seeded Dallas Mavericks, we learn the teams have the following Offensive (OFF EFF) and Defensive Efficiency

Step 2. Find each team’s PACE (possessions per game), and the League Average PACE.

The PACE information is also provided on the 2018-2019 Hollinger Team Statistics dataset.

The League Average PACE is found by simply averaging the PACE of all 30 teams together ( as of the first week of December 2018).

In this case, the League Average PACE is 102.96 possessions per game.

By multiply each team’s PACEs together and divided by the league average PACE, you arrive at the expected average PACE each team would be playing at during that particular match-up (denoted as VS PACE)

Thus,

Step 3. For each team, multiply their OFF EFF with the opponent’s DEF EFF and divide the product the by the League Average OFF EFF

The League Average OFF EFF is found by also simply averaging the OFF EFFs of all 30 teams together, which at this point in time turns out to be 106.51.

By multiply each team’s OFF EFF with their opponent’s DEF EFF and Divide the product by the League Average OFF EFF, we arrive at what I call the Exaggerated Points Per Possession ( denoted by EPPP). By itself, EPPP does not really make sense and offer very little intuition. However, just think of it as a necessary intermediate product that will be used to compute final numbers that will be make sense to us later.

Step 4. We can now calculate the predicted Game Score between the two teams by multiply each EPPP by the predicted game PACE (VS PACE) and divide each product by 100 to get the final predicted score for each team.

Thus,

Step 5. Using the predicted scores and treating them as a team’s PPG and Opp PPG, we can then plug them into the Pythagorean Winning Percentage to calculate a team’s chance of winning the series.

Using the same set of procedure, we could now predict the rest of First Round, and basically the whole Playoffs with ease.

FIRST ROUND

CONFERENCE SEMIFINALS

CONFERENCE FINALS

2018-2019 NBA FINALS

Alternative Route

Seeing how close the match-up between the Toronto Raptors (TOR) and Milwaukee Bucks (MIL) during the Conference Finals , I also consider the case where Toronto would win and advance to play the Warriors.

Surprisingly, even when we see a Finals matchup between Golden State and Toronto, the model still predict the Golden State Warriors to lose! I guess Kawhi Leonard is indeed the Warriors’ worst nightmare! Too bad Zaza is gone 😊.

Weakness In the Model

1) The model in this post works solely on the assumption that teams would continue to maintain their current offensive and defensive performance consistently throughout the season and offseason which is often not the case.

o Factors in injuries, trades and coaching changes (which happens a lot in the NBA) could dramatically affects a team’s offensive and defensive performance, and PACE

o Teams tend to play a lot more aggressive and better during the playoffs. The model uses statistical metrics in the regular season that might not accurately reflect the level of play a team is capable of during the offseason. But again, this is statistical prediction, not magic 😊

2) The Eastern Conference is sufficiently weaker compared to the West. Thus, given the much higher intra-conference match-ups ( so the NBA can minimize travel costs), teams in the Easts might boast inflated statistical figures in term of performance.

3) Vice versa, for the West, since teams in the West are much stronger talent-wise, it is possible that their statistical figures might be deflated.

4) No Home/Away game element is considered. Typically, a team would play better at Home than Away.

5) No random element is considered; again, the model assume teams function that the same level every game. Players might have random fluctuations in term of both offensive and defensive performance. Sometimes, when a player was just feeling it , he may score way more points than expected. Of course, the contrary is true as well. To improve the model, I might consider implementing Monte Carlo simulation next time around.

Well, there you have it! That was my attempt at predicting the 2018-2019 NBA Playoffs, I hope you enjoy reading this post, and have a good day/or night 😊

Cheers,

Khai Lai

Summary Playoffs Picture

Predicting the 2018-2019 NBA Playoff. ALREADY??

Recent Posts

Comments