Seven Overtimes NCAA Basketball Analytics Ratings

As long as I've been doing this site (since 2011), the rankings and game prediction methodology has been essentially unchanged. There are several different steps:
   -1) Creating win probabilities for any margin / time / team strength combination using historical data
   -2) Creating a weighted average win probability for each game (using the win probabilties)
   -3) Creating team rankings based on average win probabilities (using the weighted averages)
   -4) Creating a strength of schedule (SOS) for each team based on their opponents ranking (using the team rankings)
   -5) Creating updated Team Rankings with an updated iteration using SOS (using the SOS)
   -6) Creating game predictions based on Team Rankings (using the team rankings)

While I believe the key difference between my site and any other ranking system out there is my reliance on cumulative win probabilities, I've always felt that I haven't done enough in some of the other steps to truly maxmimize the usefulness and accuracy of my system. In an effort to both improve my system and I am working on a new methodology based on an updated linear regression and process that leverages R. The old methodology used a similar process but was created with a combination of Excel and SQL. You can find both my game history data and the regression code in R in the links below.

Game History Data (Excel)   R Prediction Code

This is output of the new model I plan to use to predict scoring margin this year. It's a simple linear model based on four factors: the ranking of the two teams and two flags indicating neutral site and conference play, respectively.

Example Game #1
Tennessee @ Vanderbilt (Home Conference Game for Vanderbilt)
Tennessee Ranking: .537
Vanderbilt Ranking: .597
Vanderbilt, the home team would be favored by: 1.963 + 123.5(.597) - 119.21(.537) - 3.57(0) - 0.77(1) = 10.96 points.

Example Game #2
Kansas vs. Vanderbilt (Non-Conference Neutral Site Game)
Kansas Ranking: .642
Vanderbilt Ranking: .597
Kansas the higher-ranked team, would be favored by: 1.963 + 123.5(.647) - 119.21(.597) - 3.57(1) - 0.77(0) = 7.18 points.

While the model to predict the scoring margin is a linear model, the updated model to predict the winner of the game is a logistic regression, using the glm function in R with the logit link. This measures the affect each of the same variables (home team ranking, visitor ranking, neutral site, conference game) has on the home team winning.

As you can see in this chart, the relationship between home team margin and win probability is a typical logistic curve. There is less data on the low-end because there are very few games where the home team is a huge underdog.

This table shows the difference between my old method and the new one. One problem in testing a new method after the season is that if I compare it to predictions made during the season, the "after-the-fact" rankings are almost guaranteed to be better because they have all the information for every game, while the "day of game" predictions only have the data of the games up until that day. The old model correctly predcited 71.42% of games in real-time, and improved to 74.06% when back-testing the final rankings across the whole season. The new system is about 169 basis points more accurate - increasing the correct rate from 74.06% to 75.75%.

In addition to implementing these new rankings, I also plan to use similar methods to update the historical win probabilities rates (Step 1 from above) I've used to calculate the win probability metrics for each game. The chart below shows a sample of what this new data will look like (this part is still a work in progress - I'll have a separate post about this when it is complete).

Team	Rank	Record	Conf Record	Luck Rating	Non Conf Luck Rating	Conf Luck Rating
IPFW	153	(14-5)	(5-1)	4.451 1	3.13 1	1.29 17
Michigan St	47	(16-4)	(3-4)	1.169 70	2.41 4	-1.24 331
Vanderbilt	35	(11-7)	(3-3)	-2.40 345	-1.01 293	-1.39 337
Washington	88	(13-5)	(5-1)	2.230 20	-0.10 186	2.33 1

Team

Rank

Record

Conf Record

Luck Rating

Non Conf Luck Rating

Conf Luck Rating

IPFW

153

(14-5)

(5-1)

4.451 1

3.13 1

1.29 17

Michigan St

(16-4)

(3-4)

1.169 70

2.41 4

-1.24 331

Vanderbilt

(11-7)

(3-3)

-2.40 345

-1.01 293

-1.39 337

Washington

(13-5)

(5-1)

2.230 20

-0.10 186

2.33 1

Team One	Team Two	Team Three	Team Four
Ranking: 39 SOS: 61 Record vs. Top 50: (1-6)	Ranking: 74 SOS: 90 Record vs. Top 50: (1-4)	Ranking: 63 SOS: 95 Record vs. Top 50: (2-5)	Ranking: 46 SOS: 82 Record vs. Top 50: (3-5)
Temple (RPI)	Temple (BPI)	Temple (KenPom)	Temple (SevenOT)

NCAA Basketball Analytics - Updated - 10/26/2025

Updating Game Predictions for the 2017 Season - 11/6/2016

Which Players Have Had the Biggest Regression Offensively This Year? - 2/10/2016

Projecting the SEC - Big XII Challenge

Comparing Conf Luck vs. Non Conf Luck - 1/23/2016

Quadrant 1: IPFW (14-5, Summit 5-1)

Quadrant 2: Michigan St (16-4, ACC 3-4)

Quadrant 3: Vanderbilt (11-7, SEC 3-3)

Quadrant 4: Washington (13-5, Pac-12 5-1)

Team Metrics

Putting Computer Rankings in Context - 3/1/2015