2008 Major League Baseball

April 3rd, 2008

MLB is running again. The new season snuck up on me. It’s interesting to see how the number of teams in your division and league effects your starting odds for making the post season.
 
Formula 1 is also up.
 
I’m working on adding American soccer (MLS and USL).

Games above .500? In Hockey?

March 22nd, 2008

Thanks go to Lexicon Devil for pointing out that the games above .500 graph makes sense in baseball but not in the NHL because of the point you get for a loss in overtime.
I changed the graph so that each day:
Win and the line goes up 2, lose in overtime it goes up 1, don’t play and the line stays flat, lose in regulation and the line goes DOWN 2.
The graph is now, still confusingly, called “Points above .500”.
[Edit 3/23: and it still has other problems. A team’s line is good, you can see their ups and downs. But when you compare lines it can falsely imply that one team has more points than another. I feel like a cartographer. :)]
[Edit 1/30/09: changed again. Now you get 1 point for a win, -1 for a loss, and 0 for an overtime loss. Thanks James. It still has the the “relative position of teams” issue.]
 
And thanks go to Squealy for suggesting the alternating row colors to make it easier for your eyes to follow across the tables.
 
I saw a good question asked by Replacement (I’m paraphrasing): The Predators are a point ahead of Edmonton. How come if both teams win out the Oilers have a higher chance of making the playoffs?
Here is my thought process for pondering an answer:
Do the Oilers have a game in hand? No, they both have 7 games left.
Well, clearly if they both win out the Predators will have more points. Do they play each other? No, they don’t.
Who do they play? The Oilers play many more of the teams just ahead of them in the top 8, one of which they need to supplant to make the playoffs. The Predators play more teams already locked in or locked out. So in a universe where the Oilers win out, their direct competition “by definition” takes at least some losses.
Ahh, that makes sense, the Oilers have “more control” of their destiny. But still, I’m confused. If they both win out then the Predators should be ahead? True, but both of them winning out at the same time happens so rarely that it has little, if any, effect on the numbers. It’s far outweighed by who the opponents are.
 
Replacement also makes the good point that I’m not enumerating all possible scenarios. I take a big 100 million season sample, but even this late in the season that is dwarfed by the number of win/loss/ot loss (and by how many goals, because one of the tiebreakers is best goals delta) combinations there are in the games remaining.
And, I’ll add, the numbers might look “official”, all clean and crisp on your screen. But the program that makes them could have errors. :)
It is humbling to see people use this site as a tool for making interesting observations. Thanks.
 
Finally, Michael has a page that does pretty much what Sports Club Stats does. Check it out if you want to see numbers behind who your likely first round opponent will be.

Time to show all the “What If” numbers

March 20th, 2008

I changed the NHL “what if” sections to show a row for every record the simulation encounters. Before I grouped records together that had the same number of points (and before that I grouped by 2 points), to keep the list from being too long.
 
Here, for example, are all the times (out of 100 million) the Canucks finished with 93 points:
 
In doesn’t always mean in
 
Notice how their chances decrease each time they trade a win and a loss for a pair of OT losses, until that last row, where suddenly 9 straight overtime losses puts them in the playoffs. Don’t believe it. The sample size for that record is too small. The program only saw it 1 time, when they happed to come in 8th. The chart is saying “100% of the times I saw them finish out 0-0-9 they made the playoffs, so I’m going to write “In” next to that record.” In does not necessarily mean mathematically in, and likewise for Out, especially when the number in the count column is small.

Behind the NHL scenes

March 2nd, 2008

Here are some questions and answers from feedback you have sent:
 
GV askes about “What If” outcome skew:
I really appreciate this website. I was wondering why in the NHL section, every team I looked at from the top of the division to the bottom was much more likely to go 16-1 than 1-16. I would think that they would be nearly identical. Any explanation for this?
 
My reply (over 3 days):
Great question. I have to look into it to be sure, but here is some background info that might give you an idea of what’s going on:
 
I don’t show a line for each outcome I see because there are so many that it becomes harder to read, so I group the records into “buckets” that are 2 points apart. And for the name of the “bucket” (what I show on the line) I use the record inside it with the most wins. That record typically has just 0 or 1 overtime loss. Now, I also rig it that 0-17 and 17-0 are not buckets, but show only those exact records, because they are kind of interesting special cases.
So, for example, a 10-7-0 “bucket” shows their odds when they finish
out 10-7-0 or 9-6-2 or 8-5-4, etc… (20 points)
and 9-7-1 or 8-6-3 or 7-5-5, etc… (19 points)
I’s always the results that have the number of points in bucket “label” (what I show) and the results with 1 less point than the bucket label. I go one less instead of one more because it helps counteract the rosy skew I think the chart has because of the 50/50 way I play future games.
So, I show a row for exactly 34 points and a row for exactly 0 points, and in between I start from the top and work down: 33 or 32, 31 or 30, …, 5 or 4, 3 or 2, 1 ( not 1 or 0, because 0 gets it’s own row). The 1 has no point buddy to add to its total. That is why they are not nearly identical. I think.
 
I’ll change it for tonight’s run to make the buckets 1 point instead of 2. It might be close enough to the end to make the extra lines useful to see.
 
On a separate note, this is how I “play” a future NHL game:
 
HomeScore = pick random number 0-3
AwayScore = pick random number 0-3
If HomeScore = AwayScore then they went to overtime, so I flip a coin to see who wins and add 1 or 2 goals to the winners score.
That means in my simulation 25% of the games go to overtime (which I hope is close to the real world).
 
2 days later I realized another cause of skew that is specific to the NHL:
The NHL “what if” section is now broken down into 1 point buckets instead of 2. The remaining positive skew is due to that point you get for an overtime loss.
For example, to earn 0 points total in all remaining games you have to lose every game in regulation. But to earn the max points you can win all remaining games in regulation OR overtime. So the simulation sees the max points outcome happen more often.
 
Because 25 percent of my simulation’s games go to overtime, the expected value for points scored in a game is 1.125. Half the time you get 2 points, 3 out of 8 times you lose in regulation and get 0 points, and 1 out of 8 times you lose in OT and get 1 point. So the “normal curve” you see in the count column should be centered around “games left” * 1.125 points, not around .500 percent like every other sport.
 
 
Russ asked about a more accurate algorithm:
I don’t know if this is feasible, but is there any way (or are you planning) to include external factors to the simulations? For example, if #1 seed played the #30 seed, I would imagine the probabilities of each team winning should not be identical.
 
My reply:
That is the next big thing I want to add. People have done the hard part coming up with formulas that better match historic results, at least as far as using the records of the 2 opponents to predict a games outcome. I just need to hook them up and make it easy to toggle between the “better” formula and the 50/50 formula. I want to have both because I’m still fond of the 50/50’s “concreteness” (it has a huge flaw but at least I understand the flaw). I don’t plan on getting it in this season, although since hockey is bringing the most visitors now maybe I should hop to it.
 
 
Russ asked about the “Big Game” section:
Notice under today’s games. Why is it the teams impacted by a Penguins or Senators Win/Loss does not include the Canadiens? I can’t see how they would not be impacted +/-.
 
My reply:
Currently if the range between the best case outcome and the worse case outcome is < .3 percentage points I don’t show the team on that page, because I don’t want to show “false positives”. I don’t have the stats chops to compute what the margin of error is in my simulation. The .3 is a bit arbitrary and may be too restrictive. So if the team is not listed it does not necessarily mean there is mathematically no impact.
That said, I’m not sure who the Canadiens would want to win from a “just get into playoffs” perspective. I have noticed in the past that sometimes when it seems obvious that a team would want a certain outcome it turns out that it doesn’t really matter in the big scheme of things because that particular outcome never (or almost never) turns out to be a deciding factor in whether the team makes it or not. This kind of unexpected stuff fascinates me.
 
 
Many folks asked about the day’s total effect not matching the sum of the individual game effects on a team’s page:
The sum is the correct value for the total change, although it is often different than the sum of the individual game changes above it. When a team is on the bubble lots of games have big impacts, but you can’t just add them up because the first few might move the team off the bubble. For example, imagine the last day of the season, and 2 game results can each knock you out of the playoffs, say take you from a 25% chance to a 0% chance. But you can’t be knocked out twice; if both games go bad you don’t end up with a -25% chance of making the playoffs.
This also effects the best and worst case totals because they are just the sum of the best or worst case for each game. They also can be too big, especially when you are on the bubble.
 
 
On a personal note, I started this site back in April of 2006. On most days one person visited: me. In May of 2007 I started using Google Analytics to count how many people have visited. It has a map that shows what countries people come from. Sometimes I check the map and check the grand total of people that have come since May. This morning that total crossed into 6 digits. Thank you.
 
A small world:
Small World
 
Come on Nuuk, represent.

NASCAR is back

February 17th, 2008

Sprint (was Nextel,) Nationwide (was Busch,) and Craftsman Truck started up this weekend at Daytona:
 
NASCAR Week 1
 
Left: odds of making the Sprint chase
Middle: odds of winning Nationwide
Right: odds of winning truck

Death to erroneous NHL big games

February 14th, 2008

So the other day Mogen_david writes me and explains that something is rotten in the state of big games:
 
This is wrong

I game between 2 Western Conference teams should have 0 impact on all those Eastern Conference teams.
 
Right he is. The bug is fixed. It has effected the hockey big games ever since I added the overtime wins columns.
 
What happened?
 
I run the simulation 10 million times to get current odds in. Worked.
I take each of tomorrow’s games, one at a time, and: (say the first game is Capital vs Thrashers.)
Pretend the Capitals wins 1-0 and run simulation 4 million times to find the effect that win would have on every other team. Worked
Pretend the Thrashers wins 1-0 and do the same. Worked
Pretend the Capitals win in OT 1-0 and do the same. Worked
Pretend Thrashers wins in OT 1-0 and do the same. Worked, but, I neglected to clear the overtime flag on the game. Crap.

So the next game is Blue Jacket vs Blackhawks. I go though the 4 outcomes again, and low and behold, the 2 teams in the game I messed up always come out ahead. Because in those simulation they never lose that first game outright, they always get at least 1 point, because I left the overtime flag set on the game and it never gets cleared until I run the program again.
 
The error did not show up on, say, the Capitals page because on the team pages I filter the big games a different way, if the difference between the best case outcome of a game and the worst case outcome is small I don’t show it. No matter who wins the Blue Jackets Blackhawks game I was showing the Capitals getting that same 1.9 percentage point bump, so the game did not show.
 
Thanks Dave for helping me fix this craziness.

NCAA Basketball, and what else is new

February 13th, 2008

Mens and womens college basketball should be working. They show what conference tournament seed a team is headed for. I’m working on adding RPI.
2 things still need fixing:
1. The “Big Games” sections are blank because there are too many games most days for me to calculate in 24 hours. Time to get a faster PC. :)
2. All of the conferences use the ACC’s tiebreaking rules.
 
The soccer match results should be more accurate, thanks to much help from Grimsby UK native Colin Pollard.
 
Thanks for all the hockey message board postings.
The NHL update is running late today, it’s about 2 hours out.
 
The new NASCAR season will be up sometime before the Daytona 500.

Titans and Redskins beat the odds and make the playoffs

December 31st, 2007

The Tennessee Titans and the Washington Redskins are going to the NFL playoffs.
Thanks everyone for following along with Sports Club Stats.
 

Games Above .500

Chance Will Make Playoffs

 

The Panthers are in the playoffs if…

December 19th, 2007

All of the following happens:

1. Panthers beat Cowboys and Bucs.
2. Vikings lose to Redskins and Broncos.
3. Redskins lose to Cowboys
4. Saints don’t beat both the Eagles and the Bears, or don’t beat one and tie the other. (It’s OK if they tie both.)
 
Nothing to it. Pack your bags.
 
The first 3 you can gather [Ed. not now, but if you looked at the time] from the Panther’s page by looking at the “Outs” in the big game section. I had to run a little side simulation to figure out number 4.
 
And, as explained by Woz:

RESULT: All four teams are 8-8

Okay, first things first, we need to break the tie between New Orleans and Carolina since they are in the same division. Head-to-head was 1-1, division record is 3-3 and 3-3, common games is 4-4 and 4-4, but conference record would be 8-4 Carolina to New Orleans’ 7-5.

At that point, we have three teams from three divisions, so we would go to to the 3 team or more wild card tiebreaker. Head-to-head doesn’t apply since no team beat the other two. So, we then look at conference records: Carolina 8-4, Washington 6-6, Minnesota 6-6. Carolina advances.

Why do the Redskins want the Bears to beat the Packers next week?

December 17th, 2007

The teams are all fighting for 2 wildcard spots. You would think they all want the Packers to knock down the Bears. What tiebreaking scenario is behind this?
Why do the Redskins want the Bears to win?
 
Here are more results that the site is filtering out because I’m not sure they are greater than the “margin for error”:
Cardinals   0.1   0.0
Panthers   5.6   5.5
Giants     94.8   94.8
Lions       0.1   0.1
 
Including 2 team from the AFC:
Browns   89.2   89.1
Titans     19.1   19.2