Tuesday, April 26, 2011

Stewart Robson on Defense

During the latest "The Game" podcast, Stewart Robson, an ex-Arsenal player himself, commented on the current Arsenal squad's defensive failures:
You don't deserve to win anything if you don't work hard enough at your set pieces and your defensive gameplan.

Cheers, Stewart!

Thursday, April 21, 2011

EPL Week 34 Fantasy Team

Now that I've moved to a fully automated system for picking a fantasy team, I need to rethink what information would prove most valuable to other people. I can no longer report my own reasoning behind choosing the team since I'm not doing any reasoning.

Essentially, choosing a team has two parts. First, you need to guess how many points each player will earn. Second, you need to figure out how to fit the most expected points into one team.

To help with the first part, here are my top 15 players by expected number of points:

  • 5.845 — J. Terry (CHE)
  • 5.624 — K. Richardson (SUN)
  • 5.525 — W. Rooney (MAU)
  • 5.456 — R. van Persie (ARS)
  • 5.226 — Nani (MAU)
  • 4.863 — R. Meireles (LIV)
  • 4.862 — B. Ivanovic (CHE)
  • 4.689 — P. Bardsley (SUN)
  • 4.668 — F. Lampard (CHE)
  • 4.452 — S. Kyrgiakos (LIV)
  • 4.412 — M. Skrtel (LIV)
  • 4.247 — F. Malouda (CHE)
  • 4.234 — R. Van der Vaart (TOT)
  • 4.206 — R. Giggs (MAU)
  • 4.175 — D. Campbell (BPL)

This list is not perfect, of course, and you should apply your own judgement as well. As one example, Kyrgiakos is unlikely to play, even though he is neither injured nor suspended. Dalglish just hasn't been picking him recently. As another example, this is based on results from weeks 20 through 32. But perhaps you think a player is doing much better now than they were on average over that period. In that case, you may rate them higher.

As for finding the best way to fit players into your team, I have no useful advice. I let my computer figure out the team with the maximum number of expected points. However, for your reference, here is the team it chose:

  • Forwards: W. Rooney (MAU), R. van Persie (ARS), D. Campbell (BPL)
  • Midfielders: Nani (MAU), R. Meireles (LIV), R. van der Vaart (TOT)
  • Defenders: J. Terry (CHE, captain), B. Ivanovic (CHE), K. Richardson (SUN), P. Bardsley (SUN)
  • Goalkeeper: P. Reina (LIV)

Friday, April 15, 2011

EPL Week 33 Fantasy Team

I finally got around this week to automating the rest of my fantasy team selection process.

Previously, I had been using my goal predictions to determine which teams were most likely to score or have clean sheets. Then I would try to find the best way of packing the best players from those teams into a lineup.

This approach has one particular problem. When deciding whether to start Tevez or van Persie, for example, I would know only the expected number of goals for each team and the average number of points per game for each of these players. It's not obvious how you would use those two numbers, though, to estimate the expected number of points for each player because I don't know whether this week's expected number of goals is more or less than usual for that player. I also don't know, if the expected number of goals is different this week, how many more points that should translate into.

However, I now have all the information necessary to do this the right way. What I really want to know is the expected number of points for a given player. This is just 2 plus [probability of a clean sheet] times [clean sheet points for that player] plus [expected number of goals] times [points per team goal for that player]. (All other sources of points amount to less than half a point per match, on average, so it's fairly safe to ignore.) I computed that last part, which tells me how many points the player should get for each goal scored by the team, from the official statistics of each player over the last 12 weeks.

Putting that all together, I get a list of all the players and their expected number of fantasy points this week.

All that remains is to figure out the best way to put them into a lineup. This is a simple exercise in search. For those interested, I used a branch-and-bound approach where the upper-bound comes from ignoring the team constraint (only 2 players per team) and simply taking the best players at each position.

Surprisingly, the resulting algorithm takes very little time to run, just a few seconds. I computed the best lineups in each of the allowed formations.

This week, I get the following 4-3-3:

  • Forwards: R. van Persie (ARS), W. Rooney (MAU), N. Zigic (BIR)
  • Midfielders: F. Lampard (CHE), Nani (MAU), K. Nolan (NEW)
  • Defenders: J. Terry (CHE), L. Baines (EVE), P. Bardsley (SUN), K. Richardson (SUN)
  • Keeper: B. Foster (BIR)

Not too many surprises there, I suppose, except for the players from Birmingham. Still, I expect that this more analytical approach will pay off in the coming weeks. At least, it will save me some time.

Update: It turns out that Kevin Nolan is suspended, so I needed to do this again.

I added more data from some teams less likely to score goals and was surprised to see some were picked. In particular, Charlie Adam is a fantasy team gem. Even though Blackpool aren't as likely to score as some other teams, Adam's share of every goal is so high that it cancels that out.

The new team put Adam in for Nolan and DJ Campbell in for Zigic. It also picked Seamus Coleman. However, I remain worried that he will not be available for tomorrow's match. (He is currently recovering from injury.)

Once I listed him as still injured, a different formation gave the best results. In particular, I get the following 3-5-2:

  • Forwards: R. van Persie (ARS), W. Rooney (MAU),
  • Midfielders: F. Lampard (CHE), Nani (MAU), C. Adam (BPL), A. Young (AST), T. Cleverley (WIG)
  • Defenders: J. Terry (CHE), L. Baines (EVE), K. Richardson (SUN)
  • Keeper: S. Mignolet (SUN)

Tuesday, April 5, 2011

Hodgson, Dalglish, and Statistics

Of all the poor statistical arguments you hear people make, the one that bothers me the most (at the moment, anyway) is when they try to extrapolate recent performance back over the whole season.

If Arsenal beats Wigan 3-0, they say, "if Arsenal had played like this all season, they'd be top of the league," but perhaps they should really say, "if Arsenal had played against Wigan for every match this season, they'd be top of the league." It's not sensible to extrapolate recent performance over the whole season without taking into account who the opponents have been, at the very least.

This error was being made repeatedly a few weeks ago amongst Liverpool fans. They said, "if we'd had Dalglish all season, we'd be top of the league". They would point out that Dalglish's team had earned X points in Y matches, so they suggested that a whole season with Dalglish would have earned (X/Y)*31 points. Argh!

Despite these complaints, it is possible to do a proper analysis of this sort of situation. In fact, I had already reported on what such an analysis shows in this case: no statistical difference with Dalglish versus Hodgson.

Liverpool fans didn't want to hear it, but events since have bourn out that analysis. Liverpool have had a series of poor results. Now fans are admitting that the team just isn't good enough and that they need 4 or 5 new players this summer. This is especially amusing given that Hodgson said exactly this a few months ago and the fans chastised him for it. A better coach, they said, could win the league with this team. Apparently, they've changed their minds about that.

This week, the same statistical error is being made again, but this time by West Brom fans. After West Brom's 2-1 win over Liverpool, pundits are saying that a West Brom with Hodgson would be in the top 5 of the league!

I felt duty bound to do the proper statistical analysis. Here are the results. Unlike with Dalglish, there is currently a nontrivial difference. West Brom are scoring about 0.18 more goals on average (at home only, not away) and conceding 0.26 more goals on average. Unfortunately, these results are more sensitive to how regularization is performed, but the qualitative results are not: Hodgson has made the team stronger in attack and weaker in defense.

Given that, it is hard to imagine that West Brom would be amongst the top 5 in the league. In fact, it's not hard to guess where West Brom would be because there is another team whose characteristics are almost identical to West Brom under Hodgson: Blackpool. The two teams have roughly identical attacking and defending scores as well as the same home versus away performance. So it seems safe to say that, if West Brom had hired Hodgson at the start of the season, they would still be embroiled in a relegation battle now.

Monday, April 4, 2011

Tactical Changes From Mancini

Mancini plays a 4-4-2 for the first time and his team wins 5-0. That was headline, but what did it mean on the field?

In principle, the change was to move up one of the midfielders from the 4-2-3-1 (i.e., the "2-3" part) as an extra striker. In particular, instead of Tevez playing up top and Balotelli playing wide or deep (or vice versa), they started with both strikers up front.

However, it did not take long before Tevez dropped deep, as he likes to do, and the team returned to a familiar 4-2-3-1 shape. Indeed, it took no more than 6 minutes according to the clock:


They continued to use this shape for most of the match.




Even though they stayed broadly in this 4-2-3-1 shape, the movement of the attacking players was much better. Sometimes Balotelli was up top, sometimes Tevez was. Silva moved from side to side or played centrally. And Adam Johnson continued to show how good he is at making runs not picked up by defenders. He created the first goal almost singlehandedly, and continued to create good chances while he was on the field.

Perhaps more important than the positioning of the strikers was the movement of the two defensive midfielders. While Yaya Toure and Nigel de Jong (the "2" part of the formation) spent most of the match side by side, as can be seen in the photos, Yaya was frequently seen storming forward. He combined with Adam Johnson in a neat 1-2 for the initial goal, and was frequently involved in link-up play. (He also scored, but that was after he had moved to a forward role, replacing Johnson.)

This is more like the way that Arsenal play in their 4-2-3-1. Previously, the two defensive midfielders had been staying deep (perhaps against Mancini's wishes), which left it up to the fullbacks to push forward. Fortunately, with Micah Richards and Aleks Kolarov on the pitch, that is bound to happen. In this match, though, only one fullback moved forward at any time, but Yaya Toure was also moving forward to meet the usual total of 6 attacking players.

Later on, Johnson came off and was replaced by Patrick Viera. Viera moved into Yaya Toure's position and Yaya moved forward into an attacking role, which he has played in some recent matches. Fortunately, Viera also managed to get forward, breaking away from de Jong's side, to provide the extra attacker that Toure had.


As an aside, numerous pundits and fans, I think, are confused by the use of Yaya Toure in an attacking role. As we can see in the picture above, he is one of the attacking "3" rather than the defensive "2". Others assumed that since three capable defensive midfielders were named in the side that they were all playing in defensive roles, but this was not the case.

The typical way to play three defensive midfielders is in a 4-5-1: the three DMs play in a line with two wide attacking midfielders just above. However, City have never played this formation as far as I have seen. We always play with two defensive midfielders, in either a 4-2-3-1 or a 4-4-2 / 4-2-2-2 shape — the latter used for the first time today, at the start of the match.


In summary, the two striker formation did not last long, and it was not clear that it had any effect on the match. City were back in their usual shape before the first goal, and that goal did not even involve either striker. What seemed more important, to me anyway, was the ability of one of the two defensive midfielders (Toure) to get forward more often, the improved movement of the attacking players, and the penetrating runs of Adam Johnson. Indeed, Johnson gets my vote for Man of the Match.

Friday, April 1, 2011

EPL Season Predictions

A couple of days ago, I entered a contest for predicting the final results of the EPL season. I used my models to predict each of the remaining matches, and entered the most likely outcome into this page at the BBC's web site. Here are the results:



Even though these results are the most likely, the odds that I predicted each match correctly are absurdly small, i.e., one out of the number of particles in the universe sort of small.

What I would really like is to estimate the probability that the top 5 positions will be what I have predicted.

Unfortunately, we can't compute this probability exactly. The number of possible outcomes to consider is 3^82, which is another "number of particles in the universe" sort of number.

Fortunately, we can compute this probability approximately, via random sampling. And of course, I couldn't stop myself from wasting a half an hour doing just that. So I will now describe the results.

The five most likely outcomes are the following:

  • 15% — Man United, Arsenal, Chelsea, Man City, Tottenham
  • 13% — Arsenal, Man United, Chelsea, Man City, Tottenham
  • 8% — Man United, Arsenal, Chelsea, Tottenham, Man City
  • 6% — Man United, Chelsea, Arsenal, Man City, Tottenham
  • 6% — Arsenal, Man United, Chelsea, Tottenham, Man City

After this, we get a slowly decaying tail with many other possible outcomes. In particular, the outcome I chose above — the one that includes the most likely outcome for every match — has a 3.8% chance of occurring. Hence, it may very well be the case that I'd have done better in the contest to choose one of the less likely outcomes that results in the most likely final ordering for the top 5 teams.

The likely winner of the title this season is:

  • 54% — Man United
  • 39% — Arsenal
  • 7% — Chelsea

Finally, the odds of taking a Champions League spot are:

  • 100% — Man United
  • 100% — Arsenal
  • 98% — Chelsea
  • 69% — Man City
  • 29% — Tottenham
  • 5% — Liverpool