One of the biggest problems when trying to use, track and analyse expected goal models is the lack of context, caused mainly by a lack of access. Even using the totals for individual matches can also be tricky because the baseline figures can be misleading, in that creating fewer high quality chances is more likely to win a match than attempting far more shots of lower quality but with the same total xG value. It can be a minefield at times.
Fortunately last year the website Understat popped up out of nowhere and suddenly we have team and player data from six leagues across Europe which was a real breakthrough and we stats nerds will be eternally thankful for it. (While it’s still around that is, because it’s definitely a Russian site, and I personally thought it might close after the World Cup but as it stands it’s still with us!)
Another site to raise its stats game recently has been Wyscout, which has introduced all sorts of filters, and also an xG model. Crucially, this model is applied to all the games they have video footage for, which is as it stands is 211,738 full matches across 599 different competitions with 43,000 teams involved. That has to be the biggest model in the world, surely?
However, before you get too excited, I’ve only managed to collect the relevant data for 21 different leagues and around 350 different teams – split into home and away performances, across all competitions (domestic and continental) in the last 12 months. One problem that does create is that the figures won’t be exact across domestic leagues; although I’ve split the teams by leagues, there is some muddying of the waters because obviously some teams went all the way to the European Cup Final, some got knocked out by Sevilla and some are called Everton. Volume of games is slightly different, as is quality of opponent. Plus, comparing the National League in England to La Liga is like comparing Liverpool with, say, Everton. Apples and oranges. Hence why this is not an in-depth statistical analysis, but more of an overview into the world of xG – and how Liverpool compare on the surface. To make any concrete conclusions (which I will be doing once I sort and collect the data with appropriate filters), the data needs splitting up by competition at least. But in this light-hearted review of the data we have, there are still some interesting findings … such as Man Utd being on a par with Kilmarnock at home, and Accrington Stanley away. Like I said: light-hearted, honestly!
The rest of this article is for subscribers only.