The Data Debate on Transfer Targets

The Data Debate on Transfer Targets
May 16, 2013 Chris Rowland

By Lee Mooney.

I was recently involved in an interesting debate on Twitter about the usefulness of data to clubs looking to buy and sell players.

I thought I’d produce a response that outlines what I see as the strengths, weaknesses and opportunities inherent in using data to assist player recruitment.

Criticisms that arose included:

  1. Using data to identify potential transfer targets is a waste of time
  2. Exceeding the capabilities of the Football Manager database is hard
  3. Traditionally scouted ‘qualitative’ data is the key enabler/differentiator
  4. Insight based on ‘career biography’ data will only ever highlights ‘known’ players
  5. The aim should be to identify talented players who haven’t yet made a biographic impact
  6. Data can’t easily help you identify players that fit a certain style or system
  7. Much of what counts in football is difficult to count effectively in practice

I’d like to take each of these points one at a time and provide my perspective. In many ways, I agree with the points being raised at the theoretical/conceptual/logical level. However, I think it’s important to empathise with the practical realities of player recruitment within football clubs and accept that there are limitations and market imperfections that can be exploited (at least in the short-term).

1 – Using data to identify potential transfer targets is a waste of time

This statement really makes no sense to me. Practically, how can a club possibly calibrate the ‘ability’ and ‘value’ of a player if it only ever considers the data that it can collect manually via its scouts? How many players can a club actively track via scouts with any kind of quality? Perhaps as many as 100? The potential talent pool is massive – with my estimates putting it at over 250,000 professional players (not including young prospects playing in academies and other youth levels). Data offers a practical and affordable way to scan and monitor a huge talent pool. Even if data is limited on the finer points of player selection, which it may well be, discounting this this potential benefit would be folly in my opinion.

If we take this out of the ‘football’ domain, and think about something ‘normal’ like buying a car, hopefully my logic will make more sense [having just a quick debate in the office about this…perhaps this won’t be as easy as I thought]. Let’s assume you buy your car in cash and your budget is 2 months of your net salary which you’ve managed to save up. Estimating very roughly, based on a salary of £25,000 per year, your budget would be £3,500. Now, you need this car in a hurry because you’re about to start a new job – so you need to spend this money quickly (but also carefully and well because it’s a big chunk of income). You also have a full-time job so you can only really go and visit show-rooms on the weekend. How do you approach getting the best overall deal given your circumstances? I’m willing to bet you wouldn’t try and visit every showroom in the country or even in the region for that matter. Even if you know exactly what type of car you want – who’s to say that the showrooms close enough to where you are actually have one available at a price you can afford?

In reality you’d jump on Auto Trader, or something similar, plug in some basic data and get a high-level ‘feel’ for the pool of cars available nationally and where they are in-relation to you. You might even fine-tune your criteria to create a short-list of suitable candidates and then invest the precious time you have scrutinising these in-detail to decide which one you’ll ultimately buy. That might involve taking a ‘mechanic’ with you to check the things you don’t really understand etc.

This is a simplistic example but hopefully it makes my point. The role of data in player recruitment, in my mind, is to enable this ‘Auto Trader’ facility. It doesn’t automatically pick the player – it simply helps you get a quick market context and practically navigate the ‘sea’ of potential options given the time and resources you have available.

A logical extension here might be to think of football recruitment like Auto Trader, but where the the prices of the cars are entirely set at random. In that scenario, how would you then go about assessing how much you would/wouldn’t be willing to pay for a car – even though the ‘pool’ of vehicles is effectively ‘known’ to you.

You need to keep in-mind though, and this is very important, fans and bloggers think clubs are far more equipped than they actually are. Once all the post-match and pre-match analysis is done, spot requests from the manager are dealt with and scouting reports are collated – there really isn’t much time left to explore new possibilities. Headcounts and salaries, outside of the playing and management staff, are actually very strictly controlled (and low to non-existent in many cases).

2 – Exceeding the capabilities of the Football Manager database is hard

I agree with this statement entirely. The Football Manager database, especially in the hands of someone who knows how to manipulate and integrate data, is a hugely powerful resource. I’m willing to bet there are more people involved in the development and testing of this game data than there are people working in ‘first team’ player recruitment in every club across Europe…combined!

If clubs aren’t using this data, they really should be. The full game database is massive and can be bought for £15. It’s a ‘no-brainer’, as they say, and should absolutely be part of the overall ‘problem armoury’. If I was hired by a club tomorrow, despite all the work I’m doing, buying every available publication of this game – along with a powerful laptop – would be the first thing I’d do. What I’d do with it is another topic entirely. Would it be the sole basis for a decision? Not at all, but it would be a useful component.

FOOTBALL MANAGER

Despite popular belief, I’m not sure clubs are actually using this data. If they are, it’ll be through the game interface (which is very limiting from an analysis perspective). The only exclusive agreement I’m aware of is between the makers of the game, ‘Sports Interactive’, and one of the main UK football data providers. So, in owning copies of the game you effectively have the same resource that all the clubs have but aren’t using (other than those buying the data via a license agreement with the data provider in question). Sadly, any insights generated from this source are generally discarded by the ‘industry’ (along with many more ‘intuitively credible’ sources of insight as it happens – bias extends to data as well as players it seems).

3 – Traditionally scouted ‘qualitative’ data is the key enabler/differentiator

I guess I agree with this too. You can identify short-lists of players via a range of different quantitative mechanisms. However, when it comes down to it, the personality of the individual is very important too. You’re ultimately recruiting a ‘human’ with a set of skills. A friend once said to me that when you recruit anybody you’re effectively looking to answer three questions:

Can they do the job?

Will they fit in to, and add to, the dynamic of the team?

Will they love/enjoy what they’re doing in this environment?

I guess football is no different really. Now, I’m prepared to accept that ‘ability’ is difficult to judge. Just because a player is a 21 year-old defender, has played 100 times for Real Madrid and scored 10 goals along with making 30 full international appearances for Spain (the World Champions) with 5 goals – doesn’t mean he’s any good. He could be useless and lucky. Similarly, a player that has never made a single senior appearance anywhere by the age of 21 could be the next Lionel Messi – his talent has just not been recognised through bad luck.

But, when you’re faced with the practical reality of a time-boxed decision, where would you put your own money? Surely talent is self-fulfilling? If you’re good enough you’ll get games at your level (creating a biographic signature that can then be valued objectively). Yes, the very top players playing regularly for top clubs and international sides will stand out. But when you filter them out and focus on your club’s addressable market – like unchecking ‘Buggatti’ and ‘Veyron’ on Auto Trader – there are still so many players out there that the majority of clubs just won’t ever have heard of.

Ability aside, judging the ‘human’ elements requires a ‘human’ interaction. Is the player a ‘good person’ that will fit into the team? Is he a skilled manipulator of managers and national football associations – gaining recognition simply by virtue of his influencing skills and good looks? Possibly. Equally, is he a psychotic individual who might just violently assault the wives and girlfriends of fellow players given half a chance…after biting each and every one of their pets? Again, there’s a chance. Ruling out these extreme possibilities is where human judgement really plays its part in my view (weighing the potential costs of a person’s ‘negatives’ against their ‘positives’ and the value attached to them).

4 – Insight based on ‘career biography’ data will only ever highlights ‘known’ players

I guess this depends on the definition of ‘known’. Few players escape the coverage of the Football Manager database for example, so I guess on that basis this is right. That said, that database does not have anything like an accurate record of a player’s achievements in the game to-date. It is simply the opinions of enthusiasts synthesised to create a realistic and immersive game experience by encoding certain attributes using an arbitrary scale (which could be perfect for answering certain types of questions).

For the kind of work I’m focussing on – even with total unrestricted access to all the data – you couldn’t create what I’m successfully working towards. A minor differentiator then would be coverage, which I think I’ll be able to exceed (but only slightly as the game is undoubtedly strong in this regard).

Now, pairing the two datasets together – the data I’m building along with the encoded qualitative data in the Football Manager database – that would be powerful (and might actually become an option through partnerships and other arrangements in the future as things develop – so watch this space).

5 – The aim should be to identify talented players who haven’t yet made a biographic impact

I’m not sure about this. No biographic impact at all? I guess theoretically, a player that has made no biographic impact – so someone like me – could be good enough to play for a Premier League club (if Brendan Rodgers is reading this, I’m not picky about squad numbers – any will do). In reality, even if that impact is small – like a youth academy or international appearance for example – that’s still an impact that can be objectively recorded assessed. Now aged 30, my career biography doesn’t say ‘Lee Mooney can never be good enough to be a first-team regular for Liverpool FC” but it does say that “the probability that Lee Mooney will ever be good enough to be a first-team regular for Liverpool FC is significantly lower than his probability of visiting outer-space”.

More realistically, I guess the aim here is to find players that are capable of making a bigger impact than they have because of a lack of suitable opportunities (cue the club willing to give such a player such an opportunity). That’s different, and an objective that I would entirely support. A good ‘current’ example might be Victor Wanyama. Before joining Celtic for £900,000 from ‘Beerschot AC’ in Belgium he’d achieved 50 full senior appearances and a number of international appearances for Kenya – before his 20th birthday. Would I have had him on my ‘data radar’ back then? Absolutely! I guess you’d need to ask the clubs to see how many of them could honestly say the same.

6 – Data can’t easily help you identify players that fit a certain style or system

I agree with this too actually. The data that might help with this is either expensive to acquire or it just doesn’t exist (or can’t be made available in a timely fashion to support recruitment decisions). Again, this is where human judgement based on thousands of hours spent watching football can really make the difference between an ‘acceptable’ impact and a ‘historic’ impact. I love the examples of Jamie Carragher being ‘converted’ by Ronnie Moran and then later by Rafael Benitez. Similarly, the switching of Ray Kennedy by Bob Paisley from a central striker to a ‘wide-forward‘ was inspired. Arsene Wenger converting Thierry Henry to a central striker form a wide midfield player took a talent and made him something else entirely. Football ‘geniuses’ can do, and have done, these things – but examples are quite rare in-relation to the thousands of transfer transactions that take place all over the world every season.

7 – Much of what counts in football is difficult to count effectively in practice

I couldn’t agree more on this. I’m not sure what should be counted to be honest. I have some ideas but again, that’s probably for another time.

In truth, I’m not really in the business of trying to re-define what the established data providers are doing. These organisations have invested heavily in a range of technologies and processes and need to realise profitability from those investments – even if they were the wrong investments. It’s the responsibility of those buying this content – clubs and the like – to make sure they’re getting value for money and, if they’re not, to drive the evolution of this data towards ever-increasing value by voting with their feet.

As bloggers and ‘fanalysts’ we basically ‘get what we’re given’ so we’re not really in a position to influence things, certainly not at any great speed. We can talk and comment and hypothesise. But, like so many things in life, those with the money have the power.