How to win your fantasy football league, by our data scientists

How to win your fantasy football league, by our data scientists

It’s been a while, but football is back. And with the imminent restart of the Premier League comes a return to action not just for players and fans, but for 7 million fantasy football managers too. With only a handful of weeks left in the season, there’s not much time to climb the league table.

So what’s the best way to pick up points fast? Frontier’s data scientists have crunched the numbers to provide the answers: which strategy to follow to catch your league leader, and which players to pick to do it.

The busiest managers can jump straight to the tips, but to any budding data scientist, please do feel free to interrogate our data analysis in the Data Deep Dive further below – and let us know what you think!

The best strategy? Gamble on mavericks

Let’s start by stating the obvious: to overtake the manager at the top of your league, you’ll need to pick different players to them. ‘Differential’ players, as they’re known in the fantasy football community, are key to success. But what type of players should you pick?

Those with a high average points tally might seem an easy starting point. But with a battery of statistics on player performance available online, it’s unlikely there’ll be a pool of untapped, high-scoring players that have not already been snapped up by other managers.

So, with only a few weeks left, it’s time to roll the dice: if you want to move up the table, our experts suggest you pick maverick players with a high volatility of points. From a statistical point of view, gambling on mercurial players that have the capability of bagging super-high points totals is the most attractive strategy for the chasing pack at this stage of the season.

To explain why, here’s an example:

'Fatter tails': How mavericks can help you catch up

Picture the scene: it’s crunch time in your mini-league. To have any chance of catching the leader before the season ends, you need one of your differential players to score at least 12 points in the next Gameweek.

You scan your squad for players that the leader doesn’t have. Who might bring in the points you need? You spot David Silva, the safe bet. Silva has been a solid pick, with consistent returns of 4 to 8 points most weeks. Sure, this gradual accumulation has helped you stay within touching distance of the top. But time is running out, and gradual accumulation won’t help you now – will Silva score the 12 points you need this week?

Your attention turns to Paul Pogba, the maverick. He only bagged 2 points at home to Crystal Palace, but then scored twice away at Burnley and notched up a heavenly 13 points. He’s inconsistent, but wouldn’t Pogba be more likely than Silva to score those crucial 12 points?

The chart below compares the two players by showing their points distributions: the probability that they’ll score a certain amount of points. Silva is more likely to score a decent total – 5, 6 or 7 points – but is unlikely to bag a super-high score. Pogba’s distribution has a flatter peak, but ‘fatter tails’ – he’s more likely than Silva to collect a low score, but also more likely to score a super-high total.

Planning your picks

So, picking mavericks is the best strategy if you want to climb the table quickly. But how many should you pick? And how do you find them? Our data scientists have analysed the numbers using statistical methods, and worked out the key steps.

Structuring your team

Identifying mavericks 

Who are the best maverick players?

Now that you’re armed with the strategy, it’s time to start making your picks. To help you, our team at Frontier used a combination of statistical and econometric techniques – Monte Carlo simulations, machine learning and logit regression analysis – to reveal the players most likely to catapult your team up the table when the season kicks off again on 17 June.

Focusing on goal-scoring midfielders and upcoming fixtures, here is the list of players that we think hold the key to your hopes of clinching the title:

One quick caveat to bear in mind: this list was generated based on analysis of historical Fantasy Premier League data from previous seasons. As with all data analysis, managers should consider how the present Premier League situation is different to that of previous seasons, and whether these insights extend to the current situation.

Enjoy the ride

So there you have it – those are the mercurial points-scorers who we think could help you stage a late comeback in your mini-league. As all fantasy managers know, it will be important to have a strategy for the coming weeks, and the data suggest that gamblers may be rewarded.

Hopefully picking maverick players will be an effective strategy for you, but remember to remain flexible. Whatever happens in the final few weeks, one thing’s for certain: it’s the rollercoaster ride of fantasy football that makes it fun, and throwing in a few maverick picks each week can only add to the entertainment.

The data deep dive

Our fantasy football tips are based on comprehensive analysis by our data scientists. Here, in our Data Deep Dive, we’ve highlighted some of the statistical and econometric methods they used.

We see these methods as complementary: Monte Carlo analysis explores the impact of a greater volatility of points on a manager’s prospects of catch-up by simulating what would happen in thousands of possible ends-of-season. Unsupervised machine learning helps identify who the maverick players are by sorting all players into different groups, based on the volatility of their points scores. Logit regression analysis isolates the additional impact of form and easy fixtures on the probability of an outcome (in this case 10 points scored) occurring. These methods answer different questions in different ways, but together they provide a range of tips to increase your chances of catching the leader.

Before jumping in, it’s also worth reiterating that these are not normal times for the Premier League. We’ve run checks to make sure our results replicate to other scenarios, but as the season plays out it’ll be interesting to see whether insights based on historical data extend to the current situation. 

Mavericks vs safe bets – Monte Carlo simulation

We suggested above that maverick players increase your chances of climbing the table when compared with safe bet players. So how did we calculate the odds?

We used a technique called Monte Carlo simulation, playing out thousands of possible ends-of-season for pairs of maverick and safe bet players. For each player, we recorded a stat: the proportion of ends-of-season where they overcome a certain points deficit – for example, 20 points – against a base player. Since each pair of representative players has the same average points (see the examples of Marko Arnautovic and Jamie Vardy below), any difference in the percentage of catch-ups across players is due to differing points volatility.

Using Monte Carlo analysis to calculate how often Arnautovic and Vardy catch up

Key insights

How many mavericks?

Our advice was to pick between two and four maverick players – again, this was based on our Monte Carlo simulations.

A second maverick is almost always helpful: repeating our simulations for two mavericks catching up the same 20-point deficit across seven weeks, the mavericks succeeded 32% of the time. Two safe bet players caught up in 22% of simulations. This means that adding a second maverick increased chances of catching up by 22 percentage points, compared to an additional safe bet increasing chances by 18 percentage points.

For managers a modest distance behind the leader, say between 10 and 30 points, it’s best not to get too greedy. Our analysis showed that adding more than two mavericks will have a diminishing effect on your chances. Each maverick you add has an offsetting effect: a low score from the second maverick may coincide with a high score from the first maverick, cancelling out the gain from the first player’s points volatility. This offsetting effect isn’t too large when moving from one to two mavericks, but it becomes greater the more you add.

Finding the mavericks – machine learning

To find out what makes a maverick player, we used unsupervised machine learning to form groups of similar players. This was to make our lives easier – there are lots of different players to choose from, so grouping similar players made it simpler for us to identify the mavericks.  

Here is a short description of how the machine learning technique works. For a plot of data for all player-and-season combinations (we call this the “data space”), machine learning uses an algorithm to draw boundaries around data points, thereby identifying a set of groups. This is done for a specified number of groups – we chose four. The boundaries of each group are drawn so that each player-and-season observation is a close distance to the centre of any one of the four groups. Once the boundaries are set, players are allocated to whichever group they sit within – or more precisely, are closest to.  

We observed that players in two of the four groups had similar average cost and points, but one of these groups has the players with the greatest points volatility in the dataset – this, therefore, is the maverick group. We compared the characteristics of players (e.g. Position) across the maverick group and the comparison group (which we also refer to as the safe bet group) to determine what is different about mavericks. Below is a summary of the players in these groups.

Key insights

Analysing form and fixtures – logit regression

To calculate the impact of form and ‘easy’ fixtures (those against weaker teams), we employed logit regression analysis. This technique is used to estimate the probability of a particular outcome occurring and to identify whether different factors affect the likelihood of an outcome. It also estimates the size of the individual impact that each factor has on the probability of the outcome.

Our data scientists used logit regression analysis to estimate the likelihood that different players score 10 points in a certain Gameweek. This included estimating the uptick in probability for in-form players and players with easier fixtures scoring 10 points. We also analysed whether playing against a team that’s ‘on the beach’ – one with nothing left to play for – increases the likelihood of a player scoring 10 points. Our definition for being in ‘good form’ was scoring 8 points or more in the previous week; we categorised easier fixtures as those with an Fantasy Premier League difficulty rating of 2 or 3 (as opposed to 4 or 5); and we manually identified teams that could have been ‘on the beach’.

The chart below presents our results. The left-hand bar is the overall probability that any player will score 10 points, irrespective of form or fixtures – so roughly 5% of players score 10 points in any week. The right-hand bar starts with the conditional probability that an player that is not in form with a hard fixture scores 10 points (2.6%), and then adds on the impact of a player being in form and having easier fixtures. Our results suggest that form and fixtures should not be ignored – players that are both in form and have an easy fixture could be twice as likely as the average player to score 10 points. Playing against ‘on the beach’ teams, however, has a limited impact.

Key insights

Using data to win in fantasy footabll