Atanu Biswas

Professor, Indian Statistical Institute, Kolkata

DURING the FIFA World Cup in Qatar, South Korea’s Hwang Hee-Chan took his jersey off after scoring the match-winner against Portugal. He was spotted wearing a GPS tracker vest that helps collect individual players’ GPS data. Indeed, the sporting world is no stranger to an obsession with data.

Michael Lewis’ 2003 book Moneyball was a game changer. It detailed how Billy Beane, manager of Major League Baseball’s Oakland Athletics, recruited players in 2002 on the basis of data and the application of statistics about their performance and built a successful baseball team on a shoestring budget. The Moneyball culture soon infiltrated every aspect of our lives — elections, healthcare, business and national planning — and Silicon Valley jumped into supplementing it through the magic dust of data science.

During the 2005 National Basketball Association draft, the Portland Trail Blazers engaged a company called Protrade Sports, which estimated the probability that a player would be a performer with historical college data. However, as technology advanced, the data became increasingly complex. For example, since 2006, SportVision has used the motion-capture technology to track the trajectory and speed of every pitch in Major League Baseball. The Tampa Bay Rays recruited Josh Kalk, a physics and mathematics professor, to analyse how pitchers’ release points change for different pitches. Nate Silver, the author of the 2012 book, The Signal and the Noise, also played a leading role in popularising data science in both sports and business.

Soon, we were left with more data than we could handle. And we keep collecting more and more data. Well, data science has started building momentum in football too. It ought to be so. But football is much more complex than baseball or cricket, for sure. Baseball is a game with a discrete set of actions; it’s a natural stop-start game. In contrast, football is a continuous type of game; it’s an intrinsically fluid and relatively low-scoring ‘invasion game’ (a game in which players ‘invade’ an opponent’s territory to score a goal/point). From a statistical and analytical standpoint, football has an additional layer of abstraction over baseball. Hence, the data revolution in football is comparatively slower.

However, historically, data-based strategies in football can be traced back to the 1950s, predating personal computers, when Charles Reep, a former military accountant, watched matches in England and made basic observations of factors such as pitch positions and passing sequences. Reep analysed the data to suggest strategy and tactics at Wolverhampton Wanderers Football Club; he helped introduce a direct and incisive playing style that frowned upon sideways passes, and won three league championships in five years!

In the post-Moneyball era, Simon Kuper and Stefan Szymanski’s 2009 book Soccernomics, described as “the most-intelligent book ever written about soccer” by the San Francisco Chronicle, can be treated as Moneyball’s soccer equivalent. It discussed why England loses, why Germany and Brazil win, and why the US, Japan, Australia, Turkey and even Iraq are destined to become the kings of football. The authors argued that goalkeepers are undervalued in the transfer market and players from Brazil are overvalued.

Oxford graduate Matthew Benham bought Brentford Football Club (FC) in 2012. He invested in analytics, spending nearly $10 million testing out his theory in a Moneyball way, as he did with another club he owned, Danish side FC Midtjylland. Both were stories of eventual, reasonable success.

In today’s multibillion-dollar soccer market, data analysts are becoming high-profile signings. When former astrophysicist and Treasury policy adviser Laurie Shaw joined Manchester City in early 2021, the football world was abuzz. The research department at Liverpool FC was led by a Cambridge-trained polymer physicist, and Arsenal FC hired a former Facebook software engineer as a data scientist.

But what exactly can data scientists do in football? Well, applications of data science in football cover recruitment, training the players, team selection, player change, passing, throw-in, corner, free kick, man-marking, and whatnot. Big data analytics have altered team philosophy and behaviour, as well as talent development and scouting, ushering in a new era of football.

With an algorithm, the GPS data during training and matches even provides the probability of injuries. With the use of advanced technology, statistics such as the number of passes, shots and interceptions, their locations, and even more advanced metrics like the expected goals and threats can be generated. The FIFA Football Data Ecosystem also includes data for every action during the match, such as passes, shots, substitutions, decisions of match officials and many more.

In a paper published in Nature in November against the backdrop of the 2022 World Cup, science journalist David Adam investigated how big data is transforming football. “Data analysis now helps steer everything from player transfers and the intensity of training to targeting opponents and recommending the best direction to kick the ball at any point on the pitch,” wrote Adam.

However, football is a bit different; it relies on instinct, passion, character, and sometimes magical skill. So, it’s not easy to inscribe the data revolution into the inherent character of the game. In his new book, Net Gains: Inside the Beautiful Game’s Analytics Revolution, soccer writer Ryan O’Hanlon carries out an in-depth examination of the rise of analytics in soccer. He illustrates how the analytics in soccer have brought revolutionary tactics and underexplored metrics that are breaking the beautiful game wide open. “Once you think you’ve figured out the answer, someone else will find a better way to ask the question,” writes O’Hanlon.

Remember that while the Oakland Athletics made four consecutive playoff appearances under Beane, they also had a poor record after that. And with so much data analytics, the football World Cup didn’t get a new winner in Qatar! You may achieve success up to some extent by analysing the data and outlining and executing your data-driven plans accordingly. But, at the end of the day, wizards such as Lionel Messi and Kylian Mbappé would make any kind of big data analytics seem less important. Let data scientists accept such brilliance.