TrendingVideosIndia
Opinions | CommentEditorialsThe MiddleLetters to the EditorReflections
Sports
State | Himachal PradeshPunjabJammu & KashmirHaryanaChhattisgarhMadhya PradeshRajasthanUttarakhandUttar Pradesh
City | ChandigarhAmritsarJalandharLudhianaDelhiPatialaBathindaShaharnama
World | United StatesPakistan
Diaspora
Features | Time CapsuleSpectrumIn-DepthTravelFood
Business | My MoneyAutoZone
UPSC | Exam ScheduleExam Mentor
Advertisement

Making a case for levying tax on data collection

With so much data at our disposal, it’s like looking for a needle in an ever-growing haystack. We lack the computational tools and algorithms necessary to process billions of data points, as well as the statistical know-how to manage thousands of variables. There would be an increase in the number of pairs of variables that showed strong ‘spurious’ or ‘nonsense’ correlations. The ocean of big data can produce both nectar and poison when stirred.
Advertisement

SHOULD we bank on data to make life’s important decisions? ‘Yes’ seems an obvious answer in today’s big-data era and data-obsessed world. The title of a 2022 book by economist and former Google data scientist Seth Stephens-Davidowitz, Don’t Trust Your Gut: Using Data to Get What You Really Want in Life, says a lot. But can we actually accomplish that? Do we know how to use the vast amount of data we are currently gathering?

Advertisement

The Michael Lewis book Moneyball (2003) and the 2011 movie based on it described how the manager of Oakland Athletics built a successful baseball team by using data and computer analytics to recruit new players and orchestrated a remarkable shift in people’s perceptions of the utility of data. The Moneyball ethos eventually permeated every aspect of our way of life. Data scientists, who, according to the Harvard Business Review, have the most attractive job in the 21st century, have joined Silicon Valley in this endeavour.

Advertisement

The world has developed data addiction, and people strive to use data to create effective plans for every part of their lives and lifestyles. The term ‘Data is the new oil’ was first used in 2006 by British mathematician and data science entrepreneur Clive Humby, and it has since gained enormous popularity. It is commonly believed that data will be the driving force behind the Fourth Industrial Revolution, which will be defined by a fusion of technologies that will blur the lines between the physical, digital and biological spheres. Data generation, however, is growing tremendously due to its ever-expanding mode.

The digital universe is currently anticipated to double every two years due to the development of computers and the Internet, as well as the fact that almost everything is now governed by the Internet of Things system. It is estimated that by 2025, the world will produce 463 exabytes (one exabyte is equal to a billion gigabytes) of data every day, which equates to 21,27,65,957 DVDs. That is a pre-pandemic estimate. We may have to re-evaluate it due to the enormous increase in digital dependence during the pandemic.

Big or small, various organisations are motivated to gather as much data as they can. The costs of gathering, keeping and conserving data are too high. However, they are frequently unable to utilise even a small portion of such data. Data collection still continues. Should society as a whole discourage this behaviour? Well, the idea of a ‘data tax’ has surfaced from several sources. In a 2017 article in The New York Times, Rockefeller Foundation’s Saadia Madsbjerg wrote: “Consider your data as something real and physical, like a car… (that) moves around a real, physical infrastructure… owned and operated by the Internet providers.” In American docudrama The Social Dilemma (2020), a former Google employee says, “We could tax data collection and processing the same way that you, for example, pay your water bill by monitoring the amount of water that you use.” The ‘data tax’ gives the companies a fiscal reason to not acquire every piece of data on the planet, he thinks.

Advertisement

With so much data at our disposal, it’s like looking for a needle in an ever-growing haystack. We lack the computational tools and algorithms necessary to process billions of data points, as well as the statistical know-how to manage thousands of variables. There would be an increase in the number of pairs of variables that showed strong ‘spurious’ or ‘nonsense’ correlations. The ocean of big data can, therefore, produce both nectar and poison when stirred. It is difficult to separate them. In this context, statistics is still in its infancy and is not yet prepared to deal with these kinds of issues.

A senior colleague once advised me to just preserve the documents I am certain I would need in the future rather than keeping every piece of paper we receive in the course of our daily work. He said even if we stored every piece of paper on our shelves, it would be difficult to locate the specific document when we needed it. That’s the issue with big data.

The projections made from various quarters during the Covid-19 pandemic demonstrated the limitations of data analytics’ power. Here are two prime examples of big-data follies. Fiction first: In the 2002 Steven Spielberg film Minority Report, the Washington Precrime Police Department uses data mining and predictive analyses to foretell future homicides in 2054. After being accused of one of these potential crimes, a systems officer sets out to establish his innocence.

The much-publicised Google Flu Trends initiative, which Google initiated in 2008 with the aim of making precise forecasts about influenza outbreaks by compiling Google Search queries, is an example from real life. The project, however, was a failure since people frequently search for disease symptoms that are similar to the flu but are not actually flu. Big data might not be the holy grail of analytics.

The Star Trek character ‘Data’ had an ‘emotion chip’ inserted into him, so he tended to be human-like. The starship USS Enterprise’s captain, Jean-Luc Picard, observed of ‘Data’, “He (Data) evolved, he embraced change because he always wanted to be better than he was.” What about the data-driven world? Well, there is little doubt that Google will continue making mistakes while attempting to predict the flu. Also, pre-empting future murders will continue to be hopelessly dystopian. The future of data is still quite hazy.

Advertisement
Show comments
Advertisement