The Promise and Peril of Big Data

‘Big data’ is a term that describes the vast amounts information being collected and processed as the world’s technological prowess grows at an astonishing rate. What differentiates big data from regular old data is the three v’s, velocity, variety, and volume. Velocity describes the speed at which the information is initially received, transferred, and processed into a usable form for analysis. Variety and volume are categorized by the sheer quantity of data received, both in amount and in diversity. You can see for yourself the growth of big data both as a term and as a tool by checking the Google Trends display of ‘big data’ searches. Analytical programs such as Google Trends are made possible through the collection of big data, and will become more advanced and accurate as our understanding of this new phenomenon rises.

The future of big data is strong, but potentially frightening. As it grows, big data will have a growing impact on the functioning of the world economy and society in general. Whilst the increasingly pervasive presence of big data in our daily lives may be met with hesitance, the possibilities for its use will likely overcome any backlash. Positive uses of big data such as fighting crime and managing healthcare offer the potential to increase living standards by improving public safety, health, and longevity. Valid concerns about data privacy will need to be addressed, but are unlikely to stop continued big data collection and application.

Privacy is an increasingly important issue in today’s world. CCTV cameras track your every move, and tech firms monitor your every click online. This data collection and monitoring has the potential to make nations more secure, and websites more useful.  However, it could also be used to track your movements, influence your thoughts, and manipulate your decisions.  Is it a fair trade off? How much privacy and personal freedom are we willing to trade for a little more national security and shopping convenience? This is not an easy question to answer, but it’s a conversation that needs to be had.

In order to make an informed judgment about these privacy trade-offs, we must consider some of the practical applications of big data. One example of a beneficial use of big data and surveillance that has proven successful at cutting crime is a technology being deployed by Ross McNutt’s firm Persistent Surveillance Systems. With a small plane and a 192 mega-pixel camera, his team takes aerial photos of a targeted city at regular intervals. Now ‘photos of a city’ may sound like a small advancement for crime fighting, if an advancement at all, but let me propose to you a scenario. A drive-by shooting occurs in a poor part of town. Police resources are limited, the car skids into the distance, perhaps a change of vehicles occurs, and the criminals disband. No leads. Now, in McNutt’s world with this ‘Eye in the Sky’ technology, there are leads. The car can be tracked down frame by frame, with the ability to see individuals and vehicles move pixel by pixel on a near real-time map. This is not just theoretical. Within weeks of this technology being implemented in Juarez, Mexico, an otherwise unsolvable drug cartel hit on a Mexican police officer was solved.

Pair this technology with big data analytics and we have a crime stopping masterpiece. With new advances in machine learning, a computer could be fed these images and trained to spot irregularities, notifying police when unusual circumstances occur. This has real potential in aiding the fight against drug cartels, stopping gang violence, and decreasing overall rates of crime.

However, would you want your every move stored on a server, your position constantly known to law enforcement and the government?  Living in a Western democracy, it is easy to take basic safeguards for granted like free elections, freedom of the press, separation of government powers, and rule of law.  However, these protections do not exist everywhere.  Big data analytics in the hands of Kim Jong-un might well be used to further centralise power, micro-manage the economy, and suppress political opponents. In other words, to build a techno dystopian society.

The future will bring more big data analytics. Each aspect of your life will be targeted, your shopping preferences, holiday plans, employment potential, the possibilities are endless. With the vast swathes of information that can be collected from our devices, many of our actions will be increasingly pinpointed and analysed.

A current battle that will have a large impact on the future of big data is the fight for net neutrality. Net neutrality is the principle that governments and internet service providers (ISPs) must not discriminate against or give preference to any aspect of the internet, from the type of customer to the websites that they browse. The fight for net neutrality in the United States looks like it will be lost as the FCC (Federal Communications Commission) moves to crack down on the free internet, giving power to ISPs to regulate the web as they please. Losing net neutrality would be a large step backwards for personal rights as free access to the internet plays such an important role in modern life. Big data would be impacted by these changes as companies who are given preferential treatment by ISPs would gain a significant boost in web traffic, and as such be able to collect more data. Companies not favoured in a post net neutrality world would see their data stores decline as their traffic is throttled. As it stands, our data is going to be tracked one way or another, so it should at least be our right to determine which firms, preferably socially conscious ones, are the benefactors of receiving our traffic.

It is a tricky dilemma to balance the benefits of advanced data technology with the importance of data privacy. The potential advancements in the health sector are likely to be a great benefit to society, but we need to remain cautious about why, how, and what data is collected, stored, and used. Overall, big data will emerge as one of the revolutions of the early 21st century, but at a cost to our privacy. It is a hard trade to stomach, but the future is exciting nonetheless.

Dean Franklet is a third year economics and finance student at the University of Canterbury where he is President of the largest commerce society on campus. Spending his life in Texas and then New Zealand with a few other stops along, he gives a unique global viewpoint to portray in his writing.

Image: Pexels

Big Data: When Can You Act on Correlation?

The key is to know when correlation is enough, and what to do when it is not

David Ritter, director at BCG, has explored the world of big data and when companies should take action based on observed correlations in the data.

His ideas have big implications for business because, if correlation is enough, then instead of having to know what causes customers to act, it may be enough just to know what things tend to happen together.

For example, many large supermarkets already understand that women who buy certain kinds of food may tend to be pregnant. Digital Life reported that in 2012:

… news broke of how data analytics by Target in the US enabled it to identify which customers were pregnant – and even what trimester they were in. It famously sent coupons for baby products to a teenage girl whose father, unaware she was expecting, angrily confronted a Minneapolis store manager.

Ritter notes that the key question when looking at correlation in the data is “Can I take action on the basis of a correlation finding?”

And his answer:

The answer to that question is “It depends”—primarily on two factors:

  1. Confidence That the Correlation Will Reliably Recur in the Future. The higher that confidence level, the more reasonable it is to take action in response.
  2. The Tradeoff Between the Risk and Reward of Acting. If the risk of acting and being wrong is extremely high, [then] … acting on even a strong correlation may be a mistake.

The first factor—the confidence that the correlation will recur—is in turn a function of two things: the frequency with which the correlation has historically occurred (the more often events occur together in real life, the more likely it is that they are connected) and the understanding around what is causing that statistical finding. This second element—what we call “clarity of causality”—stems from the fact that the fewer possible explanations there are for a correlation, the higher the likelihood that the two events are in fact linked. Considering frequency and clarity together yields a more reliable gauge of the overall confidence in the finding than evaluating only one or the other in isolation.

When working with big data, sometimes correlation is enough. But other times, understanding the cause is vital. The key is to know when correlation is enough—and what to do when it is not.

To read the full article by David Ritter, visit the BCG website.