In statistics and computer science, the term big data – “large amounts of data” – generically refers to a collection of information data that is so large in volume, velocity and variation that it requires special techniques and analytical methods to extract value or knowledge. The literature thus explains what big data is and what it is for, using terms that may sound overly technical to the uninitiated. It is in fact one of the most profound and pervasive developments in the digital world, destined to endure over time and to have a major impact on our daily lives and on the productive activities of businesses.
This is an influence that can be felt every day and has virtually radically changed many of the basic activities of our lives. As has the world around us. This is why, especially in the last twenty years, we have been hearing more and more about megadata in the print and online press, and even more in pages devoted to marketing and IT. In this guide, we will discover their value, what they are used for and where big data can come from.
Big data: what it is and what it is used for
Giant data is a trend that is not only powerful, but as we have already mentioned, also destined to continue over time. Moreover, it is constantly improving from an application point of view. The term is used, as you will have understood, in the context of the ability to analyse, extrapolate and relate a very large amount of heterogeneous data, structured and unstructured, which is part of the science of computing. All thanks to sophisticated statistical and computational methods aimed at discovering links and correlations between different phenomena and thus predicting future phenomena.
To give a few examples, from a business point of view, large amounts of data can be used for various purposes, including measuring the performance of an organisation or a business process. However, to fully understand what big data is, we can think about when we interact on social networks, when we navigate on a website or on the most modern smart phones that are virtually always connected, without forgetting the credit cards used for shopping, the TV, the storage space needed for computing applications, smart city infrastructures and sensors mounted on buildings and public and private transport.
In all these cases, we are faced with an impressive amount of generated data, much larger than a few decades ago. Today, big data can be analysed in real time. Moreover, over time, people have become sources of data, just as a not insignificant amount of data is created along the value chain in all industries. Teradata stated in 2011 that “a big data system exceeds/exceeds/outperforms the hardware and software systems commonly used to collect, manage and process data within a reasonable time frame for a community/population of users, even if it is large”.
Another suggestion to characterise big data was given by the McKinsey Global Institute: “A big data system refers to data sets whose size/volume is so large that it exceeds the capacity of relational database systems to collect, store, manage and analyse”. In fact, the definition of big data alone is not sufficient to provide a complete and optimal picture of such a relevant phenomenon. In fact, it does not limit itself to talking about big data: the process of collecting and managing data has also changed, and the technologies supporting the data lifecycle and the exploitation of data have evolved.
The great revolution that we refer to when we talk about big data is therefore above all the ability to use all this information to process, analyse and find objective evidence on various issues. It is about what can be done with all these amounts of data, i.e. algorithms that can handle so many variables in a short time and also with few available computing resources – maybe just a simple laptop to access the platform to be analysed. Big data, to put it more simply, requires new and more refined ways of linking information to provide a truly visual view of the data, suggesting patterns and models of interpretation that were previously unthinkable.
Big data is thus generally defined by three Vs. The first, which is very big data, is volume, i.e. the amount of data (structured or unstructured) generated every second from heterogeneous sources – to name a few, we can think of sensors, logs, email, GPS, social media and traditional databases. We also have Variety, which refers to the different types of data generated, accumulated and used, followed by Velocity – as big data is produced in real time. Over time, a fourth V, veracity, was introduced, and then a fifth, value.
The analysis of big data makes it possible to generate new knowledge that is useful for making more informed decisions, not only in business. Now that we know what big data is and what it is used for, it is equally important to be aware of how it is used in different sectors. All of this is possible and entirely affordable thanks to technologies that make it possible to handle unstructured data and process large amounts of data in real time, but also thanks to the proliferation of more sophisticated algorithms and highly innovative analytical methods.
These tools can and should independently extrapolate the information hidden in the data. In fact, they can be used in infinite quantities, which are visible every day in the modern world. Megadata has its most useful and widespread use first and foremost in marketing, where it is widely used to construct so-called recommendation methods, such as those used by entertainment and e-commerce giants – Netflix and Amazon, to name a few – to provide purchase suggestions based on a specific customer’s interests compared to the interests of millions of other customers. The discovery and subsequent reduction of fraud is another example of how big data can be used on a daily basis to create productive value and improve all kinds of experiences for users of a service or platform. Leading credit card companies such as Visa and American Express, not surprisingly, analyse billions of transactions every day from around the world to identify unusual movements and patterns, thereby significantly reducing fraud and its occurrence in real time.
It is also important to use so-called predictive maintenance. This term refers to companies that use data collected about their operations to analyze performance and predict future problems before they occur. Experts have found that companies that are leaders in big data can generate an average of 12 percent higher profits than companies that do not harness the value of today’s data stars.
In the public sphere, there are many other types of applications for big data: in recent years, the police have used large amounts of real-time data to predict where and how many crimes are most likely to occur; more precise studies have been carried out by the associations responsible for the link between health and the quality of the air we breathe; there is also the possibility of carrying out genomic analysis to improve the resistance of rice crops to drought or even creating models to analyse data from living creatures in biological sciences and medical research, both diagnostic and pharmacological.
Of course, in all these areas, it is imperative that the legitimate use of big data is regulated because of its incredible value. Illegal or overly intrusive use of data can, in less serious cases, undermine customer confidence in companies. In more serious cases, however, it can harm citizens – who may be patients, voters and consumers – defined as the weakest link in the value chain. To ensure this protection, the control and sanctioning activities of the relevant public authorities need to be strengthened and adapted with more advanced legal and financial tools.
Do you have a big data project? Contact Enkronos team today.