Big data is an evolving term that describes any voluminous amount of structured, semi structured and unstructured data that has the potential to be mined for information. It is a term applied to data sets whose size or type is beyond the ability of traditional relational databases to capture, manage, and process the data with low-latency.
Big data is used to better understand the customers and their behavior patterns and preferences. Big data comes from sensors, devices, video/audio, networks, log files, transactional applications, web, and social media – much of it generated in real time and in a very large scale.
Big data is ‘characterized by ‘3Vs’- the extreme ‘volume’ of data, the wide ‘variety’ of data types and the ‘velocity’ at which the data must be processed.
Volume indicates more data; it is the granular nature of the data that is unique. Big data requires processing high volumes of low-density, unstructured Hadoop data—that is, data of unknown value, click streams on a web page and a mobile app, network traffic, sensor-enabled equipment capturing data at the speed of light, and many more. It is the task of big data to convert such Hadoop data into valuable information.
The highest velocity data normally streams directly into memory versus being written to disk. Some Internet of Things (IoT) applications have health and safety ramifications that require real-time evaluation and action. Other internet-enabled smart products operate in real time or near real time.
Unstructured and semi-structured data types, such as text, audio, and video require additional processing to both derive meaning and the supporting metadata. The unstructured data has many of the same requirements as structured data, such as summarization, lineage, auditability, and privacy.
There are a range of quantitative and investigative techniques to derive value from data—from discovering a consumer preference or sentiment, to making a relevant offer by location, or for identifying a piece of equipment that is about to fail. The technological breakthrough is that the cost of data storage and compute has exponentially decreased, thus providing an abundance of data from which statistical analysis on the entire data set versus previously only sample.
Big data is used to better understand customers and their behaviors and preferences. Business firms are keen to expand their traditional data sets with social media data, browser logs as well as text analytics and sensor data to get a more complete picture of their customers.
Big data is increasingly used to optimize business processes. Retailers are able to optimize their stock based on predictions generated from social media data, web search trends and weather forecasts.
Big data is not just for companies and governments but also for all of us individually. We can now benefit from the data generated from wearable devices such as smart watches or smart bracelets.
The computing power of big data analytics enables one to decode entire DNA strings in minutes and will allow us to find new cures and better understand and predict disease patterns.
Science and research is currently being transformed by the new possibilities big data brings.
Big data analytics help machines and devices become smarter and more autonomous. Big data tools are used to optimize the performance of computers and data warehouses.