meta data for this page

Definition of Big Data

Industry analyst, Doug Laney, has summed up the big data by using 3 Vs: volume, velocity and variety.

Volume The increase of data volume is contributed by many factors. It can be caused by transaction-based data stored through the years, unstructured data streaming in from social media, increasing amounts of sensor and machine-to-machine data being collected or some sort of sum of them all. Excessive data volume used to be seen as a storage issue. The storage cost went down shown even more complex problems caused by the huge amount of data. By having a huge amount of data, it will be challenges to determine relevance within it and use analytics to create value from relevant data.

Velocity Data is streaming in at unprecedented speed and must be dealt with in a timely manner. RFID tags, sensors and smart metering are driving the need to deal with torrents of data in near-real time. Reacting quickly enough to deal with data velocity is a challenge for most organizations.

Variety. Data today comes in all types of formats. Structured, numeric data in traditional databases. Information created from line-of-business applications. Unstructured text documents, email, video, audio, stock ticker data and financial transactions. Managing, merging and governing different varieties of data is something many organizations still grapple with.

Variability. In addition to the increasing velocities and varieties of data, data flows can be highly inconsistent with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered peak data loads can be challenging to manage. Even more so with unstructured data involved. Complexity. Today's data comes from multiple sources. And it is still an undertaking to link, match, cleanse and transform data across systems. However, it is necessary to connect and correlate relationships, hierarchies and multiple data linkages or your data can quickly spiral out of control.