What is Big Data? Introduction to its Type and Characteristics:
The term “big data” was introduced in 2008 in a special issue dedicated to the explosive growth of world information volumes. Although, of course, the large data itself existed before. According to experts, the category Big data refers to most data streams above 100 GB per day.
An unorganized form of facts and figures used to represent conditions, ideas or objects is Data.
“Big Data” is a combination of gadgets, methodologies and procedures for preparing both organized and unstructured data remembering the ultimate objective to use them for specific assignments and purposes.”
In the modern world, Large data is a socio-economic phenomenon, which is connected with the fact that new technological opportunities have appeared for analyzing a huge amount of data. The amount of data that is created and stored at the world level continues to grow every day. Every day 2,5 exabytes (1 exabytes = billion gigabytes) are created: thus, 90% of all data is created in the last 2 years. Using them, companies can significantly accelerate development. The problem is that only a small part of this data is analyzed.
Big data is usually determined from the point of view of data management problems that can not be solved within traditional databases due to the volume, variety of data and speed requirements. There are different definitions of large data, but most of them are based on the concept of “three V” large data.
Volume : is calculated in terabytes and petabytes.
Variety : data comes from a wide variety of sources in various formats (these can be network logs, social networking, online commerce and online transactions, financial transactions, etc.).
Speed (Velocity) : Companies are increasingly demanding very hard on how long the data should turn into analytical results based on which users can make decisions. Thus, it is necessary to ensure the collection, storage, processing and analysis of data in a relatively short time: from one day to the real-time mode.
Big Data is mainly classified into three main types, which are-
Structured data– is used to refer to the data which is stored in database, in an ordered manner.
Unstructured data– have no clear format in storage.
Semi-structured data-Data that isn’t in the conventional database design as organized information, yet contain some hierarchical properties which make it less demanding to process, are incorporated into semi-organized information.