Big data is an IT buzzword nowadays, but what does it really mean?When does data become big? At a recent Big Data and High Performance Computing Summit inBoston hosted by Amazon Web Services (AWS), data scientist JohnRauser mentioned a simple definition: Any amount of data that's toobig to be handled by one computer. Some says that's too simplistic. Others say it's spot on. CLOUD TRENDS: New bare metal cloud offerings emerging HADOOP: Hadoop wins over enterprise IT, spurs talent crunch "Big data has to be one of the most hyped technologies since, wellthe last most hyped technology, and when that happens, definitionbecome muddled," says Jeffrey Breen of Atmosphere Research Group. The lack of a standard definition points to the immaturity of themarket, says Dan Vesset, IDC program vice president of the businessanalytics division of the research firm. But, he isn't quite buyingthe definition floated by AWS. "I'd like to see something thatactually talks about data instead of the infrastructure needed toprocess it," he says. Others agree with the AWS definition. "It may not be all inclusive, but I think for the most part that'sright," says Jeff Kelly, a big data analytics analyst at theWikibon project. Part of the idea of big data is that it's so bigthat analyzing it needs to be spread across multiple workloads,hence AWS's definition. "When you're hitting the limits of yourtechnology, that's when data gets big," Kelly says. One of the most common definitions of big data uses three terms,all of which happen to start with the letter V: volume, velocityand variety. Many analyst firms, such as IDC and companies, such asIBM, seem to coalesce around this definition. Volume would mean themassive amount of data generated and collected by organizations;velocity, refers to the speed at which the data must be analyzed;and variety means the vast array of different types of data that iscollected, from text, to audio, video, web logs and more. But some are skeptical of that definition, too. Breen has a fourth"v" to add to the definition: vendor. Companies such as AWS and IBM tailor definitions to support theirproducts, Breen says. AWS, for example, offers a variety of bigdata analytic tools, such as Elastic Map Reduce, which is acloud-based big data processing feature. "The cloud provides instant scalability and elasticity and lets youfocus on analytics instead of infrastructure," Amazon spokespersonTera Randall wrote in an e-mail. "It enhances your ability andcapability to ask interesting questions about your data and getrapid, meaningful answers." Randall says Rauser's big datadefinition is not an official AWS definition of the term, but wasbeing used to describe the challenges facing business management ofbig data. Big data analytics in the cloud is an emerging market though, Kellysays. Google recently, for example, released BigQuery , the company's cloud-based data analytics tool. IBM, for its part,says information is "becoming the petroleum of the 21st century,"fueling business decisions across a variety of industries movingforward. Big data, IDC says, is a big market though. According to IBM, IDCestimates enterprises will invest more than $120 billion by 2015 tocapture the business impact of analytics, across hardware, softwareand services. The big data market is growing seven times fasterthan the overall IT and communications business, IDC says. But Vesset, the IDC researcher, says big data is not about how itis defined, but rather about what is done with the data. Thebiggest challenge organizations have today is understanding whichtechnologies are best for which data and use cases. With the riseof Hadoop, an open source big data analytics tool, some question if that's the end totraditional relational databases compared to unstructured dataservices, like Hadoop. "Both have a role to play," he says, and most large organizationwill likely use each. Relational databases will have somestructured approach to the data, which will be used fororganizations that have a large amount of data that is subject tocompliance or security requirements, for example. Large scale collecting of data on an adhoc basis is more unstructured and would take advantage of Hadoopcomputing clusters, he believes. How big data is defined though, is slightly more intangible, atleast so far. Kelly perhaps has the best definition though: "Youknow it when you see it." Network World staff writer Brandon Butler covers cloud computingand social collaboration. He can be reached atandfound on Twitter at @BButlerNWW. Read more about data center in Network World's Data Center section. We are high quality suppliers, our products such as GPS Moving Map Software Manufacturer , Fleet GPS Tracking Systems for oversee buyer. To know more, please visits AVL GPS System.
Related Articles -
GPS Moving Map Software Manufacturer, Fleet GPS Tracking Systems,
|