Why Hadoop for Big Data

Marina Astapchik
IT Trends Research Specialist
August 31, 2012

Nowadays, digital data is growing exponentially. Proliferation of its volume, variety and velocity is known as the Big Data phenomenon. Big Data includes large, diverse, complex, structured, semi-structured and unstructured datasets generated through software systems, collaboration tools, social networking, video streams, payment transactions, sensors and all other possible digital sources.

Obviously, those companies and organizations that are best able to leverage Big Data for making well-timed and well-aimed business decisions are expected to prosper in the near and distant future. The challenge here lies in finding the most efficient ways to manage, analyze, visualize and create new value from terabytes and even petabytes of information in real time.

Working with Big Data by using traditional relational databases and thousands of enormously expensive servers is no longer the best option. Here comes Apache Hadoop — a reliable open source technology that enables high-performance processing of huge amounts of data. It is an excellent solution to run large, scalable, distributed computations.

Hadoop’s file system (HDFS) is different from other distributed file systems. HDFS creates multiple replicas of data blocks and distributes them on compute nodes throughout a cluster in order to ensure extremely rapid computations. The file system is very fault-tolerant and is designed to be deployed on low-cost hardware.

It is also due to Hadoop MapReduce software framework that allows easy writing of applications for processing vast amounts of data (multi-terabyte datasets) in-parallel on large clusters of commodity hardware.

A wide variety of enterprises and organizations use Hadoop for both research and production. According to IDC, revenues for the worldwide Hadoop-MapReduce ecosystem software market are expected to grow to $812.8 million in 2016 from $77 million in 2011.

Recognizing the tremendous value of Hadoop for Big Data processing and analytics, many companies invest time and resources in this technology, seeing this as a necessary step towards a sustainable development of their businesses. Some of the notable users of Hadoop include Amazon.com, American Airlines, Apple, eBay, Facebook, Hewlett-Packard, IBM, LinkedIn, Microsoft, Twitter, Yahoo!.

Industries and Technology Areas:

Industries: software development

Technology Areas: Hadoop, HDFS, MapReduce, Big Data


8 Characteristics Of The Next-Generation Financial Services Websites

When it comes to the face of business, it is important to change it over the time to keep up with the pace of innovation. Digital revolution sets its own rules on the way business is transformed. Brick-and-mortar shops are empowered with online versions, face-to-face payments – with one-click purchases,...

How Can AI Change The State of Cybersecurity

According to Wikipedia, financial crimes are “crimes against property, involving the unlawful conversion of the ownership of property (belonging to one person) to one's own personal use and benefit”. In the Internet age, financial crimes are often associated with cybercrime. Globally, this type of criminal activity is regarded as a...

Smart vs. Ricardian Contracts: What’s the Difference?

We continue to explore the world of the blockchain, and one of the conspicuous notions in this field is called “a smart contract”. In this article, we’re going to find out what the difference between a smart and a Ricardian contract is. However, before diving into the details, let’s answer...
insurance innovation

Insurance And Innovation: How Technology Disrupts Traditional Business

“The threat that inspires” – here’s how insurtech is referred to in media. Indeed, the innovation that comes to the industry has the disruptive nature. Well, it’s very early days for insurtech startups to transform the market but high customer expectations and rough competition encourage industry leaders to weigh the...