Our website makes use of cookies like most of the websites. In order to deliver a personalised, responsive and improved experience, we remember and store information about how you use it. This is done using simple text files called cookies which sit on your computer. These cookies are completely safe and secure and will never contain any sensitive information. By clicking continue here, you give your consent to the use of cookies by our website.

Monday, 31 March 2014 13:00

Apache Hadoop and the future of the corporate data warehouse

Written by 

Big data experts Hortonworks have created a new white paper explaining how Apache Hadoop can help businesses avoid the strictures of Enterprise Data Warehouses and alleviate the problems caused by the tsunami of corporate data

The explosion of new types of data such as social, graphical and video from inputs such as the web and connected devices, or just sheer volumes of records - has put tremendous pressure on the Enterprise Data Warehouse (EDW).

A recent IDC report estimated businesses would be dealing with 2.8 Zetta Bytes (a Zetta byte is one billion terabytes, or one with 21 zeroes after it) of data in 2012 and that’s expected to grow to 40 Zetta Bytes by 2020, and the majority (85%) of this data growth is expected to come from new types; with machine-generated data being projected to increase 15x by 2020.

In response to this disruption, an increasing number of organisations have turned to new methods of dealing with this tsunami of data that both help manage the enormous increase in data whilst maintaining coherence of the EDW.

The Journey to a Data Lake

This paper discusses Apache Hadoop as a solution to the EDW problem and looks at its capabilities as a data platform and how the core of Hadoop and its surrounding ecosystem solution vendors provides the enterprise requirements to integrate alongside the EDW and other enterprise data systems as part of a modern data architecture, and as a step on the journey toward delivering the concept of an enterprise ‘Data Lake’.

With an enterprise “data lake” businesses receive all the following core benefits to an enterprise:

  • Data architecture efficiencies - through a significantly lower cost of storage, and through optimisation of data processing workloads such as data transformation and integration.
  • New opportunities through flexible ‘schema-on-read’ access to all enterprise data, and through multi-use and multi-workload data processing on the same sets of data: from batch to real-time.

To read more on data lakes and how they can improve your business and reduce your dependence on EDW click on the white paper below.

Leave a comment

Make sure you enter the (*) required information where indicated. HTML code is not allowed.

IBM skyscraper2

datazen side

Most Read Articles