Migrating to the Hadoop Ecosystem by Eleni Srtoulia

Outline

  • Background
    – Why?
  • PaaS with “the Hadoop Ecosystem”:
  • HDFS,Hadoop,and HBase
    – What?
  • The TAPoR Migration
    – How?
  • Closing Remarks

Introduction

Big Data… Cheap Hardware…

  • Data is growing at an unprecedented rate
    • More people use the web and publish data
      • The Internet Usage around the world:   in 2000: 360m; in 2011: 2billion (1/3 of earth population)
      • Facebook, in 2009 uploading 60 TB images every week
    • Things are on the Internet
      •  A jet engine produces 10TB data every 30 flight mins
  • Commodity hardware is cheap
  • Owning and maintaining hardware is expensive

View the full presentation below:

Is Your Data Cloud-Ready? by Dale Oldford


The Web evolution is enabling technologies and tools to facilitate information sharing and collaboration in virtual communities, where members are active participants as “prosumers” of content, instead of passive consumers of data.

Agenda

• Finding meaningful data in the “Cloud”
• Example 1: Prime Time Blogs
• Open Data
• Example 2: “Unlocking” Your Data
• The “ABC” Formula
• Some of the Challenges
• Shaping your data for “Cloud” readiness
• Conclusions
• Q&A

 

Conclusions

  •  Web 2.0 technologies and analytics provide feasible options to cope with challenging characteristics of Big Data (i.e., volume, velocity, and variety).
  • Analytics enables enterprise search solutions to perform complex tasks requiring machine learning and automation.
  •  Cloud readiness goes beyond data and available technology.

See the full presentation below:

Finding Relevant Data in the Cloud for Actionable Decisions by Andres Dorado

An Information Need* is the topic about which the user desires to know more, and is differentiated from a query, which is what the user conveys to the computer in an attempt to communicate the information need.”

Agenda

• Information Retrieval
• The “ABC” Formula
• Some of the Challenges
• Example 1: The Right Profile
• Example 2: Like it 
• Example 3: Promote it
• Conclusions
• Q&A

 Conclusions

  • Analytics add capabilities to information retrieval systems that facilitate finding relevant data in the “cloud”.
  • Analytics enables information retrieval systems to deal with  large-scale data sets and therefore is recommendable for working with Big Data.
  • Analytics provides advanced techniques for more effective browsing and filtering of Big data.

See the the full presentation below: