20. “The big issue is not that everyone will
suddenly operate at petabyte scale; a lot of
folks do not have that much data.
The more important topics are the specifics
of the storage and processing infrastructure
and what approaches best suit each
problem.”
- Bradford Cross, Flightcaster/Woven
20
24. “build Amazon's product search indices”
“build the recommender system for behavioral targeting”
“ETL style processing and statistics generation”
“information extraction & search”
“searching and analysis of millions of rental bookings”
“we use Hadoop to summarize of user's tracking data”
“we use Hadoop to store ad serving logs”
“the freedom to query the data in an ad-hoc manner”
“generating web graphs on 100 nodes”
“we use Hadoop for batch-processing large RDF datasets”
“facial similarity and recognition across large datasets“
“We are using Hadoop and Nutch to crawl Blog posts”
“Used for ETL & data analysis on terascale datasets”
Source: http://wiki.apache.org/hadoop/PoweredBy
24