Now in its fifth year, Apache Hadoop has firmly established itself as the platform of choice for organizations that need to efficiently store, organize, analyze, and harvest valuable insight from the flood of data that they interact with. Since its inception as an early, promising technology that inspired curiosity, Hadoop has evolved into a widely embraced, proven solution used in production to solve a growing number of business problems that were previously impossible to address. In his opening keynote, Mike will reflect on the growth of the Hadoop platform due to the innovative work of a vibrant developer community and on the rapid adoption of the platform among large enterprises. He will highlight how enterprises have transformed themselves into data-driven organizations, highlighting compelling use cases across vertical markets. He will also discuss Cloudera’s plans to stay at the forefront of Hadoop innovation and its role as the trusted solution provider for Hadoop in the enterprise. He will share Cloudera’s view of the road ahead for Hadoop and Big Data and discuss the vital roles for the key constituents across the Hadoop community, ecosystem and enterprises.
21. Building Applications
Develop personalized
applications on Hadoop
and HBase
Get it at:
http://fonedoktor.com
Learn more about
Today, 3:30PM,
Architecture Track
Battery Analysis Mapping Features Aaron Kimball and
Available Today… Coming Soon! Garrett Wu
www.wibidata.com, @wibidata
22. Data Analysis and Visualization INSTANT INTELLIGENCE
Demand for Online
App Analytics
• Real-time, interactive &
visual analytics
• Auto-discover data trends
• User behavior analytics with
data clustering
• Investigative and root cause
analytics
• Simplify data modeling &
custom functions for Hadoop
data
Empower business users, data scientists without-of-the-box analytics
www.cetas.net, @CetasAnalytics
23. Powerful Statistical Tools
• Why Hadoop and R?
• Need to do more than simple statistics
• Analyze all of the data
• Integration
• Make it easy to write MapReduce programs in R
• Keep the statisticians focused on the analysis
Usage
• Fraud and Risk Analysis
• Portfolio Optimization
• Anything you can model in R!
www.revolutionanalytics.com, @RevolutionR
24. Complex Data Exploration
Automatic extraction of facts,
Who
connections, associations, etc.
Relationship
Who
Association
Connections
Aliases Entity
:
Alias Where AIG
When
Location
What
Time
Synthesys Knowledge
Base
What did..
Connection discovered from AIG to
Metlife Equity in Wikipedia:
Unstructured Data AIG sells Allco to Metlife Equity
for $6.8B
Synthesys automatically surfaces critical
facts in unstructured data
www.digitalreasoning.com, @dreasoning
25. Business Analytics
• Metrics Management and Reporting
• Strategic, Financial, and Operational Planning, Budgeting, and Forecasting
• Profitability Modeling
USABLE
UNIFIED
ACTIONABLE
Enterprise Performance Management
for the Cloud
www.tidemark.net, @TidemarkEPM
28. Big Data Fund
• $100MM dedicated to fund entrepreneurs globally in building disruptive, Big
Data companies
• Funding innovation across every layer of the “Big Data Stack”:
Infrastructure • Applications
Business Intelligence
• Automation • Collaboration
• Data Management • Data Analysis/Visualization
• Identity & Access • Mobile
• Security • Vertical Applications
• Storage • …
• …
• Partnering with thought leaders to foster community and drive innovation:
Doug Cutting Gil Elbaz Jeff Hammerbacher Jeff Heer Hilary Mason Jay Parikh Kenny Van Zant
Hadoop Factual Cloudera Stanford Bit.ly Facebook Solarwinds
Accel Partners 28
29. Who We Are
Three decades of technology investing with over $6B of capital in US, Europe,
China and India
• Partner with category-defining entrepreneurs
• Invest at every stage of technology lifecycle – seed, venture and growth capital
• Focus deeply on technology innovations in software, infrastructure and internet
Big Data consistently drives innovation across our portfolio companies today
Data Generators Data Solutions
Accel Partners 29
30. Time is Now!
The Big Data Wave Data is exploding
“New” data types are
breaking legacy data
Data Growth
platforms
Big Data platforms such
as Hadoop are becoming
mainstream
1980 1990 2000 2010 “Native” Big Data
Traditional Data Big Data
applications and services
will quickly emerge
Big Data continues to revolutionize data centers across all industries, opening
up a massive market for entrepreneurial activity.
Accel Partners 30
31. Funding the Big Data Ecosystem
Big Data will drive the next-generation of multi-billion dollar software companies
1980 - 2010 2010 and beyond
Analytics Security
Business
Applications
Collaboration Intelligence
Mobile CRM
Vertical Apps: Fin Tech, Healthcare
Big Data Platforms
Traditional Data Platforms
Data
Relational Database Management Systems
Infrastructure
Traditional Infrastructure Platforms Private & Public Cloud
Mainframe, Client-Server, Web Platform and Services
Accel Partners 31
32. Big Data Fund Contact Info
Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel
Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners
Contact Us
▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big
Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data
Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪
accel.com/bigdata
Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel
Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners
▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel bigdatafund@accel.com
Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big
Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data
Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪
@bigdatafund
Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel
Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners
▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big
Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data
Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪
Big Data Conference - Spring 2012
Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel
Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners
Want to attend or speak?
▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big
BigData2012@accel.com
Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data
Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪
Stay on top of the latest big data news from Accel Partners by finding us on
Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel
Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners
facebook.com/Accel
▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big
Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data
@Accel_Partners
Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪
Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel Partners ▪ Big Data Fund ▪ Accel
Accel Partners 32
We ran a survey.1400 people, 580 countries27 countries and 40 of the United StatesMore than 3/4 are first-timers at Hadoop World – Welcome!Nearly 3/4 are using Hadoop today2/3 technical, 1/3 businessAnd the new profession of data science is here in force!
One third each: Less than one year, 1-2 years, more than two years.The average user here is more experienced than the average user at Hadoop World 2010 – 9 months
Average cluster size has doubled in a year.More than half of you have pretty big clusters – more than 100 nodes.202 PB represented on our survey. One company was 10% of that.More of you – 12% -- above a petabyte than I would have guessed.But important: About 3/4 of you have less than 100TB in Hadoop.
Hadoop needed more:Load and share dataQuery tools and ways to schedule and manage obsFast record storage and retrievalAll of that is available from the Apache ecossytem
In 2006 and 2007, all the work was on core Hadoop.2008, the ecosystem began to diversify.Today, nearly 70% of all new contribs are to surrounding projects – only 31% to Hadoop itselfWhat you would expect as platform has matured
Hadoop in production is just one part of your data center.You need to monitor and manage like other critical platforms.
What’s happening right now?Who’s doing what?
How are the services I depend on doing?
I need a high-level service view.Take storage.How is it performing?Latency? Throughput?What’s happening?
Who’s consuming storage?Am I close to capacity?How to I make sure users get what they need?How do I track their use?
Infrastructure is long-lived.I need to add, remove, retire hardware.I can’t shut down the system.
Move between high-level view and detail.HDFS is a service, but it runs on lots of servers.I need to see both.
That’s just storage.Lots of other services: query tools, analytics and more.Complex, multi-tenant, mission-critical infrastructure.Integrate with data center operations.
Hadoop is not an island.It is part of your enterprise IT platform.We were right.
Pick your graph: Big data is a big deal.The platform is here today.The next 12 months will be about use cases.About tooling and apps.Let me show you some cool ones. These companies are all here today.
WibiData is Odiago’s core product – a platform for developing personalized applications with Hadoop and HbaseWibiData provides both programmatic APIs for Application Development and an ODBC interface for easy integration with existing BI / Reporting / Analysis technology + libraries that make personalization quick and easyFoneDoktor is one such application, powered by WibiDataFoneDoktor is free for Consumers:Learn from your dataShare with the community -> get more value from your dataAvailable at fonedoktor.comFoneDoktor is available to Partners (Carriers and OEMs):Lower Device Return RatesLower Support VolumeMeasure Device / Network performanceWibiData + FoneDoktor deep dive in Aaron and Garrett’s talk – check it out!
Need self-service tools for behavioral analytics.Interactive, visual tools for business users to explore data themselves.Cetas provides real-time, interactive analytics.Automatic discover and highlight clusters and trends in data.Mask complexity, deliver big data analysis to business users.
R is a statistical language for developing advanced analyticsWith Hadoop, R can explore all the data: No sampling, no subsetting.R language runs under MapReduceStatistician focuses on analysis, not HadoopFraud and Risk analysisPortfolio optimizationAnything you can model in R
Validated by customers in the US Army and intelligence spaceOperates on key enterprise information (financial intelligence, risk, and patents)Combines enterprise data with public sourcesStructured, semi-structured and complexDiscovers and shows connections, relationships among entities
Enterprise Performance ManagementKey metrics, trends, analysies: Plan, budget, forecastHadoop for trending, diverse data sources, external and internalWith drill-downAimed at busy execs who need clear insight and overviewiPad, iPhone applications
It’s getting crowded in here!Companies contributing to Hadoop, integrating with it or building on top.Sign of a big, robust market.But these aren’t the only people who have spotted the opportunity in big data.I’d like to bring up Ping Li from Accel Partners with an exciting announcement.
Hadoop as the hubCatch, process, summarize the firehoseIntegrate with new and existing platforms for special-purpose workloadsAlready happening
Three years talking speeds and feedsThe story for the future is value:Business problems and solutions built on big data.