The document discusses Databricks, a hosted platform for processing big data using Apache Spark. It notes that over the past year, more than 5,000 people have been trained on introductory Spark courses. Databricks aims to alleviate challenges around data scientist scarcity by making big data processing simpler. The platform provides a managed Spark cluster, notebooks, dashboards, and integration with third-party tools to simplify tasks from data ingestion to production. Since its initial unveiling in June 2014, over 150 organizations have adopted Databricks to help improve products, speed time to market, and increase access to data.
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
Spark Summit 2015 keynote: Making Big Data Simple with Spark
1. Making big Data Simple
with Spark
Ion Stoica and Ali Ghodsi
June 15, 2015
2. More than 5,000 people trained over past year
2
Alleviating Data Scientist Scarcity Challenge
“Intro to Big Data with Apache Spark”
• Anthony Joseph, UC Berkeley
• Started June 1st
“Scalable Machine Learning”
• Ameet Talwalkar, UCLA
• To start July 5th
3. More than 5,000 people trained over past year
3
Alleviating Data Scientist Scarcity Challenge
“Intro to Big Data with Apache Spark”
• Anthony Joseph, UC Berkeley
• Started June 1st, over 64K registered students
“Scalable Machine Learning”
• Ameet Talwalkar, UCLA
• To start July 5th, over 26K registered students
4. 4
…
Spark Core
Python, Java, Scala, R
Spark Streaming
real-time
Spark SQL
interactive
MLlib
machine learning
GraphX
graph
a
Fast • Expressive • General
Spark Significantly Simplifies Big Data Processing
5. 5
Still need to set up and manage your own Spark cluster
Still more complex to operate than existing single node tools
(R, Python)
But Big Data Processing
Remains Complex...
6. Databricks Truly Makes Big Data Simple
A hosted end-to-end platform from ingest to production
6
Cluster Manager
JobsNotebooks Third-Party AppsDashboards
7. June 2014: Unveiling
• Over 3,500 sign ups
November 2014: Limited Availability
Today
• Over 150 organizations using Databricks
Databricks: The Journey Thus Far
7
8. Better products
Update customers’ databases weekly instead of monthly
What can Databricks and Spark
do for organizations?
8
Faster time to market
Create new products in 3 weeks rather than 2 months
Democratize data access within enterprises
Increase number of data analysts by 4x and number
of data projects by 6x
10. Ease of use
Increase user productivity
10
Key Areas of Focus
1
2
Integration with existing (small and big) data tools
Make non-Spark experts instantly productive
3
Security
Enable mission-critical applications
11. 11
Cluster manager
with multiple Spark versions
From notebooks to dashboards
and jobs with just a few clicks
Lunch and monitor jobs,
including streaming
Ease of Use
Notebooks Dashboards
Jobs