SlideShare una empresa de Scribd logo
1 de 20
SUBMITTED TO: SUBMITTED BY:
Mrs. Suman singh Nikita Vijay
(HOD of CSE Dept.) B. Tech –VIII sem(CSE)
A SEMINAR PRESENTATION ON
“Introduction To Apache Spark”
● Need of new generation distributed system
● Hardware/software evolution in last decade
● Apache Spark
• Components of Apache Spark
● Why Spark?
● Who are using Spark?
Agenda
● Lot has been changed from 2000
● Both hardware and software gone through changes
● Big data has become necessity now
● Let’s look at what changed over decade
Why we need new generation?
● Disk was cheap so disk was primary source of data
● Network was costly so data locality
● RAM was very costly
● Single core machines were dominant
RAM is the king
• RAM is primary source of data and we use disk for
fallback
● Network is speedier
● Multi core machines are commonplace
State of hardware in 2000
Now
● Object orientation was the king
● Software optimized for single core
● No open frameworks for creating
○ Distributed storage
○ Distributed processing
● SQL was the only dominant way for data analysis
Now
•Functional programming is on rise
● Software needs to exploit multiple cores on single node
There are good frameworks to create distributed systems
○ HDFS for storage
● NoSQL is real alternative now
Software in 2000
● Very few companies had big data issue
● Batch processing system ruled the world
● Volume was big concern compare to velocity
● Mostly used for
○ Search
○ Log analysis
● All companies use big data
● Velocity is as much concern as volume
Needs of real time are as much important as batch
processing
Big Data processing needs in 2000
NOW
• A fast and general engine for large scale data
processing
• Created by AMPLab
• Written in Scala
•Licensed under Apache
Apache Spark
Spark streaming
graphX
MLlib
Apache sql
Benefits of a Unified Platform
• No copying of data between systems
•Combine processing types in one program
• Code reuse
• One system to learn
• One system to maintain
Mesos, a distributed system framework as class project
in UC Berkeley in 2009.
● Spark to test how mesos works
● Focused on
○ Iterative programs (ML)
○ Unifying real time and batch processing
● Open sourced in 2010
History of Apache Spark
● You can spark on top any distributed system
● It can run on
○ Yarn
○ Apache Mesos
○ It’s own cluster
Runs everywhere
● Apache Spark is highly modular
The original version contained only 1600 lines of scala
code
● Apache Spark API is extremely simple compared Java
API of M/R
● API is concise and consistent
Small and Simple
Source : http://spark-summit.org/wp-content/uploads/2013/10/Zaharia-spark-summit-2013-
• In Spark, you can cache hdfs data in main memory of
worker nodes
• Spark analysis can be executed directly on in memory
data
● Shuffling also can be done from in memory
● Fault tolerant
In-memory aka Speed
● No separate storage layer
● Integrates well with HDFS
● Can run on Hadoop 1.0 and Hadoop 2.0 YARN
● Excellent integration with ecosystem projects like
Apache Hive, HBase etc
Integration with Hadoop
● Written in Scala but API is not limited to it
● Offers API in
○ Scala
○ Java
○ Python
● You can also do SQL using SparkSQL
Multi language API
Who are using Spark
seminar presentation on apache-spark
seminar presentation on apache-spark

Más contenido relacionado

La actualidad más candente

Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekVenkata Naga Ravi
 
Introduction to ML with Apache Spark MLlib
Introduction to ML with Apache Spark MLlibIntroduction to ML with Apache Spark MLlib
Introduction to ML with Apache Spark MLlibTaras Matyashovsky
 
Optimizing Apache Spark UDFs
Optimizing Apache Spark UDFsOptimizing Apache Spark UDFs
Optimizing Apache Spark UDFsDatabricks
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache SparkRahul Jain
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to sparkHome
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark FundamentalsZahra Eskandari
 
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop DataWorks Summit/Hadoop Summit
 
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...Edureka!
 
Hyper threading technology
Hyper threading technologyHyper threading technology
Hyper threading technologydeepakmarndi
 
GraphFrames: Graph Queries In Spark SQL
GraphFrames: Graph Queries In Spark SQLGraphFrames: Graph Queries In Spark SQL
GraphFrames: Graph Queries In Spark SQLSpark Summit
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Sparkdatamantra
 
Spark DataFrames and ML Pipelines
Spark DataFrames and ML PipelinesSpark DataFrames and ML Pipelines
Spark DataFrames and ML PipelinesDatabricks
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonBenjamin Bengfort
 
Flink history, roadmap and vision
Flink history, roadmap and visionFlink history, roadmap and vision
Flink history, roadmap and visionStephan Ewen
 
Introduction to Apache Spark Developer Training
Introduction to Apache Spark Developer TrainingIntroduction to Apache Spark Developer Training
Introduction to Apache Spark Developer TrainingCloudera, Inc.
 
Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overviewDataArt
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning Dr. Swaminathan Kathirvel
 

La actualidad más candente (20)

Spark
SparkSpark
Spark
 
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeekProcessing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
 
Introduction to ML with Apache Spark MLlib
Introduction to ML with Apache Spark MLlibIntroduction to ML with Apache Spark MLlib
Introduction to ML with Apache Spark MLlib
 
Optimizing Apache Spark UDFs
Optimizing Apache Spark UDFsOptimizing Apache Spark UDFs
Optimizing Apache Spark UDFs
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Introduction to spark
Introduction to sparkIntroduction to spark
Introduction to spark
 
Apache Spark Fundamentals
Apache Spark FundamentalsApache Spark Fundamentals
Apache Spark Fundamentals
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop
 
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...
 
Hyper threading technology
Hyper threading technologyHyper threading technology
Hyper threading technology
 
GraphFrames: Graph Queries In Spark SQL
GraphFrames: Graph Queries In Spark SQLGraphFrames: Graph Queries In Spark SQL
GraphFrames: Graph Queries In Spark SQL
 
Introduction to Apache Spark
Introduction to Apache SparkIntroduction to Apache Spark
Introduction to Apache Spark
 
Apache Spark Core
Apache Spark CoreApache Spark Core
Apache Spark Core
 
Spark DataFrames and ML Pipelines
Spark DataFrames and ML PipelinesSpark DataFrames and ML Pipelines
Spark DataFrames and ML Pipelines
 
Fast Data Analytics with Spark and Python
Fast Data Analytics with Spark and PythonFast Data Analytics with Spark and Python
Fast Data Analytics with Spark and Python
 
Flink history, roadmap and vision
Flink history, roadmap and visionFlink history, roadmap and vision
Flink history, roadmap and vision
 
Introduction to Apache Spark Developer Training
Introduction to Apache Spark Developer TrainingIntroduction to Apache Spark Developer Training
Introduction to Apache Spark Developer Training
 
Apache Spark overview
Apache Spark overviewApache Spark overview
Apache Spark overview
 
FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning FPGA Hardware Accelerator for Machine Learning
FPGA Hardware Accelerator for Machine Learning
 

Similar a seminar presentation on apache-spark

Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architectureSohil Jain
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architectureSohil Jain
 
Apache Spark vs Apache Flink
Apache Spark vs Apache FlinkApache Spark vs Apache Flink
Apache Spark vs Apache FlinkAKASH SIHAG
 
Spark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark GroupSpark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark GroupPhaneendra Chiruvella
 
Getting Started with Spark Scala
Getting Started with Spark ScalaGetting Started with Spark Scala
Getting Started with Spark ScalaKnoldus Inc.
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaItai Yaffe
 
spark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark examplespark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark exampleShidrokhGoudarzi1
 
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and KafkaDatabricks
 
Apache Spark for Beginners
Apache Spark for BeginnersApache Spark for Beginners
Apache Spark for BeginnersAnirudh
 
BlackRay - The open Source Data Engine
BlackRay - The open Source Data EngineBlackRay - The open Source Data Engine
BlackRay - The open Source Data Enginefschupp
 
Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014mahchiev
 
Introduction to Apache Beam
Introduction to Apache BeamIntroduction to Apache Beam
Introduction to Apache BeamKnoldus Inc.
 
.NET per la Data Science e oltre
.NET per la Data Science e oltre.NET per la Data Science e oltre
.NET per la Data Science e oltreMarco Parenzan
 
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaDataWorks Summit
 
The structured streaming upgrade to Apache Spark and how enterprises can bene...
The structured streaming upgrade to Apache Spark and how enterprises can bene...The structured streaming upgrade to Apache Spark and how enterprises can bene...
The structured streaming upgrade to Apache Spark and how enterprises can bene...Impetus Technologies
 
Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009fschupp
 
An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache SparkDona Mary Philip
 

Similar a seminar presentation on apache-spark (20)

Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 
Spark introduction and architecture
Spark introduction and architectureSpark introduction and architecture
Spark introduction and architecture
 
Apache Spark vs Apache Flink
Apache Spark vs Apache FlinkApache Spark vs Apache Flink
Apache Spark vs Apache Flink
 
Apache Spark in Industry
Apache Spark in IndustryApache Spark in Industry
Apache Spark in Industry
 
Spark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark GroupSpark Streaming and MLlib - Hyderabad Spark Group
Spark Streaming and MLlib - Hyderabad Spark Group
 
Getting Started with Spark Scala
Getting Started with Spark ScalaGetting Started with Spark Scala
Getting Started with Spark Scala
 
Stream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and KafkaStream, stream, stream: Different streaming methods with Spark and Kafka
Stream, stream, stream: Different streaming methods with Spark and Kafka
 
spark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark examplespark example spark example spark examplespark examplespark examplespark example
spark example spark example spark examplespark examplespark examplespark example
 
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Apache Spark and Kafka
 
Apache Spark for Beginners
Apache Spark for BeginnersApache Spark for Beginners
Apache Spark for Beginners
 
BlackRay - The open Source Data Engine
BlackRay - The open Source Data EngineBlackRay - The open Source Data Engine
BlackRay - The open Source Data Engine
 
Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014Big Data Processing with Apache Spark 2014
Big Data Processing with Apache Spark 2014
 
Introduction to Apache Beam
Introduction to Apache BeamIntroduction to Apache Beam
Introduction to Apache Beam
 
Spark Workshop
Spark WorkshopSpark Workshop
Spark Workshop
 
.NET per la Data Science e oltre
.NET per la Data Science e oltre.NET per la Data Science e oltre
.NET per la Data Science e oltre
 
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and KafkaStream, Stream, Stream: Different Streaming Methods with Spark and Kafka
Stream, Stream, Stream: Different Streaming Methods with Spark and Kafka
 
The structured streaming upgrade to Apache Spark and how enterprises can bene...
The structured streaming upgrade to Apache Spark and how enterprises can bene...The structured streaming upgrade to Apache Spark and how enterprises can bene...
The structured streaming upgrade to Apache Spark and how enterprises can bene...
 
Spark Uber Development Kit
Spark Uber Development KitSpark Uber Development Kit
Spark Uber Development Kit
 
Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009Blackray @ SAPO CodeBits 2009
Blackray @ SAPO CodeBits 2009
 
An Introduction to Apache Spark
An Introduction to Apache SparkAn Introduction to Apache Spark
An Introduction to Apache Spark
 

Más de Jawhar Ali

seminar report on What is ransomware
seminar report on What is ransomwareseminar report on What is ransomware
seminar report on What is ransomwareJawhar Ali
 
seminar report on Sql injection
seminar report on Sql injectionseminar report on Sql injection
seminar report on Sql injectionJawhar Ali
 
seminar report on kingapp application
seminar report on kingapp applicationseminar report on kingapp application
seminar report on kingapp applicationJawhar Ali
 
seminar report on school management system
seminar report on school management systemseminar report on school management system
seminar report on school management systemJawhar Ali
 
seminar presentation on Face ricognition technology
seminar presentation on Face ricognition technologyseminar presentation on Face ricognition technology
seminar presentation on Face ricognition technologyJawhar Ali
 
seminar presentation on Digital Jwellery
seminar presentation on Digital Jwelleryseminar presentation on Digital Jwellery
seminar presentation on Digital JwelleryJawhar Ali
 
powerpoint presentation on sixth sense Technology
powerpoint presentation  on sixth sense Technologypowerpoint presentation  on sixth sense Technology
powerpoint presentation on sixth sense TechnologyJawhar Ali
 
Powerpoint presentation on 5G wireless technology
Powerpoint presentation on 5G wireless technologyPowerpoint presentation on 5G wireless technology
Powerpoint presentation on 5G wireless technologyJawhar Ali
 
powerpoint presentation on Google glass
powerpoint presentation on Google glasspowerpoint presentation on Google glass
powerpoint presentation on Google glassJawhar Ali
 
Table Of Contents Google Glass
Table Of Contents Google GlassTable Of Contents Google Glass
Table Of Contents Google GlassJawhar Ali
 
introduction and abstract on Google Glass Major report
introduction and abstract on  Google Glass Major reportintroduction and abstract on  Google Glass Major report
introduction and abstract on Google Glass Major reportJawhar Ali
 
Candidate declaration on Google Glass
Candidate declaration on Google GlassCandidate declaration on Google Glass
Candidate declaration on Google GlassJawhar Ali
 
front Page on Google Glass
 front Page on Google Glass front Page on Google Glass
front Page on Google GlassJawhar Ali
 
Table of contents on blood bank management system
Table of contents on blood bank management systemTable of contents on blood bank management system
Table of contents on blood bank management systemJawhar Ali
 
List of figures in Blood bank management system
List of figures in Blood bank management systemList of figures in Blood bank management system
List of figures in Blood bank management systemJawhar Ali
 
Full report on blood bank management system
Full report on  blood bank management systemFull report on  blood bank management system
Full report on blood bank management systemJawhar Ali
 
Cand declaration
Cand declaration Cand declaration
Cand declaration Jawhar Ali
 
Training report on web developing
Training report on web developingTraining report on web developing
Training report on web developingJawhar Ali
 
seminar report on wireless Sensor network
seminar report on wireless Sensor networkseminar report on wireless Sensor network
seminar report on wireless Sensor networkJawhar Ali
 
Cloud computing ppt
Cloud computing pptCloud computing ppt
Cloud computing pptJawhar Ali
 

Más de Jawhar Ali (20)

seminar report on What is ransomware
seminar report on What is ransomwareseminar report on What is ransomware
seminar report on What is ransomware
 
seminar report on Sql injection
seminar report on Sql injectionseminar report on Sql injection
seminar report on Sql injection
 
seminar report on kingapp application
seminar report on kingapp applicationseminar report on kingapp application
seminar report on kingapp application
 
seminar report on school management system
seminar report on school management systemseminar report on school management system
seminar report on school management system
 
seminar presentation on Face ricognition technology
seminar presentation on Face ricognition technologyseminar presentation on Face ricognition technology
seminar presentation on Face ricognition technology
 
seminar presentation on Digital Jwellery
seminar presentation on Digital Jwelleryseminar presentation on Digital Jwellery
seminar presentation on Digital Jwellery
 
powerpoint presentation on sixth sense Technology
powerpoint presentation  on sixth sense Technologypowerpoint presentation  on sixth sense Technology
powerpoint presentation on sixth sense Technology
 
Powerpoint presentation on 5G wireless technology
Powerpoint presentation on 5G wireless technologyPowerpoint presentation on 5G wireless technology
Powerpoint presentation on 5G wireless technology
 
powerpoint presentation on Google glass
powerpoint presentation on Google glasspowerpoint presentation on Google glass
powerpoint presentation on Google glass
 
Table Of Contents Google Glass
Table Of Contents Google GlassTable Of Contents Google Glass
Table Of Contents Google Glass
 
introduction and abstract on Google Glass Major report
introduction and abstract on  Google Glass Major reportintroduction and abstract on  Google Glass Major report
introduction and abstract on Google Glass Major report
 
Candidate declaration on Google Glass
Candidate declaration on Google GlassCandidate declaration on Google Glass
Candidate declaration on Google Glass
 
front Page on Google Glass
 front Page on Google Glass front Page on Google Glass
front Page on Google Glass
 
Table of contents on blood bank management system
Table of contents on blood bank management systemTable of contents on blood bank management system
Table of contents on blood bank management system
 
List of figures in Blood bank management system
List of figures in Blood bank management systemList of figures in Blood bank management system
List of figures in Blood bank management system
 
Full report on blood bank management system
Full report on  blood bank management systemFull report on  blood bank management system
Full report on blood bank management system
 
Cand declaration
Cand declaration Cand declaration
Cand declaration
 
Training report on web developing
Training report on web developingTraining report on web developing
Training report on web developing
 
seminar report on wireless Sensor network
seminar report on wireless Sensor networkseminar report on wireless Sensor network
seminar report on wireless Sensor network
 
Cloud computing ppt
Cloud computing pptCloud computing ppt
Cloud computing ppt
 

Último

A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformChameera Dedduwage
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...RKavithamani
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptxVS Mahajan Coaching Centre
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxheathfieldcps1
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpinRaunakKeshri1
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityGeoBlogs
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdfQucHHunhnh
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)eniolaolutunde
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfciinovamais
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdfSoniaTolstoy
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfJayanti Pande
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfchloefrazer622
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesFatimaKhan178732
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxmanuelaromero2013
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 

Último (20)

A Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy ReformA Critique of the Proposed National Education Policy Reform
A Critique of the Proposed National Education Policy Reform
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
Privatization and Disinvestment - Meaning, Objectives, Advantages and Disadva...
 
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions  for the students and aspirants of Chemistry12th.pptxOrganic Name Reactions  for the students and aspirants of Chemistry12th.pptx
Organic Name Reactions for the students and aspirants of Chemistry12th.pptx
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
The basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptxThe basics of sentences session 2pptx copy.pptx
The basics of sentences session 2pptx copy.pptx
 
Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1Código Creativo y Arte de Software | Unidad 1
Código Creativo y Arte de Software | Unidad 1
 
Student login on Anyboli platform.helpin
Student login on Anyboli platform.helpinStudent login on Anyboli platform.helpin
Student login on Anyboli platform.helpin
 
Paris 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activityParis 2024 Olympic Geographies - an activity
Paris 2024 Olympic Geographies - an activity
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
1029 - Danh muc Sach Giao Khoa 10 . pdf
1029 -  Danh muc Sach Giao Khoa 10 . pdf1029 -  Danh muc Sach Giao Khoa 10 . pdf
1029 - Danh muc Sach Giao Khoa 10 . pdf
 
Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)Software Engineering Methodologies (overview)
Software Engineering Methodologies (overview)
 
Activity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdfActivity 01 - Artificial Culture (1).pdf
Activity 01 - Artificial Culture (1).pdf
 
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdfBASLIQ CURRENT LOOKBOOK  LOOKBOOK(1) (1).pdf
BASLIQ CURRENT LOOKBOOK LOOKBOOK(1) (1).pdf
 
Web & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdfWeb & Social Media Analytics Previous Year Question Paper.pdf
Web & Social Media Analytics Previous Year Question Paper.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
Arihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdfArihant handbook biology for class 11 .pdf
Arihant handbook biology for class 11 .pdf
 
Separation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and ActinidesSeparation of Lanthanides/ Lanthanides and Actinides
Separation of Lanthanides/ Lanthanides and Actinides
 
How to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptxHow to Make a Pirate ship Primary Education.pptx
How to Make a Pirate ship Primary Education.pptx
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 

seminar presentation on apache-spark

  • 1. SUBMITTED TO: SUBMITTED BY: Mrs. Suman singh Nikita Vijay (HOD of CSE Dept.) B. Tech –VIII sem(CSE) A SEMINAR PRESENTATION ON “Introduction To Apache Spark”
  • 2. ● Need of new generation distributed system ● Hardware/software evolution in last decade ● Apache Spark • Components of Apache Spark ● Why Spark? ● Who are using Spark? Agenda
  • 3. ● Lot has been changed from 2000 ● Both hardware and software gone through changes ● Big data has become necessity now ● Let’s look at what changed over decade Why we need new generation?
  • 4. ● Disk was cheap so disk was primary source of data ● Network was costly so data locality ● RAM was very costly ● Single core machines were dominant RAM is the king • RAM is primary source of data and we use disk for fallback ● Network is speedier ● Multi core machines are commonplace State of hardware in 2000 Now
  • 5. ● Object orientation was the king ● Software optimized for single core ● No open frameworks for creating ○ Distributed storage ○ Distributed processing ● SQL was the only dominant way for data analysis Now •Functional programming is on rise ● Software needs to exploit multiple cores on single node There are good frameworks to create distributed systems ○ HDFS for storage ● NoSQL is real alternative now Software in 2000
  • 6. ● Very few companies had big data issue ● Batch processing system ruled the world ● Volume was big concern compare to velocity ● Mostly used for ○ Search ○ Log analysis ● All companies use big data ● Velocity is as much concern as volume Needs of real time are as much important as batch processing Big Data processing needs in 2000 NOW
  • 7. • A fast and general engine for large scale data processing • Created by AMPLab • Written in Scala •Licensed under Apache Apache Spark
  • 9.
  • 10. Benefits of a Unified Platform • No copying of data between systems •Combine processing types in one program • Code reuse • One system to learn • One system to maintain
  • 11. Mesos, a distributed system framework as class project in UC Berkeley in 2009. ● Spark to test how mesos works ● Focused on ○ Iterative programs (ML) ○ Unifying real time and batch processing ● Open sourced in 2010 History of Apache Spark
  • 12. ● You can spark on top any distributed system ● It can run on ○ Yarn ○ Apache Mesos ○ It’s own cluster Runs everywhere
  • 13. ● Apache Spark is highly modular The original version contained only 1600 lines of scala code ● Apache Spark API is extremely simple compared Java API of M/R ● API is concise and consistent Small and Simple
  • 15. • In Spark, you can cache hdfs data in main memory of worker nodes • Spark analysis can be executed directly on in memory data ● Shuffling also can be done from in memory ● Fault tolerant In-memory aka Speed
  • 16. ● No separate storage layer ● Integrates well with HDFS ● Can run on Hadoop 1.0 and Hadoop 2.0 YARN ● Excellent integration with ecosystem projects like Apache Hive, HBase etc Integration with Hadoop
  • 17. ● Written in Scala but API is not limited to it ● Offers API in ○ Scala ○ Java ○ Python ● You can also do SQL using SparkSQL Multi language API
  • 18. Who are using Spark