Enviar búsqueda
Cargar
Spark graphx
•
2 recomendaciones
•
1,597 vistas
Carol McDonald
Seguir
Introduction to Apache Spark GraphX
Leer menos
Leer más
Software
Denunciar
Compartir
Denunciar
Compartir
1 de 63
Descargar ahora
Descargar para leer sin conexión
Recomendados
Introduction to Apache Spark
Introduction to Apache Spark
Rahul Jain
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
Apache Spark Introduction
Apache Spark Introduction
sudhakara st
Introduction to Spark with Python
Introduction to Spark with Python
Gokhan Atil
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQL
Databricks
Introducing DataFrames in Spark for Large Scale Data Science
Introducing DataFrames in Spark for Large Scale Data Science
Databricks
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
Anton Kirillov
Recomendados
Introduction to Apache Spark
Introduction to Apache Spark
Rahul Jain
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
What Is Apache Spark? | Introduction To Apache Spark | Apache Spark Tutorial ...
Simplilearn
Apache Spark Introduction
Apache Spark Introduction
sudhakara st
Introduction to Spark with Python
Introduction to Spark with Python
Gokhan Atil
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
Apache Spark - Basics of RDD | Big Data Hadoop Spark Tutorial | CloudxLab
CloudxLab
A Deep Dive into Query Execution Engine of Spark SQL
A Deep Dive into Query Execution Engine of Spark SQL
Databricks
Introducing DataFrames in Spark for Large Scale Data Science
Introducing DataFrames in Spark for Large Scale Data Science
Databricks
Apache Spark in Depth: Core Concepts, Architecture & Internals
Apache Spark in Depth: Core Concepts, Architecture & Internals
Anton Kirillov
Spark SQL
Spark SQL
Joud Khattab
Introduction to PySpark
Introduction to PySpark
Russell Jurney
Apache Spark Architecture
Apache Spark Architecture
Alexey Grishchenko
Data Source API in Spark
Data Source API in Spark
Databricks
Introduction to Pig
Introduction to Pig
Prashanth Babu
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Sachin Aggarwal
Introduction to spark
Introduction to spark
Duyhai Doan
Spark DataFrames and ML Pipelines
Spark DataFrames and ML Pipelines
Databricks
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop
DataWorks Summit/Hadoop Summit
Introduction to Apache Spark
Introduction to Apache Spark
Anastasios Skarlatidis
Spark architecture
Spark architecture
GauravBiswas9
PySpark in practice slides
PySpark in practice slides
Dat Tran
Intro to Apache Spark
Intro to Apache Spark
Robert Sanders
Apache Spark 101
Apache Spark 101
Abdullah Çetin ÇAVDAR
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
Databricks
Apache spark
Apache spark
shima jafari
Apache Spark overview
Apache Spark overview
DataArt
Introduction to apache spark
Introduction to apache spark
Aakashdata
Apache spark - Architecture , Overview & libraries
Apache spark - Architecture , Overview & libraries
Walaa Hamdy Assy
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
Venkata Naga Ravi
Apache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision Trees
Carol McDonald
Free Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache Spark
MapR Technologies
Más contenido relacionado
La actualidad más candente
Spark SQL
Spark SQL
Joud Khattab
Introduction to PySpark
Introduction to PySpark
Russell Jurney
Apache Spark Architecture
Apache Spark Architecture
Alexey Grishchenko
Data Source API in Spark
Data Source API in Spark
Databricks
Introduction to Pig
Introduction to Pig
Prashanth Babu
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Sachin Aggarwal
Introduction to spark
Introduction to spark
Duyhai Doan
Spark DataFrames and ML Pipelines
Spark DataFrames and ML Pipelines
Databricks
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop
DataWorks Summit/Hadoop Summit
Introduction to Apache Spark
Introduction to Apache Spark
Anastasios Skarlatidis
Spark architecture
Spark architecture
GauravBiswas9
PySpark in practice slides
PySpark in practice slides
Dat Tran
Intro to Apache Spark
Intro to Apache Spark
Robert Sanders
Apache Spark 101
Apache Spark 101
Abdullah Çetin ÇAVDAR
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
Databricks
Apache spark
Apache spark
shima jafari
Apache Spark overview
Apache Spark overview
DataArt
Introduction to apache spark
Introduction to apache spark
Aakashdata
Apache spark - Architecture , Overview & libraries
Apache spark - Architecture , Overview & libraries
Walaa Hamdy Assy
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
Venkata Naga Ravi
La actualidad más candente
(20)
Spark SQL
Spark SQL
Introduction to PySpark
Introduction to PySpark
Apache Spark Architecture
Apache Spark Architecture
Data Source API in Spark
Data Source API in Spark
Introduction to Pig
Introduction to Pig
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Apache Spark Introduction and Resilient Distributed Dataset basics and deep dive
Introduction to spark
Introduction to spark
Spark DataFrames and ML Pipelines
Spark DataFrames and ML Pipelines
Real-time Hadoop: The Ideal Messaging System for Hadoop
Real-time Hadoop: The Ideal Messaging System for Hadoop
Introduction to Apache Spark
Introduction to Apache Spark
Spark architecture
Spark architecture
PySpark in practice slides
PySpark in practice slides
Intro to Apache Spark
Intro to Apache Spark
Apache Spark 101
Apache Spark 101
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
A Tale of Three Apache Spark APIs: RDDs, DataFrames, and Datasets with Jules ...
Apache spark
Apache spark
Apache Spark overview
Apache Spark overview
Introduction to apache spark
Introduction to apache spark
Apache spark - Architecture , Overview & libraries
Apache spark - Architecture , Overview & libraries
Processing Large Data with Apache Spark -- HasGeek
Processing Large Data with Apache Spark -- HasGeek
Similar a Spark graphx
Apache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision Trees
Carol McDonald
Free Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache Spark
MapR Technologies
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Carol McDonald
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
Torsten Steinbach
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Codemotion
Spark Streaming Data Pipelines
Spark Streaming Data Pipelines
MapR Technologies
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Carol McDonald
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
Nesreen K. Ahmed
Geospatial applications created using java script(and nosql)
Geospatial applications created using java script(and nosql)
Comsysto Reply GmbH
Graph Computing with Apache TinkerPop
Graph Computing with Apache TinkerPop
Jason Plurad
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Tugdual Grall
Hadoop and Storm - AJUG talk
Hadoop and Storm - AJUG talk
boorad
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
Tugdual Grall
What are customers building with new Bing Maps capabilities
What are customers building with new Bing Maps capabilities
Microsoft Tech Community
Transformations and actions a visual guide training
Transformations and actions a visual guide training
Spark Summit
Developing Spatial Applications with Google Maps and CARTO
Developing Spatial Applications with Google Maps and CARTO
CARTO
Real-World NoSQL Schema Design
Real-World NoSQL Schema Design
DataWorks Summit/Hadoop Summit
GraphFrames Access Methods in DSE Graph
GraphFrames Access Methods in DSE Graph
Jim Hatcher
Wherecamp Navigation Conference 2015 - CartoDB and the new spatial technology...
Wherecamp Navigation Conference 2015 - CartoDB and the new spatial technology...
WhereCampBerlin
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
Ankur Dave
Similar a Spark graphx
(20)
Apache Spark Machine Learning Decision Trees
Apache Spark Machine Learning Decision Trees
Free Code Friday - Machine Learning with Apache Spark
Free Code Friday - Machine Learning with Apache Spark
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
Analyzing Flight Delays with Apache Spark, DataFrames, GraphFrames, and MapR-DB
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
IBM Insight 2015 - 1823 - Geospatial analytics with dashDB in the cloud
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Anomaly Detection in Telecom with Spark - Tugdual Grall - Codemotion Amsterda...
Spark Streaming Data Pipelines
Spark Streaming Data Pipelines
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
Fast, Scalable, Streaming Applications with Spark Streaming, the Kafka API an...
High-Performance Graph Analysis and Modeling
High-Performance Graph Analysis and Modeling
Geospatial applications created using java script(and nosql)
Geospatial applications created using java script(and nosql)
Graph Computing with Apache TinkerPop
Graph Computing with Apache TinkerPop
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Lambda Architecture: The Best Way to Build Scalable and Reliable Applications!
Hadoop and Storm - AJUG talk
Hadoop and Storm - AJUG talk
Fast Cars, Big Data - How Streaming Can Help Formula 1
Fast Cars, Big Data - How Streaming Can Help Formula 1
What are customers building with new Bing Maps capabilities
What are customers building with new Bing Maps capabilities
Transformations and actions a visual guide training
Transformations and actions a visual guide training
Developing Spatial Applications with Google Maps and CARTO
Developing Spatial Applications with Google Maps and CARTO
Real-World NoSQL Schema Design
Real-World NoSQL Schema Design
GraphFrames Access Methods in DSE Graph
GraphFrames Access Methods in DSE Graph
Wherecamp Navigation Conference 2015 - CartoDB and the new spatial technology...
Wherecamp Navigation Conference 2015 - CartoDB and the new spatial technology...
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
GraphX: Graph Analytics in Apache Spark (AMPCamp 5, 2014-11-20)
Más de Carol McDonald
Introduction to machine learning with GPUs
Introduction to machine learning with GPUs
Carol McDonald
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Carol McDonald
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Carol McDonald
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine Learning
Carol McDonald
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Carol McDonald
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Carol McDonald
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Carol McDonald
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Carol McDonald
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health Care
Carol McDonald
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep Learning
Carol McDonald
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Carol McDonald
Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures
Carol McDonald
Spark machine learning predicting customer churn
Spark machine learning predicting customer churn
Carol McDonald
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1
Carol McDonald
Applying Machine Learning to Live Patient Data
Applying Machine Learning to Live Patient Data
Carol McDonald
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka API
Carol McDonald
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming Data
Carol McDonald
Apache Spark Machine Learning
Apache Spark Machine Learning
Carol McDonald
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache HBase
Carol McDonald
Apache Spark streaming and HBase
Apache Spark streaming and HBase
Carol McDonald
Más de Carol McDonald
(20)
Introduction to machine learning with GPUs
Introduction to machine learning with GPUs
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Streaming healthcare Data pipeline using Apache APIs: Kafka and Spark with Ma...
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Analysis of Popular Uber Locations using Apache APIs: Spark Machine Learning...
Predicting Flight Delays with Spark Machine Learning
Predicting Flight Delays with Spark Machine Learning
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Structured Streaming Data Pipeline Using Kafka, Spark, and MapR-DB
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Streaming Machine learning Distributed Pipeline for Real-Time Uber Data Using...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real-Ti...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
Applying Machine Learning to IOT: End to End Distributed Pipeline for Real- T...
How Big Data is Reducing Costs and Improving Outcomes in Health Care
How Big Data is Reducing Costs and Improving Outcomes in Health Care
Demystifying AI, Machine Learning and Deep Learning
Demystifying AI, Machine Learning and Deep Learning
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Applying Machine learning to IOT: End to End Distributed Distributed Pipeline...
Streaming patterns revolutionary architectures
Streaming patterns revolutionary architectures
Spark machine learning predicting customer churn
Spark machine learning predicting customer churn
Fast Cars, Big Data How Streaming can help Formula 1
Fast Cars, Big Data How Streaming can help Formula 1
Applying Machine Learning to Live Patient Data
Applying Machine Learning to Live Patient Data
Streaming Patterns Revolutionary Architectures with the Kafka API
Streaming Patterns Revolutionary Architectures with the Kafka API
Advanced Threat Detection on Streaming Data
Advanced Threat Detection on Streaming Data
Apache Spark Machine Learning
Apache Spark Machine Learning
Build a Time Series Application with Apache Spark and Apache HBase
Build a Time Series Application with Apache Spark and Apache HBase
Apache Spark streaming and HBase
Apache Spark streaming and HBase
Último
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
VictorSzoltysek
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
kalichargn70th171
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
Delhi Call girls
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
Presentation.STUDIO
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Delhi Call girls
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Steffen Staab
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
Willy Marroquin (WillyDevNET)
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
HimanshiGarg82
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
Fatema Valibhai
Software Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
Arshad QA
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
ThousandEyes
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
masabamasaba
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
ryanfarris8
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
9953056974 Low Rate Call Girls In Saket, Delhi NCR
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
masabamasaba
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
masabamasaba
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
Anusha Are
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
masabamasaba
Último
(20)
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Sector 18, Noida Call girls :8448380779 Model Escorts | 100% verified
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Shapes for Sharing between Graph Data Spaces - and Epistemic Querying of RDF-...
Microsoft AI Transformation Partner Playbook.pdf
Microsoft AI Transformation Partner Playbook.pdf
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
Software Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
%in ivory park+277-882-255-28 abortion pills for sale in ivory park
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
Azure_Native_Qumulo_High_Performance_Compute_Benchmarks.pdf
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
ManageIQ - Sprint 236 Review - Slide Deck
ManageIQ - Sprint 236 Review - Slide Deck
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in Midrand+277-882-255-28 abortion pills for sale in midrand
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
%in kaalfontein+277-882-255-28 abortion pills for sale in kaalfontein
Pharm-D Biostatistics and Research methodology
Pharm-D Biostatistics and Research methodology
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
Spark graphx
1.
® © 2016 MapR
Technologies 9-1© 2017 MapR Technologies ® Spark GraphX
2.
® © 2016 MapR
Technologies 9-2 Learning Goals • Describe GraphX • Define Regular, Directed, and Property Graphs • Create a Property Graph • Perform Operations on Graphs
3.
® © 2016 MapR
Technologies 9-3 Learning Goals • Describe GraphX • Define Regular, Directed, and Property Graphs • Create a Property Graph • Perform Operations on Graphs
4.
® © 2016 MapR
Technologies 9-4 What is a Graph? Graph: vertices connected by edges vertex edge 5 1
5.
® © 2016 MapR
Technologies 9-5 What is a Graph? set of vertices, connected by edges. vertex edge DFW ATL Relationship: distance
6.
® © 2016 MapR
Technologies 9-6 Graphs are Essential to Data Mining and Machine Learning • Identify influential entities (people, information…) • Find communities • Understand people’s shared interests • Model complex data dependencies
7.
® © 2016 MapR
Technologies 9-7 Real World Graphs • Web Pages Reference Spark GraphX in Action
8.
® © 2016 MapR
Technologies 9-8 Real World Graphs • Web Pages Reference Spark GraphX in Action
9.
® © 2016 MapR
Technologies 9-9 Real World Graphs • Web Pages Reference Spark GraphX in Action
10.
® © 2016 MapR
Technologies 9-10 Real World Graphs Reference Spark GraphX in Action
11.
® © 2016 MapR
Technologies 9-11 Real World Graphs Reference Spark GraphX in Action
12.
® © 2016 MapR
Technologies 9-12 Real World Graphs Reference Spark GraphX in Action
13.
® © 2016 MapR
Technologies 9-13 Real World Graphs Reference Spark GraphX in Action
14.
® © 2016 MapR
Technologies 9-14 Real World Graphs Reference Spark GraphX in Action
15.
® © 2016 MapR
Technologies 9-15 Real World Graphs • Recommendations Ratings Items Users
16.
® © 2016 MapR
Technologies 9-16 Real World Graphs • Credit Card Application Fraud Reference Spark Summit
17.
® © 2016 MapR
Technologies 9-17 Real World Graphs • Credit Card Fraud
18.
® © 2016 MapR
Technologies 9-18 Finding Communities Count triangles passing through each vertex: " Measures “cohesiveness” of local community More Triangles Stronger Community Fewer Triangles Weaker Community 1 2 3 4
19.
® © 2016 MapR
Technologies 9-19 Real World Graphs Healthcare
20.
® © 2016 MapR
Technologies 9-20 Liberal Conservative Post Post Post Post Post Post Post Post Predicting User Behavior Post Post Post Post Post Post Post Post Post Post Post Post Post Post ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? 20 Conditional Random Field! Belief Propagation!
21.
® © 2016 MapR
Technologies 9-21 Enable JoiningTables and Graphs 21 User Data Product Ratings Friend Graph ETL Product Rec. Graph Join Inf. Prod. Rec. Tables Graphs
22.
® © 2016 MapR
Technologies 9-22 Table and Graph Analytics
23.
® © 2016 MapR
Technologies 9-23 What is GraphX? Spark SQL • Structured Data • Querying with SQL/HQL • DataFrames Spark Streaming • Processing of live streams • Micro-batching MLlib • Machine Learning • Multiple types of ML algorithms GraphX • Graph processing • Graph parallel computations RDD Transformations and Actions • Task scheduling • Memory management • Fault recovery • Interacting with storage systems Spark Core
24.
® © 2016 MapR
Technologies 9-24 Apache Spark GraphX • Spark component for graphs and graph- parallel computations • Combines data parallel and graph parallel processing in single API • View data as graphs and as collections (RDD) – no duplication or movement of data • Operations for graph computation – includes optimized version of Pregel • Provides graph algorithms and builders GraphX • Graph processing • Graph parallel computations
25.
® © 2016 MapR
Technologies 9-25 Learning Goals • Describe GraphX • Define Regular, Directed, and Property Graphs • Create a Property Graph • Perform Operations on Graphs
26.
® © 2016 MapR
Technologies 9-26 Regular Graphs vs Directed Graphs edge Carol Bob vertex Relationship: Friends • Regular graph: each vertex has the same number of edges • Example: Facebook friends – Bob is a friend of Carol – Carol is a friend of Bob
27.
® © 2016 MapR
Technologies 9-27 Regular Graphs vs Directed Graphs vertex edge Carol 1 2 3 Oprah 6 • Directed graph: edges have a direction • Example: Twitter followers – Carol follows Oprah – Oprah does not follow Carol Relationship: follows
28.
® © 2016 MapR
Technologies 9-28 Property Graph Flight 123 Flight 1002 LAX SJC Properties Properties
29.
® © 2016 MapR
Technologies 9-29 Flight Example with GraphX edge ORD vertex SFO 1800 miles 800 miles1400 miles DFW Originating Airport Destination Airport Distance SFO ORD 1800 miles ORD DFW 800 miles DFW SFO 1400 miles
30.
® © 2016 MapR
Technologies 9-30 Flight Example with GraphX edge ORD vertex SFO 1800 miles 800 miles1400 miles DFW Id Property 1 SFO 2 ORD 3 DFW SrcId DestId Property 1 2 1800 2 3 800 3 1 1400 Vertex Table Edge Table
31.
® © 2016 MapR
Technologies 9-31 Spark Property Graph class edge ORD vertex SFO 1800 miles 800 miles1400 miles DFW class Graph[VD, ED] { val vertices: VertexRDD[VD] val edges: EdgeRDD[ED] }
32.
® © 2016 MapR
Technologies 9-32 Learning Goals • Define GraphX • Define Regular, Directed, and Property Graphs • Create a Property Graph • Perform Operations on Graphs
33.
® © 2016 MapR
Technologies 9-33 Create a Property Graph Import required classes Create vertex RDD Create edge RDD Create graph 1 2 3 4
34.
® © 2016 MapR
Technologies 9-34 import org.apache.spark._ import org.apache.spark.graphx._ import org.apache.spark.rdd.RDD Create a Property Graph 1 Import required classes
35.
® © 2016 MapR
Technologies 9-35 Create a Property Graph: Data Set Vertices: Airports Edges: Routes Source ID Dest ID Property (E) Id Id Distance (Integer) Vertex ID Property (V) Id (Long) Name (String)
36.
® © 2016 MapR
Technologies 9-36 Create a Property Graph // create vertices RDD with ID and Name val vertices=Array((1L, ("SFO")),(2L, ("ORD")),(3L,("DFW"))) val vRDD= sc.parallelize(vertices) vRDD.take(1) // Array((1,SFO)) 2 Create vertex RDD Id Property 1 SFO 2 ORD 3 DFW
37.
® © 2016 MapR
Technologies 9-37 Create a Property Graph 3 Create edge RDD // create routes RDD with srcid, destid , distance val edges = Array(Edge(1L,2L,1800),Edge(2L,3L,800), Edge(3L,1L,1400)) val eRDD= sc.parallelize(edges) eRDD.take(2) // Array(Edge(1,2,1800), Edge(2,3,800)) SrcId DestId Property 1 2 1800 2 3 800 3 1 1400
38.
® © 2016 MapR
Technologies 9-38 Create a Property Graph 4 Create graph // define default vertex nowhere val nowhere = “nowhere” //build initial graph val graph = Graph(vertices, edges, nowhere) graph.vertices.take(3).foreach(print) // (2,ORD)(1,SFO)(3,DFW) graph.edges.take(3).foreach(print) // Edge(1,2,1800) Edge(2,3,800) Edge(3,1,1400)
39.
® © 2016 MapR
Technologies 9-39 Learning Goals • Define GraphX • Define Regular, Directed, and Property Graphs • Create a Property Graph • Perform Operations on Graphs
40.
® © 2016 MapR
Technologies 9-40 Graph Operators To answer questions such as: • How many airports are there? • How many flight routes are there? • What are the longest distance routes? • Which airport has the most incoming flights? • What are the top 10 flights?
41.
® © 2016 MapR
Technologies 9-41 Graph Class
42.
® © 2016 MapR
Technologies 9-42 Graph Operators To find information about the graph Operator Description numEdges number of edges (Long) numVertices number of vertices (Long) inDegrees The in-degree of each vertex (VertexRDD[Int]) outDegrees The out-degree of each vertex (VertexRDD[Int]) degrees The degree of each vertex (VertexRDD[Int])
43.
® © 2016 MapR
Technologies 9-43 Graph Operators Graph Operators // How many airports? val numairports = graph.numVertices // Long = 3 // How many routes? val numroutes = graph.numEdges // Long = 3 // routes > 1000 miles distance? graph.edges.filter { case ( Edge(org_id, dest_id,distance))=> distance > 1000 }.take(3) // Array(Edge(1,2,1800), Edge(3,1,1400)
44.
® © 2016 MapR
Technologies 9-44 Triplets // Triplets add source and destination properties to Edges graph.triplets.take(3).foreach(println) ((1,SFO),(2,ORD),1800) ((2,ORD),(3,DFW),800) ((3,DFW),(1,SFO),1400)
45.
® © 2016 MapR
Technologies 9-45 Triplets What are the longest routes ? ((1,SFO),(2,ORD),1800) ((2,ORD),(3,DFW),800) ((3,DFW),(1,SFO),1400) // print out longest routes graph.triplets.sortBy(_.attr, ascending=false) .map(triplet =>"Distance" + triplet.attr.toString + “from" + triplet.srcAttr + “to" + triplet.dstAttr) .collect.foreach(println) Distance 1800 from SFO to ORD Distance 1400 from DFW to SFO Distance 800 from ORD to DFW
46.
® © 2016 MapR
Technologies 9-46 Graph Operators Which airport has the most incoming flights? (real dataset) // Define a function to compute the highest degree vertex def max(a:(VertexId,Int),b:(VertexId, Int)):(VertexId, Int) = { if (a._2 > b._2) a else b } // Which Airport has the most incoming flights? val maxInDegree:(VertexId, Int)= graph.inDegrees.reduce(max) // (10397,152) ATL
47.
® © 2016 MapR
Technologies 9-47 Graph Operators Which 3 airports have the most incoming flights? (real dataset) // get top 3 val maxIncoming = graph.inDegrees.collect .sortWith(_._2 > _._2) .map(x => (airportMap(x._1), x._2)).take(3) maxIncoming.foreach(println) (ATL,152) (ORD,145) (DFW,143)
48.
® © 2016 MapR
Technologies 9-48 Graph Operators Caching Graphs Operator Description cache() Caches the vertices and edges; default level is MEMORY_ONLY persist(newLevel) Caches the vertices and edges at specified storage level; returns a reference to this graph unpersist(blocking) Uncaches both vertices and edges of this graph unpersistVertices(blocking) Uncaches only the vertices, leaving edges alone
49.
® © 2016 MapR
Technologies 9-49 Graph Class
50.
® © 2016 MapR
Technologies 9-50 Class Discussion 1. How many airports are there? • In our graph, what represents airports? • Which operator could you use to find the number of airports? 2. How many routes are there? • In our graph, what represents routes? • Which operator could you use to find the number of routes?
51.
® © 2016 MapR
Technologies 9-51 How Many Airports are There? How many airports are there? • In our graph, what represents airports? Vertices • Which operator could you use to find the number of airports? graph.numVertices
52.
® © 2016 MapR
Technologies 9-52 Pregel API • GraphX exposes variant of Pregel API • iterative graph processing – Iterations of message passing between vertices
53.
® © 2016 MapR
Technologies 9-53 The Graph-Parallel Abstraction A user-definedVertex-Program runs on each Graph vertex • Using messages (e.g. Pregel ) • Parallelism: run multiple vertex programs simultaneously
54.
® © 2016 MapR
Technologies 9-54 Pregel Operator Initial message received at each vertex Message computed at each vertex Sum of message received at each vertex Message computed at each vertex Sum of message received at each vertex Message computed at each vertex 1Super step 2Super step nSuper step Loop until no messages left OR max iterations
55.
® © 2016 MapR
Technologies 9-55 Pregel Operator: Example Use Pregel to find the cheapest airfare: // starting vertex val sourceId: VertexId = 13024 // a graph with edges containing airfare cost calculation val gg = graph.mapEdges(e => 50.toDouble + e.attr.toDouble/20 ) // initialize graph, all vertices except source have distance infinity val initialGraph = gg.mapVertices((id, _) => if (id == sourceId) 0.0 else Double.PositiveInfinity
56.
® © 2016 MapR
Technologies 9-56 Graph Class Pregel
57.
® © 2016 MapR
Technologies 9-57 Pregel Operator: Example Use Pregel to find the cheapest airfare: // call pregel on graph val sssp = initialGraph.pregel(Double.PositiveInfinity)( // Vertex Program (id, distCost, newDistCost) => math.min(distCost, newDistCost), triplet => { // Send Message if (triplet.srcAttr + triplet.attr < triplet.dstAttr) { Iterator((triplet.dstId, triplet.srcAttr + triplet.attr)) } else { Iterator.empty } }, // Merge Message (a,b) => math.min(a,b) )
58.
® © 2016 MapR
Technologies 9-58 Pregel Operator: Example Use Pregel to find the cheapest airfare: // routes , lowest flight cost println(sssp.edges.take(4).mkString("n")) Edge(10135,10397,84.6) Edge(10135,13930,82.7) Edge(10140,10397,113.45) Edge(10140,10821,133.5)
59.
® © 2016 MapR
Technologies 9-59 PageRank • Measures the importance of vertices in a graph • In links are votes • In links from important vertices are more important • Returns a graph with vertex attributes graph.pageRank(tolerance).vertices
60.
® © 2016 MapR
Technologies 9-60 Page Rank: Example Use Page Rank: // use pageRank val ranks = graph.pageRank(0.1).vertices // join the ranks with the map of airport id to name val temp= ranks.join(airports) temp.take(1) // Array((15370,(0.5365013694244737,TUL))) // sort by ranking val temp2 = temp.sortBy(_._2._1, false) temp2.take(2) //Array((10397,(5.431032677813346,ATL)), (13930,(5.4148119418905765,ORD))) // get just the airport names val impAirports =temp2.map(_._2._2) impAirports.take(4) //res6: Array[String] = Array(ATL, ORD, DFW, DEN)
61.
® © 2016 MapR
Technologies 9-61 Use Case Monitor air traffic at airports Monitor delays Analyze airport and routes overall Analyze airport and routes by airline
62.
® © 2016 MapR
Technologies 9-62 Learn More • https://www.mapr.com/blog/how-get-started-using-apache-spark-graphx-scala • GraphX Programming Guide http://spark.apache.org/docs/latest/graphx- programming-guide.html • MapR announces Free Complete Apache Spark Training and Developer Certification https://www.mapr.com/company/press-releases/mapr-unveils-free- complete-apache-spark-training-and-developer-certification • Free Spark On Demand Training http://learn.mapr.com/?q=spark#-l • Get Certified on Spark with MapR Spark Certification http://learn.mapr.com/? q=spark#certification-1,-l
63.
® © 2016 MapR
Technologies 9-63 Open Source Engines & Tools Commercial Engines & Applications Enterprise-Grade Platform Services DataProcessing Web-Scale Storage MapR-FS MapR-DB Search and Others Real Time Unified Security Multi-tenancy Disaster Recovery Global NamespaceHigh Availability MapR Streams Cloud and Managed Services Search and Others UnifiedManagementandMonitoring Search and Others Event StreamingDatabase Custom Apps MapR Converged Data Platform HDFS API POSIX, NFS Kakfa APIHBase API OJAI API
Descargar ahora