SlideShare una empresa de Scribd logo
1 de 57
Descargar para leer sin conexión
Unless otherwise indicated, these slides are © 2016 Pivotal Software, Inc. and licensed under a

Creative Commons Attribution-NonCommercial license: http://creativecommons.org/licenses/by-nc/3.0/
Using Docker, Neo4j, and Spring Cloud for
Developing Microservices
Kenny Bastani, Spring Developer Advocate, Pivotal
@kennybastani
https://github.com/kbastani/spring-boot-graph-processing-example
Speaker Intro - Kenny Bastani
2
Ranking Twitter Profiles
Using PageRank
3 https://github.com/kbastani/spring-boot-graph-processing-example
https://github.com/kbastani/spring-boot-graph-processing-example
PageRank algorithm
4
https://github.com/kbastani/spring-boot-graph-processing-example
PageRank algorithm
5
https://github.com/kbastani/spring-boot-graph-processing-example6
https://github.com/kbastani/spring-boot-graph-processing-example7
https://github.com/kbastani/spring-boot-graph-processing-example8
https://github.com/kbastani/spring-boot-graph-processing-example
Tools we’ll be using
! Spring Boot
! Neo4j
! Apache Spark
! Docker
! RabbitMQ
9
https://github.com/kbastani/spring-boot-graph-processing-example
Containerize all the things!
10
https://github.com/kbastani/spring-boot-graph-processing-example11
https://github.com/kbastani/spring-boot-graph-processing-example12
https://github.com/kbastani/spring-boot-graph-processing-example13
https://github.com/kbastani/spring-boot-graph-processing-example14
Connecting Neo4j and Apache Spark
…to submit PageRank jobs
15 https://github.com/kbastani/spring-boot-graph-processing-example
https://github.com/kbastani/spring-boot-graph-processing-example
Request new Apache Spark job
16
Algorithm Type
Relationship Type
https://github.com/kbastani/spring-boot-graph-processing-example
Export Neo4j graph to HDFS
17
New Job Request


HDFS Path: /../../graph.csv
Job Type: pagerank
https://github.com/kbastani/spring-boot-graph-processing-example
Encoding a graph as an edge list
18
edge list


G B
H B
I B
K B
E B
F B
J B
D B
G E
H E
I E
K E

…
https://github.com/kbastani/spring-boot-graph-processing-example
Import edge list to Apache Spark
19
Process Job Request


HDFS Path: /../../graph.csv
Job Type: pagerank
graph.csv


0 1
1 2
2 3
0 3
2 1

…
https://github.com/kbastani/spring-boot-graph-processing-example
Apply results back to Neo4j
20
Completed Job


HDFS Path: /../../results.csv
Job Type: pagerank
results.csv


0 .56
1 .42
2 .14
3 .25

…
https://github.com/kbastani/spring-boot-graph-processing-example
Graph processing platform
21
Algorithm Type
Relationship Type
https://github.com/kbastani/spring-boot-graph-processing-example
docker-compose.yml
! Demo
22
Building Microservices
23 https://github.com/kbastani/spring-boot-graph-processing-example
https://github.com/kbastani/spring-boot-graph-processing-example
Building Microservices
24
25 https://github.com/kbastani/spring-boot-graph-processing-example
Building Microservices
26
Creating Spring Data Neo4j Repositories
https://github.com/kbastani/spring-boot-graph-processing-example
https://github.com/kbastani/spring-boot-graph-processing-example
What our application needs
! Repositories
• User repository (to manage and import users)
• Follows repository (to manage and import following relationships)
• Custom Cypher queries mapped to repository methods
! Domain model
• User — (our node entity)
• Follows — (our relationship entity)
! REST API
• Expose the resources of the domain as a REST API
27
Creating Spring Data Neo4j Repositories
28 https://github.com/kbastani/spring-boot-graph-processing-example
Creating Spring Data Neo4j Repositories
29 https://github.com/kbastani/spring-boot-graph-processing-example
Creating Spring Data Neo4j Repositories
30 https://github.com/kbastani/spring-boot-graph-processing-example
Creating Spring Data Neo4j Repositories
31 https://github.com/kbastani/spring-boot-graph-processing-example
Creating Spring Data Neo4j Repositories
32 https://github.com/kbastani/spring-boot-graph-processing-example
Creating Spring Data Neo4j Repositories
33 https://github.com/kbastani/spring-boot-graph-processing-example
Exposing repository APIs using Spring Data REST
34 https://github.com/kbastani/spring-boot-graph-processing-example
Exposing repository APIs using Spring Data REST
35 https://github.com/kbastani/spring-boot-graph-processing-example
Exposing repository APIs using Spring Data REST
36 https://github.com/kbastani/spring-boot-graph-processing-example
Connecting to the Twitter API
37 https://github.com/kbastani/spring-boot-graph-processing-example
38 https://github.com/kbastani/spring-boot-graph-processing-example
Connecting to the Twitter API
Connecting to the Twitter API
39 https://github.com/kbastani/spring-boot-graph-processing-example
Connecting to the Twitter API
40 https://github.com/kbastani/spring-boot-graph-processing-example
We can override these properties as environment variables at runtime
41 https://github.com/kbastani/spring-boot-graph-processing-example
Connecting to the Twitter API
42 https://github.com/kbastani/spring-boot-graph-processing-example
Connecting to the Twitter API
Scheduling new PageRank jobs
43 https://github.com/kbastani/spring-boot-graph-processing-example
https://github.com/kbastani/spring-boot-graph-processing-example
Scheduling PageRank jobs from Neo4j
44
Scheduling new PageRank jobs
45 https://github.com/kbastani/spring-boot-graph-processing-example
Ranking Dashboard
46 https://github.com/kbastani/spring-boot-graph-processing-example
47 https://github.com/kbastani/spring-boot-graph-processing-example
Ranking dashboard
48 https://github.com/kbastani/spring-boot-graph-processing-example
49 https://github.com/kbastani/spring-boot-graph-processing-example
Ranking dashboard
Adding static web content
50 https://github.com/kbastani/spring-boot-graph-processing-example
Ranking dashboard
51 https://github.com/kbastani/spring-boot-graph-processing-example
https://github.com/kbastani/spring-boot-graph-processing-example
Ranking dashboard
52
https://github.com/kbastani/spring-boot-graph-processing-example
Add seed profiles
53
https://github.com/kbastani/spring-boot-graph-processing-example
Choose 3 seed profiles
54
Creating Spring Data Neo4j Repositories
55 https://github.com/kbastani/spring-boot-graph-processing-example
https://github.com/kbastani/spring-boot-graph-processing-example
Discover new users and update rankings
56
https://github.com/kbastani/spring-boot-graph-processing-example57
Learn More. Stay Connected.
! Twitter: @kennybastani
! Spring: spring.io/team/kbastani
! Blog: kennybastani.com
! GitHub: github.com/kbastani
Twitter: twitter.com/springcentral
YouTube: spring.io/video
LinkedIn: spring.io/linkedin
Google Plus: spring.io/gplus

Más contenido relacionado

Destacado

Molecular genetics unit 3
Molecular genetics unit 3Molecular genetics unit 3
Molecular genetics unit 3
mcnutter
 

Destacado (9)

Extending the Platform with Spring Boot and Cloud Foundry
Extending the Platform with Spring Boot and Cloud FoundryExtending the Platform with Spring Boot and Cloud Foundry
Extending the Platform with Spring Boot and Cloud Foundry
 
Scalatra scala meetup
Scalatra scala meetupScalatra scala meetup
Scalatra scala meetup
 
Back your app with MySQL and Redis on Cloud Foundry
Back your app with MySQL and Redis on Cloud FoundryBack your app with MySQL and Redis on Cloud Foundry
Back your app with MySQL and Redis on Cloud Foundry
 
Adivina de _quienes_son_las_siguientes_cansiones[1]
Adivina de _quienes_son_las_siguientes_cansiones[1]Adivina de _quienes_son_las_siguientes_cansiones[1]
Adivina de _quienes_son_las_siguientes_cansiones[1]
 
12gravimetrik
12gravimetrik12gravimetrik
12gravimetrik
 
Molecular genetics unit 3
Molecular genetics unit 3Molecular genetics unit 3
Molecular genetics unit 3
 
Visual Resume
Visual ResumeVisual Resume
Visual Resume
 
Taberna mylaensis
Taberna mylaensisTaberna mylaensis
Taberna mylaensis
 
Ecn Forex
Ecn ForexEcn Forex
Ecn Forex
 

Similar a Using Docker, Neo4j, and Spring Cloud for Developing Microservices

Similar a Using Docker, Neo4j, and Spring Cloud for Developing Microservices (20)

Princeton RSE Peer network first meeting
Princeton RSE Peer network first meetingPrinceton RSE Peer network first meeting
Princeton RSE Peer network first meeting
 
DWX 2022 - DevSecOps mit GitHub
DWX 2022 - DevSecOps mit GitHubDWX 2022 - DevSecOps mit GitHub
DWX 2022 - DevSecOps mit GitHub
 
Spring Projects Infrastructure
Spring Projects InfrastructureSpring Projects Infrastructure
Spring Projects Infrastructure
 
Spring Projects Infrastructure
Spring Projects InfrastructureSpring Projects Infrastructure
Spring Projects Infrastructure
 
Bootiful Development with Spring Boot and Vue - Devnexus 2019
Bootiful Development with Spring Boot and Vue - Devnexus 2019Bootiful Development with Spring Boot and Vue - Devnexus 2019
Bootiful Development with Spring Boot and Vue - Devnexus 2019
 
D4Maia - 12_12_23.pptx
D4Maia - 12_12_23.pptxD4Maia - 12_12_23.pptx
D4Maia - 12_12_23.pptx
 
Front End Development for Backend Developers - GIDS 2019
Front End Development for Backend Developers - GIDS 2019Front End Development for Backend Developers - GIDS 2019
Front End Development for Backend Developers - GIDS 2019
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
 
CoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFiCoC23_ Looking at the New Features of Apache NiFi
CoC23_ Looking at the New Features of Apache NiFi
 
A Love Story with Kubevirt and Backstage from Cloud Native NoVA meetup Feb 2024
A Love Story with Kubevirt and Backstage from Cloud Native NoVA meetup Feb 2024A Love Story with Kubevirt and Backstage from Cloud Native NoVA meetup Feb 2024
A Love Story with Kubevirt and Backstage from Cloud Native NoVA meetup Feb 2024
 
Using apache mx net in production deep learning streaming pipelines
Using apache mx net in production deep learning streaming pipelinesUsing apache mx net in production deep learning streaming pipelines
Using apache mx net in production deep learning streaming pipelines
 
Beacon Development
Beacon DevelopmentBeacon Development
Beacon Development
 
Introduction to Apache Spark 2.0
Introduction to Apache Spark 2.0Introduction to Apache Spark 2.0
Introduction to Apache Spark 2.0
 
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
Curiosity Bits Python Tutorial: Mining Facebook Fan Page - getting posts and ...
 
Practical Examples of Serverless Architecture using AWS Lambda and PyWren as ...
Practical Examples of Serverless Architecture using AWS Lambda and PyWren as ...Practical Examples of Serverless Architecture using AWS Lambda and PyWren as ...
Practical Examples of Serverless Architecture using AWS Lambda and PyWren as ...
 
Build a RESTful API with the Serverless Framework
Build a RESTful API with the Serverless FrameworkBuild a RESTful API with the Serverless Framework
Build a RESTful API with the Serverless Framework
 
Making your first OpenStack contribution (EuroPython)
Making your first OpenStack contribution (EuroPython)Making your first OpenStack contribution (EuroPython)
Making your first OpenStack contribution (EuroPython)
 
Intro to Github Actions @likecoin
Intro to Github Actions @likecoinIntro to Github Actions @likecoin
Intro to Github Actions @likecoin
 
Mm.. FLaNK Stack (MiNiFi MXNet Flink NiFi Kudu Kafka)
Mm.. FLaNK Stack (MiNiFi MXNet Flink NiFi Kudu Kafka)Mm.. FLaNK Stack (MiNiFi MXNet Flink NiFi Kudu Kafka)
Mm.. FLaNK Stack (MiNiFi MXNet Flink NiFi Kudu Kafka)
 
Develop Hip APIs and Apps with Spring Boot and Angular - Connect.Tech 2017
 Develop Hip APIs and Apps with Spring Boot and Angular - Connect.Tech 2017 Develop Hip APIs and Apps with Spring Boot and Angular - Connect.Tech 2017
Develop Hip APIs and Apps with Spring Boot and Angular - Connect.Tech 2017
 

Más de Kenny Bastani

Más de Kenny Bastani (9)

In the Eventual Consistency of Succeeding at Microservices
In the Eventual Consistency of Succeeding at MicroservicesIn the Eventual Consistency of Succeeding at Microservices
In the Eventual Consistency of Succeeding at Microservices
 
Open Source Big Graph Analytics on Neo4j with Apache Spark
Open Source Big Graph Analytics on Neo4j with Apache SparkOpen Source Big Graph Analytics on Neo4j with Apache Spark
Open Source Big Graph Analytics on Neo4j with Apache Spark
 
Big Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache SparkBig Graph Analytics on Neo4j with Apache Spark
Big Graph Analytics on Neo4j with Apache Spark
 
Document Classification with Neo4j
Document Classification with Neo4jDocument Classification with Neo4j
Document Classification with Neo4j
 
Neo4j Graph Data Modeling
Neo4j Graph Data ModelingNeo4j Graph Data Modeling
Neo4j Graph Data Modeling
 
Building a Graph-based Analytics Platform
Building a Graph-based Analytics PlatformBuilding a Graph-based Analytics Platform
Building a Graph-based Analytics Platform
 
Building Killer Apps with Neo4j 2.0
Building Killer Apps with Neo4j 2.0Building Killer Apps with Neo4j 2.0
Building Killer Apps with Neo4j 2.0
 
Natural Language Processing with Neo4j
Natural Language Processing with Neo4jNatural Language Processing with Neo4j
Natural Language Processing with Neo4j
 
Natural language search using Neo4j
Natural language search using Neo4jNatural language search using Neo4j
Natural language search using Neo4j
 

Último

+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
Health
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
masabamasaba
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
shinachiaurasa2
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
masabamasaba
 

Último (20)

Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park %in kempton park+277-882-255-28 abortion pills for sale in kempton park
%in kempton park+277-882-255-28 abortion pills for sale in kempton park
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdfThe Top App Development Trends Shaping the Industry in 2024-25 .pdf
The Top App Development Trends Shaping the Industry in 2024-25 .pdf
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
SHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions PresentationSHRMPro HRMS Software Solutions Presentation
SHRMPro HRMS Software Solutions Presentation
 
Generic or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisionsGeneric or specific? Making sensible software design decisions
Generic or specific? Making sensible software design decisions
 
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdfPayment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
Payment Gateway Testing Simplified_ A Step-by-Step Guide for Beginners.pdf
 
Define the academic and professional writing..pdf
Define the academic and professional writing..pdfDefine the academic and professional writing..pdf
Define the academic and professional writing..pdf
 
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
%in Bahrain+277-882-255-28 abortion pills for sale in Bahrain
 
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
%+27788225528 love spells in Atlanta Psychic Readings, Attraction spells,Brin...
 
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
call girls in Vaishali (Ghaziabad) 🔝 >༒8448380779 🔝 genuine Escort Service 🔝✔️✔️
 
The title is not connected to what is inside
The title is not connected to what is insideThe title is not connected to what is inside
The title is not connected to what is inside
 
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
%in Hazyview+277-882-255-28 abortion pills for sale in Hazyview
 
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
%+27788225528 love spells in Boston Psychic Readings, Attraction spells,Bring...
 
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
%+27788225528 love spells in Vancouver Psychic Readings, Attraction spells,Br...
 
AI & Machine Learning Presentation Template
AI & Machine Learning Presentation TemplateAI & Machine Learning Presentation Template
AI & Machine Learning Presentation Template
 
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa%in tembisa+277-882-255-28 abortion pills for sale in tembisa
%in tembisa+277-882-255-28 abortion pills for sale in tembisa
 
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
OpenChain - The Ramifications of ISO/IEC 5230 and ISO/IEC 18974 for Legal Pro...
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 

Using Docker, Neo4j, and Spring Cloud for Developing Microservices

Notas del editor

  1. Hi my name is Kenny Bastani. I am a Spring Developer Advocate at Pivotal. Today I will be introducing you to a reference architecture for creating a PageRank analytics platform using Spring Boot microservices.
  2. A little bit about me. Again, I’m a Developer Advocate on the Spring team at Pivotal. I’m currently writing a book about Cloud Native JVM applications with Josh Long. The early release is available today from O’Reilly media.
  3. Let’s do an overview of the problem we will solve as a part of our reference application. The problem we’re going to solve is how to discover influencers on Twitter using a set of seed profiles as inputs. To solve this problem without a background in machine learning or social network analytics might be a bit of a stretch, but we’re going to take a stab at it using a little bit of computer science history.
  4. The PageRank algorithm, created by Google co-founder Larry Page, was first used by Google to rank website documents from analyzing the graph of backlinks between sites. I dug up the original research paper on PageRank from Stanford for some inspiration. In the paper, the authors talk about the notion of approximating the "importance" of an academic publication by weighting the value of its citations. In the paper I found a quote which best summarized how the PageRank algorithm works. The quote reads: …
  5. PageRank works by calculating the importance of a single vertex by analyzing only the link structure of the graph. Each incoming relationship is considered as a vote. PageRank is calculated in steps, where each step is an approximation of votes. At the end of each step a weight is assigned to each vertex. Each subsequent step applies a weight to outgoing relationships using the approximate value that was calculated for the vertex from the previous step. Here we have an example graph, consisting of nodes and relationships. Each node has a degree of connections, inbound and outbound. The more inbound links a node has, the higher the weight that is applied to its outbound links to other nodes. We see that the red node, node B, is the most important node in this graph. We also see that node C, the second most important node, only has a single inbound relationship, which is coming from node B. This is the basis for how PageRank calculates the importance of nodes in a graph using only the relationships between nodes.
  6. As a part of today’s webinar, I will be introducing you to a reference application that combines multiple microservices with a graph processing platform to rank users on Twitter. We’re going to use a collection of popular tools as a part of this reference architecture. The tools we’ll use, in the order of importance, will be: Spring Boot Neo4j Apache Spark Docker RabbitMQ
  7. This diagram shows each component and microservice that we will create as a part of this sample application. Notice how we’re connecting the Spring Boot applications to the graph processing platform we looked at earlier. Also, notice the connections between the services, these connections define communication points between each service and what protocol is used. The three applications that are colored in blue are stateless services. Stateless services will not attach a persistent backing service or need to worry about managing state locally. The application that is colored in green is the Twitter Crawler service. Components that are colored in green will typically have an attached backing service. These backing services are responsible for managing state locally, and will either persist state to disk or in-memory.
  8. Let’s do an overview of the problem we will solve as a part of our reference application. The problem we’re going to solve is how to discover influencers on Twitter using a set of seed profiles as inputs. To solve this problem without a background in machine learning or social network analytics might be a bit of a stretch, but we’re going to take a stab at it using a little bit of computer science history.
  9. A graph processing platform is an application architecture that provides a general-purpose job scheduling interface for analyzing graphs. The reference application uses a graph processing platform to analyze and rank communities of users on Twitter. For this we’ll use Neo4j Mazerunner, an open source project that I started that connects Neo4j’s database server to Apache Spark. This diagram illustrates a graph processing platform similar to Neo4j Mazerunner. We can see from the diagram that new job requests are sent from Neo4j to RabbitMQ. Before Neo4j sends a message to RabbitMQ requesting a new job, it will export a graph replica to HDFS. The analysis service, which is the hexagon that is colored in purple, has an embedded standalone instance of Apache Spark, and will listen for messages from RabbitMQ containing new job requests. Each message that is received by the analysis service contains information about where the exported graph replica is stored on HDFS and what graph algorithm to execute. After the analysis service has completed execution of a job, it sends a message to RabbitMQ that will be received by a listener on Neo4j. The message will contain a path on HDFS of the resulting graph that was saved by the analysis service. Neo4j will then import the results from HDFS back into the database without interrupting or impacting transactions that are being made by other database clients. The PageRank results from GraphX will be automatically applied back to Neo4j without any additional work to manually handle data loading. The workflow for this is extremely simple for our purposes. From a backend service we will only need to make a simple HTTP request to Neo4j to begin a PageRank job.
  10. When we talk about microservices we are talking about developing software in the context of continuous delivery. Microservices are not just smaller services that scale horizontally. When we talk about microservices, we are talking about being able to create applications that are the product of many teams delivering continuously in independent release cycles. In this reference application, I’ve built 4 microservices, each as a Spring Boot application. If we were to build this architecture as microservices in an authentic scenario, each microservice would be owned and managed by a different team. This is an important differentiation in this new practice, as there is much confusion around what a microservice is and what it is not. A microservice is not just a distributed system of small services. The practice of building microservices should never be without the discipline of continuous delivery. For the purposes of this reference application, I hope that it will give you an idea of how to compose complex applications as distributed systems that resemble a microservice architecture in a team environment.
  11. This diagram shows each component and microservice that we will create as a part of this sample application. Notice how we’re connecting the Spring Boot applications to the graph processing platform we looked at earlier. Also, notice the connections between the services, these connections define communication points between each service and what protocol is used. The three applications that are colored in blue are stateless services. Stateless services will not attach a persistent backing service or need to worry about managing state locally. The application that is colored in green is the Twitter Crawler service. Components that are colored in green will typically have an attached backing service. These backing services are responsible for managing state locally, and will either persist state to disk or in-memory.
  12. We’ll start building the Twitter Crawler service by creating a set of domain classes and repositories to manage data with the Spring Data Neo4j project. The Twitter Crawler service will act as a REST API to interact with the users we’ve imported from the Twitter API. We can see that this service has an HTTP connection to the Neo4j server. We’ll also use the Twitter Crawler service to schedule PageRank jobs with Neo4j, which will be dispatched for calculation on Apache Spark.
  13. Spring Data Neo4j is a project in the Spring Data ecosystem that implements the Spring Data repository abstraction using an OGM (Object Graph Mapping) library for Neo4j. Spring Data Neo4j allows you to manage data on a Neo4j server using annotated POJOs as entity references in a Spring Data application.
  14. Before we can start managing data in Neo4j, we’ll need to design and construct a graph data model for our application’s domain data. The domain model for this application is rather simple, and we’ll construct it using domain objects described in Twitter’s API documentation. We’ll only have one domain concept, which is a User profile, and we’ll source this resource from profiles that are imported from the Twitter API. We’ll then have a relationship entity with the type named FOLLOWS. The FOLLOWS relationship will connect User profiles together in Neo4j after importing follower data from the Twitter API. The graph data model that we will end up with looks like this diagram. We see here that we have three nodes, having the label User, and the relationships between them having the relationship type FOLLOWS. We can see that each of the User nodes have a set of properties. First we have an internal node ID that is assigned by Neo4j. Next we have a profileId, which is a unique identifier that is assigned to the profile by Twitter. We also see that we have a property for PageRank. This PageRank property will be updated as jobs are processed by Apache Spark. The results of the job are applied back to User nodes in the Neo4j database. To represent this domain model as Java classes, we’ll need to create two domain classes, one for the User node, and one for the Follows relationship. We’ll then need to create Spring Data repositories for each of these domain classes. Let’s start by taking a look at how to create a class that represents the User node, which is a profile imported from Twitter.
  15. We see here that we have a basic POJO that represents a domain class for the User node in Neo4j. We can also describe fields for incoming and outgoing follower connections. These fields will be created with the type Set<User>, and give us a way to load profiles that are connected to a User node with a FOLLOWS relationship. At the top of this class we see that we have a @NodeEntity annotation. This annotation is used to mark this class as a Neo4j entity reference, one that is specific to nodes in the Neo4j database. Each domain class automatically will apply a label to Neo4j nodes in the database, using the name of the class, in this case ‘User’. The next annotation we’ll need to add is the @GraphId annotation. This annotation is used to mark a private field as a ID property, one that is automatically generated by Neo4j as a unique internally issued identifier. The next annotation we’ll use is the @Index annotation. We’ll use this annotation on the profileId field, to indicate that it is a unique ID that we will use to make sure that duplicate User nodes are not imported from the Twitter API. The next annotation we will use is the @Relationship annotation. This annotation is used to indicate that a field references a set of nodes that are connected via a relationship type and direction. In this case, we can load a user’s incoming and outgoing relationships to other users in the graph. Finally, we’ll add in other fields that we will import from the Twitter API, as well as a PageRank field that will be added as a result of our Apache Spark jobs that calculate a user’s PageRank.
  16. Here we have a basic POJO that represents an entity reference for our relationships in Neo4j with the relationship type FOLLOWS. The basic function of this relationship entity is to act as a index record that connects two records together, from UserA to UserB. The annotations in this POJO differ slightly from the User entity, in that the first annotation is @RelationshipEntity instead of @NodeEntity. This annotation includes a relationship type, which we’ve supplied as a parameter called FOLLOWS. The next annotation is @GraphId, and just like the User POJO, this entity is an internally assigned unique ID from the Neo4j database. Next we have our two users, UserA and UserB. UserA will be annotated as our start node, meaning that this user is following the second user. The second user is UserB, and is annotated with @EndNode.
  17. Next, we’ll need to create a repository to manage our data that will be mapped to the User domain class. The Spring Data project makes repository-based management of database entities a snap. We’re going to use the GraphRepository<T> interface to create a repository bean that will be created at runtime in our Spring Boot application. Here we create a basic interface that extends GraphRepository<User>. This repository interface will be initialized as a bean at runtime, and provides us with a client to manage transactions on entities for User nodes in Neo4j. We’ll also add in custom repository methods that take advantage of Neo4j’s declarative query language, Cypher. Cypher is a query language that is designed to be familiar to users of SQL, but specific to the traits of a graph database. One example I’ve added for the purpose of this demonstration is the method getUserIdByProfileId. This query will match a user node using a profileId parameter. When we download records from the Twitter API, it will often be the case that we can only see a profileId before downloading additional information, such as a user’s screen name or profile image. In the case that we want to get back the internally assigned unique ID that Neo4j has assigned to a node, we can use this Cypher query to perform a simple lookup that translates a profileId to an internal node ID. The reference application defines many custom Cypher query methods to extend the repository interface’s basic operations. Each one of these methods play a role in the crawling algorithm that allows the application to discover new users. Now that we can manage User nodes and FOLLOWS relationships, we need to think about how performant it will be to save potentially thousands of relationships per second when importing user profiles from the Twitter API. We’ll need to be able to batch transactions so that Neo4j can handle the throughput of ingesting writes at a rapid pace. To do this, we need to create another GraphRepository bean for managing the creation of many FOLLOWS relationships between a set of User profiles.
  18. The repository interface is similar to the UserRepository interface from the last slide. Here we have defined a Cypher query method that will allow us to save batches of thousands of relationships in a single transaction. The result of this is that User nodes will either be created or connected to other User nodes in our Neo4j database. The Cypher query is fairly simple. For each of the instances of the follow class that are provided as a parameter to this method, the Cypher query will either get or create the users that are referenced for the userA and userB fields. Finally, a relationship is created between these two users, if one does not already exist. Now we are able to save potentially thousands of users and relationships per transaction, which will be essential to the data import process from Twitter.
  19. Now we have our domain classes and repositories that we can use to manage entities in our Neo4j database. We can now import data from the Twitter API so that we can perform scheduled PageRank operations on Apache Spark.
  20. Now that we have created our Spring Data Neo4j repositories for managing our Twitter follower graph, we’ll need to expose a REST API interface that allows remote services to manage our domain data over HTTP. Thankfully this is a simple task when using the Spring Data REST project.
  21. All that we need to do to enable this as a feature on our Spring Data Neo4j repositories is add the spring-boot-starter-data-rest artifact as a dependency to the project’s pom.xml file.
  22. The database resources will now be returned as a JSON representation that lists hypermedia resources as embedded links. We can use these embedded links to manage resources of our graph repositories over HTTP.
  23. Now that we have everything we need to manage our Twitter profile data in Neo4j, we can import profiles from the Twitter API. To do this, we can use the Spring Social Twitter project. Spring Social Twitter is a Spring ecosystem project that allows you to manage Twitter API resources using a TwitterTemplate client. The TwitterTemplate will take care of all of our concerns, such as API authorization, and provide client bindings that will allow us to interact with Twitter’s API resources from our JVM application.
  24. Going back to the architecture diagram from earlier, we can see that we have a dependency on the Twitter API from the Twitter Crawler service. We’ll call the Twitter API to import user profiles and their connections to other Twitter users. The results will be saved in Neo4j using the repositories that we created earlier.
  25. Before we can start using this client, we’ll need to add the spring-social-twitter artifact as one of our project dependencies in the pom.xml, which is shown here.
  26. In our Spring Boot application we will map configuration properties as key values on the classpath. To do this, we will map keys in our application’s properties file to values that will be loaded from the environment. These environment-specific values represent the keys and access tokens that we will need to authenticate with the Twitter API. We’ll source these values to configure the TwitterTemplate bean for our Spring Boot application.
  27. The next step will be to configure the Twitter client that is provided by spring-social-twitter. In order to access operations for importing profiles from Twitter’s API, we will need to provide API tokens and keys that are generated for an application by Twitter. You’ll need to register with Twitter and create a developer app in order to get these keys. Getting API access is a simple process. A step-by-step guide is available from Spring’s website that will show you how to generate Twitter API keys for an application. In the code example I’ve shown here, we’ll use the @Value annotation to automatically load in configurations from the application’s .properties file. These are the keys that we saw from the last slide. We’ll then define a bean for the Twitter Template, which will require as parameters, the Twitter API access credentials. Now we will be able to automatically inject instances of the TwitterTemplate that were initialized from our configuration properties during runtime.
  28. Now our Spring Boot application for the Twitter Crawler service will be able to use a TwitterTemplate object as a client to interact with the Twitter API. The code snippet I have here is a simplified example of how we will access a TwitterTemplate bean using a Spring framework technique called constructor-based dependency injection. This code snippet is only meant to illustrate how we’ll be using the TwitterTemplate client throughout the application. We can see that in this class that we’re getting a reference to the TwitterTemplate through the constructor, which will be called by Spring when initializing the bean at runtime.
  29. The last concern we need to address on our Twitter Crawler service is to integrate with the graph processing platform. The graph processing platform is an attached backing service on Neo4j, which makes it easy for us to issue requests for new graph processing jobs. Neo4j exposes an endpoint to an unmanaged extension for Mazerunner on the classpath of the Neo4j database server. This unmanaged extension exposes a REST API for interacting with the graph processing platform’s analysis service that embeds an instance of Apache Spark.
  30. Going back to our graph processing platform example from earlier, we can understand how Neo4j has been extended to be able to take incoming requests for new PageRank jobs. The only thing we need to worry about for scheduling new jobs from our consumers is to issue an HTTP GET request to the Neo4j server using a route that contains the analysis type and the relationship type we would like to analyze.
  31. It’s easy enough to make a GET request to the job scheduling interface on Neo4j, but we will still need to create a trigger that will be called on a scheduled time interval from the Twitter Crawler service. To do this, we can use the @Scheduled annotation on a method of an object in our Spring Boot application. We’ll then provide a fixed value for the rate parameter of the annotation that is measured in milliseconds. I’ve decided that the PageRank job should be started about every 5 minutes, so we’ll initialize the fixedRate value to 5 minutes in milliseconds. This snippet of code is an example of how we will register a method using Spring’s @Scheduled annotation. The method will issue a GET request to a REST API endpoint on Neo4j. This method is described by the relative path variable. We’ll load in our environment-specific hostname of the Neo4j server to generate the full URL to the job scheduling interface for Mazerunner. Finally we’ll make a get request to the analysis endpoint.
  32. Now that we’ve gone over the major functionality of the Twitter Crawler service, let’s go over how to create a user interface that will display our ranked users. We’ll call this service the ranking dashboard.
  33. We’ve finished creating the backend components of the microservice architecture and can now write a simple client-side web application to interface with the Twitter Crawler service. We’ll need to be able to access the Twitter Crawler service’s REST API as a part of the dashboard’s host. For this we’ll need to create a reverse proxy to the Twitter Crawler’s REST API. Since we are using Spring Cloud, we are able to take advantage of the Eureka discovery service and the @EnableZuulProxy annotation to automatically inject routes from the Twitter Crawler service into our new Ranking Dashboard service. What this means is that the new Ranking Dashboard service will be able to expose the full REST API of the Twitter Crawler service on its own host, without writing a single line of code. Well, maybe just one line of code.
  34. The Spring Boot application we’ll create is as simple as it gets. The only application code we’ll create is the Spring Boot application class. Here we’ve annotated the application class with the annotation: @SpringCloudApplication. This enables the basic Spring Cloud features for connecting to a discovery service. We’ve also added the annotation: @EnableZuulProxy, which will enable an embedded Zuul proxy in this service. To get this to work, we do need to worry about some configuration properties in our application.yml file.
  35. The properties here are configured for the production Spring profile, and will provide the necessary settings to connect and register with the discovery service. Since the Twitter Crawler service we created earlier uses the same discovery service connection settings, the Ranking Dashboard will automatically create a proxy to the Twitter Crawler’s REST API. Now when the Ranking Dashboard service is started, it will contact the Eureka discovery service at the serviceUrl’s default zone property. By doing this we are able to embed the request mappings that are exposed by the Twitter Crawler service. The ID that the Twitter Crawler service will use when registering with Eureka will be twitter-rank. This ID will be used as the request path to access the routes of the Twitter Crawler service from the Ranking Dashboard service. All requests to the route with the ID, /twitter-rank from the Ranking Dashboard service will be forwarded as a request to the Twitter Crawler service. This is a pretty nifty feature. We can create local request mappings to any remote service that is registered with the Eureka discovery service. This comes in handy if you’re working with microservice architectures. The next step will be to add static web content to the Ranking Dashboard service that connects to the REST API of the Twitter Crawler service through our newly embedded Zuul proxy.
  36. I’ve created a simple client-side web application that uses jQuery to make AJAX requests to the Twitter Crawler REST API. Spring Boot makes it easy to map static content by placing it in the resource directory of our application.
  37. The example here shows a directory of the Ranking Dashboard application. The static content for the single page application has been placed in the resource directory under the folder ‘static’. Now when I run the Spring Boot application, the index.html file will be mapped to the service’s root.
  38. Now we can start up all the applications in our architecture one by one. I’ve already gone ahead and done this for the demo so that we can explore the working application that has already imported records from the Twitter API. First, let’s take a look at what I did to train the dataset that we’ll see in the demo.
  39. The dashboard is a single page web application which consumes two REST API methods on the Twitter Crawler service. Let’s first review how the dashboard will be used. The first time the dashboard is loaded, there won’t be any data to display from the Twitter Crawler service.
  40. Before the Twitter Crawler service will begin to automatically discover new profiles, the user must provide a minimum of three screen names of Twitter users as seeds. The goal is to add three seed profiles of users who are members of a community on Twitter. It’s important to make sure that these users follow each other, which will make it likely that there are other mutual profile connections between these users.
  41. The seed profiles I’ve chosen for this demonstration are: @kennybastani @starbuxman @bridgetkromhout A few of my teammates at Pivotal. When adding new seed profiles through the UI, an AJAX call will be made as a GET request to a proxied REST API path. This path will route requests to the Twitter Crawler service and respond as if the method was inside our web application. After adding each of these profiles manually through the UI, we’ll end up with the view shown here. We can see that we’ve imported the profile attributes of each of the seed profiles. These are indicated in the ranking dashboard, but the PageRank has yet to be calculated. If you remember from earlier, this job is scheduled every 5 minutes, so we’ll have to wait.
  42. What’s happened at this point was illustrated earlier. Here again is our example graph data model that we used to create our Neo4j domain classes and repositories. In the model we see the three seed users I’ve selected. The relationships for these users have now been imported from Twitter. They also have relationships to many other users, which have also been imported. This will be the training data set that the discovery algorithm will use to crawl new users, using PageRank as the metric of relevancy to prioritize which users to crawl next from the current data set.
  43. Now that I’ve added the three seed profiles, each of these user’s connections will be imported to Neo4j on the Twitter Crawler service. After about 5 minutes, the PageRank job will have been scheduled and completed its analysis of the initial users. After a PageRank value has been assigned to the initial users, you will see new users that have automatically discovered. The screenshot here shows users that were discovered automatically by the Twitter Crawler service. Newly discovered users are indicated by a green plus icon in the Rank column. These results are highly appropriate to the seed selection. We can even infer that the original three users have distributed systems development and DevOps as a common interest. Now let’s go ahead and explore the working application.
  44. If you have questions for me after this webinar, please connect. You can reach out to me on Twitter, my screen name is here. Also, be sure to stay connected with the Spring team on these sites.