3. What is a graph database?
● A graph consists of nodes connected by relationships.
● Graph databases store data as graph structures (nodes and
relationships)
4. Property Graph Model Components
● Nodes
○ Represent the objects in the graph
○ Can be labeled
CAR
PERSON PERSON
5. Property Graph Model Components
● Nodes
○ Represent the objects in the graph
○ Can be labeled
● Relationships
○ Relate nodes by type and direction
CAR
DRIVES
LOVES
LOVES
LIVES WITH
O
W
N
S
PERSON PERSON
6. Property Graph Model Components
● Nodes
○ Represent the objects in the graph
○ Can be labeled
● Relationships
○ Relate nodes by type and direction
● Properties
○ Name-value pairs that can go on
nodes and relationships
CAR
DRIVES
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
LOVES
LOVES
LIVES WITH
O
W
N
S
PERSON PERSON
7. Cypher: Graph Query Language
MATCH (:Person { name:"Dan"} ) -[:LOVES]-> (:Person { name:"Ann"} )
LOVES
Dan Ann
LABEL PROPERTY
NODE NODE
LABEL PROPERTY
8. Neo4j Use Cases
● Social network analysis
● Recommendation engine
● Search & Discovery
● Network & Data Center
● Master Data Management
● Identity & Access Management
● GEO
● Fraud Detection
● …
Blog series: Top 10 Neo4j use cases
9. Neo4j
● The world’s leading graph database
● June 2021: raised $325M!
● On-premise, in the cloud or hybrid (Aura)
● Graph ML algorithms
● High performance
● Up to trillions of nodes and relationships
17. Apache Hop : background
● Community lead initiative
●
● New scalable GUI
● New metadata back-end
● Simplified toolset
● Code refactored, renamed, trimmed down, ...
● Extra plugins: Projects, Testing, Apache Beam, Debugging, ...
● …
18. Apache Hop Incubator
● Forked PDI/Kettle 8.2 + WebSpoon + patches + plugins + …
● → Represents 20 years of development!
● Part of the ASF Incubator, close to TLP
● 1.0 early October!
● built continuously
● Active, quickly growing community
https://hop.apache.org
19. Why Apache Hop?
● A quickly diversifying technological landscape
→ Makes it hard to manage complexity
→ Drives the need for rapid innovation
● Development done independent from a single large corporation
● By and for data orchestration professionals
20. Guiding principles
Apache Hop aims to make data orchestration better:
● Easy: setup, build, maintenance, deployment, …
● Fast: startup time, supporting Spark, Flink & DataFlow, ...
● Transparent: before, during and after execution
● Predictable: unit and integration testing
● Innovative: need for the latest tech (digital transformation)
● Best practices: support version control, testing, CI/CD, project,
lifecycle management, ...
21. Apache Hop : key architecture features
● Metadata driven: no code generation
● Modular pluggable architecture: scale back to <30MB
● Fast startup, minimal overhead
● Apache Beam with support for Apache Spark, Apache Flink and GCP
DataFlow runners
● Version controlled documentation
● Ease of use: transparent naming and easy to use tools
● Integration test: critical components are tested daily with integration tests
→ runtime compatibility, stability, ...
22. Apache Hop : key GUI features
● Pluggable GUI features
● Scalable interface for high DPI displays or visually impaired
● Perspectives for easy fast context switching
● Designed for web browsers and mobile users
→ Single click mode for faster navigation
● Full support for 4 platforms: Windows, OSX, Linux & Web
● Support for “dark mode” themes on Linux and OSX
23. Apache Hop : key configuration features
● All GUI configuration options have a command line variant
● Single central system configuration JSON file
● Easy project and lifecycle environment configuration
● Configuration and metadata inheritance from other projects
● Standard docker container
● Stateless server supporting multi-tenancy
● Version control friendly setup
24. Apache Hop : Q4 & 2022 roadmap
● Graduation to Apache Top Level Project (TLP)
● Pluggable field expressions
● Java 11 (mirror Apache Beam)
● new execute/preview/debug GUI
● Improved cloud support
● Airflow runtime support
● Marketplace for 3rd party plugins
● ...
25. Apache Hop : Community!
● Accepted in the ASF Incubator in Sept ‘20, ready to graduate to TLP
● Apache is a community building organisation
● Great communities deliver great software
● During incubation we are asked to
→ Grow the community
→ Release software the Apache way
26. Apache Hop : Community
● No single company drives the software forward
● The Hop community is growing fast across all social media channels, chat
server, …
● Anyone is welcome with ideas, code, bug fixes, suggestions, documentation,
translations, …
● No bug is too small or too big to fix.
● No improvement suggestion is too small or too big to consider
27. Apache Hop and Neo4j
● Best Neo4j (incl Aura!) support of any platform!!
● Functionality:
○ Neo4j logging perspective
○ Neo4j connection type, graph data type
○ 20+ action and transform plugins to write data to Neo4j, run
Cypher, split graph in nodes and relationships etc
28. Apache Hop : Enterprise support
● Lean With Data wants to help you!
● See: www.leanwithdata.com
→ Support
→ Training
→ Certification
→ Custom development
→ Data orchestration tool migration
Lean Orchestration: professionally supported Apache Hop.
Get the best of both worlds.
Know.bi is a Lean With Data partner
30. Thank you for your interest and time!
@ApacheHop, @know_bi
https://www.linkedin.com/company/apachehop
https://www.linkedin.com/company/knowbi
https://chat.project-hop.org