Deep Dive on Amazon Neptune (DAT403) - AWS re:Invent 2018

© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Amazon Neptune Deep Dive
Brad Bebee
Principal Product Manager
Amazon Web Services
D A T 4 0 3
Bruce McGaughy
Sr. Manager, Software Development
Amazon Web Services

Agenda
Building applications on highly connected data
Different types of graph models
Amazon Neptune overview
Delivering high availability and enterprise features
Getting started

Related breakouts
Wednesday, November 28
DAT360 - Neptune Performance Tuning: Get the Best out of Amazon Neptune
5:30 – 6:30PM | Mirage, Grand Ballroom D, Table 3
DAT359 - Getting Started with Amazon Neptune and Amazon SageMaker Jupyter Notebooks
2:30PM – 3:30PM | Aria West, Level 3, Starvine 10, Table 7
SRV307-R1 Building Serverless Applications Using AWS AppSync and Amazon Neptune
3:15PM – 5:30PM | MGM, Level 1, Grand Ballroom 120

Relationships enable new applications
Retail fraud detectionRestaurant recommendationsSocial networks

Use cases for highly connected data
Social networking
Life Sciences Network & IT operationsFraud detection
Recommendations Knowledge graphs

Dave
Whom might I know? What product should I buy?
Bill
Bob
Alice
Dave
Sara
Bill
Bob
Alice
Dave

Understanding who, what, when, and where…
What museums should
Alice visit while in Paris?
Who painted the
Mona Lisa?
What artists have
paintings in The Louvre?

Navigate a web of global tax policies
“Our customers are increasingly required to navigate a complex web of global tax policies and
regulations. We need an approach to model the sophisticated corporate structures of our largest
clients and deliver an end-to-end tax solution. We use a microservices architecture approach for
our platforms and are beginning to leverage Amazon Neptune as a graph-based system to
quickly create links within the data.”
said Tim Vanderham, chief technology officer, Thomson Reuters Tax & Accounting

Challenges Building Apps with Highly Connected DataThe challenges of building apps with highly
connected data using a relational database
Unnatural for
querying graph
Inefficient
graph processing
Rigid schema inflexible
for changing data

Different approaches for highly connected data
Purpose-built for a business process
Purpose-built to answer questions about
relationships

A graph database is optimized for efficient storage
and retrieval of highly connected data

Leading graph models and frameworks
Open Source Apache TinkerPop™
Gremlin Traversal Language
W3C Standard
SPARQL Query Language
RESOURCE DESCRIPTION
FRAMEWORK (RDF)PROPERTY GRAPH

A highly connected university example

Find all of the graduate students who received an
undergraduate degree from the same university
Undergraduate Degree
From
name: ?
name: ?
University
Graduate Student
name: ?
Department
Member Of
subOrganizationOf

Challenges of existing graph databases
Difficult to maintain
high availability
Difficult to scale
Limited support for
open standards
Too expensive

Amazon Neptune
Fully managed graph database
FAST RELIABLE OPEN
Query billions of
relationships with
millisecond latency
6 replicas of your data
across 3 AZs with full
backup and restore
Build powerful
queries easily with
Gremlin and SPARQL
Supports Apache
TinkerPop & W3C
RDF graph models
EASY

Amazon Neptune high level architecture
Bulk load
from S3
Database
Mgmt.

Fully managed service
Easily configurable via the console
Multi-AZ high availability
Support for up to 15 read replicas
Supports encryption at rest
Supports encryption in transit (TLS)
Backup and restore, point-in-time
recovery
B E N E F I T S

Security
• Network isolation via Virtual Private Cloud
• Use security groups to control ingress
• HTTPS encrypted client connections using TLS 1.2
• Encryption at rest using AWS Key Management
Service (KMS)
• AWS Identity and Access Management (IAM)
Policies to secure creation of Neptune resources
• IAM-based Authentication for Access control
• Each request is signed with AWS Signature Version 4
• Libraries provided for Gremlin and SPARQL clients

Neptune GA customers

Neptune general availability
• Announced on 5/30/2018
• Regions
• US East (No. Virginia), US East
(Ohio), US West (Oregon), EU
(Ireland), EU (London), EU
(Frankfurt)
• https://aws.amazon.com/about-
aws/whats-new/2018/05/amazon-
neptune-is-now-generally-available/

Neptune: Distributed storage architecture
 Performance, availability, durability
 Scale-out replica architecture
 Shared storage volume with 10 GB
segments striped across hundreds of nodes
 Data is replicated 6 times across 3 AZs
 Hotspot rebalance, Fast database recovery
 Log applicator embedded in storage layer
Master Replica Replica Replica
Primary
Shared storage volume
Replica Replica
Gremlin /
Sparql
Transactions
Caching
Gremlin /
Sparql
Transactions
Caching
Gremlin /
Sparql
Transactions
Caching
Delivered as a managed service
AZ1 AZ2 AZ3
 Ship only the log
 Less work on engine
 Minimizes network traffic

Six copies across three availability zones
4 out 6 write quorum; 3 out of 6 read quorum
Many failures possible: Disk (segment loss), Node, AZ network, AZ power, etc..
Continuous monitoring for failures
Automatic repair by peer-to-peer gossiping and replication
Gremlin /
Sparql
Transaction
AZ 1 AZ 2 AZ 3
Caching
Gremlin /
Sparql
Transaction
AZ 1 AZ 2 AZ 3
Caching
Read and write availabilityRead availability
6-way replicated storage to survive “AZ+1” failure

Why are 6 copies necessary?
 You need replication across 3 AZs
to tolerate an AZ failure.
 Why not just 1 copy per AZ?
 An AZ + 1 node failure would break
the quorum
 Also important for performance
 Hides long tail network latencies
 Only 3/6 needed to ack reads
 Only 4/6 needed to ack writes
AZ 1 AZ 2 AZ 3
Quorum
break on
AZ failure
2/3 read
2/3 write
AZ 1 AZ 2 AZ 3
Quorum
survives
AZ failure
3/6 read
4/6 write

Continuous backup
Segment snapshot Log records
Recovery point
Segment 1
Segment 2
Segment 3
Time
• Neptune takes periodic snapshots of each segment in parallel
• Continuously streams the redo logs to Amazon Simple Storage Service (Amazon S3)
• Backup happens continuously without performance or availability impact
• At restore, retrieve the appropriate segment snapshots and log streams to storage nodes
• Apply log streams to segment snapshots in parallel and asynchronously

Traditional Database
Have to replay logs since the last
checkpoint
Typically 5 minutes between checkpoints
Often single threaded
Amazon Neptune
Normal reads also replay the logs in the
storage layer
Parallel, distributed, asynchronous
No replay for startup
Checkpointed Data Redo Log
Crash at T0 requires
a re-application of the
redo log since
last checkpoint
T0 T0
Crash at T0 will result in redo logs being
applied to each segment on demand, in
parallel, asynchronously
Instant crash recovery

Database backtrack
Backtrack brings the database to a point in time without requiring restore from backups
• Backtracking from an unintentional insert or delete
• Backtrack is not destructive. You can backtrack multiple times to find the right point in time.
t0 t1 t2
t0 t1
t2
t3 t4
t3
t4
Rewind to t1
Rewind to t3
Invisible Invisible

Simplified storage management
 Automatic storage scaling up to 64 TB—no performance impact
 Instantly create user snapshots—no performance impact
Up to 64TB of storage – auto-incremented in 10GB units
up to 64 TB

Neptune read replicas
PAGE CACHE
UPDATE
Neptune Primary
30% Read
70% Write
Neptune Replica
100% New Reads
Shared Multi-AZ Storage
Amazon Neptune read scaling
Performance
• Applications can scale out read traffic
across up to 15 read replicas
Low Replica Lag
• Typically < 10ms
• Master ships redo logs to replica
• Cached pages have redo applied
• Un-cached pages from shared storage
Availability
• Failing database nodes are
automatically detected and replaced
• If primary fails, a replica replaces it
(typically < 60s failover time)
• Primary upgrade by forced failover

Monitoring
AWS CloudTrail
• Log all Neptune API calls to S3 bucket
Event Notifications
• Create Amazon Simple Notification Service
(Amazon SNS) subscription via AWS
Command Line Interface (AWS CLI) or AWS
SDK
Amazon CloudWatch
CPUUtilization GremlinRequestsPerSec Http429 SparqlErrors
ClusterReplicaLag Http100 Http500 SparqlRequests
ClusterReplicaLagMaximum Http101 Http501 SparqlRequestsPerSec
ClusterReplicaLagMinimum Http200 LoaderErrors StatusErrors
EngineUptime Http400 LoaderRequests StatusRequests
FreeableMemory Http403 NetworkReceiveThroughput VolumeBytesUsed
GremlinErrors Http405 NetworkThroughput VolumeReadIOPs
GremlinRequests Http413 NetworkTransmitThroughput VolumeWriteIOPs

View our blog posts on
using Neptune and
Jupyter Notebooks
https://aws.amazon.com/blogs
/database/analyze-amazon-
neptune-graphs-using-amazon-
sagemaker-jupyter-notebooks/

Launch Amazon Neptune via AWS CloudFormation
https://docs.aws.amazon.com/neptune/latest/userguide/quickstart.html

Check out Amazon Neptune samples on Github
https://github.com/aws-samples/amazon-neptune-samples

Check out Amazon Neptune tools on Github
https://github.com/awslabs/amazon-neptune-tools

Thank you!
Brad Bebee
beebs@amazon.com
Bruce McGaughy
mcgaughy@amazon.com

Deep Dive on Amazon Neptune (DAT403) - AWS re:Invent 2018

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Deep Dive on Amazon Neptune (DAT403) - AWS re:Invent 2018

Similar a Deep Dive on Amazon Neptune (DAT403) - AWS re:Invent 2018 (20)

Más de Amazon Web Services

Más de Amazon Web Services (20)

Deep Dive on Amazon Neptune (DAT403) - AWS re:Invent 2018