SlideShare una empresa de Scribd logo
1 de 55
Descargar para leer sin conexión
Extensible RESTful
Applications
with Apache Tinkerpop
Graph Day SF 2018
About Us
LIKES
{
"first_name": "Varun",
"last_name": "Ganesh",
}
{
"first_name": "Harshvardhan",
"last_name": "Joshi",
}
LIKES
CONNECTING TO
BUSINESS STACKS VISUALISATION
CUSTOM BUILT
INFOGRAPHICS
NATURAL
LANGUAGE
GENERATED
INSIGHTS
EXPORT & SHARE
STORIES
EMAIL
POWERPOINT, TV
WEB
Embedded SDK
About
CLIENTS
• Automating the process of data storytelling
• For more information, visit www.nugit.co
Agenda
• Use Cases
• The Slack APIs
• Defining the Entities
• Graph Design and Considerations
• Making the Graph RESTful
• Building a DSL
• Testing the Application
• Scaling the Graph
Use Cases - Communities
• View contribution to
communication
• Participation across
channels
• Identify collaborative
groups
• Users connected by
mentions and reactions
• Identify influential users
per channel
• Highlight engaging conversations
• Top videos, GIFs, links
• Get insights across channels
Use Cases – Top Posts
Defining the
Entities
Top Post:
• Files shared
• Messages with attachments
• Posts without replies or reactions
are not considered
Defining the
Entities
Notable Message:
• Messages with reactions or replies
• Replies and Comments that have
reactions
• Other alerts that gather reactions
Defining the
Entities
Mention:
• Replies and Comments can have
mentions too
• Ignore mentions that are
unnecessary or alreadycaptured in
a relationship
Defining the Entities
• Narrows down data required for the use case
• Helps “whiteboarding” process for graph design
• Allows defining schema for payloads
• Requires understanding the nuances of the platform
Graph Design and Considerations
• Team node acts as root node
• Allows maintaing separate graphs
for different organisations
Graph Design and Considerations
• Top posts, notable messages are
both message nodes
• Differentiated using edge labels
• Edge traversals favoured over
property lookup
Graph Design and Considerations
• Any user can comment on, react to
or be mentioned in any message
• Reaction type modelled as edge
property
• Efficient as use-case does not need
filtering by reaction type
Graph Design and Considerations
• Same file shared across channels
shares common pool of reactions
• Schema respects Slack specific
behaviour
• Handles idempotency based on
unique ID maintained by Slack
Graph Design and Considerations
{
"type": "message",
"user": "U2FQG2G9F",
"text": "next time you want cereal: n<https://www.instagram.com/p/BcDN4eWFjac/?taken-
by=therock>",
"attachments": [
{
"service_name": "Instagram",
"title": "Instagram post by @therock • Nov 28, 2017 at 7:14pm UTC",
"title_link": "https://www.instagram.com/p/BcDN4eWFjac/?taken-by=therock",
"text": "346.3k Likes, 2,167 Comments - @therock on Instagram:”……”",
"fallback": "Instagram: Instagram post by @therock • Nov 28, 2017 at 7:14pm UTC",
"image_url": "https://scontent-iad3-1.cdninstagram.com/t51.2885-15/e35/24178_n.jpg",
"from_url": "https://www.instagram.com/p/BcDN4eWFjac/?taken-by=therock",
"image_width": 334,
"image_height": 250,
"image_bytes": 178559,
"service_icon": "https://www.instagram.com/static/images/ico/appl.png/932e4d9af891.png",
"id": 1
}
],
"thread_ts": "1511936426.000178",
"reply_count": 3,
"replies": [
{
"user": "U193XDML7",
"ts": "1511953167.000138”
},
{
"user": "U2FQG2G9F",
"ts": "1511953180.000044"
},
{
"user": "U193XDML7",
"ts": "1511953192.000230”
}
],
"ts": "1511936426.000178",
"reactions": [
{
"name": "smile",
"users": [
"U193XDML7”
],
"count": 1
},
{
"name": "obesecat",
"users": [
"U193XDML7”
],
"count": 1
}
]
}
The Slack APIs
Endpoint:
https://slack.com/api/conversations.history
Endpoint:
https://slack.com/api/conversations.history
[
{
"type": "message",
"user": "U4BPQR94L",
"text": "Yinghui Malmsteen
<@U2FQG2G9F>n<https://www.youtube.com/watch?v=D4OxW_0qqv8>",
"attachments": [
{
...
}
],
"ts": "1536057373.000100",
"reactions": [
{
"name": "flag-se",
"users": [
"U58LYK8Q6"
],
"count": 1
}
]
}
]
[
{
"user": "U2Q2U37SA",
"inviter": "U0LPSJQR0",
"text": "<@U2Q2U57SA> has joined the channel",
"type": "message",
"subtype": "channel_join",
"ts": "1536138265.000200”
}
]
The Slack APIs
[
{ "id": "U4C0FDU2J",
"team_id": "T028ZLMQN",
"name": "friendlybotdev",
"deleted": true,
"profile": {
"title": "",
"phone": "",
"skype": "",
"real_name": "Friendly Bot",
"real_name_normalized": "Friendly Bot",
"display_name": "friendlybotdev",
"display_name_normalized": "friendlybotdev",
"status_text": "",
"status_emoji": "",
"status_expiration": 0,
"avatar_hash": "123456",
"bot_id": "B4B47T0G3",
"api_app_id": "A4B92ZEER",
"always_active": true,
"image_original": "https://slack-edge.com/2017-06-21/123456_original.png",
"first_name": "Friendly",
"last_name": "Bot",
"image_24": "https://slack-edge.com/2017-06-21/123456_24.png",
"image_32": "https://slack-edge.com/2017-06-21/123456_32.png",
"image_48": "https://slack-edge.com/2017-06-21/123456_48.png",
"image_72": "https://slack-edge.com/2017-06-21/123456_72.png",
"image_192": "https://slack-edge.com/2017-06-21/123456_192.png",
"image_512": "https://slack-edge.com/2017-06-21/123456_512.png",
"image_1024": "https://slack-edge.com/2017-06-21/123456_1024.png",
"status_text_canonical": "",
"team": "T028Z5MQN"
},
"is_bot": true,
"is_app_user": false,
"updated": 1517305013
}
]
[
{ "id": "C8KMHCN5D",
"name": "arandomchannel",
"is_channel": true,
"created": 1507613685,
"creator": "U5BG5XU6T",
"is_shared": false,
"is_member": true,
"is_private": false,
"last_read": "1533892238.000324",
"latest": {
"type": "message",
"user": "U84K3ZTF9",
"text": "let's meetup tomorrow",
"ts": "1536139470.000100"
},
"unread_count": 7,
"unread_count_display": 7,
"members": [
"U08ED90CD",
"U0LPSJQR0",
"U193XDML7",
"U9LKWV9C1",
"UBJ4CHV5L" ],
"topic": {
"value": "place for people who are interested in sharing and learning",
"creator": "U5BGLXU6T",
"last_set": 1507613720
},
"purpose": {
"value": "",
"creator": "",
"last_set": 0
},
"previous_names": []
}
]
Endpoint:
https://slack.com/api/users.list
Endpoint:
https://slack.com/api/channels.info
The Slack APIs
The Journey So Far
• Defining entities and modelling them into Graph
• Iterative feedback-drivenprocess
• Understanding the data available from the API
• Identifying unique IDs
• Filtering out required fields
Data Ingestion and Extraction
• Apache Flink cluster retrieves, parses and filters Slack data
• GraphQL service requests data for visualization
• Flask REST service ingests/queries data to/from Tinkerpop
POST
PUT
GET
Gremlin-Python
Gremlin
Bytecode
Why Tinkerpop?
• Abstraction that lets us avoid vendor lock-in
• Reduces rework when switching data stores
• Gremlin query language
• Hadoop and SparkComputer
Making the Graph RESTful
• Defining REST Endpoints
• Defining the Resources
• Remote Traversals
• Write endpoints for seeding
• POST /teams/<team_uid>/channels
• POST /teams/<team_uid>/channels/<channel_uid>/messages
• Handling Idempotency
• Replace default strategy with ”ElementIDStrategy”
• Enables creation of nodes with Slack specific unique IDs
Defining REST Endpoints
// scripts/empty-sample.groovy
globals << [g : graph.traversal(),sg: graph.traversal().withStrategies(ElementIdStrategy.build().create())]
• Read endpoints for queries
• GET /teams/<team_uid>/top_posts
Making the Graph RESTful
• Setting up REST Endpoints
• Defining the Resources
• Remote Traversals
Defining the Resources
from marshmallow import Schema, fields, pre_load, pre_dump, post_load, validates_schema
from marshmallow.exceptions import ValidationError
...
class MessageSchema(Schema):
""" Holds all the required fields for a message object."""
ts = fields.Float(required=True)
text = fields.Str()
comment = fields.Str()
subtype = fields.Str()
bot_id = fields.Str(validate=is_bot_uid)
user = fields.Str(validate=is_user_uid) thread_ts = fields.Str()
file_share = fields.Nested(FileShareSchema, load_from="file")
attachments = fields.Nested(AttachmentSchema, many=True)
reactions = fields.Nested(ReactionSchema, many=True)
comments = fields.Nested(CommentSchema, many=True, load_from="replies")
mentions = fields.List(fields.Str(validate=is_user_uid))
class AttachmentSchema(Schema):
""" Holds all the required fields for an Attachment object."""
class ReactionSchema(Schema):
""" Holds all the required fields for a reaction object."""
class CommentSchema(Schema):
""" Holds all the required fields for a comment object."""
...
• Organized code with single point of
reference
Defining the Resources
from marshmallow import Schema, fields, pre_load, pre_dump, post_load, validates_schema
from marshmallow.exceptions import ValidationError
...
class MessageSchema(Schema):
""" Holds all the required fields for a message object."""
@validates_schema
def validate_message(self, data):
""" Validate if the message contains any of comments, mentions or reactions. """
if not any([f(data) for f in (has_comments, has_mentions, has_reactions)]):
raise ValidationError("The message must contain comments, mentions or
reactions")
ts = fields.Float(required=True)
text = fields.Str()
comment = fields.Str()
subtype = fields.Str()
bot_id = fields.Str(validate=is_bot_uid)
user = fields.Str(validate=is_user_uid) thread_ts = fields.Str()
file_share = fields.Nested(FileShareSchema, load_from="file")
attachments = fields.Nested(AttachmentSchema, many=True)
reactions = fields.Nested(ReactionSchema, many=True)
comments = fields.Nested(CommentSchema, many=True, load_from="replies")
mentions = fields.List(fields.Str(validate=is_user_uid))
class AttachmentSchema(Schema):
""" Holds all the required fields for an Attachment object."""
class ReactionSchema(Schema):
""" Holds all the required fields for a reaction object."""
class CommentSchema(Schema):
""" Holds all the required fields for a comment object."""
...
• Organized code with single point of
reference
• Validate data before ingestion
• Enforce types and required fields
@validates_schema
def validate_message(self, data):
""" Validate if the message contains any of comments, mentions or reactions. """
if not any([f(data) for f in (has_comments, has_mentions, has_reactions)]):
raise ValidationError("The message must contain comments, mentions or
reactions")
from marshmallow import Schema, fields, pre_load, pre_dump, post_load, validates_schema
from marshmallow.exceptions import ValidationError
...
class MessageSchema(Schema):
""" Holds all the required fields for a message object."""
class AttachmentSchema(Schema):
""" Holds all the required fields for an Attachment object."""
title = fields.Str()
fallback = fields.Str()
text = fields.Str()
thumb_url = fields.Str()
image_url = fields.Str()
title_link = fields.Str()
@post_load
def reshape_attachment(self, data):
""" Apply required transformations on the Attachment object. ""”
# Create a post_title field
collapse_keys(data, "post_title", *("fallback", "title", "text"))
# Create a post_thumbnail field
collapse_keys(data, "post_thumbnail", *("thumb_url", "image_url",
"title_link"))
# Set post_type to URL
data["post_type"] = "URL”
class ReactionSchema(Schema):
""" Holds all the required fields for a reaction object."""
class CommentSchema(Schema):
""" Holds all the required fields for a comment object."""
class FileShareSchema(Schema):
""" Holds all the required fields for a File Share object.""”
class UserSchema(Schema):
""" Holds all the required fields for a User object.""”
...
• Organized code with single point of
reference
• Validate data before ingestion
• Enforce types and required fields
• Normalize fields with post-
processing
Defining the Resources
@post_load
def reshape_attachment(self, data):
""" Apply required transformations on the Attachment object. ""”
# Create a post_title field
collapse_keys(data, "post_title", *("fallback", "title", "text"))
# Create a post_thumbnail field
collapse_keys(data, "post_thumbnail", *("thumb_url", "image_url",
"title_link"))
# Set post_type to URL
data["post_type"] = "URL”
Making the Graph RESTful
• Schema enforcement and validation
• Handling Idempotency of endpoints
• Custom Traversal Source
Remote Traversals
• Bytecode sent over network instead of string
• Allows using custom traversal source for a Domain Specific Language (DSL)
from gremlin_python.driver.driver_remote_connection import
DriverRemoteConnection
...
conn = DriverRemoteConnection(GREMLIN_SERVER_HOST, 'sg')
slack = Graph().traversal(SlackTraversalSource).withRemote(conn)
Building a DSL
• Motivations
• Custom Workflows
Building a DSL - Motivations
class SlackTraversalSource(BaseTraversalSource):
""" Module to initialise a Graph with the methods listed under SlackTraversal. """
def __init__(self, *args, **kwargs):
super(SlackTraversalSource, self).__init__(*args, **kwargs)
self.graph_traversal = SlackTraversal
def channels(self, *channel_ids):
""" Shorthand to identify all channel nodes"""
return traversal
• Custom traversal source can also specify useful shorthands
• E.g. Traversing to all the Channel nodes
traversal = self.get_graph_traversal()
traversal.bytecode.add_step("V")
traversal.bytecode.add_step("hasLabel", NODES.channel)
if channel_ids:
traversal.bytecode.add_step("has", "__id", P.within(channel_ids))
Building a DSL - Motivations
class SlackTraversal(BaseTraversal):
def addPartOfChannelEdges(self, channel_uid, *user_uids, **kwargs):
""" Add an edge to a channel from the users who were/are a part of the channel. ""”
return self
• Custom traversal source specifies business logic behind traversals
• E.g. Connecting a User node to a Channel node
for user_uid in user_uids:
edge_uid = construct_uid(user_uid, channel_uid, EDGES.part_of.name, delim="|")
self.getOrAddEdgeFrom(edge_label=EDGES.part_of, edge_uid=edge_uid,
node_label=NODES.user, node_uid=user_uid)
.upsertProperties(kwargs.get("properties")).inV()
Building a DSL - Motivations
from gremlin_python.process.graph_traversal import GraphTraversal
from gremlin_python.process.graph_traversal import GraphTraversalSource, __
class BaseTraversal(GraphTraversal):
def getOrAddEdgeFrom(self, edge_label, edge_uid, node_label, node_uid):
"""
Adds an edge from the node with the given label and uid only if the edge doesn’t exist.
"""
return self.coalesce(
__.addE(edge_label).property(T.id, edge_uid).from_(
__.V().getNode(node_label, node_uid)))
__.InE(edge_label).hasId(edge_uid).and(
__.outV().hasId(node_uid), __.outV().hasLabel(node_label)),
• BaseTraversal handles creation of nodes and edges
• These methods should guarantee idempotency
• E.g. Creation of edges between two nodes…
• ...checks for an existing edge
Building a DSL - Motivations
from gremlin_python.process.graph_traversal import GraphTraversal
from gremlin_python.process.graph_traversal import GraphTraversalSource, __
class BaseTraversal(GraphTraversal):
def getOrAddEdgeFrom(self, edge_label, edge_uid, node_label, node_uid):
"""
Adds an edge from the node with the given label and uid only if the edge doesn’t exist.
"""
return self.coalesce(
__.InE(edge_label).hasId(edge_uid).and(
__.outV().hasId(node_uid), __.outV().hasLabel(node_label)),
__.addE(edge_label).property(T.id, edge_uid).from_(
__.V().getNode(node_label, node_uid)))
• The edge is created only if it doesn’t already exist
def build_visualization(self, traversal_source,
**kwargs):
""" The below are standardized steps that are
required to generate data for any visualization."""
return self.start(traversal_source)
.filterByDate(self.date_dimension,
kwargs.get("start_time"),
kwargs.get("end_time"))
.filterByFields(self.filters_map,
kwargs.get("filters"))
.sortByFields(self.sorting_map,
kwargs.get("sort_field"),
kwargs.get("sort_direction"))
.buildObject(self.object_map).toList()
Building a DSL – Custom Workflows
• Standardized steps for generating a visualization are defined in the BaseTraversal
• Custom maps define traversal paths for fields that vary across visualizations
Building a DSL – Custom Workflows
# Sample filter from frontend
filter_obj = {'_and': [{"field": 'reactions', '_gte': 100},
{"field": 'post_creator',
'_in': [‘bob’, ‘chloe']
}]}
filter_map = {"post_creator": lambda pred:
__.in_(EDGES.created_post).has(USER.display_name, pred),
"reactions": lambda pred:
__.inE(EDGES.reacted_to).count().is_(pred)
}
object_map = {
"post_creator": {"uid": [__.in_(EDGES.created_post).values("__id"),
__.constant("")],
"image": ... # define similar path here,
},
"reactions":
__.inE(EDGES.reacted_to).groupCount().by(__.values(REACTION.name))
}
start = lambda traversal_source: traversal_source.posts()
# DSL generates the required lower level base traversals
slack.posts().where(
__.and_(
__.inE(EDGES.reacted_to).count().is_(P.gte(100)),
__.in_(EDGES.created_post).has(USER.display_name,
P.within(['bob', 'chloe'])))).
project("post_creator", "reactions").by(
__.project("image", "display_name", "uid").by(
__.in(EDGES.created_post).values(USER.image),
__.in(EDGES.created_post).values(USER.display_name),
__.in(EDGES.created_post).values("__id"))).by(
__.inE(EDGES.reacted_to).groupCount()).toList()
# Inject maps into DSL methods
start(slack)
.filterByFields(self.filters_map, kwargs.get("filters"))
.buildObject(self.object_map)
.toList()
• The DSL takes in functions/paths that map fields to their traversals
• Maps customized based on the visualization that is needed
Building a DSL – Custom Workflows
{
"reactions": {
"palm_tree": 82,
"robot_face": 18
},
"post_creator": {
"image": "https://url_of_image.jpg",
"display_name": ”chloe",
"uid": "U024ZH7HL”
}
}
• The traversals generated churn out the final response objects
• Objects rendered into visualizations by the client
Testing the Application
• Unit Tests
• Validating traversals on Gremlin Server
Check if test passes
Use Fixtures
Write code to make the
test pass
Write a failing test
class TestNodeMethods(object):
""" Test methods that help in retrieval and creation of Nodes. """
def test_node_retrieval(self, graph):
""" Test if getNode retrieves an existing node. """
assert graph.V().getNode(label="person", uid=100)
.count().next() == 1
assert graph.V().getNode(label="person", uid=101)
.count().next() == 1
Start Gremlin
Server
Testing Our Application – Unit Testing
Check if test passes
Use Fixtures
Write code to make the
test pass
Write a failing test
Start Gremlin
Server
def getNode(self, label, uid):
"""
Returns the node with the given label and uid.
Args: label (string): The label of the node to return
uid (string): Unique ID of the node
Raises: StopIteration: Node with the given label and uid does not exist
"""
return self.and_(__.hasLabel(label), __.has(T.id, uid))
Testing Our Application – Unit Testing
Check if test passes
Use Fixtures
Write code to make the
test pass
Write a failing test
Start Gremlin
Server
$ bin/gremlin-server.sh conf/gremlin-server-neo4j-python.yaml
class TestBasicTraversal(object):
"""
Tests for methods that help create edges or nodes
and methods that help populate the properties of these objects.
"""
@pytest.fixture(scope="module")
def graph(self):
""" Graph with two nodes and one edge connecting them. """
graph = Graph().traversal(CerebroTraversalSource)
.withRemote(
DriverRemoteConnection(GREMLIN_SERVER_HOST,
GREMLIN_SERVER_TRAVERSER))
graph.V().clear()
from_node = graph.addV("person").
property(T.id, 100).next()
to_node = graph.addV("person").
property(T.id, 101).next()
graph.addE("knows").from_(from_node).to(to_node)
.property("__id", "1")
.next()
yield graph
graph.V().clear()
Testing Our Application – Unit Testing
Check if test passes
class TestNodeMethods(object):
""" Test methods that help in retrieval and creation of Nodes. """
def test_node_retrieval(self, graph):
""" Test if getNode retrieves an existing node. """
assert graph.V().getNode(label="person", uid=100)
.count().next() == 1
assert graph.V().getNode(label="person", uid=101)
.count().next() == 1
Write code to make the
test pass
Write a failing test
Use Fixtures
Start Gremlin
Server
Testing Our Application – Unit Testing
[
{
"reactions": [
{
"name": "joy",
"users": [
"U5K7JUATE”
]
}
],
"attachments": [
{
...
}
],
"text": "<https://www.youtube.com/watch?v=4iEh1ykb13w>",
"ts": "1465895473.000050",
"user": "U37BF9457",
"type": "message”
}
]
Testing Our Application – Unit Testing
class MessageSchema(Schema):
""" Holds all the required fields for a message object."""
. . .
• Fixture used to test if the
MessageSchema class is
implemented correctly
[
{
"reactions": [
{
"name": "joy",
"users": [
"U5K7JUATE”
]
}
],
"attachments": [
{...}
],
"text": ” <@U123456> <https://www.youtube.com/watch?v=4iEh1ykb13w>",
"mentions": [
"U123456”
],
"ts": ”a
"type": "message”
}
]
Testing Our Application – Unit Testing
class MessageSchema(Schema):
""" Holds all the required fields for a message object."""
mentions = fields.List(fields.Str(validate=is_user_uid))
• MessageSchema needs
to include mentions
• Update the fixture to
be able to test that the
schema includes
mentions
• Need to validate if
traversals pick up
mentions
Write code to make the
test pass
Write a failing test
[
{
"reactions": [
{
"name": "joy",
"users": [
"U5K7JUATE”
]
}
],
"attachments": [
{...}
],
"text": ” <@U123456> <https://www.youtube.com/watch?v=4iEh1ykb13w>",
"mentions": [
"U123456”
],
"ts": ”a
"type": "message”
}
]
gremlin> graph.io(graphson()).writeGraph("graph_name.json")
Testing Our Application – Unit Testing
Update JSON &
Generate GraphSON
Check if test passes
Use Fixtures
Start Gremlin
Server
Write code to make the
test pass
Write a failing test
@pytest.fixture(scope="module")
def slack_graph():
""" Open a subgraph on localhost for testing. """
slack.V().clear()
slack_client = Client(GREMLIN_SERVER_HOST, SLACK_TRAVERSER)
path_to_fixture = str(Path.cwd().joinpath(
"tests/fixtures/slack_graph.json"))
graphson_statement = 'graph.io(graphson()).readGraph("{}")’.
format(path_to_fixture)
slack_client.submit(graphson_statement).all().result()
yield slack
slack.V().clear()
Testing Our Application – Unit Testing
Update JSON &
Generate GraphSON
Check if test passes
Use Fixtures
Start Gremlin
Server
Testing the Application – CI/CD
• Automated tests using CircleCI
• Custom Configuration for Gremlin Server
• Caching Dependencies for Faster Tests
steps: #CircleCI 2.0
...
- run:
command: |
if [ ! -d ./apache-tinkerpop-gremlin-server-3.3.3 ]; then
curl -O https://archive.apache.org/dist/tinkerpop/3.3.3/apache-tinkerpop-gremlin-server-
3.3.3-bin.zip
unzip -q apache-tinkerpop-gremlin-server-3.3.3-bin.zip
# Install gremlin-python
cd ./apache-tinkerpop-gremlin-server-3.3.3 && 
./bin/gremlin-server.sh install org.apache.tinkerpop gremlin-python 3.3.3
# Change max content length and traversal strategy
sed -i -- 's/.*maxContentLength:.*/maxContentLength: 2621440/g' conf/gremlin-server.yaml
sed -i -- 's/graph.traversal()]/graph.traversal(),sg:
graph.traversal().withStrategies(ElementIdStrategy.build().create())]/g' 
./scripts/empty-sample.groovy
fi
...
Testing the Application – CI/CD
Testing the Application – CI/CD
steps: #CircleCI 2.0
- checkout
- restore_cache:
keys:
- v1-dependencies-{{ .Branch }}
- v1-dependencies-master
- run:
# Download and install Gremlin server
...
# Cache the installation
- save_cache:
key: v1-dependencies-{{ .Branch }}
paths:
- ~/src/app_name/apache-tinkerpop-gremlin-server-3.3.3
# Test
- run:
# Starting Gremlin Server
command: |
cd ./apache-tinkerpop-gremlin-server-3.3.3 && ./bin/gremlin-server.sh 
./conf/gremlin-server.yaml
background: true
# Sleep to give the gremlin server enough time to start
- run: sleep 10
- run: pycodestyle app_name
- run: coverage run --source=app_name -m pytest tests --capture=no --strict
- run: coverage report -m --fail-under=95
Testing the Application – CI/CD
Scaling Our Graph
• Async Traversals
• HA Cluster and Load Balancing
def seed_channels(data, team_uid):
for channel_data in data:
channel_uid, creator, members = (channel_data.pop(key) for
key in ["uid", "creator", "members"])
slack.V().addChannel(channel_uid, properties=channel_data).next()
slack.teams(team_uid).addTeamHasChannelEdge(team_uid, channel_uid).next()
slack.users(creator).addCreatedChannelEdge(creator, channel_uid).next()
slack.channels(channel_uid).addPartOfChannelEdges(channel_uid, *members).next()
def seed_channels(data, team_uid):
for channel_data in data:
channel_uid, creator, members = (channel_data.pop(key) for
key in ["uid", "creator", "members"])
slack.V().addChannel(channel_uid, properties=channel_data)
.addTeamHasChannelEdge(team_uid, channel_uid).inV()
.addCreatedChannelEdge(creator, channel_uid).inV()
.addPartOfChannelEdges(channel_uid, *members).next()
def seed_channels(data, team_uid):
for channel_data in data:
channel_uid, creator, members = (channel_data.pop(key) for
key in ["uid", "creator", "members"])
slack.V().addChannel(channel_uid, properties=channel_data)
.addTeamHasChannelEdge(team_uid, channel_uid).inV()
.addCreatedChannelEdge(creator, channel_uid).inV()
.addPartOfChannelEdges(channel_uid, *members).promise()
• Seed subgraph using “next”
• Reduce number of blocking calls to one
per channel
• Seed subgraph using “promise”
• Make seeding asynchronous, no
blocking calls
• Verify that the returned futures were
successful
• Seed individual entities using “next”
• Each call to “next” is blocking
Async Traversals
next()
next()
next()
next()
next()
promise()
HA Cluster and Load Balancing
• Preparing for high availability with Neo4J and Gremlin
• Configuring Gremlin Server and Neo4J
• Understanding the Neo4J HA Architecture
• Advantages
• Data replication
• Spread writes across instance
• Handle greater read loads
• HA cluster is fronted by a load balancer like HAProxy
• Reference:
• https://neo4j.com/docs/operations-manual/current/ha-cluster/architecture/
• http://tinkerpop.apache.org/docs/3.3.3/reference/#_high_availability_configuration
HA Cluster and Load Balancing
• Tuning parameters for the cluster
• Frequency of pulling updates from other members of the cluster
• gremlin.neo4j.conf.ha.pull_interval
• Number of slaves a transaction should be committed to
• gremlin.neo4j.conf.ha.tx_push_factor
• Tuning parameters for the Load Balancer
• Routing requests across the cluster
• balance
• Checking if the members in the cluster are responsive
• option httpchk
// gremlin-server-neo4j-ha-{1..3}.yaml
channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer
> curl "http://localhost:8182?gremlin=100-1"
Thank You
Graph Day SF 2018

Más contenido relacionado

La actualidad más candente

Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Philips Kokoh Prasetyo
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneRahul Jain
 
Data Exploration with Elasticsearch
Data Exploration with ElasticsearchData Exploration with Elasticsearch
Data Exploration with ElasticsearchAleksander Stensby
 
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013Niels Henrik Hagen
 

La actualidad más candente (6)

Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!Elasticsearch: You know, for search! and more!
Elasticsearch: You know, for search! and more!
 
Introduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of LuceneIntroduction to Elasticsearch with basics of Lucene
Introduction to Elasticsearch with basics of Lucene
 
Data Exploration with Elasticsearch
Data Exploration with ElasticsearchData Exploration with Elasticsearch
Data Exploration with Elasticsearch
 
Elasticsearch
ElasticsearchElasticsearch
Elasticsearch
 
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
Query log analytics - using logstash, elasticsearch and kibana 28.11.2013
 
tutorial2-notes2
tutorial2-notes2tutorial2-notes2
tutorial2-notes2
 

Similar a Extensible RESTful Applications with Apache TinkerPop

2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar Slides2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar SlidesDuraSpace
 
Building APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformBuilding APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformAntonio Peric-Mazar
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status FeedMongoDB
 
Scaling Analytics with elasticsearch
Scaling Analytics with elasticsearchScaling Analytics with elasticsearch
Scaling Analytics with elasticsearchdnoble00
 
Webinar: Build an Application Series - Session 2 - Getting Started
Webinar: Build an Application Series - Session 2 - Getting StartedWebinar: Build an Application Series - Session 2 - Getting Started
Webinar: Build an Application Series - Session 2 - Getting StartedMongoDB
 
Office Dev Day 2018 - Extending Microsoft Teams
Office Dev Day 2018 - Extending Microsoft TeamsOffice Dev Day 2018 - Extending Microsoft Teams
Office Dev Day 2018 - Extending Microsoft TeamsAndré Vala
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life琛琳 饶
 
An Introduction to Working With the Activity Stream
An Introduction to Working With the Activity StreamAn Introduction to Working With the Activity Stream
An Introduction to Working With the Activity StreamMikkel Flindt Heisterberg
 
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data AnalyticsAmazon Web Services
 
Mikkel Heisterberg - An introduction to developing for the Activity Stream
Mikkel Heisterberg - An introduction to developing for the Activity StreamMikkel Heisterberg - An introduction to developing for the Activity Stream
Mikkel Heisterberg - An introduction to developing for the Activity StreamLetsConnect
 
AWS CloudFormation under the Hood (DMG303) | AWS re:Invent 2013
AWS CloudFormation under the Hood (DMG303) | AWS re:Invent 2013AWS CloudFormation under the Hood (DMG303) | AWS re:Invent 2013
AWS CloudFormation under the Hood (DMG303) | AWS re:Invent 2013Amazon Web Services
 
Building APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformBuilding APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformAntonio Peric-Mazar
 
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...Prasoon Kumar
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overviewAmit Juneja
 
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"South Tyrol Free Software Conference
 
Mongo db eveningschemadesign
Mongo db eveningschemadesignMongo db eveningschemadesign
Mongo db eveningschemadesignMongoDB APAC
 
Xitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディング
Xitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディングXitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディング
Xitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディングscalaconfjp
 

Similar a Extensible RESTful Applications with Apache TinkerPop (20)

REST easy with API Platform
REST easy with API PlatformREST easy with API Platform
REST easy with API Platform
 
2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar Slides2.28.17 Introducing DSpace 7 Webinar Slides
2.28.17 Introducing DSpace 7 Webinar Slides
 
Building APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformBuilding APIs in an easy way using API Platform
Building APIs in an easy way using API Platform
 
Socialite, the Open Source Status Feed
Socialite, the Open Source Status FeedSocialite, the Open Source Status Feed
Socialite, the Open Source Status Feed
 
Scaling Analytics with elasticsearch
Scaling Analytics with elasticsearchScaling Analytics with elasticsearch
Scaling Analytics with elasticsearch
 
Webinar: Build an Application Series - Session 2 - Getting Started
Webinar: Build an Application Series - Session 2 - Getting StartedWebinar: Build an Application Series - Session 2 - Getting Started
Webinar: Build an Application Series - Session 2 - Getting Started
 
Office Dev Day 2018 - Extending Microsoft Teams
Office Dev Day 2018 - Extending Microsoft TeamsOffice Dev Day 2018 - Extending Microsoft Teams
Office Dev Day 2018 - Extending Microsoft Teams
 
How ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps lifeHow ElasticSearch lives in my DevOps life
How ElasticSearch lives in my DevOps life
 
An Introduction to Working With the Activity Stream
An Introduction to Working With the Activity StreamAn Introduction to Working With the Activity Stream
An Introduction to Working With the Activity Stream
 
F8 tech talk_pinterest_v4
F8 tech talk_pinterest_v4F8 tech talk_pinterest_v4
F8 tech talk_pinterest_v4
 
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
(BDT209) Launch: Amazon Elasticsearch For Real-Time Data Analytics
 
Mikkel Heisterberg - An introduction to developing for the Activity Stream
Mikkel Heisterberg - An introduction to developing for the Activity StreamMikkel Heisterberg - An introduction to developing for the Activity Stream
Mikkel Heisterberg - An introduction to developing for the Activity Stream
 
AWS CloudFormation under the Hood (DMG303) | AWS re:Invent 2013
AWS CloudFormation under the Hood (DMG303) | AWS re:Invent 2013AWS CloudFormation under the Hood (DMG303) | AWS re:Invent 2013
AWS CloudFormation under the Hood (DMG303) | AWS re:Invent 2013
 
Building APIs in an easy way using API Platform
Building APIs in an easy way using API PlatformBuilding APIs in an easy way using API Platform
Building APIs in an easy way using API Platform
 
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
 
Elasticsearch an overview
Elasticsearch   an overviewElasticsearch   an overview
Elasticsearch an overview
 
MongoDB Basics
MongoDB BasicsMongoDB Basics
MongoDB Basics
 
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
SFScon17 - Patrick Puecher: "Exploring data with Elasticsearch and Kibana"
 
Mongo db eveningschemadesign
Mongo db eveningschemadesignMongo db eveningschemadesign
Mongo db eveningschemadesign
 
Xitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディング
Xitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディングXitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディング
Xitrum Web Framework Live Coding Demos / Xitrum Web Framework ライブコーディング
 

Último

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxolyaivanovalion
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Standamitlee9823
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangaloreamitlee9823
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfMarinCaroMartnezBerg
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsJoseMangaJr1
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...amitlee9823
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramMoniSankarHazra
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAroojKhan71
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...amitlee9823
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...amitlee9823
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...amitlee9823
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightDelhi Call girls
 

Último (20)

Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bellandur ☎ 7737669865 🥵 Book Your One night Stand
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Midocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFxMidocean dropshipping via API with DroFx
Midocean dropshipping via API with DroFx
 
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night StandCall Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Attibele ☎ 7737669865 🥵 Book Your One night Stand
 
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service BangaloreCall Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
Call Girls Begur Just Call 👗 7737669865 👗 Top Class Call Girl Service Bangalore
 
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
Escorts Service Kumaraswamy Layout ☎ 7737669865☎ Book Your One night Stand (B...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
FESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdfFESE Capital Markets Fact Sheet 2024 Q1.pdf
FESE Capital Markets Fact Sheet 2024 Q1.pdf
 
Probability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter LessonsProbability Grade 10 Third Quarter Lessons
Probability Grade 10 Third Quarter Lessons
 
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
Call Girls Bannerghatta Road Just Call 👗 7737669865 👗 Top Class Call Girl Ser...
 
Capstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics ProgramCapstone Project on IBM Data Analytics Program
Capstone Project on IBM Data Analytics Program
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al BarshaAl Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
Al Barsha Escorts $#$ O565212860 $#$ Escort Service In Al Barsha
 
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men  🔝malwa🔝   Escorts Ser...
➥🔝 7737669865 🔝▻ malwa Call-girls in Women Seeking Men 🔝malwa🔝 Escorts Ser...
 
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
Mg Road Call Girls Service: 🍓 7737669865 🍓 High Profile Model Escorts | Banga...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
Call Girls Jalahalli Just Call 👗 7737669865 👗 Top Class Call Girl Service Ban...
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 nightCheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
Cheap Rate Call girls Sarita Vihar Delhi 9205541914 shot 1500 night
 
Predicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science ProjectPredicting Loan Approval: A Data Science Project
Predicting Loan Approval: A Data Science Project
 

Extensible RESTful Applications with Apache TinkerPop

  • 1. Extensible RESTful Applications with Apache Tinkerpop Graph Day SF 2018
  • 2. About Us LIKES { "first_name": "Varun", "last_name": "Ganesh", } { "first_name": "Harshvardhan", "last_name": "Joshi", } LIKES
  • 3. CONNECTING TO BUSINESS STACKS VISUALISATION CUSTOM BUILT INFOGRAPHICS NATURAL LANGUAGE GENERATED INSIGHTS EXPORT & SHARE STORIES EMAIL POWERPOINT, TV WEB Embedded SDK About CLIENTS • Automating the process of data storytelling • For more information, visit www.nugit.co
  • 4. Agenda • Use Cases • The Slack APIs • Defining the Entities • Graph Design and Considerations • Making the Graph RESTful • Building a DSL • Testing the Application • Scaling the Graph
  • 5. Use Cases - Communities • View contribution to communication • Participation across channels • Identify collaborative groups • Users connected by mentions and reactions • Identify influential users per channel
  • 6. • Highlight engaging conversations • Top videos, GIFs, links • Get insights across channels Use Cases – Top Posts
  • 7. Defining the Entities Top Post: • Files shared • Messages with attachments • Posts without replies or reactions are not considered
  • 8. Defining the Entities Notable Message: • Messages with reactions or replies • Replies and Comments that have reactions • Other alerts that gather reactions
  • 9. Defining the Entities Mention: • Replies and Comments can have mentions too • Ignore mentions that are unnecessary or alreadycaptured in a relationship
  • 10. Defining the Entities • Narrows down data required for the use case • Helps “whiteboarding” process for graph design • Allows defining schema for payloads • Requires understanding the nuances of the platform
  • 11. Graph Design and Considerations • Team node acts as root node • Allows maintaing separate graphs for different organisations
  • 12. Graph Design and Considerations • Top posts, notable messages are both message nodes • Differentiated using edge labels • Edge traversals favoured over property lookup
  • 13. Graph Design and Considerations • Any user can comment on, react to or be mentioned in any message • Reaction type modelled as edge property • Efficient as use-case does not need filtering by reaction type
  • 14. Graph Design and Considerations • Same file shared across channels shares common pool of reactions • Schema respects Slack specific behaviour • Handles idempotency based on unique ID maintained by Slack
  • 15. Graph Design and Considerations
  • 16. { "type": "message", "user": "U2FQG2G9F", "text": "next time you want cereal: n<https://www.instagram.com/p/BcDN4eWFjac/?taken- by=therock>", "attachments": [ { "service_name": "Instagram", "title": "Instagram post by @therock • Nov 28, 2017 at 7:14pm UTC", "title_link": "https://www.instagram.com/p/BcDN4eWFjac/?taken-by=therock", "text": "346.3k Likes, 2,167 Comments - @therock on Instagram:”……”", "fallback": "Instagram: Instagram post by @therock • Nov 28, 2017 at 7:14pm UTC", "image_url": "https://scontent-iad3-1.cdninstagram.com/t51.2885-15/e35/24178_n.jpg", "from_url": "https://www.instagram.com/p/BcDN4eWFjac/?taken-by=therock", "image_width": 334, "image_height": 250, "image_bytes": 178559, "service_icon": "https://www.instagram.com/static/images/ico/appl.png/932e4d9af891.png", "id": 1 } ], "thread_ts": "1511936426.000178", "reply_count": 3, "replies": [ { "user": "U193XDML7", "ts": "1511953167.000138” }, { "user": "U2FQG2G9F", "ts": "1511953180.000044" }, { "user": "U193XDML7", "ts": "1511953192.000230” } ], "ts": "1511936426.000178", "reactions": [ { "name": "smile", "users": [ "U193XDML7” ], "count": 1 }, { "name": "obesecat", "users": [ "U193XDML7” ], "count": 1 } ] } The Slack APIs Endpoint: https://slack.com/api/conversations.history
  • 17. Endpoint: https://slack.com/api/conversations.history [ { "type": "message", "user": "U4BPQR94L", "text": "Yinghui Malmsteen <@U2FQG2G9F>n<https://www.youtube.com/watch?v=D4OxW_0qqv8>", "attachments": [ { ... } ], "ts": "1536057373.000100", "reactions": [ { "name": "flag-se", "users": [ "U58LYK8Q6" ], "count": 1 } ] } ] [ { "user": "U2Q2U37SA", "inviter": "U0LPSJQR0", "text": "<@U2Q2U57SA> has joined the channel", "type": "message", "subtype": "channel_join", "ts": "1536138265.000200” } ] The Slack APIs
  • 18. [ { "id": "U4C0FDU2J", "team_id": "T028ZLMQN", "name": "friendlybotdev", "deleted": true, "profile": { "title": "", "phone": "", "skype": "", "real_name": "Friendly Bot", "real_name_normalized": "Friendly Bot", "display_name": "friendlybotdev", "display_name_normalized": "friendlybotdev", "status_text": "", "status_emoji": "", "status_expiration": 0, "avatar_hash": "123456", "bot_id": "B4B47T0G3", "api_app_id": "A4B92ZEER", "always_active": true, "image_original": "https://slack-edge.com/2017-06-21/123456_original.png", "first_name": "Friendly", "last_name": "Bot", "image_24": "https://slack-edge.com/2017-06-21/123456_24.png", "image_32": "https://slack-edge.com/2017-06-21/123456_32.png", "image_48": "https://slack-edge.com/2017-06-21/123456_48.png", "image_72": "https://slack-edge.com/2017-06-21/123456_72.png", "image_192": "https://slack-edge.com/2017-06-21/123456_192.png", "image_512": "https://slack-edge.com/2017-06-21/123456_512.png", "image_1024": "https://slack-edge.com/2017-06-21/123456_1024.png", "status_text_canonical": "", "team": "T028Z5MQN" }, "is_bot": true, "is_app_user": false, "updated": 1517305013 } ] [ { "id": "C8KMHCN5D", "name": "arandomchannel", "is_channel": true, "created": 1507613685, "creator": "U5BG5XU6T", "is_shared": false, "is_member": true, "is_private": false, "last_read": "1533892238.000324", "latest": { "type": "message", "user": "U84K3ZTF9", "text": "let's meetup tomorrow", "ts": "1536139470.000100" }, "unread_count": 7, "unread_count_display": 7, "members": [ "U08ED90CD", "U0LPSJQR0", "U193XDML7", "U9LKWV9C1", "UBJ4CHV5L" ], "topic": { "value": "place for people who are interested in sharing and learning", "creator": "U5BGLXU6T", "last_set": 1507613720 }, "purpose": { "value": "", "creator": "", "last_set": 0 }, "previous_names": [] } ] Endpoint: https://slack.com/api/users.list Endpoint: https://slack.com/api/channels.info The Slack APIs
  • 19. The Journey So Far • Defining entities and modelling them into Graph • Iterative feedback-drivenprocess • Understanding the data available from the API • Identifying unique IDs • Filtering out required fields
  • 20. Data Ingestion and Extraction • Apache Flink cluster retrieves, parses and filters Slack data • GraphQL service requests data for visualization • Flask REST service ingests/queries data to/from Tinkerpop POST PUT GET Gremlin-Python Gremlin Bytecode
  • 21. Why Tinkerpop? • Abstraction that lets us avoid vendor lock-in • Reduces rework when switching data stores • Gremlin query language • Hadoop and SparkComputer
  • 22. Making the Graph RESTful • Defining REST Endpoints • Defining the Resources • Remote Traversals
  • 23. • Write endpoints for seeding • POST /teams/<team_uid>/channels • POST /teams/<team_uid>/channels/<channel_uid>/messages • Handling Idempotency • Replace default strategy with ”ElementIDStrategy” • Enables creation of nodes with Slack specific unique IDs Defining REST Endpoints // scripts/empty-sample.groovy globals << [g : graph.traversal(),sg: graph.traversal().withStrategies(ElementIdStrategy.build().create())] • Read endpoints for queries • GET /teams/<team_uid>/top_posts
  • 24. Making the Graph RESTful • Setting up REST Endpoints • Defining the Resources • Remote Traversals
  • 25. Defining the Resources from marshmallow import Schema, fields, pre_load, pre_dump, post_load, validates_schema from marshmallow.exceptions import ValidationError ... class MessageSchema(Schema): """ Holds all the required fields for a message object.""" ts = fields.Float(required=True) text = fields.Str() comment = fields.Str() subtype = fields.Str() bot_id = fields.Str(validate=is_bot_uid) user = fields.Str(validate=is_user_uid) thread_ts = fields.Str() file_share = fields.Nested(FileShareSchema, load_from="file") attachments = fields.Nested(AttachmentSchema, many=True) reactions = fields.Nested(ReactionSchema, many=True) comments = fields.Nested(CommentSchema, many=True, load_from="replies") mentions = fields.List(fields.Str(validate=is_user_uid)) class AttachmentSchema(Schema): """ Holds all the required fields for an Attachment object.""" class ReactionSchema(Schema): """ Holds all the required fields for a reaction object.""" class CommentSchema(Schema): """ Holds all the required fields for a comment object.""" ... • Organized code with single point of reference
  • 26. Defining the Resources from marshmallow import Schema, fields, pre_load, pre_dump, post_load, validates_schema from marshmallow.exceptions import ValidationError ... class MessageSchema(Schema): """ Holds all the required fields for a message object.""" @validates_schema def validate_message(self, data): """ Validate if the message contains any of comments, mentions or reactions. """ if not any([f(data) for f in (has_comments, has_mentions, has_reactions)]): raise ValidationError("The message must contain comments, mentions or reactions") ts = fields.Float(required=True) text = fields.Str() comment = fields.Str() subtype = fields.Str() bot_id = fields.Str(validate=is_bot_uid) user = fields.Str(validate=is_user_uid) thread_ts = fields.Str() file_share = fields.Nested(FileShareSchema, load_from="file") attachments = fields.Nested(AttachmentSchema, many=True) reactions = fields.Nested(ReactionSchema, many=True) comments = fields.Nested(CommentSchema, many=True, load_from="replies") mentions = fields.List(fields.Str(validate=is_user_uid)) class AttachmentSchema(Schema): """ Holds all the required fields for an Attachment object.""" class ReactionSchema(Schema): """ Holds all the required fields for a reaction object.""" class CommentSchema(Schema): """ Holds all the required fields for a comment object.""" ... • Organized code with single point of reference • Validate data before ingestion • Enforce types and required fields @validates_schema def validate_message(self, data): """ Validate if the message contains any of comments, mentions or reactions. """ if not any([f(data) for f in (has_comments, has_mentions, has_reactions)]): raise ValidationError("The message must contain comments, mentions or reactions")
  • 27. from marshmallow import Schema, fields, pre_load, pre_dump, post_load, validates_schema from marshmallow.exceptions import ValidationError ... class MessageSchema(Schema): """ Holds all the required fields for a message object.""" class AttachmentSchema(Schema): """ Holds all the required fields for an Attachment object.""" title = fields.Str() fallback = fields.Str() text = fields.Str() thumb_url = fields.Str() image_url = fields.Str() title_link = fields.Str() @post_load def reshape_attachment(self, data): """ Apply required transformations on the Attachment object. ""” # Create a post_title field collapse_keys(data, "post_title", *("fallback", "title", "text")) # Create a post_thumbnail field collapse_keys(data, "post_thumbnail", *("thumb_url", "image_url", "title_link")) # Set post_type to URL data["post_type"] = "URL” class ReactionSchema(Schema): """ Holds all the required fields for a reaction object.""" class CommentSchema(Schema): """ Holds all the required fields for a comment object.""" class FileShareSchema(Schema): """ Holds all the required fields for a File Share object.""” class UserSchema(Schema): """ Holds all the required fields for a User object.""” ... • Organized code with single point of reference • Validate data before ingestion • Enforce types and required fields • Normalize fields with post- processing Defining the Resources @post_load def reshape_attachment(self, data): """ Apply required transformations on the Attachment object. ""” # Create a post_title field collapse_keys(data, "post_title", *("fallback", "title", "text")) # Create a post_thumbnail field collapse_keys(data, "post_thumbnail", *("thumb_url", "image_url", "title_link")) # Set post_type to URL data["post_type"] = "URL”
  • 28. Making the Graph RESTful • Schema enforcement and validation • Handling Idempotency of endpoints • Custom Traversal Source
  • 29. Remote Traversals • Bytecode sent over network instead of string • Allows using custom traversal source for a Domain Specific Language (DSL) from gremlin_python.driver.driver_remote_connection import DriverRemoteConnection ... conn = DriverRemoteConnection(GREMLIN_SERVER_HOST, 'sg') slack = Graph().traversal(SlackTraversalSource).withRemote(conn)
  • 30. Building a DSL • Motivations • Custom Workflows
  • 31. Building a DSL - Motivations class SlackTraversalSource(BaseTraversalSource): """ Module to initialise a Graph with the methods listed under SlackTraversal. """ def __init__(self, *args, **kwargs): super(SlackTraversalSource, self).__init__(*args, **kwargs) self.graph_traversal = SlackTraversal def channels(self, *channel_ids): """ Shorthand to identify all channel nodes""" return traversal • Custom traversal source can also specify useful shorthands • E.g. Traversing to all the Channel nodes traversal = self.get_graph_traversal() traversal.bytecode.add_step("V") traversal.bytecode.add_step("hasLabel", NODES.channel) if channel_ids: traversal.bytecode.add_step("has", "__id", P.within(channel_ids))
  • 32. Building a DSL - Motivations class SlackTraversal(BaseTraversal): def addPartOfChannelEdges(self, channel_uid, *user_uids, **kwargs): """ Add an edge to a channel from the users who were/are a part of the channel. ""” return self • Custom traversal source specifies business logic behind traversals • E.g. Connecting a User node to a Channel node for user_uid in user_uids: edge_uid = construct_uid(user_uid, channel_uid, EDGES.part_of.name, delim="|") self.getOrAddEdgeFrom(edge_label=EDGES.part_of, edge_uid=edge_uid, node_label=NODES.user, node_uid=user_uid) .upsertProperties(kwargs.get("properties")).inV()
  • 33. Building a DSL - Motivations from gremlin_python.process.graph_traversal import GraphTraversal from gremlin_python.process.graph_traversal import GraphTraversalSource, __ class BaseTraversal(GraphTraversal): def getOrAddEdgeFrom(self, edge_label, edge_uid, node_label, node_uid): """ Adds an edge from the node with the given label and uid only if the edge doesn’t exist. """ return self.coalesce( __.addE(edge_label).property(T.id, edge_uid).from_( __.V().getNode(node_label, node_uid))) __.InE(edge_label).hasId(edge_uid).and( __.outV().hasId(node_uid), __.outV().hasLabel(node_label)), • BaseTraversal handles creation of nodes and edges • These methods should guarantee idempotency • E.g. Creation of edges between two nodes… • ...checks for an existing edge
  • 34. Building a DSL - Motivations from gremlin_python.process.graph_traversal import GraphTraversal from gremlin_python.process.graph_traversal import GraphTraversalSource, __ class BaseTraversal(GraphTraversal): def getOrAddEdgeFrom(self, edge_label, edge_uid, node_label, node_uid): """ Adds an edge from the node with the given label and uid only if the edge doesn’t exist. """ return self.coalesce( __.InE(edge_label).hasId(edge_uid).and( __.outV().hasId(node_uid), __.outV().hasLabel(node_label)), __.addE(edge_label).property(T.id, edge_uid).from_( __.V().getNode(node_label, node_uid))) • The edge is created only if it doesn’t already exist
  • 35. def build_visualization(self, traversal_source, **kwargs): """ The below are standardized steps that are required to generate data for any visualization.""" return self.start(traversal_source) .filterByDate(self.date_dimension, kwargs.get("start_time"), kwargs.get("end_time")) .filterByFields(self.filters_map, kwargs.get("filters")) .sortByFields(self.sorting_map, kwargs.get("sort_field"), kwargs.get("sort_direction")) .buildObject(self.object_map).toList() Building a DSL – Custom Workflows • Standardized steps for generating a visualization are defined in the BaseTraversal • Custom maps define traversal paths for fields that vary across visualizations
  • 36. Building a DSL – Custom Workflows # Sample filter from frontend filter_obj = {'_and': [{"field": 'reactions', '_gte': 100}, {"field": 'post_creator', '_in': [‘bob’, ‘chloe'] }]} filter_map = {"post_creator": lambda pred: __.in_(EDGES.created_post).has(USER.display_name, pred), "reactions": lambda pred: __.inE(EDGES.reacted_to).count().is_(pred) } object_map = { "post_creator": {"uid": [__.in_(EDGES.created_post).values("__id"), __.constant("")], "image": ... # define similar path here, }, "reactions": __.inE(EDGES.reacted_to).groupCount().by(__.values(REACTION.name)) } start = lambda traversal_source: traversal_source.posts() # DSL generates the required lower level base traversals slack.posts().where( __.and_( __.inE(EDGES.reacted_to).count().is_(P.gte(100)), __.in_(EDGES.created_post).has(USER.display_name, P.within(['bob', 'chloe'])))). project("post_creator", "reactions").by( __.project("image", "display_name", "uid").by( __.in(EDGES.created_post).values(USER.image), __.in(EDGES.created_post).values(USER.display_name), __.in(EDGES.created_post).values("__id"))).by( __.inE(EDGES.reacted_to).groupCount()).toList() # Inject maps into DSL methods start(slack) .filterByFields(self.filters_map, kwargs.get("filters")) .buildObject(self.object_map) .toList() • The DSL takes in functions/paths that map fields to their traversals • Maps customized based on the visualization that is needed
  • 37. Building a DSL – Custom Workflows { "reactions": { "palm_tree": 82, "robot_face": 18 }, "post_creator": { "image": "https://url_of_image.jpg", "display_name": ”chloe", "uid": "U024ZH7HL” } } • The traversals generated churn out the final response objects • Objects rendered into visualizations by the client
  • 38. Testing the Application • Unit Tests • Validating traversals on Gremlin Server
  • 39. Check if test passes Use Fixtures Write code to make the test pass Write a failing test class TestNodeMethods(object): """ Test methods that help in retrieval and creation of Nodes. """ def test_node_retrieval(self, graph): """ Test if getNode retrieves an existing node. """ assert graph.V().getNode(label="person", uid=100) .count().next() == 1 assert graph.V().getNode(label="person", uid=101) .count().next() == 1 Start Gremlin Server Testing Our Application – Unit Testing
  • 40. Check if test passes Use Fixtures Write code to make the test pass Write a failing test Start Gremlin Server def getNode(self, label, uid): """ Returns the node with the given label and uid. Args: label (string): The label of the node to return uid (string): Unique ID of the node Raises: StopIteration: Node with the given label and uid does not exist """ return self.and_(__.hasLabel(label), __.has(T.id, uid)) Testing Our Application – Unit Testing
  • 41. Check if test passes Use Fixtures Write code to make the test pass Write a failing test Start Gremlin Server $ bin/gremlin-server.sh conf/gremlin-server-neo4j-python.yaml class TestBasicTraversal(object): """ Tests for methods that help create edges or nodes and methods that help populate the properties of these objects. """ @pytest.fixture(scope="module") def graph(self): """ Graph with two nodes and one edge connecting them. """ graph = Graph().traversal(CerebroTraversalSource) .withRemote( DriverRemoteConnection(GREMLIN_SERVER_HOST, GREMLIN_SERVER_TRAVERSER)) graph.V().clear() from_node = graph.addV("person"). property(T.id, 100).next() to_node = graph.addV("person"). property(T.id, 101).next() graph.addE("knows").from_(from_node).to(to_node) .property("__id", "1") .next() yield graph graph.V().clear() Testing Our Application – Unit Testing
  • 42. Check if test passes class TestNodeMethods(object): """ Test methods that help in retrieval and creation of Nodes. """ def test_node_retrieval(self, graph): """ Test if getNode retrieves an existing node. """ assert graph.V().getNode(label="person", uid=100) .count().next() == 1 assert graph.V().getNode(label="person", uid=101) .count().next() == 1 Write code to make the test pass Write a failing test Use Fixtures Start Gremlin Server Testing Our Application – Unit Testing
  • 43. [ { "reactions": [ { "name": "joy", "users": [ "U5K7JUATE” ] } ], "attachments": [ { ... } ], "text": "<https://www.youtube.com/watch?v=4iEh1ykb13w>", "ts": "1465895473.000050", "user": "U37BF9457", "type": "message” } ] Testing Our Application – Unit Testing class MessageSchema(Schema): """ Holds all the required fields for a message object.""" . . . • Fixture used to test if the MessageSchema class is implemented correctly
  • 44. [ { "reactions": [ { "name": "joy", "users": [ "U5K7JUATE” ] } ], "attachments": [ {...} ], "text": ” <@U123456> <https://www.youtube.com/watch?v=4iEh1ykb13w>", "mentions": [ "U123456” ], "ts": ”a "type": "message” } ] Testing Our Application – Unit Testing class MessageSchema(Schema): """ Holds all the required fields for a message object.""" mentions = fields.List(fields.Str(validate=is_user_uid)) • MessageSchema needs to include mentions • Update the fixture to be able to test that the schema includes mentions • Need to validate if traversals pick up mentions
  • 45. Write code to make the test pass Write a failing test [ { "reactions": [ { "name": "joy", "users": [ "U5K7JUATE” ] } ], "attachments": [ {...} ], "text": ” <@U123456> <https://www.youtube.com/watch?v=4iEh1ykb13w>", "mentions": [ "U123456” ], "ts": ”a "type": "message” } ] gremlin> graph.io(graphson()).writeGraph("graph_name.json") Testing Our Application – Unit Testing Update JSON & Generate GraphSON Check if test passes Use Fixtures Start Gremlin Server
  • 46. Write code to make the test pass Write a failing test @pytest.fixture(scope="module") def slack_graph(): """ Open a subgraph on localhost for testing. """ slack.V().clear() slack_client = Client(GREMLIN_SERVER_HOST, SLACK_TRAVERSER) path_to_fixture = str(Path.cwd().joinpath( "tests/fixtures/slack_graph.json")) graphson_statement = 'graph.io(graphson()).readGraph("{}")’. format(path_to_fixture) slack_client.submit(graphson_statement).all().result() yield slack slack.V().clear() Testing Our Application – Unit Testing Update JSON & Generate GraphSON Check if test passes Use Fixtures Start Gremlin Server
  • 47. Testing the Application – CI/CD • Automated tests using CircleCI • Custom Configuration for Gremlin Server • Caching Dependencies for Faster Tests
  • 48. steps: #CircleCI 2.0 ... - run: command: | if [ ! -d ./apache-tinkerpop-gremlin-server-3.3.3 ]; then curl -O https://archive.apache.org/dist/tinkerpop/3.3.3/apache-tinkerpop-gremlin-server- 3.3.3-bin.zip unzip -q apache-tinkerpop-gremlin-server-3.3.3-bin.zip # Install gremlin-python cd ./apache-tinkerpop-gremlin-server-3.3.3 && ./bin/gremlin-server.sh install org.apache.tinkerpop gremlin-python 3.3.3 # Change max content length and traversal strategy sed -i -- 's/.*maxContentLength:.*/maxContentLength: 2621440/g' conf/gremlin-server.yaml sed -i -- 's/graph.traversal()]/graph.traversal(),sg: graph.traversal().withStrategies(ElementIdStrategy.build().create())]/g' ./scripts/empty-sample.groovy fi ... Testing the Application – CI/CD
  • 49. Testing the Application – CI/CD steps: #CircleCI 2.0 - checkout - restore_cache: keys: - v1-dependencies-{{ .Branch }} - v1-dependencies-master - run: # Download and install Gremlin server ... # Cache the installation - save_cache: key: v1-dependencies-{{ .Branch }} paths: - ~/src/app_name/apache-tinkerpop-gremlin-server-3.3.3
  • 50. # Test - run: # Starting Gremlin Server command: | cd ./apache-tinkerpop-gremlin-server-3.3.3 && ./bin/gremlin-server.sh ./conf/gremlin-server.yaml background: true # Sleep to give the gremlin server enough time to start - run: sleep 10 - run: pycodestyle app_name - run: coverage run --source=app_name -m pytest tests --capture=no --strict - run: coverage report -m --fail-under=95 Testing the Application – CI/CD
  • 51. Scaling Our Graph • Async Traversals • HA Cluster and Load Balancing
  • 52. def seed_channels(data, team_uid): for channel_data in data: channel_uid, creator, members = (channel_data.pop(key) for key in ["uid", "creator", "members"]) slack.V().addChannel(channel_uid, properties=channel_data).next() slack.teams(team_uid).addTeamHasChannelEdge(team_uid, channel_uid).next() slack.users(creator).addCreatedChannelEdge(creator, channel_uid).next() slack.channels(channel_uid).addPartOfChannelEdges(channel_uid, *members).next() def seed_channels(data, team_uid): for channel_data in data: channel_uid, creator, members = (channel_data.pop(key) for key in ["uid", "creator", "members"]) slack.V().addChannel(channel_uid, properties=channel_data) .addTeamHasChannelEdge(team_uid, channel_uid).inV() .addCreatedChannelEdge(creator, channel_uid).inV() .addPartOfChannelEdges(channel_uid, *members).next() def seed_channels(data, team_uid): for channel_data in data: channel_uid, creator, members = (channel_data.pop(key) for key in ["uid", "creator", "members"]) slack.V().addChannel(channel_uid, properties=channel_data) .addTeamHasChannelEdge(team_uid, channel_uid).inV() .addCreatedChannelEdge(creator, channel_uid).inV() .addPartOfChannelEdges(channel_uid, *members).promise() • Seed subgraph using “next” • Reduce number of blocking calls to one per channel • Seed subgraph using “promise” • Make seeding asynchronous, no blocking calls • Verify that the returned futures were successful • Seed individual entities using “next” • Each call to “next” is blocking Async Traversals next() next() next() next() next() promise()
  • 53. HA Cluster and Load Balancing • Preparing for high availability with Neo4J and Gremlin • Configuring Gremlin Server and Neo4J • Understanding the Neo4J HA Architecture • Advantages • Data replication • Spread writes across instance • Handle greater read loads • HA cluster is fronted by a load balancer like HAProxy • Reference: • https://neo4j.com/docs/operations-manual/current/ha-cluster/architecture/ • http://tinkerpop.apache.org/docs/3.3.3/reference/#_high_availability_configuration
  • 54. HA Cluster and Load Balancing • Tuning parameters for the cluster • Frequency of pulling updates from other members of the cluster • gremlin.neo4j.conf.ha.pull_interval • Number of slaves a transaction should be committed to • gremlin.neo4j.conf.ha.tx_push_factor • Tuning parameters for the Load Balancer • Routing requests across the cluster • balance • Checking if the members in the cluster are responsive • option httpchk // gremlin-server-neo4j-ha-{1..3}.yaml channelizer: org.apache.tinkerpop.gremlin.server.channel.WsAndHttpChannelizer > curl "http://localhost:8182?gremlin=100-1"