SlideShare una empresa de Scribd logo
1 de 29
Descargar para leer sin conexión
Powering  Enterprise  Data-­driven  Applications  with  Cassandra
“ 2
Be  Right  Faster
with  
Reliable  Data,  
Relevant  Insights,
Recommended  Actions
TM
#DataManagement
#BigData
#ML
©  2015.  All  Rights  Reserved.    
Anastasia  Zamyshlyaeva
VP  Platform  Product  Management  and  Co-­founder  @  Reltio  
• 2011  – started  working  with  C*
• 2012  – selected  C*  as  the  persistence  store  for  creating  a  hybrid  
Columnar  &  Graph  data-­store
• Since  2012  – Running  in  Production  to  support:  
– 24/7  uptime  with  99.995%  availability
– Multi-­Tenancy  across  customers
– both  Operational  and  Analytical  workloads
stasia@reltio.com
www.linkedin.com/in/azamyshlyaeva
©  2015.  All  Rights  Reserved.     3
“If you focus on the smallest details,
you never get the big picture right”
~  Leroy  Hood
©  2015.  All  Rights  Reserved.     4
©  2015.  All  Rights  Reserved.     5
©  2015.  All  Rights  Reserved.     6
©  2015.  All  Rights  Reserved.     7
©  2015.  All  Rights  Reserved.     8
Sales
Web  site
Support
Supply
Marketing
©  2015.  All  Rights  Reserved.     9
Sales
Web  site
Supply
MarketingSupport
©  2015.  All  Rights  Reserved.     10
Sales
Web  site
Supply
MarketingSupport
Enterprise  Applications  Ecosystem
11©  2015.  All  Rights  Reserved.    
Is  data  up-­to-­date?
Is  data  
correct?
?
? ?Is  data  complete?
©  2015.  All  Rights  Reserved.     12
©  2015.  All  Rights  Reserved.     13
Sales
Web  site
Data  Unification  Application
Supply
(based  on  Relational  Databases)
• Fixed  structure
• No  big  data
• Expensive
• Hard  to  support  graphs  and  complex  attributes
• Single  point  of  failure  (often) MarketingSupport
©  2015.  All  Rights  Reserved.     14
Sales
Web  site
Supply
MarketingSupport (based  on  Cassandra)
Why  Cassandra?
üHigh performance
üFault tolerance
üLinear scalability
üMulti-datacenter
©  2015.  All  Rights  Reserved.     15
Reltio Metadata-driven
Model and Operations
©  2015.  All  Rights  Reserved.     16
Doctors  and  
Hospitals
Schema
configure
UI,  REST  API,  
Analytics
©  2015.  All  Rights  Reserved.     17
Oil  &  Gas
Schema
Reltio Metadata-driven
Model and Operations
UI,  REST  API,  
Analyticsconfigure
©  2015.  All  Rights  Reserved.     18
Asset  
Catalog
Schema
Reltio Metadata-driven
Model and Operations
UI,  REST  API,  
Analyticsconfigure
AMan
Cassandra  is  a  primary  datastore
©  2015.  All  Rights  Reserved.     19
©  2015.  All  Rights  Reserved.     20
ID: doc1
Type: Individual
Name: John
Email: john@gmail.com
john@yahoo.com
Address: CA, shipping
NY, billing
Entity type: Individual
- Name: String
- Email: List
- Address: Complex
- State: String
- Type: List
Metadata Entity
doc1
<Name>.1 …
John
Simple  metadata-­driven  attributes  in  Cassandra  (Thrift  API)
Metadata-­driven  Documents  in  Columnar  storage
ID: doc1
Type: Individual
Name: John
Email: john@gmail.com
john@yahoo.com
Address: CA, shipping
NY, billing
Entity type: Individual
- Name: String
- Email: List
- Address: Complex
- State: String
- Type: List
©  2015.  All  Rights  Reserved.     21
Entity
doc1
… <Email>.1 <Email>.2 …
… john@gmail.com john@yahoo.com
Multi-­value  metadata-­driven attributes  in  Cassandra  (Thrift  API)
Metadata
Metadata-­driven  Documents  in  Columnar  storage
ID: doc1
Type: Individual
Name: John
Email: john@gmail.com
john@yahoo.com
Address: CA, shipping (1)
NY, billing (2)
©  2015.  All  Rights  Reserved.     22
Entity
doc1
… <Address>.1.<State>.1 <Address>.1.<Type>.1 <Address>.2.<State>.1 …
… CA billing NY
Complex  metadata-­driven  attributes  in  Cassandra  (Thrift  API)
Metadata
Metadata-­driven  Documents  in  Columnar  storage
Entity type: Individual
- Name: String
- Email: List
- Address: Complex
- State: String
- Type: List
©  2015.  All  Rights  Reserved.     23
Metadata-­driven  Documents  – CQL  wide  rows
CREATE TABLE ENTITIES(
doc_id int,
attribute_name String,
attribute_value String,
…
PRIMARY KEY (doc_id, attribute_name)
);
SELECT * -- select all addresses
FROM ENTITIES
WHERE doc_id = 1
AND attribute_name >= Address.0
AND attribute_name <= Address.9;
©  2015.  All  Rights  Reserved.     24
John
Dunder
Mifflin
Dwight
Copy
Paper
Employee Individual
ProductOrganization Cassandra
-­ Records  storage  
across  datacenters
Reltio
-­ Metadata-­driven  graphs
-­ Rich  model  for  entities,  relations
-­ Partitioning
-­ Effective  joins
-­ Graph  operations
Hybrid  Graphs  -­ linked  entities  with  infinite  attribution
25
Reltio  de-­duplication
John Smith
Jon Smith
©  2015.  All  Rights  Reserved.     26
Cassandra+ = Hybrid searchElasticsearch*
*  excluded  documents
Hybrid  Search  – without  documents!
0
0.5
1
1.5
Data  volume  in  
Elasticsearch index  (Tb)
0
1000
2000
Elasticsearch indexing  
performance  (OPS)
0
10
20
30
Search  performance  on  
large  documents  (sec)
-­ Elasticsearch
-­ Hybrid  search:  Elasticsearch +  Cassandra
Reltio  Cloud  Data  Components
©  2015.  All  Rights  Reserved.    
Spark
AWS
AWS  Redshift
Cassandra
Elasticsearch
Reltio  Use  Cases
©  2015.  All  Rights  Reserved.     28
AManag
Thank  you

Más contenido relacionado

La actualidad más candente

Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
Jeffrey T. Pollock
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
Denodo
 

La actualidad más candente (20)

How to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at ScaleHow to Use a Semantic Layer to Deliver Actionable Insights at Scale
How to Use a Semantic Layer to Deliver Actionable Insights at Scale
 
Data Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to MeshData Mesh Part 4 Monolith to Mesh
Data Mesh Part 4 Monolith to Mesh
 
Data Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & AthenaData Catalog & ETL - Glue & Athena
Data Catalog & ETL - Glue & Athena
 
Activate Data Governance Using the Data Catalog
Activate Data Governance Using the Data CatalogActivate Data Governance Using the Data Catalog
Activate Data Governance Using the Data Catalog
 
Slides: Knowledge Graphs vs. Property Graphs
Slides: Knowledge Graphs vs. Property GraphsSlides: Knowledge Graphs vs. Property Graphs
Slides: Knowledge Graphs vs. Property Graphs
 
The Business Value of Metadata for Data Governance
The Business Value of Metadata for Data GovernanceThe Business Value of Metadata for Data Governance
The Business Value of Metadata for Data Governance
 
Data Catalog as a Business Enabler
Data Catalog as a Business EnablerData Catalog as a Business Enabler
Data Catalog as a Business Enabler
 
Gartner: Master Data Management Functionality
Gartner: Master Data Management FunctionalityGartner: Master Data Management Functionality
Gartner: Master Data Management Functionality
 
Enabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data VirtualizationEnabling a Data Mesh Architecture with Data Virtualization
Enabling a Data Mesh Architecture with Data Virtualization
 
MDM & BI Strategy For Large Enterprises
MDM & BI Strategy For Large EnterprisesMDM & BI Strategy For Large Enterprises
MDM & BI Strategy For Large Enterprises
 
DAS Slides: Data Governance - Combining Data Management with Organizational ...
DAS Slides: Data Governance -  Combining Data Management with Organizational ...DAS Slides: Data Governance -  Combining Data Management with Organizational ...
DAS Slides: Data Governance - Combining Data Management with Organizational ...
 
Improving Data Literacy Around Data Architecture
Improving Data Literacy Around Data ArchitectureImproving Data Literacy Around Data Architecture
Improving Data Literacy Around Data Architecture
 
Enterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data ArchitectureEnterprise Architecture vs. Data Architecture
Enterprise Architecture vs. Data Architecture
 
Data Mesh for Dinner
Data Mesh for DinnerData Mesh for Dinner
Data Mesh for Dinner
 
Data Management Maturity Assessment
Data Management Maturity AssessmentData Management Maturity Assessment
Data Management Maturity Assessment
 
Enterprise Data Architecture Deliverables
Enterprise Data Architecture DeliverablesEnterprise Data Architecture Deliverables
Enterprise Data Architecture Deliverables
 
Master Data Management
Master Data ManagementMaster Data Management
Master Data Management
 
Master Data Management methodology
Master Data Management methodologyMaster Data Management methodology
Master Data Management methodology
 
Graph Databases – Benefits and Risks
Graph Databases – Benefits and RisksGraph Databases – Benefits and Risks
Graph Databases – Benefits and Risks
 
Introduction to Data Governance
Introduction to Data GovernanceIntroduction to Data Governance
Introduction to Data Governance
 

Destacado

Using a Graph Database for Next-Gen MDM
Using a Graph Database for Next-Gen MDMUsing a Graph Database for Next-Gen MDM
Using a Graph Database for Next-Gen MDM
Neo4j
 

Destacado (7)

Applying Graph DB to Enterprise MDM
Applying Graph DB to Enterprise MDMApplying Graph DB to Enterprise MDM
Applying Graph DB to Enterprise MDM
 
Data Modeling & Metadata for Graph Databases
Data Modeling & Metadata for Graph DatabasesData Modeling & Metadata for Graph Databases
Data Modeling & Metadata for Graph Databases
 
Big MDM Part 2: Using a Graph Database for MDM and Relationship Management
Big MDM Part 2: Using a Graph Database for MDM and Relationship ManagementBig MDM Part 2: Using a Graph Database for MDM and Relationship Management
Big MDM Part 2: Using a Graph Database for MDM and Relationship Management
 
Using a Graph Database for Next-Gen MDM
Using a Graph Database for Next-Gen MDMUsing a Graph Database for Next-Gen MDM
Using a Graph Database for Next-Gen MDM
 
Graph Databases for Master Data Management
Graph Databases for Master Data ManagementGraph Databases for Master Data Management
Graph Databases for Master Data Management
 
Graph database Use Cases
Graph database Use CasesGraph database Use Cases
Graph database Use Cases
 
How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...How to identify the correct Master Data subject areas & tooling for your MDM...
How to identify the correct Master Data subject areas & tooling for your MDM...
 

Similar a Reltio: Powering Enterprise Data-driven Applications with Cassandra

Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
NoSQLmatters
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
Amazon Web Services
 
Anzo smart data integration february 2015
Anzo smart data integration february 2015Anzo smart data integration february 2015
Anzo smart data integration february 2015
John Rueter
 
Data Provisioning & Optimization
Data Provisioning & OptimizationData Provisioning & Optimization
Data Provisioning & Optimization
Ambareesh Kulkarni
 

Similar a Reltio: Powering Enterprise Data-driven Applications with Cassandra (20)

Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
Sydney: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cloud
 
Tapdata Product Intro
Tapdata Product IntroTapdata Product Intro
Tapdata Product Intro
 
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
Akmal Chaudhri - How to Build Streaming Data Applications: Evaluating the Top...
 
When and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data ArchitectureWhen and How Data Lakes Fit into a Modern Data Architecture
When and How Data Lakes Fit into a Modern Data Architecture
 
SendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data WarehousingSendGrid Improves Email Delivery with Hybrid Data Warehousing
SendGrid Improves Email Delivery with Hybrid Data Warehousing
 
Key Methodologies for Migrating from Oracle to Postgres
Key Methodologies for Migrating from Oracle to PostgresKey Methodologies for Migrating from Oracle to Postgres
Key Methodologies for Migrating from Oracle to Postgres
 
The Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of HadoopThe Maturity Model: Taking the Growing Pains Out of Hadoop
The Maturity Model: Taking the Growing Pains Out of Hadoop
 
Modern Data Management for Federal Modernization
Modern Data Management for Federal ModernizationModern Data Management for Federal Modernization
Modern Data Management for Federal Modernization
 
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
Dr. Christian Kurze from Denodo, "Data Virtualization: Fulfilling the Promise...
 
Accelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and VisualizationAccelerate Self-Service Analytics with Data Virtualization and Visualization
Accelerate Self-Service Analytics with Data Virtualization and Visualization
 
Self-Service Analytics with Guard Rails
Self-Service Analytics with Guard RailsSelf-Service Analytics with Guard Rails
Self-Service Analytics with Guard Rails
 
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
Accelerate Digital Transformation Through AI-powered Cloud Analytics Moderniz...
 
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
AWS Summit Singapore - Accelerate Digital Transformation through AI-powered C...
 
Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)Datastax - The Architect's guide to customer experience (CX)
Datastax - The Architect's guide to customer experience (CX)
 
Turn Big Data into Big Value on Informatica and AWS
Turn Big Data into Big Value on Informatica and AWSTurn Big Data into Big Value on Informatica and AWS
Turn Big Data into Big Value on Informatica and AWS
 
Oil and gas big data edition
Oil and gas  big data editionOil and gas  big data edition
Oil and gas big data edition
 
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
Melbourne: Certus Data 2.0 Vault Meetup with Snowflake - Data Vault In The Cl...
 
Anzo smart data integration february 2015
Anzo smart data integration february 2015Anzo smart data integration february 2015
Anzo smart data integration february 2015
 
Leap to Next Generation Data Management with Denodo 7.0
Leap to Next Generation Data Management with Denodo 7.0Leap to Next Generation Data Management with Denodo 7.0
Leap to Next Generation Data Management with Denodo 7.0
 
Data Provisioning & Optimization
Data Provisioning & OptimizationData Provisioning & Optimization
Data Provisioning & Optimization
 

Más de DataStax Academy

Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
DataStax Academy
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
DataStax Academy
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
DataStax Academy
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
DataStax Academy
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
DataStax Academy
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
DataStax Academy
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
DataStax Academy
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
DataStax Academy
 

Más de DataStax Academy (20)

Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craftForrester CXNYC 2017 - Delivering great real-time cx is a true craft
Forrester CXNYC 2017 - Delivering great real-time cx is a true craft
 
Introduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph DatabaseIntroduction to DataStax Enterprise Graph Database
Introduction to DataStax Enterprise Graph Database
 
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache CassandraIntroduction to DataStax Enterprise Advanced Replication with Apache Cassandra
Introduction to DataStax Enterprise Advanced Replication with Apache Cassandra
 
Cassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart LabsCassandra on Docker @ Walmart Labs
Cassandra on Docker @ Walmart Labs
 
Cassandra 3.0 Data Modeling
Cassandra 3.0 Data ModelingCassandra 3.0 Data Modeling
Cassandra 3.0 Data Modeling
 
Cassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stackCassandra Adoption on Cisco UCS & Open stack
Cassandra Adoption on Cisco UCS & Open stack
 
Data Modeling for Apache Cassandra
Data Modeling for Apache CassandraData Modeling for Apache Cassandra
Data Modeling for Apache Cassandra
 
Coursera Cassandra Driver
Coursera Cassandra DriverCoursera Cassandra Driver
Coursera Cassandra Driver
 
Production Ready Cassandra
Production Ready CassandraProduction Ready Cassandra
Production Ready Cassandra
 
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & PythonCassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
Cassandra @ Netflix: Monitoring C* at Scale, Gossip and Tickler & Python
 
Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1Cassandra @ Sony: The good, the bad, and the ugly part 1
Cassandra @ Sony: The good, the bad, and the ugly part 1
 
Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2Cassandra @ Sony: The good, the bad, and the ugly part 2
Cassandra @ Sony: The good, the bad, and the ugly part 2
 
Standing Up Your First Cluster
Standing Up Your First ClusterStanding Up Your First Cluster
Standing Up Your First Cluster
 
Real Time Analytics with Dse
Real Time Analytics with DseReal Time Analytics with Dse
Real Time Analytics with Dse
 
Introduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache CassandraIntroduction to Data Modeling with Apache Cassandra
Introduction to Data Modeling with Apache Cassandra
 
Cassandra Core Concepts
Cassandra Core ConceptsCassandra Core Concepts
Cassandra Core Concepts
 
Enabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax EnterpriseEnabling Search in your Cassandra Application with DataStax Enterprise
Enabling Search in your Cassandra Application with DataStax Enterprise
 
Bad Habits Die Hard
Bad Habits Die Hard Bad Habits Die Hard
Bad Habits Die Hard
 
Advanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache CassandraAdvanced Data Modeling with Apache Cassandra
Advanced Data Modeling with Apache Cassandra
 
Advanced Cassandra
Advanced CassandraAdvanced Cassandra
Advanced Cassandra
 

Último

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
Earley Information Science
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Reltio: Powering Enterprise Data-driven Applications with Cassandra

  • 1. Powering  Enterprise  Data-­driven  Applications  with  Cassandra
  • 2. “ 2 Be  Right  Faster with   Reliable  Data,   Relevant  Insights, Recommended  Actions TM #DataManagement #BigData #ML ©  2015.  All  Rights  Reserved.    
  • 3. Anastasia  Zamyshlyaeva VP  Platform  Product  Management  and  Co-­founder  @  Reltio   • 2011  – started  working  with  C* • 2012  – selected  C*  as  the  persistence  store  for  creating  a  hybrid   Columnar  &  Graph  data-­store • Since  2012  – Running  in  Production  to  support:   – 24/7  uptime  with  99.995%  availability – Multi-­Tenancy  across  customers – both  Operational  and  Analytical  workloads stasia@reltio.com www.linkedin.com/in/azamyshlyaeva ©  2015.  All  Rights  Reserved.     3
  • 4. “If you focus on the smallest details, you never get the big picture right” ~  Leroy  Hood ©  2015.  All  Rights  Reserved.     4
  • 5. ©  2015.  All  Rights  Reserved.     5
  • 6. ©  2015.  All  Rights  Reserved.     6
  • 7. ©  2015.  All  Rights  Reserved.     7
  • 8. ©  2015.  All  Rights  Reserved.     8 Sales Web  site Support Supply Marketing
  • 9. ©  2015.  All  Rights  Reserved.     9 Sales Web  site Supply MarketingSupport
  • 10. ©  2015.  All  Rights  Reserved.     10 Sales Web  site Supply MarketingSupport
  • 11. Enterprise  Applications  Ecosystem 11©  2015.  All  Rights  Reserved.     Is  data  up-­to-­date? Is  data   correct? ? ? ?Is  data  complete?
  • 12. ©  2015.  All  Rights  Reserved.     12
  • 13. ©  2015.  All  Rights  Reserved.     13 Sales Web  site Data  Unification  Application Supply (based  on  Relational  Databases) • Fixed  structure • No  big  data • Expensive • Hard  to  support  graphs  and  complex  attributes • Single  point  of  failure  (often) MarketingSupport
  • 14. ©  2015.  All  Rights  Reserved.     14 Sales Web  site Supply MarketingSupport (based  on  Cassandra)
  • 15. Why  Cassandra? üHigh performance üFault tolerance üLinear scalability üMulti-datacenter ©  2015.  All  Rights  Reserved.     15
  • 16. Reltio Metadata-driven Model and Operations ©  2015.  All  Rights  Reserved.     16 Doctors  and   Hospitals Schema configure UI,  REST  API,   Analytics
  • 17. ©  2015.  All  Rights  Reserved.     17 Oil  &  Gas Schema Reltio Metadata-driven Model and Operations UI,  REST  API,   Analyticsconfigure
  • 18. ©  2015.  All  Rights  Reserved.     18 Asset   Catalog Schema Reltio Metadata-driven Model and Operations UI,  REST  API,   Analyticsconfigure AMan
  • 19. Cassandra  is  a  primary  datastore ©  2015.  All  Rights  Reserved.     19
  • 20. ©  2015.  All  Rights  Reserved.     20 ID: doc1 Type: Individual Name: John Email: john@gmail.com john@yahoo.com Address: CA, shipping NY, billing Entity type: Individual - Name: String - Email: List - Address: Complex - State: String - Type: List Metadata Entity doc1 <Name>.1 … John Simple  metadata-­driven  attributes  in  Cassandra  (Thrift  API) Metadata-­driven  Documents  in  Columnar  storage
  • 21. ID: doc1 Type: Individual Name: John Email: john@gmail.com john@yahoo.com Address: CA, shipping NY, billing Entity type: Individual - Name: String - Email: List - Address: Complex - State: String - Type: List ©  2015.  All  Rights  Reserved.     21 Entity doc1 … <Email>.1 <Email>.2 … … john@gmail.com john@yahoo.com Multi-­value  metadata-­driven attributes  in  Cassandra  (Thrift  API) Metadata Metadata-­driven  Documents  in  Columnar  storage
  • 22. ID: doc1 Type: Individual Name: John Email: john@gmail.com john@yahoo.com Address: CA, shipping (1) NY, billing (2) ©  2015.  All  Rights  Reserved.     22 Entity doc1 … <Address>.1.<State>.1 <Address>.1.<Type>.1 <Address>.2.<State>.1 … … CA billing NY Complex  metadata-­driven  attributes  in  Cassandra  (Thrift  API) Metadata Metadata-­driven  Documents  in  Columnar  storage Entity type: Individual - Name: String - Email: List - Address: Complex - State: String - Type: List
  • 23. ©  2015.  All  Rights  Reserved.     23 Metadata-­driven  Documents  – CQL  wide  rows CREATE TABLE ENTITIES( doc_id int, attribute_name String, attribute_value String, … PRIMARY KEY (doc_id, attribute_name) ); SELECT * -- select all addresses FROM ENTITIES WHERE doc_id = 1 AND attribute_name >= Address.0 AND attribute_name <= Address.9;
  • 24. ©  2015.  All  Rights  Reserved.     24 John Dunder Mifflin Dwight Copy Paper Employee Individual ProductOrganization Cassandra -­ Records  storage   across  datacenters Reltio -­ Metadata-­driven  graphs -­ Rich  model  for  entities,  relations -­ Partitioning -­ Effective  joins -­ Graph  operations Hybrid  Graphs  -­ linked  entities  with  infinite  attribution
  • 26. ©  2015.  All  Rights  Reserved.     26 Cassandra+ = Hybrid searchElasticsearch* *  excluded  documents Hybrid  Search  – without  documents! 0 0.5 1 1.5 Data  volume  in   Elasticsearch index  (Tb) 0 1000 2000 Elasticsearch indexing   performance  (OPS) 0 10 20 30 Search  performance  on   large  documents  (sec) -­ Elasticsearch -­ Hybrid  search:  Elasticsearch +  Cassandra
  • 27. Reltio  Cloud  Data  Components ©  2015.  All  Rights  Reserved.     Spark AWS AWS  Redshift Cassandra Elasticsearch
  • 28. Reltio  Use  Cases ©  2015.  All  Rights  Reserved.     28 AManag