SlideShare una empresa de Scribd logo
1 de 13
CloudGraph ®

MySql to HBase in 5 Steps
Converting MySql or Oracle databases to Apache HBase™ with on-line
examples using the popular Wordnet® dictionary
Scott Cinnamond – TerraMeta Software Inc.
http://cloudgraph.org
What is Wordnet ?
®

• Large complex lexical (MySql) database of
English.
• Nouns, verbs, adjectives and adverbs
grouped into sets of cognitive synonyms
(synsets), each expressing a distinct
concept.
• Synsets are interlinked by means of
conceptual-semantic and lexical relations.
HBase Conversion Steps
http://wordnet.cloudgraph.org

1) Model Creation: reverse engineer Wordnet DB
into UML®

2) Code Generation: provision persistence and
query-DSL java code

3) HBase™ Table Mapping: map data graphs and
row keys to table(s)

4) Data Migration: MySql to HBase
5) Services / App Creation: build services,
web app
1.) Model Creation
Reverse engineer Wordnet DB into PlasmaSDO™ UML® Model

• Capture entities, properties, data types,
associations, enumerations, comments as UML
• Why UML? Popular standards-based format.
Editable, viewable using standard tools.
Supports enterprise governance processes
• How? Maven build with plasma-maven-plugin
RDB tool (goal:RDB, action:reverse, dialect:mysql)
• Download working example at
https://github.com/cloudgraph/wordnet
Generated Wordnet Model
(core subset of 30 total entities and enumerations)
2.) Code Generation
Provision SDO persistence and query DSL java code

• Generate Java API based on Wordnet UML
Model
• Why? Use across RDB, HBase, other
CloudGraph Services. Compile time checking for
queries, all persistence logic
• How? Maven build with plasma-maven-plugin
SDO and DSL tools
• See generated API Javadocs on-line at
http://wordnet.cloudgraph.org
3.) HBase™ Table Mapping
Map data graphs and row keys to HBase™ table(s)

• Configure delimited, hashed, salted, formatted,
composite row keys with (xpath) paths into
target data graphs
• Map data graph roots to HBase tables
• Why? Automates row-key creation via data
extraction processing from anywhere in your
data graphs
• How? CloudGraph Configuration XML. See
https://github.com/cloudgraph/wordnet
4.) Data Migration
MySql to HBase

• Create RDB-to-HBase standalone
migration app using generated
persistence and DSL query API
incrementally call CloudGraph HBase and
RDB services
• Why? Wordnet data is large and highly
connected, so must be incrementally
extracted/inserted and linked
5.) Services / App Creation
Build services, web app

• Build simple pojo services using
persistence and DSL query API
• Encapsulate Wordnet business logic
• Add adapter/wrapper structures
• Call services called from web-app
Web App
http://wordnet.cloudgraph.org

• Auto-complete field triggers CloudGraph
HBase to use the HBase fuzzy row filter
API
• Find button returns all semantic and
lexical relations for the selected word,
including descriptions and example
sentences
• Resulting relation graphs typically contain
more than 100 nodes and return in less
than 200 milliseconds
Conclusions
• Complex, highly recursive RDB models
can be easily converted and leveraged in
HBase and future CloudGraph services
• Large lexical data graphs can be returned
in single query
• Data migration difficult given complex
recursive model
Resources
• Download the complete CloudGraph Wordnet
example: https://github.com/cloudgraph/wordnet
• Run the example online:
http://wordnet.cloudgraph.org
• Project details, contact information:
http://cloudgraph.org
• Beta Source Repo:
https://github.com/terrameta/cloudgraph
• Production Source Repo (under construction):
https://github.com/cloudgraph
Status / Legal
•
•

•

Project Status
– CloudGraph ® is currently under private beta testing
Licensing
– CloudGraph ® 0.5.5 Community Edition (CE) is open source licensed
under version 2 of the GNU General Public License
Trademarks
– WordNet ® is a registered trademark of Princeton University
– Apache HBase™ is a trademark of Apache Software Foundation
– CloudGraph ® is a trademark of TerraMeta Software LLC, TerraMeta
Software Inc.

Más contenido relacionado

La actualidad más candente

Scaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedInScaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedInDataWorks Summit
 
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataHBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataCloudera, Inc.
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...DataWorks Summit
 
Hadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureHadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureRyan Hennig
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseHBaseCon
 
Realizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache BeamRealizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache BeamDataWorks Summit
 
Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015Joydeep Sen Sarma
 
In Search of Database Nirvana: Challenges of Delivering HTAP
In Search of Database Nirvana: Challenges of Delivering HTAPIn Search of Database Nirvana: Challenges of Delivering HTAP
In Search of Database Nirvana: Challenges of Delivering HTAPHBaseCon
 
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 Apache AGE and the synergy effect in the combination of Postgres and NoSQL Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Apache AGE and the synergy effect in the combination of Postgres and NoSQLEDB
 
Big Data and Hadoop Ecosystem
Big Data and Hadoop EcosystemBig Data and Hadoop Ecosystem
Big Data and Hadoop EcosystemRajkumar Singh
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksDataWorks Summit
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberDataWorks Summit
 
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...Spark Summit
 
Apache Ratis - In Search of a Usable Raft Library
Apache Ratis - In Search of a Usable Raft LibraryApache Ratis - In Search of a Usable Raft Library
Apache Ratis - In Search of a Usable Raft LibraryTsz-Wo (Nicholas) Sze
 
HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environmen...
HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environmen...HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environmen...
HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environmen...Cloudera, Inc.
 
Presto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performancePresto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performanceDataWorks Summit
 
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureHadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureSkillspeed
 

La actualidad más candente (20)

Scaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedInScaling Deep Learning on Hadoop at LinkedIn
Scaling Deep Learning on Hadoop at LinkedIn
 
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big DataHBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
HBaseCon 2012 | You’ve got HBase! How AOL Mail Handles Big Data
 
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
Big Data Meets NVM: Accelerating Big Data Processing with Non-Volatile Memory...
 
Hadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and FutureHadoop @ eBay: Past, Present, and Future
Hadoop @ eBay: Past, Present, and Future
 
Empower Data-Driven Organizations
Empower Data-Driven OrganizationsEmpower Data-Driven Organizations
Empower Data-Driven Organizations
 
Keynote: The Future of Apache HBase
Keynote: The Future of Apache HBaseKeynote: The Future of Apache HBase
Keynote: The Future of Apache HBase
 
Realizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache BeamRealizing the promise of portable data processing with Apache Beam
Realizing the promise of portable data processing with Apache Beam
 
HBase in Practice
HBase in Practice HBase in Practice
HBase in Practice
 
Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015Qubole @ AWS Meetup Bangalore - July 2015
Qubole @ AWS Meetup Bangalore - July 2015
 
Time-oriented event search. A new level of scale
Time-oriented event search. A new level of scale Time-oriented event search. A new level of scale
Time-oriented event search. A new level of scale
 
In Search of Database Nirvana: Challenges of Delivering HTAP
In Search of Database Nirvana: Challenges of Delivering HTAPIn Search of Database Nirvana: Challenges of Delivering HTAP
In Search of Database Nirvana: Challenges of Delivering HTAP
 
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 Apache AGE and the synergy effect in the combination of Postgres and NoSQL Apache AGE and the synergy effect in the combination of Postgres and NoSQL
Apache AGE and the synergy effect in the combination of Postgres and NoSQL
 
Big Data and Hadoop Ecosystem
Big Data and Hadoop EcosystemBig Data and Hadoop Ecosystem
Big Data and Hadoop Ecosystem
 
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise NetworksUsing Familiar BI Tools and Hadoop to Analyze Enterprise Networks
Using Familiar BI Tools and Hadoop to Analyze Enterprise Networks
 
HBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at UberHBase Global Indexing to support large-scale data ingestion at Uber
HBase Global Indexing to support large-scale data ingestion at Uber
 
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
Apache Spark—Apache HBase Connector: Feature Rich and Efficient Access to HBa...
 
Apache Ratis - In Search of a Usable Raft Library
Apache Ratis - In Search of a Usable Raft LibraryApache Ratis - In Search of a Usable Raft Library
Apache Ratis - In Search of a Usable Raft Library
 
HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environmen...
HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environmen...HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environmen...
HBaseCon 2012 | Overcoming Data Deluge with HBase to Help Save the Environmen...
 
Presto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performancePresto query optimizer: pursuit of performance
Presto query optimizer: pursuit of performance
 
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive ArchitectureHadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture
 

Destacado

Social Media Marketing Overview Share
Social Media Marketing Overview ShareSocial Media Marketing Overview Share
Social Media Marketing Overview ShareSROshinski
 
Apache Sqoop: A Data Transfer Tool for Hadoop
Apache Sqoop: A Data Transfer Tool for HadoopApache Sqoop: A Data Transfer Tool for Hadoop
Apache Sqoop: A Data Transfer Tool for HadoopCloudera, Inc.
 
Pentaho Enterprise vs. Pentaho Community
Pentaho Enterprise vs. Pentaho CommunityPentaho Enterprise vs. Pentaho Community
Pentaho Enterprise vs. Pentaho CommunityMauricio Murillo
 
Semantic Technology: The Basics
Semantic Technology: The BasicsSemantic Technology: The Basics
Semantic Technology: The BasicsPeter Berger
 
Managing "Big Data" Application Complexity with CloudGraph
Managing "Big Data" Application Complexity with CloudGraphManaging "Big Data" Application Complexity with CloudGraph
Managing "Big Data" Application Complexity with CloudGraphScott Cinnamond
 
Hadoop administration using cloudera student lab guidebook
Hadoop administration using cloudera   student lab guidebookHadoop administration using cloudera   student lab guidebook
Hadoop administration using cloudera student lab guidebookNiranjan Pandey
 

Destacado (6)

Social Media Marketing Overview Share
Social Media Marketing Overview ShareSocial Media Marketing Overview Share
Social Media Marketing Overview Share
 
Apache Sqoop: A Data Transfer Tool for Hadoop
Apache Sqoop: A Data Transfer Tool for HadoopApache Sqoop: A Data Transfer Tool for Hadoop
Apache Sqoop: A Data Transfer Tool for Hadoop
 
Pentaho Enterprise vs. Pentaho Community
Pentaho Enterprise vs. Pentaho CommunityPentaho Enterprise vs. Pentaho Community
Pentaho Enterprise vs. Pentaho Community
 
Semantic Technology: The Basics
Semantic Technology: The BasicsSemantic Technology: The Basics
Semantic Technology: The Basics
 
Managing "Big Data" Application Complexity with CloudGraph
Managing "Big Data" Application Complexity with CloudGraphManaging "Big Data" Application Complexity with CloudGraph
Managing "Big Data" Application Complexity with CloudGraph
 
Hadoop administration using cloudera student lab guidebook
Hadoop administration using cloudera   student lab guidebookHadoop administration using cloudera   student lab guidebook
Hadoop administration using cloudera student lab guidebook
 

Similar a MySql to HBase in 5 Steps

Database Freedom: Database Week SF
Database Freedom: Database Week SFDatabase Freedom: Database Week SF
Database Freedom: Database Week SFAmazon Web Services
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2Raul Chong
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Precisely
 
Database Freedom: Database Week San Francisco
Database Freedom: Database Week San FranciscoDatabase Freedom: Database Week San Francisco
Database Freedom: Database Week San FranciscoAmazon Web Services
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsAndrew Brust
 
Agile data warehousing
Agile data warehousingAgile data warehousing
Agile data warehousingSneha Challa
 
Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Amazon Web Services
 
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...Amazon Web Services
 
DBaaS with EDB Postgres on AWS
DBaaS with EDB Postgres on AWSDBaaS with EDB Postgres on AWS
DBaaS with EDB Postgres on AWSEDB
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impalamarkgrover
 
Beyond Relational
Beyond RelationalBeyond Relational
Beyond RelationalLynn Langit
 
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesKyle Banerjee
 

Similar a MySql to HBase in 5 Steps (20)

Apache Hadoop Hive
Apache Hadoop HiveApache Hadoop Hive
Apache Hadoop Hive
 
Hive
HiveHive
Hive
 
Database Freedom: Database Week SF
Database Freedom: Database Week SFDatabase Freedom: Database Week SF
Database Freedom: Database Week SF
 
AWS-DMS-2023.pptx
AWS-DMS-2023.pptxAWS-DMS-2023.pptx
AWS-DMS-2023.pptx
 
Hive_Pig.pptx
Hive_Pig.pptxHive_Pig.pptx
Hive_Pig.pptx
 
IBM - Introduction to Cloudant
IBM - Introduction to CloudantIBM - Introduction to Cloudant
IBM - Introduction to Cloudant
 
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part20812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
 
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
Big Data Goes Airborne. Propelling Your Big Data Initiative with Ironcluster ...
 
Database Freedom: Database Week San Francisco
Database Freedom: Database Week San FranciscoDatabase Freedom: Database Week San Francisco
Database Freedom: Database Week San Francisco
 
SQL Server 2012 and Big Data
SQL Server 2012 and Big DataSQL Server 2012 and Big Data
SQL Server 2012 and Big Data
 
Big Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI ProsBig Data and NoSQL for Database and BI Pros
Big Data and NoSQL for Database and BI Pros
 
Agile data warehousing
Agile data warehousingAgile data warehousing
Agile data warehousing
 
Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2Databases in the Cloud - DevDay Austin 2017 Day 2
Databases in the Cloud - DevDay Austin 2017 Day 2
 
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
ABD324_Migrating Your Oracle Data Warehouse to Amazon Redshift Using AWS DMS ...
 
DBaaS with EDB Postgres on AWS
DBaaS with EDB Postgres on AWSDBaaS with EDB Postgres on AWS
DBaaS with EDB Postgres on AWS
 
SQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for ImpalaSQL Engines for Hadoop - The case for Impala
SQL Engines for Hadoop - The case for Impala
 
Beyond Relational
Beyond RelationalBeyond Relational
Beyond Relational
 
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL DatabasesDropping ACID: Wrapping Your Mind Around NoSQL Databases
Dropping ACID: Wrapping Your Mind Around NoSQL Databases
 
What is Database Freedom?
What is Database Freedom?What is Database Freedom?
What is Database Freedom?
 
NoSQL
NoSQLNoSQL
NoSQL
 

Último

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Orbitshub
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusZilliz
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Victor Rentea
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelDeepika Singh
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 

Último (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
Exploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with MilvusExploring Multimodal Embeddings with Milvus
Exploring Multimodal Embeddings with Milvus
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot ModelMcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Mcleodganj Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 

MySql to HBase in 5 Steps

  • 1. CloudGraph ® MySql to HBase in 5 Steps Converting MySql or Oracle databases to Apache HBase™ with on-line examples using the popular Wordnet® dictionary Scott Cinnamond – TerraMeta Software Inc. http://cloudgraph.org
  • 2. What is Wordnet ? ® • Large complex lexical (MySql) database of English. • Nouns, verbs, adjectives and adverbs grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. • Synsets are interlinked by means of conceptual-semantic and lexical relations.
  • 3. HBase Conversion Steps http://wordnet.cloudgraph.org 1) Model Creation: reverse engineer Wordnet DB into UML® 2) Code Generation: provision persistence and query-DSL java code 3) HBase™ Table Mapping: map data graphs and row keys to table(s) 4) Data Migration: MySql to HBase 5) Services / App Creation: build services, web app
  • 4. 1.) Model Creation Reverse engineer Wordnet DB into PlasmaSDO™ UML® Model • Capture entities, properties, data types, associations, enumerations, comments as UML • Why UML? Popular standards-based format. Editable, viewable using standard tools. Supports enterprise governance processes • How? Maven build with plasma-maven-plugin RDB tool (goal:RDB, action:reverse, dialect:mysql) • Download working example at https://github.com/cloudgraph/wordnet
  • 5. Generated Wordnet Model (core subset of 30 total entities and enumerations)
  • 6. 2.) Code Generation Provision SDO persistence and query DSL java code • Generate Java API based on Wordnet UML Model • Why? Use across RDB, HBase, other CloudGraph Services. Compile time checking for queries, all persistence logic • How? Maven build with plasma-maven-plugin SDO and DSL tools • See generated API Javadocs on-line at http://wordnet.cloudgraph.org
  • 7. 3.) HBase™ Table Mapping Map data graphs and row keys to HBase™ table(s) • Configure delimited, hashed, salted, formatted, composite row keys with (xpath) paths into target data graphs • Map data graph roots to HBase tables • Why? Automates row-key creation via data extraction processing from anywhere in your data graphs • How? CloudGraph Configuration XML. See https://github.com/cloudgraph/wordnet
  • 8. 4.) Data Migration MySql to HBase • Create RDB-to-HBase standalone migration app using generated persistence and DSL query API incrementally call CloudGraph HBase and RDB services • Why? Wordnet data is large and highly connected, so must be incrementally extracted/inserted and linked
  • 9. 5.) Services / App Creation Build services, web app • Build simple pojo services using persistence and DSL query API • Encapsulate Wordnet business logic • Add adapter/wrapper structures • Call services called from web-app
  • 10. Web App http://wordnet.cloudgraph.org • Auto-complete field triggers CloudGraph HBase to use the HBase fuzzy row filter API • Find button returns all semantic and lexical relations for the selected word, including descriptions and example sentences • Resulting relation graphs typically contain more than 100 nodes and return in less than 200 milliseconds
  • 11. Conclusions • Complex, highly recursive RDB models can be easily converted and leveraged in HBase and future CloudGraph services • Large lexical data graphs can be returned in single query • Data migration difficult given complex recursive model
  • 12. Resources • Download the complete CloudGraph Wordnet example: https://github.com/cloudgraph/wordnet • Run the example online: http://wordnet.cloudgraph.org • Project details, contact information: http://cloudgraph.org • Beta Source Repo: https://github.com/terrameta/cloudgraph • Production Source Repo (under construction): https://github.com/cloudgraph
  • 13. Status / Legal • • • Project Status – CloudGraph ® is currently under private beta testing Licensing – CloudGraph ® 0.5.5 Community Edition (CE) is open source licensed under version 2 of the GNU General Public License Trademarks – WordNet ® is a registered trademark of Princeton University – Apache HBase™ is a trademark of Apache Software Foundation – CloudGraph ® is a trademark of TerraMeta Software LLC, TerraMeta Software Inc.