SlideShare una empresa de Scribd logo
1 de 18
Neptune hjkim http://www.openneptune.com http://dev.naver.com/projects/neptune(korean) babokim@gmail.com
Neptune Distributed Data Storage semi-structured data store(not file system) Use Distributed File System for Data file Supports real time and batch processing Google Bigtable clone Data Model, Architecture, Features Open source http://dev.naver.com/projects/neptune(korean) http://www.openneptune.com Goal 500 nodes 200 GB 이상/per node, Peta bytes
Features Schema Management Create, drop, modify table schema Real-time Transaction Single row operation(no join, group by, order by) Multi row operation: like, between Batch Transaction Scanner, Direct Uploader, MapReduce Adapter Scalability Automatic table split & re-assignment Reliability Data file stored in Distributed File System Commit log stored in ChangeLog Cluster Failover Tablet takeover time: max 1 min. Utility Web Console, Shell(simple query), Data Verifier
Architecture User Application MapReduce Neptune Master Neptune TabletServer #1 TabletServer #2 TabletServer #n Table Distributed File System(Hadoop or other) Physical storage
Components Master Lock Server Neptune Client Neptune Master ZooKeeper Pleidas Neptune Master NTable Scanner Shell NChubby failover / event failover / event Data/Control Control TabletServer #1 (Neptune) TabletServer #2 (Neptune) TabletServer #n (Neptune) LogServer #1 LogServer #2 LogServer #n DFS #1 (DataNode) Computing #1 (Map&Reduce) DFS #2 (DataNode) Computing #2 (Map&Reduce) DFS #n (DataNode) Computing #n (Map&Reduce) Local disk Local disk Local disk
Data Model Table Column#1 Column#n Rowkey row #1 ck-1         v1, t1 TabletA-1 rk-1              v2, t2 ck-2        v3, t2              v4, t3 row #k              v5, t4 row #k+1 ck-n        vn, tn TabletA-2 - Sorted by rowKey - Sorted by columnKey row #m Row#1 Cell row #m+1 Row.Key TabletA-n Column1 Column2 Column-n Cell.Key Cell1 Cell1 Cell1 Cell.Value(t1) Cell2 Cell2 Cell2 … … row #n Cell.Value(t2) Cell3 … … Cell-k Cell-m Cell.Value(tn) Cell-n
Index Root Index M.T1.1000:M1 … Max:mn M.T1.2000:M2 index of Meta Index Meta Tablet m2 m1 mn T1.100:U1 … xx xx … n T1.200:U2 T1.1000:UN … T1.2000:UN T1.1100:U1 T1.1200:U2 … index of User Tablet User defined Tablet U1 U2 10 20 … 100 110 120 … 200 … Index of TableMapFile’sblock(max-key, file-offset) Index Record format:  . Key - TableName.MaxRowKey  . Value – Tablet Name, assigned host scan 64KB TableMapFile(Physical file,sortedby rowkey, columnkey)
Data/index file in HDFS TabletA TabletB Column Data/index file
TabletServer Minor  Compaction MemoryTable ChangeLogServer Data Operation put(key, value) ChangeLog Searcher get(key) Merged MapFile (HDFS) MapFile#2 (HDFS) MapFile#1 (HDFS) MapFile #n (HDFS) Major Compaction
Failover Master fail disabled only Table Schema Management and Tablet Split can execute Multi-Master TabletServer fail assign to other TabletServer by master within 2 minutes
MapReduce with Neptune Hadoop TaskTracker TableA Map Task Map Task TabletInputFormat Map Task TabletA-1 TaskTracker Partitioned  by key TableB Reduce Task TaskTracker Tablet A-2 Tablet B-1 Map Task Map Task Map Task Tablet A-3 TaskTracker Tablet B-2 … Reduce Task TaskTracker Map Task Tablet A-N Map Task Map Task DBMS or HDFS META Table
Client Client API Single row operation: put/get Multi row operation: like, between Batch operation: scanner/uploader MapReduce: TabletInputFormat Command line Shell NQL(Neptune Query Language) JDBC support Web Console
Client API Example TableShematableSchema =                             new TableSchema(“T_TEST”, new String[]{“col1”, “col2”}); NTable.createTable(tableSchema); NTablentable = Ntable.openTable(“T_TEST”); Row row = new Row(new Row.Key(“RK1”)); Row.addCell(“col1”, new Cell(new Cell.Key(“CK1”), “test_value”.getBytes())); ntable.put(row); Row selectedRow = ntable.get(new Row.Key(“RK1”)); System.out.println(selectedRow.getCellList(“col1”).get(0)); TableScanner scanner = ntable.openScanner(ntable, new String[]{“col1”}); Row scanRow = null; while( (scanRow = scanner.next()) = null) { System.out.println(selectedRow.getCellList(“col1”).get(0)); } scanner.close();
Neptune Shell Data Definition CREATE TABLE DROP TABLE SHOW TABLES DESC Data Manipulation SELECT DELETE INSERT TRUNCATE COLUMN TRUNCATE TABLE Cluster Monitoring PING TABLETSERVER             REPORT TABLE
Web Console
Performance Number of 1000-byte values read/written per second
HBase/Bigtable Comparison
Powered by Neptune http://searcus.com/nosql twitter search service

Más contenido relacionado

La actualidad más candente

How to configure the cluster based on Multi-site (WAN) configuration
How to configure the clusterbased on Multi-site (WAN) configurationHow to configure the clusterbased on Multi-site (WAN) configuration
How to configure the cluster based on Multi-site (WAN) configurationAkihiro Kitada
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovAltinity Ltd
 
Use Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruUse Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruTim Callaghan
 
ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...Altinity Ltd
 
Hadoop Installation and basic configuration
Hadoop Installation and basic configurationHadoop Installation and basic configuration
Hadoop Installation and basic configurationGerrit van Vuuren
 
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...Altinity Ltd
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesAltinity Ltd
 
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And HbaseEdward Yoon
 
Configuringahadoop
ConfiguringahadoopConfiguringahadoop
Configuringahadoopmensb
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스PgDay.Seoul
 
Dbm 438 Enthusiastic Study / snaptutorial.com
Dbm 438 Enthusiastic Study / snaptutorial.comDbm 438 Enthusiastic Study / snaptutorial.com
Dbm 438 Enthusiastic Study / snaptutorial.comStephenson23
 
Trivadis TechEvent 2017 PostgreSQL für die (Orakel) DBA by Ludovico Caldara
Trivadis TechEvent 2017 PostgreSQL für die (Orakel) DBA by Ludovico CaldaraTrivadis TechEvent 2017 PostgreSQL für die (Orakel) DBA by Ludovico Caldara
Trivadis TechEvent 2017 PostgreSQL für die (Orakel) DBA by Ludovico CaldaraTrivadis
 
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...Altinity Ltd
 
Lab1-DB-Cassandra
Lab1-DB-CassandraLab1-DB-Cassandra
Lab1-DB-CassandraLilia Sfaxi
 

La actualidad más candente (18)

How to configure the cluster based on Multi-site (WAN) configuration
How to configure the clusterbased on Multi-site (WAN) configurationHow to configure the clusterbased on Multi-site (WAN) configuration
How to configure the cluster based on Multi-site (WAN) configuration
 
ClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei MilovidovClickHouse Features for Advanced Users, by Aleksei Milovidov
ClickHouse Features for Advanced Users, by Aleksei Milovidov
 
Use Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB GuruUse Your MySQL Knowledge to Become a MongoDB Guru
Use Your MySQL Knowledge to Become a MongoDB Guru
 
PgconfSV compression
PgconfSV compressionPgconfSV compression
PgconfSV compression
 
ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...ClickHouse materialized views - a secret weapon for high performance analytic...
ClickHouse materialized views - a secret weapon for high performance analytic...
 
Hadoop Installation and basic configuration
Hadoop Installation and basic configurationHadoop Installation and basic configuration
Hadoop Installation and basic configuration
 
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...
Big Data in Real-Time: How ClickHouse powers Admiral's visitor relationships ...
 
Php dba cache
Php dba cachePhp dba cache
Php dba cache
 
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert HodgesWebinar: Secrets of ClickHouse Query Performance, by Robert Hodges
Webinar: Secrets of ClickHouse Query Performance, by Robert Hodges
 
BigTable And Hbase
BigTable And HbaseBigTable And Hbase
BigTable And Hbase
 
Configuringahadoop
ConfiguringahadoopConfiguringahadoop
Configuringahadoop
 
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
[Pgday.Seoul 2019] Citus를 이용한 분산 데이터베이스
 
Dbm 438 Enthusiastic Study / snaptutorial.com
Dbm 438 Enthusiastic Study / snaptutorial.comDbm 438 Enthusiastic Study / snaptutorial.com
Dbm 438 Enthusiastic Study / snaptutorial.com
 
Trivadis TechEvent 2017 PostgreSQL für die (Orakel) DBA by Ludovico Caldara
Trivadis TechEvent 2017 PostgreSQL für die (Orakel) DBA by Ludovico CaldaraTrivadis TechEvent 2017 PostgreSQL für die (Orakel) DBA by Ludovico Caldara
Trivadis TechEvent 2017 PostgreSQL für die (Orakel) DBA by Ludovico Caldara
 
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
A Practical Introduction to Handling Log Data in ClickHouse, by Robert Hodges...
 
Metadata/Time-Date Tools (Toolkit_MTD)
Metadata/Time-Date Tools (Toolkit_MTD)Metadata/Time-Date Tools (Toolkit_MTD)
Metadata/Time-Date Tools (Toolkit_MTD)
 
Lab1-DB-Cassandra
Lab1-DB-CassandraLab1-DB-Cassandra
Lab1-DB-Cassandra
 
Csql Cache Presentation
Csql Cache PresentationCsql Cache Presentation
Csql Cache Presentation
 

Destacado

Neptune By Leticia
Neptune  By LeticiaNeptune  By Leticia
Neptune By Leticiadebbieglad
 
Types Of Sentences
Types Of SentencesTypes Of Sentences
Types Of Sentencesmelissagkh
 
Creative Thinking
Creative ThinkingCreative Thinking
Creative ThinkingTony Yoo
 
Facebook Powerpoint
Facebook PowerpointFacebook Powerpoint
Facebook Powerpointmyra14
 
solid waste management
solid waste managementsolid waste management
solid waste managementAmit Nakli
 
The Solar System Powerpoint
The Solar System PowerpointThe Solar System Powerpoint
The Solar System Powerpointoliverh
 
How Brands Grow : A summary of Byron Sharp's book on what marketers don't know
How Brands Grow : A summary of Byron Sharp's book on what marketers don't knowHow Brands Grow : A summary of Byron Sharp's book on what marketers don't know
How Brands Grow : A summary of Byron Sharp's book on what marketers don't knowAmie Weller
 
LinkedIn’s Culture of Transformation
LinkedIn’s Culture of TransformationLinkedIn’s Culture of Transformation
LinkedIn’s Culture of TransformationPat Wadors
 

Destacado (18)

Neptune
NeptuneNeptune
Neptune
 
Neptune
NeptuneNeptune
Neptune
 
Neptune By Leticia
Neptune  By LeticiaNeptune  By Leticia
Neptune By Leticia
 
Neptune
NeptuneNeptune
Neptune
 
Neptune
NeptuneNeptune
Neptune
 
Solar system
Solar systemSolar system
Solar system
 
Positive attitude ppt
Positive attitude pptPositive attitude ppt
Positive attitude ppt
 
Types Of Sentences
Types Of SentencesTypes Of Sentences
Types Of Sentences
 
Global warming (EVS Project)
Global warming (EVS Project)Global warming (EVS Project)
Global warming (EVS Project)
 
Solar System Ppt
Solar System PptSolar System Ppt
Solar System Ppt
 
Creative Thinking
Creative ThinkingCreative Thinking
Creative Thinking
 
Facebook Powerpoint
Facebook PowerpointFacebook Powerpoint
Facebook Powerpoint
 
solid waste management
solid waste managementsolid waste management
solid waste management
 
The Solar System Powerpoint
The Solar System PowerpointThe Solar System Powerpoint
The Solar System Powerpoint
 
How Brands Grow : A summary of Byron Sharp's book on what marketers don't know
How Brands Grow : A summary of Byron Sharp's book on what marketers don't knowHow Brands Grow : A summary of Byron Sharp's book on what marketers don't know
How Brands Grow : A summary of Byron Sharp's book on what marketers don't know
 
Slides That Rock
Slides That RockSlides That Rock
Slides That Rock
 
Body language ppt
Body language pptBody language ppt
Body language ppt
 
LinkedIn’s Culture of Transformation
LinkedIn’s Culture of TransformationLinkedIn’s Culture of Transformation
LinkedIn’s Culture of Transformation
 

Similar a Neptune Distributed Data System

Hive ICDE 2010
Hive ICDE 2010Hive ICDE 2010
Hive ICDE 2010ragho
 
Google Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 DayGoogle Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 Dayprogrammermag
 
Hadoop Introduction
Hadoop IntroductionHadoop Introduction
Hadoop IntroductionSNEHAL MASNE
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesappaji intelhunt
 
Hadoop online-training
Hadoop online-trainingHadoop online-training
Hadoop online-trainingGeohedrick
 
Whirlwind tour of Hadoop and HIve
Whirlwind tour of Hadoop and HIveWhirlwind tour of Hadoop and HIve
Whirlwind tour of Hadoop and HIveEdward Capriolo
 
The Anatomy Of The Google Architecture Fina Lv1.1
The Anatomy Of The Google Architecture Fina Lv1.1The Anatomy Of The Google Architecture Fina Lv1.1
The Anatomy Of The Google Architecture Fina Lv1.1Hassy Veldstra
 
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015Andrey Vykhodtsev
 
Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain
Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit JainApache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain
Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit JainYahoo Developer Network
 
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...Yahoo Developer Network
 
Hadoop & MapReduce
Hadoop & MapReduceHadoop & MapReduce
Hadoop & MapReduceNewvewm
 
My Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataMy Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataRobert Grossman
 
Google Cluster Innards
Google Cluster InnardsGoogle Cluster Innards
Google Cluster InnardsMartin Dvorak
 

Similar a Neptune Distributed Data System (20)

Hive ICDE 2010
Hive ICDE 2010Hive ICDE 2010
Hive ICDE 2010
 
Google Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 DayGoogle Cloud Computing on Google Developer 2008 Day
Google Cloud Computing on Google Developer 2008 Day
 
Hadoop Introduction
Hadoop IntroductionHadoop Introduction
Hadoop Introduction
 
Hadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologiesHadoop training in bangalore-kellytechnologies
Hadoop training in bangalore-kellytechnologies
 
Hadoop online-training
Hadoop online-trainingHadoop online-training
Hadoop online-training
 
Whirlwind tour of Hadoop and HIve
Whirlwind tour of Hadoop and HIveWhirlwind tour of Hadoop and HIve
Whirlwind tour of Hadoop and HIve
 
Handout3o
Handout3oHandout3o
Handout3o
 
The Anatomy Of The Google Architecture Fina Lv1.1
The Anatomy Of The Google Architecture Fina Lv1.1The Anatomy Of The Google Architecture Fina Lv1.1
The Anatomy Of The Google Architecture Fina Lv1.1
 
Apache Cassandra at Macys
Apache Cassandra at MacysApache Cassandra at Macys
Apache Cassandra at Macys
 
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
Big Data Essentials meetup @ IBM Ljubljana 23.06.2015
 
Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain
Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit JainApache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain
Apache Hadoop India Summit 2011 talk "Hive Evolution" by Namit Jain
 
An Introduction to Hadoop
An Introduction to HadoopAn Introduction to Hadoop
An Introduction to Hadoop
 
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
Apache Hadoop India Summit 2011 talk "Hadoop Map-Reduce Programming & Best Pr...
 
Hadoop & MapReduce
Hadoop & MapReduceHadoop & MapReduce
Hadoop & MapReduce
 
HADOOP
HADOOPHADOOP
HADOOP
 
My Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big DataMy Other Computer is a Data Center: The Sector Perspective on Big Data
My Other Computer is a Data Center: The Sector Perspective on Big Data
 
Unit 1
Unit 1Unit 1
Unit 1
 
Google Cluster Innards
Google Cluster InnardsGoogle Cluster Innards
Google Cluster Innards
 
Lecture 2 part 1
Lecture 2 part 1Lecture 2 part 1
Lecture 2 part 1
 
Hadoop
HadoopHadoop
Hadoop
 

Más de Hyoungjun Kim

Open infradays 2019_msa_k8s
Open infradays 2019_msa_k8sOpen infradays 2019_msa_k8s
Open infradays 2019_msa_k8sHyoungjun Kim
 
오픈소스 프로젝트 따라잡기_공개
오픈소스 프로젝트 따라잡기_공개오픈소스 프로젝트 따라잡기_공개
오픈소스 프로젝트 따라잡기_공개Hyoungjun Kim
 
Presto, Zeppelin을 이용한 초간단 BI 구축 사례
Presto, Zeppelin을 이용한 초간단 BI 구축 사례Presto, Zeppelin을 이용한 초간단 BI 구축 사례
Presto, Zeppelin을 이용한 초간단 BI 구축 사례Hyoungjun Kim
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120Hyoungjun Kim
 
Jco 소셜 빅데이터_20120218
Jco 소셜 빅데이터_20120218Jco 소셜 빅데이터_20120218
Jco 소셜 빅데이터_20120218Hyoungjun Kim
 
Big data 20111203_배포판
Big data 20111203_배포판Big data 20111203_배포판
Big data 20111203_배포판Hyoungjun Kim
 

Más de Hyoungjun Kim (6)

Open infradays 2019_msa_k8s
Open infradays 2019_msa_k8sOpen infradays 2019_msa_k8s
Open infradays 2019_msa_k8s
 
오픈소스 프로젝트 따라잡기_공개
오픈소스 프로젝트 따라잡기_공개오픈소스 프로젝트 따라잡기_공개
오픈소스 프로젝트 따라잡기_공개
 
Presto, Zeppelin을 이용한 초간단 BI 구축 사례
Presto, Zeppelin을 이용한 초간단 BI 구축 사례Presto, Zeppelin을 이용한 초간단 BI 구축 사례
Presto, Zeppelin을 이용한 초간단 BI 구축 사례
 
Tajo_Meetup_20141120
Tajo_Meetup_20141120Tajo_Meetup_20141120
Tajo_Meetup_20141120
 
Jco 소셜 빅데이터_20120218
Jco 소셜 빅데이터_20120218Jco 소셜 빅데이터_20120218
Jco 소셜 빅데이터_20120218
 
Big data 20111203_배포판
Big data 20111203_배포판Big data 20111203_배포판
Big data 20111203_배포판
 

Último

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality AssuranceInflectra
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...panagenda
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsRavi Sanghani
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterMydbops
 

Último (20)

So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance[Webinar] SpiraTest - Setting New Standards in Quality Assurance
[Webinar] SpiraTest - Setting New Standards in Quality Assurance
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
Why device, WIFI, and ISP insights are crucial to supporting remote Microsoft...
 
Potential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and InsightsPotential of AI (Generative AI) in Business: Learnings and Insights
Potential of AI (Generative AI) in Business: Learnings and Insights
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
Scale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL RouterScale your database traffic with Read & Write split using MySQL Router
Scale your database traffic with Read & Write split using MySQL Router
 

Neptune Distributed Data System

  • 1. Neptune hjkim http://www.openneptune.com http://dev.naver.com/projects/neptune(korean) babokim@gmail.com
  • 2. Neptune Distributed Data Storage semi-structured data store(not file system) Use Distributed File System for Data file Supports real time and batch processing Google Bigtable clone Data Model, Architecture, Features Open source http://dev.naver.com/projects/neptune(korean) http://www.openneptune.com Goal 500 nodes 200 GB 이상/per node, Peta bytes
  • 3. Features Schema Management Create, drop, modify table schema Real-time Transaction Single row operation(no join, group by, order by) Multi row operation: like, between Batch Transaction Scanner, Direct Uploader, MapReduce Adapter Scalability Automatic table split & re-assignment Reliability Data file stored in Distributed File System Commit log stored in ChangeLog Cluster Failover Tablet takeover time: max 1 min. Utility Web Console, Shell(simple query), Data Verifier
  • 4. Architecture User Application MapReduce Neptune Master Neptune TabletServer #1 TabletServer #2 TabletServer #n Table Distributed File System(Hadoop or other) Physical storage
  • 5. Components Master Lock Server Neptune Client Neptune Master ZooKeeper Pleidas Neptune Master NTable Scanner Shell NChubby failover / event failover / event Data/Control Control TabletServer #1 (Neptune) TabletServer #2 (Neptune) TabletServer #n (Neptune) LogServer #1 LogServer #2 LogServer #n DFS #1 (DataNode) Computing #1 (Map&Reduce) DFS #2 (DataNode) Computing #2 (Map&Reduce) DFS #n (DataNode) Computing #n (Map&Reduce) Local disk Local disk Local disk
  • 6. Data Model Table Column#1 Column#n Rowkey row #1 ck-1 v1, t1 TabletA-1 rk-1 v2, t2 ck-2 v3, t2 v4, t3 row #k v5, t4 row #k+1 ck-n vn, tn TabletA-2 - Sorted by rowKey - Sorted by columnKey row #m Row#1 Cell row #m+1 Row.Key TabletA-n Column1 Column2 Column-n Cell.Key Cell1 Cell1 Cell1 Cell.Value(t1) Cell2 Cell2 Cell2 … … row #n Cell.Value(t2) Cell3 … … Cell-k Cell-m Cell.Value(tn) Cell-n
  • 7. Index Root Index M.T1.1000:M1 … Max:mn M.T1.2000:M2 index of Meta Index Meta Tablet m2 m1 mn T1.100:U1 … xx xx … n T1.200:U2 T1.1000:UN … T1.2000:UN T1.1100:U1 T1.1200:U2 … index of User Tablet User defined Tablet U1 U2 10 20 … 100 110 120 … 200 … Index of TableMapFile’sblock(max-key, file-offset) Index Record format: . Key - TableName.MaxRowKey . Value – Tablet Name, assigned host scan 64KB TableMapFile(Physical file,sortedby rowkey, columnkey)
  • 8. Data/index file in HDFS TabletA TabletB Column Data/index file
  • 9. TabletServer Minor Compaction MemoryTable ChangeLogServer Data Operation put(key, value) ChangeLog Searcher get(key) Merged MapFile (HDFS) MapFile#2 (HDFS) MapFile#1 (HDFS) MapFile #n (HDFS) Major Compaction
  • 10. Failover Master fail disabled only Table Schema Management and Tablet Split can execute Multi-Master TabletServer fail assign to other TabletServer by master within 2 minutes
  • 11. MapReduce with Neptune Hadoop TaskTracker TableA Map Task Map Task TabletInputFormat Map Task TabletA-1 TaskTracker Partitioned by key TableB Reduce Task TaskTracker Tablet A-2 Tablet B-1 Map Task Map Task Map Task Tablet A-3 TaskTracker Tablet B-2 … Reduce Task TaskTracker Map Task Tablet A-N Map Task Map Task DBMS or HDFS META Table
  • 12. Client Client API Single row operation: put/get Multi row operation: like, between Batch operation: scanner/uploader MapReduce: TabletInputFormat Command line Shell NQL(Neptune Query Language) JDBC support Web Console
  • 13. Client API Example TableShematableSchema = new TableSchema(“T_TEST”, new String[]{“col1”, “col2”}); NTable.createTable(tableSchema); NTablentable = Ntable.openTable(“T_TEST”); Row row = new Row(new Row.Key(“RK1”)); Row.addCell(“col1”, new Cell(new Cell.Key(“CK1”), “test_value”.getBytes())); ntable.put(row); Row selectedRow = ntable.get(new Row.Key(“RK1”)); System.out.println(selectedRow.getCellList(“col1”).get(0)); TableScanner scanner = ntable.openScanner(ntable, new String[]{“col1”}); Row scanRow = null; while( (scanRow = scanner.next()) = null) { System.out.println(selectedRow.getCellList(“col1”).get(0)); } scanner.close();
  • 14. Neptune Shell Data Definition CREATE TABLE DROP TABLE SHOW TABLES DESC Data Manipulation SELECT DELETE INSERT TRUNCATE COLUMN TRUNCATE TABLE Cluster Monitoring PING TABLETSERVER REPORT TABLE
  • 16. Performance Number of 1000-byte values read/written per second
  • 18. Powered by Neptune http://searcus.com/nosql twitter search service