SlideShare a Scribd company logo
1 of 12
Download to read offline
Hadoop HDFS
The Ultimate Storage
tagomoris
2013/05/20 Cassandra Casual #1
13年5月20日月曜日
Nodes
• NameNode (metadata)
• 1
• or 2 (NamenodeHA + 3 JournalNodes)
• DataNode (blocks)
• 3~ nodes
• Rack awareness
13年5月20日月曜日
Filesystem
• Metadata on Namenode JVM heap
• "OK, Namenode should have giant RAM"
• File with Blocks (default 64MB)
• Block level compression & parallel read
13年5月20日月曜日
Compression
• Gzip, Bzip2, ....
• By filename suffix!
• By HDFS specific container file feature
13年5月20日月曜日
Replication
• Block level replication
• Default 3 replicas
• Automatically replicated
13年5月20日月曜日
Rebalancing
• `start-balancer.sh`
13年5月20日月曜日
Protocol
• Java (DFSClient) Native Protocol
• Binary protocol
• Version sensitive
• All clients communicate with all nodes
13年5月20日月曜日
Protocol #2
• WebHDFS (Hadoop v1.0~)
• HTTP
• Protocol version defined
• All clients communicate with all nodes
• HttpFs (Hadoop v2.0~)
• HTTP proxy server for DFSClient
• All clients communicate with a node
13年5月20日月曜日
Concurrency
• NONE
• Concurrent write(append) breaks file
13年5月20日月曜日
Performance
• HDFS is for sequencial access
• and for large (128MB or more) files
• HDFS is not for random access
• HBase is perfect software for you!
13年5月20日月曜日
Admin tools
• WebUI with poor CSS
• CLI `hdfs dfsadmin`
13年5月20日月曜日
Conclusion
• Use just for Hadoop batches
13年5月20日月曜日

More Related Content

What's hot

エコなWebサーバー
エコなWebサーバーエコなWebサーバー
エコなWebサーバー
emasaka
 
Google Perf Tools (tcmalloc) の使い方
Google Perf Tools (tcmalloc) の使い方Google Perf Tools (tcmalloc) の使い方
Google Perf Tools (tcmalloc) の使い方
Kazuki Ohta
 
Xとかオワコン?
Xとかオワコン?Xとかオワコン?
Xとかオワコン?
Naohiro Aota
 
CategoLJについて
CategoLJについてCategoLJについて
CategoLJについて
Toshiaki Maki
 
20101220 pixiv tech_meeting
20101220 pixiv tech_meeting20101220 pixiv tech_meeting
20101220 pixiv tech_meeting
semind
 
HaskellではじめるCortex-M3組込みプログラミング
HaskellではじめるCortex-M3組込みプログラミングHaskellではじめるCortex-M3組込みプログラミング
HaskellではじめるCortex-M3組込みプログラミング
Kiwamu Okabe
 

What's hot (19)

エコなWebサーバー
エコなWebサーバーエコなWebサーバー
エコなWebサーバー
 
ハイパフォーマンスブラウザネットワーキング 12章「HTTP 2.0」と現在の仕様
ハイパフォーマンスブラウザネットワーキング 12章「HTTP 2.0」と現在の仕様ハイパフォーマンスブラウザネットワーキング 12章「HTTP 2.0」と現在の仕様
ハイパフォーマンスブラウザネットワーキング 12章「HTTP 2.0」と現在の仕様
 
マスタリングJUNOS DHCP
マスタリングJUNOS DHCPマスタリングJUNOS DHCP
マスタリングJUNOS DHCP
 
20190518 27th-chugoku db-lt-pg12
20190518 27th-chugoku db-lt-pg1220190518 27th-chugoku db-lt-pg12
20190518 27th-chugoku db-lt-pg12
 
Google Perf Tools (tcmalloc) の使い方
Google Perf Tools (tcmalloc) の使い方Google Perf Tools (tcmalloc) の使い方
Google Perf Tools (tcmalloc) の使い方
 
NetBSD/evbarm on Raspberry Pi
NetBSD/evbarm on Raspberry PiNetBSD/evbarm on Raspberry Pi
NetBSD/evbarm on Raspberry Pi
 
Varnish
VarnishVarnish
Varnish
 
EWD 3 トレーニング・コース #1 Node.jsとGT.Mの統合方法
EWD 3トレーニング・コース #1 Node.jsとGT.Mの統合方法EWD 3トレーニング・コース #1 Node.jsとGT.Mの統合方法
EWD 3 トレーニング・コース #1 Node.jsとGT.Mの統合方法
 
Xとかオワコン?
Xとかオワコン?Xとかオワコン?
Xとかオワコン?
 
20110205.linux 0.01
20110205.linux 0.0120110205.linux 0.01
20110205.linux 0.01
 
CategoLJについて
CategoLJについてCategoLJについて
CategoLJについて
 
Emacs
EmacsEmacs
Emacs
 
GGEasyMonitor技術情報
GGEasyMonitor技術情報GGEasyMonitor技術情報
GGEasyMonitor技術情報
 
VIOPS06: マルチコア時代のコンピューティング活用術
VIOPS06: マルチコア時代のコンピューティング活用術VIOPS06: マルチコア時代のコンピューティング活用術
VIOPS06: マルチコア時代のコンピューティング活用術
 
20101220 pixiv tech_meeting
20101220 pixiv tech_meeting20101220 pixiv tech_meeting
20101220 pixiv tech_meeting
 
HaskellではじめるCortex-M3組込みプログラミング
HaskellではじめるCortex-M3組込みプログラミングHaskellではじめるCortex-M3組込みプログラミング
HaskellではじめるCortex-M3組込みプログラミング
 
Web Service on SSD
Web Service on SSDWeb Service on SSD
Web Service on SSD
 
OpenBSD/luna88k news at NBUG meeting 2014-04
OpenBSD/luna88k news at NBUG meeting 2014-04OpenBSD/luna88k news at NBUG meeting 2014-04
OpenBSD/luna88k news at NBUG meeting 2014-04
 
ParliamentでGeoSPARQL
ParliamentでGeoSPARQLParliamentでGeoSPARQL
ParliamentでGeoSPARQL
 

Viewers also liked

Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
Edureka!
 
Hadoop operations basic
Hadoop operations basicHadoop operations basic
Hadoop operations basic
Hafizur Rahman
 
Hadoop Installation and basic configuration
Hadoop Installation and basic configurationHadoop Installation and basic configuration
Hadoop Installation and basic configuration
Gerrit van Vuuren
 
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability | Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Edureka!
 

Viewers also liked (16)

Hadoop Developer
Hadoop DeveloperHadoop Developer
Hadoop Developer
 
Anatomy of file write in hadoop
Anatomy of file write in hadoopAnatomy of file write in hadoop
Anatomy of file write in hadoop
 
Hadoop operations basic
Hadoop operations basicHadoop operations basic
Hadoop operations basic
 
What's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File SystemWhat's New and Upcoming in HDFS - the Hadoop Distributed File System
What's New and Upcoming in HDFS - the Hadoop Distributed File System
 
Hadoop YARN overview
Hadoop YARN overviewHadoop YARN overview
Hadoop YARN overview
 
Hadoop Installation and basic configuration
Hadoop Installation and basic configurationHadoop Installation and basic configuration
Hadoop Installation and basic configuration
 
HDFS NameNode High Availability
HDFS NameNode High AvailabilityHDFS NameNode High Availability
HDFS NameNode High Availability
 
Learn Big Data & Hadoop
Learn Big Data & Hadoop Learn Big Data & Hadoop
Learn Big Data & Hadoop
 
HDFS Namenode High Availability
HDFS Namenode High AvailabilityHDFS Namenode High Availability
HDFS Namenode High Availability
 
Apache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS FederationApache Hadoop YARN, NameNode HA, HDFS Federation
Apache Hadoop YARN, NameNode HA, HDFS Federation
 
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
Hadoop Operations Powered By ... Hadoop (Hadoop Summit 2014 Amsterdam)
 
Sqoop
SqoopSqoop
Sqoop
 
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability | Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
Hadoop 2.0 Architecture | HDFS Federation | NameNode High Availability |
 
Introduction to Apache Sqoop
Introduction to Apache SqoopIntroduction to Apache Sqoop
Introduction to Apache Sqoop
 
Apache Hadoop In Theory And Practice
Apache Hadoop In Theory And PracticeApache Hadoop In Theory And Practice
Apache Hadoop In Theory And Practice
 
Big Data & Hadoop Tutorial
Big Data & Hadoop TutorialBig Data & Hadoop Tutorial
Big Data & Hadoop Tutorial
 

Similar to Hadoop HDFS: The Ultimate Storage

Mtddc kyusyu-lightningtalks
Mtddc kyusyu-lightningtalksMtddc kyusyu-lightningtalks
Mtddc kyusyu-lightningtalks
Yuji Takayama
 

Similar to Hadoop HDFS: The Ultimate Storage (11)

Hadoopのシステム設計・運用のポイント
Hadoopのシステム設計・運用のポイントHadoopのシステム設計・運用のポイント
Hadoopのシステム設計・運用のポイント
 
MapR アーキテクチャ概要 - MapR CTO Meetup 2013/11/12
MapR アーキテクチャ概要 - MapR CTO Meetup 2013/11/12MapR アーキテクチャ概要 - MapR CTO Meetup 2013/11/12
MapR アーキテクチャ概要 - MapR CTO Meetup 2013/11/12
 
Mysql casial01
Mysql casial01Mysql casial01
Mysql casial01
 
Hadoop operation chaper 4
Hadoop operation chaper 4Hadoop operation chaper 4
Hadoop operation chaper 4
 
Hadoopによる大規模分散データ処理
Hadoopによる大規模分散データ処理Hadoopによる大規模分散データ処理
Hadoopによる大規模分散データ処理
 
ゆるふわLinux-HA 〜PostgreSQL編〜
ゆるふわLinux-HA 〜PostgreSQL編〜ゆるふわLinux-HA 〜PostgreSQL編〜
ゆるふわLinux-HA 〜PostgreSQL編〜
 
PostgreSQLバックアップの基本
PostgreSQLバックアップの基本PostgreSQLバックアップの基本
PostgreSQLバックアップの基本
 
Cloudera大阪セミナー 20130219
Cloudera大阪セミナー 20130219Cloudera大阪セミナー 20130219
Cloudera大阪セミナー 20130219
 
Mtddc kyusyu-lightningtalks
Mtddc kyusyu-lightningtalksMtddc kyusyu-lightningtalks
Mtddc kyusyu-lightningtalks
 
Yesod on Heroku
Yesod on HerokuYesod on Heroku
Yesod on Heroku
 
WalBの紹介
WalBの紹介WalBの紹介
WalBの紹介
 

More from SATOSHI TAGOMORI

More from SATOSHI TAGOMORI (20)

Ractor's speed is not light-speed
Ractor's speed is not light-speedRactor's speed is not light-speed
Ractor's speed is not light-speed
 
Good Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/OperationsGood Things and Hard Things of SaaS Development/Operations
Good Things and Hard Things of SaaS Development/Operations
 
Maccro Strikes Back
Maccro Strikes BackMaccro Strikes Back
Maccro Strikes Back
 
Invitation to the dark side of Ruby
Invitation to the dark side of RubyInvitation to the dark side of Ruby
Invitation to the dark side of Ruby
 
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)Hijacking Ruby Syntax in Ruby (RubyConf 2018)
Hijacking Ruby Syntax in Ruby (RubyConf 2018)
 
Make Your Ruby Script Confusing
Make Your Ruby Script ConfusingMake Your Ruby Script Confusing
Make Your Ruby Script Confusing
 
Hijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in RubyHijacking Ruby Syntax in Ruby
Hijacking Ruby Syntax in Ruby
 
Lock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive OperationsLock, Concurrency and Throughput of Exclusive Operations
Lock, Concurrency and Throughput of Exclusive Operations
 
Data Processing and Ruby in the World
Data Processing and Ruby in the WorldData Processing and Ruby in the World
Data Processing and Ruby in the World
 
Planet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: BigdamPlanet-scale Data Ingestion Pipeline: Bigdam
Planet-scale Data Ingestion Pipeline: Bigdam
 
Technologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise BusinessTechnologies, Data Analytics Service and Enterprise Business
Technologies, Data Analytics Service and Enterprise Business
 
Ruby and Distributed Storage Systems
Ruby and Distributed Storage SystemsRuby and Distributed Storage Systems
Ruby and Distributed Storage Systems
 
Perfect Norikra 2nd Season
Perfect Norikra 2nd SeasonPerfect Norikra 2nd Season
Perfect Norikra 2nd Season
 
Fluentd 101
Fluentd 101Fluentd 101
Fluentd 101
 
To Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT ToTo Have Own Data Analytics Platform, Or NOT To
To Have Own Data Analytics Platform, Or NOT To
 
The Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and ContainersThe Patterns of Distributed Logging and Containers
The Patterns of Distributed Logging and Containers
 
How To Write Middleware In Ruby
How To Write Middleware In RubyHow To Write Middleware In Ruby
How To Write Middleware In Ruby
 
Modern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real WorldModern Black Mages Fighting in the Real World
Modern Black Mages Fighting in the Real World
 
Open Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud ServiceOpen Source Software, Distributed Systems, Database as a Cloud Service
Open Source Software, Distributed Systems, Database as a Cloud Service
 
Fluentd Overview, Now and Then
Fluentd Overview, Now and ThenFluentd Overview, Now and Then
Fluentd Overview, Now and Then
 

Hadoop HDFS: The Ultimate Storage