Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
HDFS
1. Johan Oskarsson
Developer at Last.fm
Hadoop and Hive committer
2. What is HDFS?
Hadoop Hadoop Distributed FileSystem
Two server types
Namenode - keeps track of block locations
Datanode - stores blocks
Files commonly split up into 128mb blocks
Replicated to 3 datanodes by default
Scales well: ~4000 nodes
Write once
Large files
4. Yes
We have used it in production since
2006, but then again we are insane.
5. Who is using HDFS in production?
Yahoo! Largest cluster 4000 nodes (14PB raw storage)
Facebook. 600 nodes (2PB raw storage)
Powerset (Microsoft). "up to 400 instances"
Last.fm. 31 nodes (110TB raw storage)
... see more at http://wiki.apache.org/hadoop/PoweredBy
6. What do they use Hadoop for?
Yahoo! search index, Yahoo! anti spam, etc
Facebook ad, profile and application monitoring, etc
Powerset search index, heavy HBase users
Last.fm charts, A/B testing stats, site metrics and reporting
8. Use case - MR batch jobs
Scenario
1. Large source data files are inserted into HDFS
2. MapReduce job is run
3. Output is saved to HDFS
HDFS is a great choice for this use case
Shorter downtime is acceptable
Backups for important data
Permissions + trash to avoid user error
9. Use case - Serving files to a website
Scenario
1. User visits a website to browse photos
2. Lots of image files are requested from HDFS
Potential issues and solutions
HDFS isn't written for many small files
Namenode ram limits number of files
Use HBase or similar
Namenode goes down
Crazy "double cluster" solution
Standby namenode HADOOP-4539
HDFS isn't really written for low response times
Work is being done, not high priority
Use GlusterFS or MogileFS instead
10. Use case - Reliable, realtime log storage
Scenario
1. A stream of logging events is generated
2. The stream is written directly to HDFS
Potential issues and solutions
Problems with long write sessions
HDFS-200, HADOOP-6099, HDFS-278
Namenode goes down
Crazy "double cluster" solution
Standby namenode HADOOP-4539
Appends not stable
HDFS-265
11. Potential dealbreakers
Small files problem™
Use archives, sequencefiles or HBase
Appends/sync not stable
Namenode not highly available
Relatively high latency reads
12. Improvements
In progress or completed
HADOOP-4539 - Streaming edits to a standby NN
HDFS-265 - Appends
HDFS-245 - Symbolic links
Wish list
HDFS-209 - Tool to edit namenode metadata files
HDFS-220 - Transparent data archiving off HDFS
HDFS-503 - Reduce disk space used with erasure coding