Link to video: https://youtu.be/lDXdf5q8Yw8
At Indeed, we use massive amounts of data to build our products and services. At first, we relied on rsync to distribute these data to our servers. This rsync system lasted for ten years before we started to encounter scaling challenges. So we built a new system on top of BitTorrent to improve latency, reliability, and throughput. Today, terabytes of data flow around the world every day between our servers. In this talk, we will describe what we needed, what we created, and the lessons we learned building a system at this scale.
3. Indeed is the #1
external source of hire
64% of US job searchers search
on indeed each month
Unique Visitors (millions)
Million unique visitors
2009 2011 2012 2013 2014 2015
0
20
40
60
80
100
120
140
160
180
2010
180M
180 million
unique users
80.2M
unique US visitors per month
16M
jobs
50+
countries
28
languages
4.
5. How We Build Systems
fast simple resilient scalable
9. Job Search Browser Rendering
median ~0.5 seconds
Feb 24 Feb 25 Feb 26 Feb 27 Feb 28 Feb 29 Mar 1 Mar 2 Mar 3 Mar 4 Mar 5 Mar 6 Mar 7 Mar 8
0
100
200
300
400
500
600
700
800
milliseconds
77. Job Search Browser Rendering
median ~0.5 seconds
Feb 24 Feb 25 Feb 26 Feb 27 Feb 28 Feb 29 Mar 1 Mar 2 Mar 3 Mar 4 Mar 5 Mar 6 Mar 7 Mar 8
0
100
200
300
400
500
600
700
800
milliseconds
207. Rhone stores list of versions by artifact.
version 4
version 5
version 6
artifactA
version 221
version 226
version 227
version 228
artifactB
version 1artifactC
223. 100 artifacts in 10 years
2011
52 countries
2004
Indeed
2008
6 countries
2009
23 countries
2014
rsync limits
1st artifact
migrated to RAD
2015
critical artifacts
migrated
2016
80 RAD
artifacts
80 new
artifacts
in 1 year