SlideShare a Scribd company logo
1 of 36
Storj Labs Decentralized
Cloud Storage: V3 Network
Overview
JUNE 2018
PURPOSE
Why are we doing V3?
“
PURPOSE
Right design at X may be
very wrong at 10X or 100X
... design for ~10X growth,
but plan to rewrite before
~100X.
Jeff Dean, Google
100 PB
PEAK NETWORK SIZE ”
PURPOSE
We have the largest decentralized storage
network in the world!
We have learned a significant amount
about what works and what doesn’t from
our current 100 PB network.
Unfortunately, some of the things we’ve
learned require going back and updating
some things at a foundational level.
V2 CHALLENGES
What’s wrong with V2
and how are we fixing it?
V2 CHALLENGES
Poor Visibility
We’re often asked for statistics about our
network and we often don’t know the answer!
It’s not that we don’t want to know answers,
but that our existing implementation and data
models are not well suited for collecting useful
data at this scale.
Not having a rich set of metrics about health
and growth of the network is like flying blind.
No wonder some Redditors are mad at us!
V3 SOLUTIONS
Poor Visibility
We’re often asked for statistics about our
network and we often don’t know the answer!
It’s not that we don’t want to know answers,
but that our existing implementation and data
models are not well suited for collecting useful
data at this scale.
Not having a rich set of metrics about health
and growth of the network is like flying blind.
No wonder some Redditors are mad at us!
Visibility
We are baking instrumentation and telemetry
into all aspects of our codebase from the
get-go. We’ll be instrumenting down to the
function call level.
We’re going to be drowning in data.
(Yes, you can opt out.)
V2 CHALLENGES
Poor Durability
Amazon S3 is designed to provide
99.999999999% durability of objects over a
given year (11 9s!). This corresponds to an
average annual expected loss of 0.000000001%
of objects. For example, if you store 10,000,000
objects with Amazon S3, you can on average
expect to incur a loss of a single object once
every 10,000 years.
Getting 11 9s of durability is a big undertaking.
We have quite a ways to go.
V2 CHALLENGES
High Bandwidth Usage
North American residential ISPs often have
bandwidth caps. If you have a 1 TB monthly
bandwidth cap, you can only use
385 KB/s all month long before you go over
your cap!
Replication is bad for high durability but low
bandwidth usage. To get high durability with
replication you need more copies. We use six
copies and we still need better durability!
Using only erasure codes (like Reed Solomon)
are a better choice. You can increase your
durability by spreading out over more nodes
without increasing the amount of data you
need to store.
V3 SOLUTIONS
Durability & Bandwidth Usage
We are going all in on erasure coding.
Erasure coding allows us to adjust durability
without adjusting “expansion factor” (how much
extra data we have to store for redundancy
purposes).
Directly impacts bandwidth usage.
High Bandwidth Usage
North American residential ISPs often have
bandwidth caps. If you have a 1 TB monthly
bandwidth cap, you can only use
385 KB/s all month long before you go over
your cap!
Replication is bad for high durability but low
bandwidth usage. To get high durability with
replication you need more copies. We use six
copies and we still need better durability!
Using only erasure codes (like Reed Solomon)
are a better choice. You can increase your
durability by spreading out over more nodes
without increasing the amount of data you
need to store.
V3 SOLUTIONS
V2 CHALLENGES
Difficult Onboarding
Experience
We require customers to write code to integrate
with Libstorj.
The industry standard is the S3 storage protocol.
Why don’t we make it easy for people who have
programs that already support S3 to support us?
Farmers will appreciate everything we do to
make customer acquisition as easy as possible.
V3 SOLUTIONS
Difficult Onboarding
Experience
We require customers to write code to
integrate with Libstorj.
The industry standard is the S3 storage protocol.
Why don’t we make it easy for people who have
programs that already support S3 to support
us?
Farmers will appreciate everything we do to
make customer acquisition as easy as possible.
Improved Onboarding
Experience
S3-compatible object gateway via
our partner Minio!
V2 CHALLENGES
Lack of Functionality
V2 Currently Doesn’t Support...
Streaming
Sharing
“Folders” (S3 delimited path prefixes)
Indefinite storage (contract renewal + data
repair)
Extended attributes
Rich identity and access management
V3 SOLUTIONS
Lack of Functionality
V2 Currently Doesn’t Support...
Streaming
Sharing
“Folders” (S3 delimited path prefixes)
Indefinite storage (contract renewal + data
repair)
Extended attributes
Rich Identity and Access Management
Improved Functionality
V3 Will Support...
Streaming
Sharing
“Folders” (S3 delimited path prefixes)
Indefinite storage
Extended attributes
Rich identity and access management
V2 CHALLENGES
Poor Latency
More parallelism can improve throughput, but
not latency.
Time-to-first-byte is a critical performance
measurement.
Improving latency means designing for a
minimum number of network request round
trips at the architecture level.
We need latency measured in milliseconds, not
seconds or minutes.
V3 SOLUTIONS
Poor Latency
More parallelism can improve throughput, but
not latency.
Time-to-first-byte is a critical performance
measurement.
Improving latency means designing for a
minimum number of network request round
trips at the architecture level.
We need latency measured in milliseconds, not
seconds or minutes.
Latency
We will be tracking time-to-first-byte as a
performance measurement.
The V3 architecture minimizes network request
round trips for retrieving data.
V2 CHALLENGES
Centralized in Some Ways
For being a decentralized storage network, it’s
surprising everyone has to use our bridge.
Our V2 paper expected everyone would be able
to run their own bridge, but the current
situation falls short of our aspirations.
The current bridge is centralized for a number
of reasons, but making sure farmers get paid is
one of them. Decentralized bridges need to
pay farmers!
V3 SOLUTIONS
Centralized in Some Ways
For being a decentralized storage network, it’s
surprising everyone has to use our bridge.
Our V2 paper expected everyone would be able
to run their own bridge, but the current
situation falls short of our aspirations.
The current bridge is centralized for a number
of reasons, but making sure farmers get paid is
one of them. Decentralized bridges need to
pay farmers!
Decentralization
To avoid a situation where our launch
implementation diverges from our whitepaper
goals (again), we will be launching multiple
reference heavy client installations — all with
completely open-source components.
V2 CHALLENGES
“Cheaters”
Why is cheating even a thing?
Our network should be robust enough to
incentivize all actors to do the right thing and
prevent bad actors from doing the wrong
thing.
We should be sending job offers to people who
figure out problems with our network, not
trying to penalize them!
V3 SOLUTIONS
“Cheaters”
Why is cheating even a thing?
Our network should be robust enough to
incentivize all actors to do the right thing and
prevent bad actors from doing the wrong
thing.
We should be sending job offers to people who
figure out problems with our network, not
trying to penalize them!
Cheaters Never Win
Heavy clients will actually do storage audits to
ensure farmers have the data they are supposed to.
Farmers will be paid with relation to successfully
responding to audits and data retrievals.
Reputation will be fixed so that farmers who don’t
actually store and return data will not earn new
contracts.
New bandwidth accounting protocol — bandwidth
will be measured by signed agreements farmers
and clients both agree to at the beginning of every
request (without additional round trips, to be
described in upcoming whitepaper).
V2 CHALLENGES
Codebase Modularity
Fixing many of the previous issues in isolation
is a big challenge and requires touching many
parts of the codebase.
If the system was designed as a collection of
modular parts at the architecture level, the
code would be easier to iterate and improve
on.
Hard to know what modular parts are
necessary prior to really seeing this scale!
V3 SOLUTIONS
Codebase Modularity
Fixing many of the previous issues in isolation
is a big challenge and requires touching many
parts of the codebase.
If the system was designed as a collection of
modular parts at the architecture level, the
code would be easier to iterate and improve
on.
Hard to know what modular parts are
necessary prior to really seeing this scale!
Codebase Modularity
V3 is broken down into a collection of
modular responsibilities:
Network overlay
Node-to-node RPC
Network state and metadata storage
Farmer reputation
Heavy client reputation
Farmer payments
S3 gateway
Farmers
Encryption/erasure encoding
Data maintenance and repair
Identity and access management
V2 CHALLENGES
Codebase Scale
We have more than 350,000 lines of Javascript
code with few type annotations.
Complex refactors are getting very hairy and
onboarding new developers is difficult.
Lots of state needs to be kept in each
developer’s head.
Say what you will about type systems, but
unless you have 100% unit test coverage (we
don’t), type systems eliminate whole classes of
bugs during system-wide refactorings.
V3 SOLUTIONS
Codebase Scale
We have over 350,000 lines of Javascript code
with few type annotations.
Complex refactors are getting very hairy and
onboarding new developers is difficult.
Lots of state needs to be kept in each
developer’s head.
Say what you will about type systems, but
unless you have 100% unit test coverage (we
don’t), type systems eliminate whole classes of
bugs during system-wide refactorings.
Codebase Scale
We’re writing new systems in Go going forward.
Go is simple.
Go is typed.
Go is C binding compatible (Libstorj), great for
concurrency and distributed systems, and also is
good for web development (IAM/Admin).
Go has great tooling for supporting large
codebases.
V3 SOLUTIONS
How are we fixing V2’s problems with V3?
Visibility – Baked-in instrumentation
Durability, latency, and bandwidth – Erasure codes
Onboarding – S3 compatibility
Features – Streaming, folders, indefinite storage, IAM, etc.
Decentralized – Heavy clients instead of V2 bridge
“Cheating” – Incentives aligned and holes actually plugged
Codebase modularity – Reorganization into components
Codebase scale – Go language for new components
V3 ARCHITECTURE
What will V3 look like?
V3 ARCHITECTURE
S3 Gateway/
Libstorj
Customer
Application
Heavy
Client
Farmers
V3 ARCHITECTURE
Heavy client responsibilities
Cache high-demand network information for
performance and latency reasons
Keep track of and back up object metadata
Assist with farmer selection and reputation
Identity and access management
(+ web interface)
Manage user funds
Pay farmers
Replace missing data in the network when
redundancy falls below acceptable thresholds
A single heavy client “instance” can be spread
across multiple servers for fault tolerance;
users will register an account on a specific
heavy client instance.
Each heavy client instance will be run by some
specific individual or entity as an admin who will
be responsible for uptime and upkeep of user
data, but will not be able to decrypt or access
user data.
Porting accounts and data between heavy
clients may be a launch feature, but may slip to
a future release.
V3 ARCHITECTURE
S3 Gateway/Libstorj Responsibilities
Encrypt data
Coordinate with heavy clients for farmer identification
Erasure code data and send/retrieve to/from farmers
Stream data to/from user applications
V3 ARCHITECTURE
Farmer Responsibilities
Store data reliably
Return data
Have great uptime
Get paid
PROGRESS
How far are we?
PROGRESS
Whitepaper
We’re in the middle of
updating our whitepaper to
reflect changes we’re making
from V2 to V3.
We’ll be sharing this as
soon as we can.
Code
github.com/storj/storj
We have already done some
internal tests where our
streaming object
performance is faster than S3.
We’ve begun to tie some of
the heavy client services
together.
Only started implementation
in March; still a long way to
go.
V2
What about V2?
V2
We have decided to halt development on
V2 and focus exclusively on V3 whenever
possible
This does mean we have decided to halt efforts on
SIP9 deployment.
We have also shut off new account activations.
ROADMAP
Network V3 Timeline

More Related Content

Recently uploaded

Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
Joaquim Jorge
 

Recently uploaded (20)

2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Handwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed textsHandwritten Text Recognition for manuscripts and early printed texts
Handwritten Text Recognition for manuscripts and early printed texts
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 

Featured

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Saba Software
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
Simplilearn
 

Featured (20)

How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them wellGood Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
Good Stuff Happens in 1:1 Meetings: Why you need them and how to do them well
 
Introduction to C Programming Language
Introduction to C Programming LanguageIntroduction to C Programming Language
Introduction to C Programming Language
 

Storj Labs V3 Network Overview

  • 1. Storj Labs Decentralized Cloud Storage: V3 Network Overview JUNE 2018
  • 2. PURPOSE Why are we doing V3?
  • 3. “ PURPOSE Right design at X may be very wrong at 10X or 100X ... design for ~10X growth, but plan to rewrite before ~100X. Jeff Dean, Google 100 PB PEAK NETWORK SIZE ”
  • 4. PURPOSE We have the largest decentralized storage network in the world! We have learned a significant amount about what works and what doesn’t from our current 100 PB network. Unfortunately, some of the things we’ve learned require going back and updating some things at a foundational level.
  • 5. V2 CHALLENGES What’s wrong with V2 and how are we fixing it?
  • 6. V2 CHALLENGES Poor Visibility We’re often asked for statistics about our network and we often don’t know the answer! It’s not that we don’t want to know answers, but that our existing implementation and data models are not well suited for collecting useful data at this scale. Not having a rich set of metrics about health and growth of the network is like flying blind. No wonder some Redditors are mad at us!
  • 7. V3 SOLUTIONS Poor Visibility We’re often asked for statistics about our network and we often don’t know the answer! It’s not that we don’t want to know answers, but that our existing implementation and data models are not well suited for collecting useful data at this scale. Not having a rich set of metrics about health and growth of the network is like flying blind. No wonder some Redditors are mad at us! Visibility We are baking instrumentation and telemetry into all aspects of our codebase from the get-go. We’ll be instrumenting down to the function call level. We’re going to be drowning in data. (Yes, you can opt out.)
  • 8. V2 CHALLENGES Poor Durability Amazon S3 is designed to provide 99.999999999% durability of objects over a given year (11 9s!). This corresponds to an average annual expected loss of 0.000000001% of objects. For example, if you store 10,000,000 objects with Amazon S3, you can on average expect to incur a loss of a single object once every 10,000 years. Getting 11 9s of durability is a big undertaking. We have quite a ways to go.
  • 9. V2 CHALLENGES High Bandwidth Usage North American residential ISPs often have bandwidth caps. If you have a 1 TB monthly bandwidth cap, you can only use 385 KB/s all month long before you go over your cap! Replication is bad for high durability but low bandwidth usage. To get high durability with replication you need more copies. We use six copies and we still need better durability! Using only erasure codes (like Reed Solomon) are a better choice. You can increase your durability by spreading out over more nodes without increasing the amount of data you need to store.
  • 10. V3 SOLUTIONS Durability & Bandwidth Usage We are going all in on erasure coding. Erasure coding allows us to adjust durability without adjusting “expansion factor” (how much extra data we have to store for redundancy purposes). Directly impacts bandwidth usage. High Bandwidth Usage North American residential ISPs often have bandwidth caps. If you have a 1 TB monthly bandwidth cap, you can only use 385 KB/s all month long before you go over your cap! Replication is bad for high durability but low bandwidth usage. To get high durability with replication you need more copies. We use six copies and we still need better durability! Using only erasure codes (like Reed Solomon) are a better choice. You can increase your durability by spreading out over more nodes without increasing the amount of data you need to store.
  • 12. V2 CHALLENGES Difficult Onboarding Experience We require customers to write code to integrate with Libstorj. The industry standard is the S3 storage protocol. Why don’t we make it easy for people who have programs that already support S3 to support us? Farmers will appreciate everything we do to make customer acquisition as easy as possible.
  • 13. V3 SOLUTIONS Difficult Onboarding Experience We require customers to write code to integrate with Libstorj. The industry standard is the S3 storage protocol. Why don’t we make it easy for people who have programs that already support S3 to support us? Farmers will appreciate everything we do to make customer acquisition as easy as possible. Improved Onboarding Experience S3-compatible object gateway via our partner Minio!
  • 14. V2 CHALLENGES Lack of Functionality V2 Currently Doesn’t Support... Streaming Sharing “Folders” (S3 delimited path prefixes) Indefinite storage (contract renewal + data repair) Extended attributes Rich identity and access management
  • 15. V3 SOLUTIONS Lack of Functionality V2 Currently Doesn’t Support... Streaming Sharing “Folders” (S3 delimited path prefixes) Indefinite storage (contract renewal + data repair) Extended attributes Rich Identity and Access Management Improved Functionality V3 Will Support... Streaming Sharing “Folders” (S3 delimited path prefixes) Indefinite storage Extended attributes Rich identity and access management
  • 16. V2 CHALLENGES Poor Latency More parallelism can improve throughput, but not latency. Time-to-first-byte is a critical performance measurement. Improving latency means designing for a minimum number of network request round trips at the architecture level. We need latency measured in milliseconds, not seconds or minutes.
  • 17. V3 SOLUTIONS Poor Latency More parallelism can improve throughput, but not latency. Time-to-first-byte is a critical performance measurement. Improving latency means designing for a minimum number of network request round trips at the architecture level. We need latency measured in milliseconds, not seconds or minutes. Latency We will be tracking time-to-first-byte as a performance measurement. The V3 architecture minimizes network request round trips for retrieving data.
  • 18. V2 CHALLENGES Centralized in Some Ways For being a decentralized storage network, it’s surprising everyone has to use our bridge. Our V2 paper expected everyone would be able to run their own bridge, but the current situation falls short of our aspirations. The current bridge is centralized for a number of reasons, but making sure farmers get paid is one of them. Decentralized bridges need to pay farmers!
  • 19. V3 SOLUTIONS Centralized in Some Ways For being a decentralized storage network, it’s surprising everyone has to use our bridge. Our V2 paper expected everyone would be able to run their own bridge, but the current situation falls short of our aspirations. The current bridge is centralized for a number of reasons, but making sure farmers get paid is one of them. Decentralized bridges need to pay farmers! Decentralization To avoid a situation where our launch implementation diverges from our whitepaper goals (again), we will be launching multiple reference heavy client installations — all with completely open-source components.
  • 20. V2 CHALLENGES “Cheaters” Why is cheating even a thing? Our network should be robust enough to incentivize all actors to do the right thing and prevent bad actors from doing the wrong thing. We should be sending job offers to people who figure out problems with our network, not trying to penalize them!
  • 21. V3 SOLUTIONS “Cheaters” Why is cheating even a thing? Our network should be robust enough to incentivize all actors to do the right thing and prevent bad actors from doing the wrong thing. We should be sending job offers to people who figure out problems with our network, not trying to penalize them! Cheaters Never Win Heavy clients will actually do storage audits to ensure farmers have the data they are supposed to. Farmers will be paid with relation to successfully responding to audits and data retrievals. Reputation will be fixed so that farmers who don’t actually store and return data will not earn new contracts. New bandwidth accounting protocol — bandwidth will be measured by signed agreements farmers and clients both agree to at the beginning of every request (without additional round trips, to be described in upcoming whitepaper).
  • 22. V2 CHALLENGES Codebase Modularity Fixing many of the previous issues in isolation is a big challenge and requires touching many parts of the codebase. If the system was designed as a collection of modular parts at the architecture level, the code would be easier to iterate and improve on. Hard to know what modular parts are necessary prior to really seeing this scale!
  • 23. V3 SOLUTIONS Codebase Modularity Fixing many of the previous issues in isolation is a big challenge and requires touching many parts of the codebase. If the system was designed as a collection of modular parts at the architecture level, the code would be easier to iterate and improve on. Hard to know what modular parts are necessary prior to really seeing this scale! Codebase Modularity V3 is broken down into a collection of modular responsibilities: Network overlay Node-to-node RPC Network state and metadata storage Farmer reputation Heavy client reputation Farmer payments S3 gateway Farmers Encryption/erasure encoding Data maintenance and repair Identity and access management
  • 24. V2 CHALLENGES Codebase Scale We have more than 350,000 lines of Javascript code with few type annotations. Complex refactors are getting very hairy and onboarding new developers is difficult. Lots of state needs to be kept in each developer’s head. Say what you will about type systems, but unless you have 100% unit test coverage (we don’t), type systems eliminate whole classes of bugs during system-wide refactorings.
  • 25. V3 SOLUTIONS Codebase Scale We have over 350,000 lines of Javascript code with few type annotations. Complex refactors are getting very hairy and onboarding new developers is difficult. Lots of state needs to be kept in each developer’s head. Say what you will about type systems, but unless you have 100% unit test coverage (we don’t), type systems eliminate whole classes of bugs during system-wide refactorings. Codebase Scale We’re writing new systems in Go going forward. Go is simple. Go is typed. Go is C binding compatible (Libstorj), great for concurrency and distributed systems, and also is good for web development (IAM/Admin). Go has great tooling for supporting large codebases.
  • 26. V3 SOLUTIONS How are we fixing V2’s problems with V3? Visibility – Baked-in instrumentation Durability, latency, and bandwidth – Erasure codes Onboarding – S3 compatibility Features – Streaming, folders, indefinite storage, IAM, etc. Decentralized – Heavy clients instead of V2 bridge “Cheating” – Incentives aligned and holes actually plugged Codebase modularity – Reorganization into components Codebase scale – Go language for new components
  • 27. V3 ARCHITECTURE What will V3 look like?
  • 29. V3 ARCHITECTURE Heavy client responsibilities Cache high-demand network information for performance and latency reasons Keep track of and back up object metadata Assist with farmer selection and reputation Identity and access management (+ web interface) Manage user funds Pay farmers Replace missing data in the network when redundancy falls below acceptable thresholds A single heavy client “instance” can be spread across multiple servers for fault tolerance; users will register an account on a specific heavy client instance. Each heavy client instance will be run by some specific individual or entity as an admin who will be responsible for uptime and upkeep of user data, but will not be able to decrypt or access user data. Porting accounts and data between heavy clients may be a launch feature, but may slip to a future release.
  • 30. V3 ARCHITECTURE S3 Gateway/Libstorj Responsibilities Encrypt data Coordinate with heavy clients for farmer identification Erasure code data and send/retrieve to/from farmers Stream data to/from user applications
  • 31. V3 ARCHITECTURE Farmer Responsibilities Store data reliably Return data Have great uptime Get paid
  • 33. PROGRESS Whitepaper We’re in the middle of updating our whitepaper to reflect changes we’re making from V2 to V3. We’ll be sharing this as soon as we can. Code github.com/storj/storj We have already done some internal tests where our streaming object performance is faster than S3. We’ve begun to tie some of the heavy client services together. Only started implementation in March; still a long way to go.
  • 35. V2 We have decided to halt development on V2 and focus exclusively on V3 whenever possible This does mean we have decided to halt efforts on SIP9 deployment. We have also shut off new account activations.