The document discusses lessons that can be learned from the Voyager space mission about building distributed systems. It outlines several lessons from Voyager's design and trajectory, including embracing redundancy, being creative with failure testing, reusing existing technologies, baking in flexibility, solving problems with available resources, respecting earlier design decisions, thorough instrumentation, seeking expert advice, thorough documentation, and strong leadership. The lessons are presented as questions to encourage reflection on their application to distributed system design.
11. “This is a present from a small distant
world, a token of our sounds, our
science, our images, our music, our
thoughts, and our feelings. We are
attempting to survive our time so we
may live into yours.”
President Jimmy Carter’s Golden Record message
@rseroter
15. • Redundancy at each layer (infrastructure,
networking, software, databases)
• May be active or passive
• Cloud platforms offer via configuration AND
architecture
• Multi-cloud offers new redundancy options
Embrace
redundancy
What does this mean for your
distributed system?
@rseroter
18. • In distributed systems, need to consider more
failure points
• Identify SPOF, and either mitigate or
document
• Consider using chaos engineering approach
Be creative
with “what
if” analysis
What does this mean for your
distributed system?
@rseroter
21. • Reuse architecture patterns
• Reuse aspect-oriented concerns like identity,
logging, monitoring, and configuration
• Leverage existing application platforms and
common (operational) abstractions
Reuse
wherever
possible
What does this mean for your
distributed system?
@rseroter
24. • Decompose systems into individually
updateable components
• Use Dependency Injection where it makes
sense
• Externalize configuration and environment-
specific behavior
• Use things like feature flags to achieve
”progressive delivery”
Bake in the
flexibility
you’ll need
What does this mean for your
distributed system?
@rseroter
27. • Accept that constraints can be freeing
• Use existing people and technology where
possible
• Don’t create new “platforms” when current
ones are sufficient
• Introduce new components after careful
consideration
Solve
problems
with what’s
available
What does this mean for your
distributed system?
@rseroter
30. • Don’t immediately dismiss prior choices
• Review documentation and context
• Recognize the simplicity of original design
• Be sure to understand impact of “upgrading”
Have
respect for
earlier
decisions
What does this mean for your
distributed system?
@rseroter
33. • Make subsystems and services observable
• Use configurations to turn instrumentation on
and off
• Get the big picture through distributed
tracing and correlation
• Use this data to power your automation, and
decision making
Don’t
skimp on
instrumen-
tation
What does this mean for your
distributed system?
@rseroter
36. • Solicit feedback on your design, early and
often
• Use conferences and meetups to discuss
assumptions
• Leverage SREs to explore you system
reliability
Take
advantage
of expert
advice
What does this mean for your
distributed system?
@rseroter
39. • Write for the later audience (developers,
operators) not your current self
• Call out assumptions and constraints
• Be specific about functionality and
parameters
• Document expected errors and how they are
handled
• Keep it up to date
Documen-
tation
matters
What does this mean for your
distributed system?
@rseroter
42. • Need leaders who can build consensus
• Want leaders that empower others
• Look for passion, but also calm under pressure
Leadership
matters
What does this mean for your
distributed system?
@rseroter