2. History
2
Concepts borrows heavily (or stolen) from
classic papers “Bootstrapping an
Infrastructure” by Steve Traugott and Joel
Huddleston, and Mark Burgess’s “Computer
Immunology” and Promise Theory
Personal experience – syncing
scripts, predicting change, better
communication
3. What does this fix?
3
How do I keep X (files, permissions, services) from
changing unpredictably?
When did change happen? Is it related to the
downtime incident we had? Or unpredictable
deployments?
Who/what group made that change?
The system is growing (or has) arms and legs in
unpredictable, astonishing directions making it
difficult/impossible to reproduce. Or make minor
changes: Deployments are the equivalent of leveling
the whole house to change one light bulb.
Critical parts of the infrastructure reside in people's
heads - bad for scaling the company, bad for
individual development. Put the real estate to better
use.
4. Centralized, Automated
4
Standards
Sounds intutive, but….
Obvious examples in SA world –
LDAP, DNS, logservers, data
consistency, NFS fileservers
Same principle as programmers’ DRY
5. What does this look like?
5
1) Version-Controlled Published Configurations
2) Master Fileserver Repository
3) Automated Propagation and Maintenance
The heart of where much of today’s DevOps work exists: This is
where tools like cfengine, puppet, and chef literally “level-up”
the way your infrastructure is managed. See links on last
slide for more information.
4) Monitoring the Infrastructure
5) Self-Healing
6. Version-Controlled Published
6
Configurations
Git, svn, perforce, cvs – SCM of choice
Promise Theory – connected but independent
agents cannot wrest guarantees from each
other – they can only truly obligate
themselves. But this can be leveraged to
coordinate.
9. Monitoring & Self-Healing
9
What’s the current state
Post-change state
Event-driven hooks from monitoring back to
automation tool creates self-healing
i.e. Nagios, Empirix, monitoring tool of choice
End-to-end change visibility – intended
changes, logged changes, monitoring events
10. What do we gain?
10
A lot:
Known configs/profiles assured to reflect live
system state
auditable easy-to-administer security configurations
predictable change and rollback
Large-scale updates that are
seamless, uniform, and logged.
Agile compliance!
Uptime!
More free time! To devote to higher-level activities
11. Good Reading
11
Classic “Bootstrapping an Infrastructure,” LISA ’98 -
http://www.infrastructures.org/papers/bootstrap/bootstr
ap.html
Self-Healing Networks -
http://onlamp.com/pub/a/onlamp/2006/05/25/self-
healing-networks.html?page=1
Relative origins of cfengine, puppet, chef -
http://verticalsysadmin.com/blog/uncategorized/relativ
e-origins-of-cfengine-chef-and-puppet
Promises of DevOps -
http://cfengine.com/markburgess/blog_devops.html
Promise Theory -
http://en.wikipedia.org/wiki/Promise_theory
Notas del editor
What does DevOps mean generally? - cross-discipline pollination – code as infrastructure, continuous improvement, surgical changes, holistic view, end-to-end visibility, self-healing systems - shared risk/responsibility for classic responsibilties - potential areas to be careful: blurred responsibilities, defining expertise down, mistaking outsourcing for management (i.e. a lot of startups leveraging Amazon – DevOps as NoOps)
+ add diagram of roadmap (puppet cfengine/chef?)+ complexity management