PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013

The life and times of
PuppetDB
Friday, August 23, 13

DEEPAK GIRIDHARAGOPAL
deepak@puppetlabs.com
@grim_radical

We need to talk!

Puppet
agent
Puppet
master

Puppet
agent
Puppet
master
facts

Puppet
agent
Puppet
master
facts
netmask_lo: 255.0.0.0
augeasversion: 0.10.0
fqdn: pe-debian6.localdomain
manufacturer: "VMware, Inc."
processorcount: "1"
productname: VMware Virtual
Platform
physicalprocessorcount: 1
facterversion: 1.6.7
boardproductname: 440BX Desktop
Reference Platform
kernelmajversion: "2.6"
hardwareisa: unknown
timezone: PDT
puppetversion: 2.7.12 (Puppet
Enterprise 2.5.1)
lsbdistcodename: squeeze
is_virtual: "true"
operatingsystemrelease: 6.0.2
virtual: vmware
type: Other
domain: localdomain
hostname: pe-debian6
selinux: "false"
kernel: Linux
kernelrelease: 2.6.32-5-686
ipaddress: 172.16.245.128
processor0: Intel(R) Core(TM)
i7-2635QM CPU @ 2.00GHz
lsbdistrelease: 6.0.2
uniqueid: 007f0101
hardwaremodel: i686
kernelversion: 2.6.32
operatingsystem: Debian
architecture: i386
lsbdistdescription: Debian GNU/Linux
6.0.2 (squeeze)
lsbmajdistrelease: "6"
interfaces: "eth0,lo"
ipaddress_lo: 127.0.0.1
uptime_days: 0
lsbdistid: Debian
rubysitedir: /opt/puppet/lib/site_ruby/
1.8
rubyversion: 1.8.7
osfamily: Debian
memorytotal: &id001 502.57 MB
memorysize: *id001
boardmanufacturer: Intel Corporation
path: /usr/local/sbin:/usr/local/bin:/

Puppet
agent
Puppet
master
catalog

file {“/tmp/foo”: content => “This is a test”}

target: &id063 !ruby/object:Puppet::Resource
catalog: *id001
exported: false
file: /etc/puppetlabs/puppet/manifests/site.pp
line: 44
parameters:
!ruby/sym content: This is a test
!ruby/sym backup: main
reference: "File[/tmp/foo]"
tags:
- file
- node
- default
- class
title: /tmp/foo
type: File
file {“/tmp/foo”: content => “This is a test”}

File[/var/lib/peadmin/.vim]
le[/var/lib/peadmin/.mcollective.d/peadmin-cert.pem]
File[/var/lib/peadmin]
File[/var/lib/peadmin/.bashrc.custom] File[/var/lib/peadmin/.bashrc]
Group[peadmin]
User[peadmin]

Relationships
File[/var/lib/peadmin/.mcollective.d/peadmin-public.pem] File[puppet-dashboard-public.pem]
File[/var/lib/peadmin/.mcollective]
File[/opt/puppet/shapeadmin/.mcollective.d/peadmin-private.pem]
File[/var/lib/peadmin/.vim]Exec[mcollective-client-cert]
File[/var/lib/peadmin/.mcollective.d/peadmin-cert.pem]
Pe_accounts::Home_dir[/var/lib/peadmin]File[/var/lib/peadmin]
File[/var/lib/peadmin/.bashrc.custom] File[/var/lib/peadmin/.bash_profile]File[/var/lib/peadmin/.bashrc]File[/var/lib/peadmin/.mcollective.d] File[/var/lib/peadmin/.ssh]
File[/var/lib/peadmin/.ssh/authorized_keys]
Pe_accounts::User[peadmin]
Group[peadmin]
User[peadmin]

Relationships
File[/opt/puppet/libexec/mcollective/mcollective/agent/service.rb]
Service[mcollective]
File[/opt/puppet/libexec/mcollective/mcollective/agent/service.ddl] File[/var/lib/peadmin/.mcollective.d/peadmin-public.pem]
File[/opt/puppet/share/puppet-dashboard/.bashrc]
File[/etc/puppetlabs/mcollective/ssl]
File[/etc/puppetlabs/mcollective/ssl/clients]File[mcollective-cert.pem] File[mcollective-public.pem]File[mcollective-private.pem]
File[peadmin-public.pem]File[/etc/puppetlabs/mcollective/ssl/clients/mcollective-public.pem] File[puppet-dashboard-public.pem]
File[/var/lib/peadmin/.mcollective] File[/opt/puppet/share/puppet-dashboard/.mcollective]
Class[Pe_accounts::Data]
File[/opt/puppet/share/puppet-dashboard/.ssh/File[/opt/puppet/share/puppet-dashboard/.mcollective.d/puppet-dashboard-cert.pem]
Pe_accounts::Home_dir[/opt/puppet/share/puppet-dashboard]
File[/opt/puppet/share/puppet-dashbo
File[/var/lib/peadmin/.mcollective.d/peadmin-private.pem]
File[/var/lib/peadmin/.vim]
File[/etc/puppetlabs/mcollective/server.cfg]
File[/opt/puppet/share/puppet-dashboard/.mcollective.d]
File[/opt/puppet/share/puppet-dashboard/.mcollective.d/puppet-dashboard-public.pem] File[/opt/puppet/share/puppet-dashboard/.mcollective.d/puppet-dashboard-private.pem]File[/opt/puppet/libexec/mcollective/mcollective/security/aespe_security.rb]
Exec[mcollective-client-cert]
File[/var/lib/peadmin/.mcollective.d/peadmin-cert.pem]
File[/opt/puppet/libexec/mcollective/mcollective/agent]
File[/opt/puppet/libexec/mcollective/mcollective/agent/puppetd.rb] File[/opt/puppet/libexec/mcollective/mcollective/agent/package.rb] File[/opt/puppet/libexec/mcollective/mcollective/agent/puppetd.ddl] File[/opt/puppet/libexec/mcollective/mcollective/agent/puppetral.ddl]/mcollective/mcollective/agent/puppetral.rb] File[/opt/puppet/libexec/mcollective/mcollective/agent/package.ddl] File[/opt/puppet/libexec/mcollective/mcollective/security/sshkey.rb]
File[/opt/puppet/libexec/mcollective/mcollective/util]
File[/opt/puppet/libexec/mcollective/mcollective/util/actionpolicy.rb]
Pe_accounts::Home_dir[/var/lib/peadmin]
Group[puppet-dashboard]
File[/opt/puppet/share/puppet-dashboard]
File[/opt/puppet/share/puppet-dashboard/.bash_profile] File[/opt/puppet/share/puppet-dashboard/.vim]File[/opt/puppet/share/puppet-dashboard/.bashrc.custom]
User[puppet-dashboard]
Exec[mcollective-server-cert] File[/var/lib/peadmin]
File[/var/lib/peadmin/.bashrc.custom] File[/var/lib/peadmin/.bash_profile]File[/var/lib/peadmin/.bashrc]File[/var/lib/peadmin/.mcollective.d] File[/var/lib/peadmin/.ssh]
File[/var/lib/peadmin/.ssh/authorized_keys]
Exec[puppet-dashboard-client-cert]File[/opt/puppet/libexec/mcollective/mcollective/application/package.rb]
Pe_accounts::User[peadmin] Pe_accounts::User[puppet-dashboard]
File[/opt/puppet/libexec/mcollective/mcollective/application/service.rb]
File[/opt/puppet/libexec/mcollective/mcollective/security]
Group[peadmin]
User[peadmin]
File[/opt/puppet/libexec/mcollective/mcollective/registration/meta.rb]
File[/opt/puppet/libexec/mcollective/mcollective/registration] File[/opt/puppet/libexec/mcollective/mcollective/application/puppetd.rb]

Puppet
agent
Puppet
master
report

Puppet
agent
Puppet
master
report
"File[/tmp/foo]": !ruby/object:Puppet::Resource::Status
change_count: 1
changed: true
evaluation_time: 0.001869
events:
- !ruby/object:Puppet::Transaction::Event
audited: false
desired_value: !ruby/sym file
historical_value:
message: *id006
name: !ruby/sym file_created
previous_value: !ruby/sym absent
property: ensure
status: success
time: 2011-10-25 18:51:37.143970 -07:00
failed: false
file: *id007
line: 44
out_of_sync: true
out_of_sync_count: 1
resource: "File[/tmp/foo]"
resource_type: File
skipped: false
tags:
- file
- node
- default
- class
time: 2011-10-25 18:51:37.143396 -07:00
title: /tmp/foo

Puppet
agent
Puppet
master PuppetDB

Puppet
agent
Puppet
master PuppetDB
facts

Puppet
agent
Puppet
master PuppetDB
catalog
facts
catalog

Puppet
agent
Puppet
master PuppetDB
catalog
catalog
facts

Puppet
agent
Puppet
master PuppetDB
catalog facts

Puppet
agent
Puppet
master PuppetDB
report
catalog facts

Active
Record
Puppet
master
catalog

Active
Record
Puppet
master
catalogcatalogcatalogcatalogcatalogcatalog

Active
Record
Puppet
master
catalogcatalogcatalogcatalogcatalog catalog

Active
Record
Puppet
master
catalogcatalogcatalogcatalog catalogcatalog

Active
Record
Puppet
master
catalogcatalogcatalog catalogcatalogcatalog

Active
Record
Puppet
master
catalogcatalog catalogcatalogcatalogcatalog

Active
Record
Puppet
master
catalog catalogcatalogcatalogcatalogcatalog

Active
Record
Puppet
master
catalog catalog

Puppet
master
catalog

Puppet
agent
Puppet
agent
Puppet
agent
Puppet
agent
Puppet
agent
Puppet
agent
Puppet
agent
Puppet
agent
Puppet
agent
agent agent agent
Puppet
agent
Puppet
agent
Puppet
agent
Puppet
agent
Puppet
agent
Puppet
agent
Puppet
agent
Puppet
agent
Puppet
agentFriday, August 23, 13

Active
Record

Active
Record
Which boxes are
running nginx?

Active
Record
How many servers
are running a
vulnerable version
of rails?

Active
Record
What are the IP
addresses of my
webservers?

Active
Record
Which users have
sudo access?

Active
Record
LOLWUT

Active
Record
LOLWUT
ಠ ಠ
_

And now for
something
completely
different

PuppetDB

/resources/Service/
nginx
PuppetDB

resources
/resources/Service/
nginx
PuppetDB)
O O

/resources/Package/
rails
PuppetDB

resources
/resources/Package/
rails
PuppetDB)
O O

/nodes/foo.com/
resources/User/
deepak
PuppetDB

resources
/nodes/foo.com/
resources/User/
deepak
PuppetDB)
O O

We built something
quite different

1. Asynchrony

Storage &
Querying

Command
Query
Responsibility
Separation
use a different model to update
information than the model you
use to read information

CQRS
write pipeline
async, parallel, MQ-based, with
automatic retry

{
:command "replace catalog"
:version 2
:payload {...}
}

/commands MQ Parse
Delayed
Dead Letter
Ofﬁce
Process
UUID

Command
processors must be
retry-aware
expect failure, because
it *will* happen.

Failures like,
oh I don't know,
a database crash?

2. New runtime

Fast,
Free,
Portable,
Multi-core,
Popular,
The JVM is all these things

Haters gonna hate!

Tons and tons of high
quality libraries
Web servers, concurrency
frameworks, databases, fast
parsing/lexing, clustering,
debugging, profiling, etc.

Can ship an uberjar,
makes deployment
straightforward with
few moving pieces

And it's fast.

Nobody cares what
runtime we use.
Users just want stuff
to work.

3. AST querying

Queries
are expressed in their
own“language”
domain specific,AST-based
query language

["and",
["=", "type", "User"],
["=", "title", "deepak"]]

["and",
["=", ["fact", "operatingsystem"], "Debian"],
["<", ["fact", "uptime_seconds"], 10000]]

["and",
["=", "name", "ipaddress"],
["in", "certname",
["extract", "certname", ["select-resources",
["and",
["=", "type", "Class"],
["=", "title", "Apache"]]]]

["or",
["=", "certname", "foo.com"],
["=", "certname", "bar.com"],
["=", "certname", "baz.com"]]

We walk the tree,
compiling it to
efficient SQL

AST-based API lets
users write their own
languages
ah, you’ve got to love
open source!

(Package[httpd] and country=fr)
or country=us
Package["mysql-server"]
and architecture=amd64
Erik Dalén, Spotify
https://github.com/dalen/puppet-puppetdbquery

AST-based API lets
us more safely
manipulate queries

daenny, Puppetboard
https://github.com/nedap/puppetboard

Puppet Enterprise, Event Inspector
https://puppetlabs.com

Foreman Integration (CERN)
https://github.com/cernops/puppetdb_foreman
Web UI
https://github.com/dima-exe/puppetdb-db
Web UI
https://github.com/gbougeard/puppetdb-frontend

Ruby
https://github.com/dalen/puppet-puppetdbquery
Ruby (DataMapper)
https://github.com/dalen/dm-puppetdb-adapter
Ruby
https://github.com/ripienaar/ruby-puppetdb

Python
https://github.com/nedap/pypuppetdb
Python
https://github.com/arcus-io/puppetdb-python
Python
https://github.com/JHaals/puppetdb-grep

Java
https://github.com/thallgren/puppetdb-javaclient
Go
https://github.com/nightlyone/puppetquery
Scala
https://github.com/gbougeard/puppetdb-frontend
CoffeeScript
https://gist.github.com/pmuellr/5591686
Node.js
https://github.com/nightﬂy19/minidb

MCollective
https://github.com/ploubser/mcollective-puppetdb-
discovery
Rundeck
https://github.com/sirhopcount/puppetdb-rundeck
Rundeck
https://github.com/martin2110/puppetdb-rundeck

OpenStack
https://github.com/bodepd/puppet-
openstack_puppetdb
Vagrant
https://github.com/grahamgilbert/vagrant-
puppetmaster
PowerDNS
https://github.com/evenup/evenup-pdns

4. Boring technology

Relational Database,
embedded or
PostgreSQL
because they’re actually pretty
fantastic at ad-hoc queries,
aggregation, windowing, etc.
while maintaining safety

Relational Database,
embedded or
PostgreSQL
we use arrays, recursive queries,
indexing inside complex
structures

5. Weird alien
technology

--Jeff Gagliardi

Thousands of deployments,
Hundreds of threads per install,
Zero deadlocks,
Zero bugs involving mutable state
companion Ruby code has
~10x the defect rate

All with a pretty tiny codebase

6. Conjectures
about performance

Posit:
A resource often
exists across multiple
hosts

Feature:
Single-instance
resource storage

Posit:
We’ll often receive the
same catalog for a
host

Feature:
Single-instance
catalog storage

In the field,we
almost always see
Resource and catalog
duplication rates of
over 85%.

Monitoring and
instrumentation is a
big deal.Users want
easy ways to
consume metrics and
analyze performance.

Nagios
https://github.com/jasonhancock/nagios-puppetdb
Nagios
https://github.com/favoretti/puppetdb-external-naginator
Munin
https://github.com/vpetersson/munin_puppetdb
Munin
https://github.com/dalen/puppetdb-muninplugins
Collectd
https://gist.github.com/mfournier/5615125

Turns out, people
appreciate these
efforts

(how many?)

Thousands of
production
deployments
Small shops with a dozen hosts,
large shops with thousands of
hosts,standalone,clustered...

There is a new
deployment of
PuppetDB every
15 minutes.

So...long time since
we last spoke

Availability

Available in PE3
On by default,fully supported,
and the basis for upcoming
reporting and analytics features.

Performance

20% faster storage
Improvements to memoization
and caching,eliminate double-
serialization,nuked superfluous
indexes

Much faster terminus
Better caching and data
structures.For a catalog with
10k resources,drops
serialization time from ~80s to
~6s.

Resiliance

Death to keystores
Can now use PEM certificates
directly,eliminating one of the
largest sources of configuration
problems.

Configurable HTTPS
Can customize the set of cipher
suites and SSL protocols you'd
like to use,to match your
security needs.

Automatic:
-Recovery from MQ corruption
-Compression of the DLO
-Purging of inactive node data
-DB connection recycling

Backup and restore
Now integrated into the
daemon,can restore while
PuppetDB is running.

Query changes

V2 API
-No need to ask for only active
nodes
-Full fact queries (instead of
just a list of facts for a node)
-Node metadata

Wildcard Accept
Headers
curl localhost:8080/v2/nodes

Subqueries
You can now correlate data from
resource queries with fact
queries with node queries.
"Give me the IP address of all machines with
the Nginx service configured"

Report storage
-Comes with a report
processing plugin
-Store report-level metadata
-Can do queries on events that
span reports
-Basis for PE's Event Inspector

Streaming
queries!

Streaming queries
Stream results to clients on-the-
fly,as they come in from the
database.
Massively lower latency for first
response!

resourceresourceresourceresourceresourceresourceresource
PuppetDB

/v2/resources
PuppetDB

/v2/resources
PuppetDB)
O O

Coming up!

We will be developing tools to replicate
data from one PuppetDB daemon to
another. This will help with HA and DR.
PuppetDB
Diff &
Mirror PuppetDB

By initially developing an out-of-band
mirroring tool, we can create more
interesting replication topologies:
PuppetDB
Diff &
Mirror PuppetDB
Diff &
Mirror

We can also later optimize the process to
lower latency, but preserve eventual
consistency:
PuppetDB
Diff &
Mirror
PuppetDBDirect MQ connection

More ﬂexible routing is coming, allowing
for soft failures and read/write splits:
PuppetDB
Puppetmaster
PuppetDB
Replication
Catalogs,Facts,
Reports
Collection
queries
Log error and
continue

So anyways,

Documented at
http://
docs.puppetlabs.com
/puppetdb
install, config, upkeep, specs,
the works!

Packaged
as deb and rpm for
open source,part of
Puppet Enterprise
available in the Puppet Labs
package repositories

Puppetized
using the
puppetlabs/puppetdb
module
available now, on the
Module Forge!

Open source
http://github.com/
puppetlabs/puppetdb
same license as Puppet itself!

deepak
giridharagopal
deepak@puppetlabs.com
@grim_radical [github twitter freenode]

PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013

Recommended

Recommended

More Related Content

What's hot

What's hot (20)

Similar to PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013

Similar to PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013 (20)

More from Puppet

More from Puppet (20)

Recently uploaded

Recently uploaded (20)

PuppetDB: New Adventures in Higher-Order Automation - PuppetConf 2013