SlideShare una empresa de Scribd logo
1 de 39
Descargar para leer sin conexión
Telemetry doesn't have to be scary
Developer focused metrics that make sense.
  
Table of Contents
Why we need metrics
Impact Analysis
Privacy. It's a thing.
Telemetry Data Model
Metric Plugin Design
Using the Data
Get Involved!
New talk, who dis?
Ben Ford
Developer Advocate @ Puppet
@binford2k:   
Why we need metrics
It's not a huge task to design and build a simple, single-purpose, bespoke Puppet
module in a short period of time.

Maintenance is a different story

Hey, can we add ArchLinux support?
What about Windows 2012?
Here's a PR for a defined type to manage the database too... (MariaDB only)
Development resources
  
Every single platform or feature you support takes time to maintain it!


Data driven development decisions
Two roads diverged in a wood, and I—
I took the one less traveled by,
And that has made all the difference.
-- Robert Frost
Photo by https://unsplash.com/@leliejens
Impact Analysis
Rangefinder
[~/Projects/puppetlabs-concat]$ rangefinder manifests/fragment.pp
[concat::fragment] is a _type_
==================================
The enclosing module is declared in 173 of 575 indexed public Puppetfiles
Breaking changes to this file WILL impact these modules:
* nightfly-ssh_keys (https://github.com/nightfly19/puppet-ssh_keys.git)
* viirya-mit_krb5 (git://github.com/viirya/puppet-mit_krb5.git)
* rjpearce-opendkim (https://github.com/rjpearce/puppet-opendkim)
* shadow-tor (git://github.com/LeShadow/puppet-tor.git)
[...]
Breaking changes to this file MAY impact these modules:
* empi89-quagga (UNKNOWN)
* unyonsys-keepalived (UNKNOWN)
* Flameeyes-udevnet (UNKNOWN)
* ricbra-ratbox (git://github.com/ricbra/puppet-ratbox.git)
[...]
https://github.com/puppetlabs/puppet-community-rangefinder
Identify the public Forge modules using parts of your code.
Run it on a pull request!
https://github.com/apps/puppet-community-rangefinder
Rangefinder also comes in a GitHub App flavor.
Static analysis of dependencies
Works by dissecting latest releases of all modules on the Forge
Static code analysis identifies what every module uses
resource types
functions
classes
Runs weekly to generate dependency network
This uses my itemize library: https://github.com/binford2k/binford2k-itemize
Photo by https://unsplash.com/@alinnnaaaa
Dependency data is public
Find more information about how it works and other cool queries you can run on the dataset powering
it at my blog: https://binford2k.com/2020/04/06/rangefinder/
The data Rangefinder uses is stored in a public BigQuery database.
Major limitation
Rangefinder only works on public Forge modules (currently).
Cannot identify how code is used by profiles or private modules.
Photo by https://unsplash.com/@sguestsmith
Privacy. It's a thing.
Business Identifiable Information
Most of you know BII's big cousin, PII.
But many of the same considerations apply to businesses.
Many companies don't allow phone-home metrics at all.
Sharing too much implementation information about how an infrastructure works is just
asking for security compromises.

Fingerprinting
A more insidious danger is fingerprinting.
Identifying individuals by properties or habits.
Also applies to infrastructures.
If you saw that a site was in the CEST time zone, had the locale set to fr-ch , and had
thousands of nodes classified with various HPC modules, then you might guess that you
were looking at CERN.

Live example of fingerprinting
http://panopticlick.eff.org/
Hard requirements

1. Provide real value to the end user. (that's you)
2. Transparent about what data is collected, why, and what it's used for.
3. Give you the ability to opt in or out without gating other benefits.
4. The data collected can be aggregated to share publicly.
Telemetry Data Model
Public dataset
No access to individual records themselves
-- The top ten most used classes in the ecosystem
SELECT name, count
FROM `dataops-puppet-public-data.aggregated.class_usage_count`
ORDER BY count DESC
LIMIT 10
[
{ "name": "Resource_api::Agent", "count": "272" },
{ "name": "Account", "count": "272" },
{ "name": "Ssl::Params", "count": "272" },
{ "name": "Classification", "count": "272" },
{ "name": "Os_patching", "count": "269" },
{ "name": "Ntp", "count": "265" },
{ "name": "Ntp::Install", "count": "265" },
{ "name": "Ntp::Config", "count": "265" },
{ "name": "Ntp::Service", "count": "265" },
{ "name": "Zsh::Params", "count": "265" }
]
You'll need a Google Cloud account and then you can access the dataset with your browser via the
BigQuery Console. Then you can run any queries you'd like.
Aggregating data
Data is generated via SQL queries:
-- Generate an aggregate table of all classes used and their counts.
SELECT classes.name, classes.count
FROM `bto-dataops-datalake-prod.dujour.community_metrics`,
UNNEST(classes) as classes
GROUP BY classes.name, classes.count
https://github.com/puppetlabs/dropsonde-aggregation
Private dataset
So about that sensitive data...
Telemetry reports are not terribly sensitive, but they are fingerprintable.
So we store them only in a secured dataset.
Limited access, only myself and our BigQuery admin team.
Our own developers who want to use collected metrics data have to do it via aggregation
queries just like you would.

Metric Plugin Design
Plugin hook design
Plugins are implemented as hooks.
description
schema
example
run
Some hooks are optional
initialize
setup
cleanup
Schema hook
Describes data that the plugin is allowed to return.
Returns a BigQuery schema.
JSON format with an array of row definitions
def self.schema
[
{
"description": "The number of environments",
"mode": "NULLABLE",
"name": "environment_count",
"type": "INTEGER"
}
]
end
See full schema of all plugins with dropsonde dev schema
Run hook
Generates and returns data for the metric
Returns an array of rows of data.
Can return any data that the Puppet Server node can generate.
Use PuppetDB, codebase, etc.
Methods provided to filter out private modules.
Must match schema or plugin fails.
def self.run
[
:environment_count => Puppet.lookup(:environments).list.count,
]
end
Example hook
Representative example of sample data
Provide one element of randomized representative data.
Used to generate a dataset that's shaped like the real dataset.
Must match the schema or plugin fails.
Used for developers to generate aggregation queries.
def self.example
[
:environment_count => rand(1..100),
]
end
Available as dataops-puppet-public-data:community.community_metrics
Previewing a telemetry report
$ dropsonde --enable environments preview
Puppet Telemetry Report Preview
===============================
Dropsonde::Metrics::Environments
-------------------------------
This group of metrics gathers information about environments.
- environment_count: The number of environments
4
Site ID:
bab5a61cb7af7d37fa65cb1ad97b2495b4bdbc85bac7b4f9ca6932c8cd9038dd7e87be13abb367e124bfdda2de14949
Before you submit data, you can see what's generated
This example selects only a single metric
Site ID is a cryptographic hash that just correlates reports
Can be regenerated by providing a seed value
Data validation
The plugin cannot return data that it does not declare.
The database schema is also defined from the plugin schemas.
Using the Data
Public data
All you need is a Google Cloud account.
Then you can explore the datasets.
Once you have an account, browse to https://console.cloud.google.com/bigquery?p=dataops-
puppet-public-data&d=aggregated to check out the dataops-puppet-public-data project.
Interesting queries
Dataset includes:
Forge module metadata
Static analysis of Forge modules
GitHub repositories that look like Puppet modules
Aggregated telemetry data
Combine them as you will.
Example: a list of the modules on the Forge that define custom native types:
SELECT DISTINCT g.repo_name, f.slug
FROM `dataops-puppet-public-data.community.github_ruby_files` g
JOIN `dataops-puppet-public-data.community.forge_modules` f
ON g.repo_name = REGEXP_EXTRACT(f.source, r'^(?:https?://github.com/)?(.*?)(?:.git)?$'
WHERE STARTS_WITH(g.path, 'lib/puppet/type')
LIMIT 1000
So what else is possible?
Who knows, really.
Maybe we can incorporate Impact Analysis into PDK validations.
Maybe the Forge will start surfacing some of these metrics. (hint: it will)
Maybe Vox Pupuli will incorporate usage data into their Tasks management interface.
Maybe you will write something I haven't even thought of.
Get Involved!
What I'm asking from you
This is totally opt-in only right now.
puppet module install puppetlabs/dropsonde
Check out the example dataset and contribute some aggregation queries.
Develop and contribute a metric plugin.
Doesn't have to be strictly Puppet, just ecosystem:
Reid already made a plugin to see the impact of building a better control repo and
Puppetfile for better r10k deploys.
David could see how many people are using the Rocket.chat integration for
voxpupuli/puppet-webhook .
puppet module install puppetlabs/dropsonde
Build some tooling that uses the public data.
Seriously, install the module and classify your primary Puppet server. It would help a ton.
Dropsonde : an expendable weather reconnaissance device created by the National Center
for Atmospheric Research, designed to be dropped from an aircraft at altitude over water
to measure storm conditions as the device falls to the surface.

Installing Dropsonde is super straightforward. Install the puppetlabs-dropsonde Puppet
module and classify your Puppet server with the dropsonde class. If you have multiple
compile nodes, just classify the primary Puppet server. This will install the tool and
configure a weekly cron job to submit the reports.

Resources and Q&A

https://github.com/puppetlabs/dropsonde
https://github.com/puppetlabs/puppet-community-rangefinder
https://github.com/apps/puppet-community-rangefinder
https://binford2k.com/tags/#telemetry-ref
https://github.com/puppetlabs/dropsonde
https://github.com/puppetlabs/puppetlabs-dropsonde
https://github.com/puppetlabs/dropsonde-aggregation
https://binford2k.com/tags/#telemetry-ref
Impact Analysis of Puppet Modules
Downstream impact of pull requests
Telemetry that doesn't suck
Gathering metrics with a new Dropsonde plugin
Ben ford intro

Más contenido relacionado

La actualidad más candente

Networks All Around Us: Extracting networks from your problem domain
Networks All Around Us: Extracting networks from your problem domainNetworks All Around Us: Extracting networks from your problem domain
Networks All Around Us: Extracting networks from your problem domainRussell Jurney
 
Agile Data Science: Building Hadoop Analytics Applications
Agile Data Science: Building Hadoop Analytics ApplicationsAgile Data Science: Building Hadoop Analytics Applications
Agile Data Science: Building Hadoop Analytics ApplicationsRussell Jurney
 
Data science apps: beyond notebooks
Data science apps: beyond notebooksData science apps: beyond notebooks
Data science apps: beyond notebooksNatalino Busa
 
Agile Data Science 2.0 - Big Data Science Meetup
Agile Data Science 2.0 - Big Data Science MeetupAgile Data Science 2.0 - Big Data Science Meetup
Agile Data Science 2.0 - Big Data Science MeetupRussell Jurney
 
Agile Data Science 2.0: Using Spark with MongoDB
Agile Data Science 2.0: Using Spark with MongoDBAgile Data Science 2.0: Using Spark with MongoDB
Agile Data Science 2.0: Using Spark with MongoDBRussell Jurney
 
Networks All Around Us: Extracting networks from your problem domain
Networks All Around Us: Extracting networks from your problem domainNetworks All Around Us: Extracting networks from your problem domain
Networks All Around Us: Extracting networks from your problem domainRussell Jurney
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data AnalyticsEdureka!
 
Social Network Analysis in Your Problem Domain
Social Network Analysis in Your Problem DomainSocial Network Analysis in Your Problem Domain
Social Network Analysis in Your Problem DomainRussell Jurney
 
Big Data Programming Using Hadoop Workshop
Big Data Programming Using Hadoop WorkshopBig Data Programming Using Hadoop Workshop
Big Data Programming Using Hadoop WorkshopIMC Institute
 
Wikipedia Views As A Proxy For Social Engagement
Wikipedia Views As A Proxy For Social EngagementWikipedia Views As A Proxy For Social Engagement
Wikipedia Views As A Proxy For Social EngagementDaniel Cuneo
 
How to Light a Beacon
How to Light a BeaconHow to Light a Beacon
How to Light a BeaconMiro Cupak
 
OGCE Project Overview
OGCE Project OverviewOGCE Project Overview
OGCE Project Overviewmarpierc
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data AnalyticsEdureka!
 
OREChem Services and Workflows
OREChem Services and WorkflowsOREChem Services and Workflows
OREChem Services and Workflowsmarpierc
 
Agile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics ApplicationsAgile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics ApplicationsRussell Jurney
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignMichael Noll
 
Big Data Hadoop using Amazon Elastic MapReduce: Hands-On Labs
Big Data Hadoop using Amazon Elastic MapReduce: Hands-On LabsBig Data Hadoop using Amazon Elastic MapReduce: Hands-On Labs
Big Data Hadoop using Amazon Elastic MapReduce: Hands-On LabsIMC Institute
 
Power of Python with Big Data
Power of Python with Big DataPower of Python with Big Data
Power of Python with Big DataEdureka!
 
Agile Data Science 2.0
Agile Data Science 2.0Agile Data Science 2.0
Agile Data Science 2.0Russell Jurney
 
Hadoop for Java Professionals
Hadoop for Java ProfessionalsHadoop for Java Professionals
Hadoop for Java ProfessionalsEdureka!
 

La actualidad más candente (20)

Networks All Around Us: Extracting networks from your problem domain
Networks All Around Us: Extracting networks from your problem domainNetworks All Around Us: Extracting networks from your problem domain
Networks All Around Us: Extracting networks from your problem domain
 
Agile Data Science: Building Hadoop Analytics Applications
Agile Data Science: Building Hadoop Analytics ApplicationsAgile Data Science: Building Hadoop Analytics Applications
Agile Data Science: Building Hadoop Analytics Applications
 
Data science apps: beyond notebooks
Data science apps: beyond notebooksData science apps: beyond notebooks
Data science apps: beyond notebooks
 
Agile Data Science 2.0 - Big Data Science Meetup
Agile Data Science 2.0 - Big Data Science MeetupAgile Data Science 2.0 - Big Data Science Meetup
Agile Data Science 2.0 - Big Data Science Meetup
 
Agile Data Science 2.0: Using Spark with MongoDB
Agile Data Science 2.0: Using Spark with MongoDBAgile Data Science 2.0: Using Spark with MongoDB
Agile Data Science 2.0: Using Spark with MongoDB
 
Networks All Around Us: Extracting networks from your problem domain
Networks All Around Us: Extracting networks from your problem domainNetworks All Around Us: Extracting networks from your problem domain
Networks All Around Us: Extracting networks from your problem domain
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data Analytics
 
Social Network Analysis in Your Problem Domain
Social Network Analysis in Your Problem DomainSocial Network Analysis in Your Problem Domain
Social Network Analysis in Your Problem Domain
 
Big Data Programming Using Hadoop Workshop
Big Data Programming Using Hadoop WorkshopBig Data Programming Using Hadoop Workshop
Big Data Programming Using Hadoop Workshop
 
Wikipedia Views As A Proxy For Social Engagement
Wikipedia Views As A Proxy For Social EngagementWikipedia Views As A Proxy For Social Engagement
Wikipedia Views As A Proxy For Social Engagement
 
How to Light a Beacon
How to Light a BeaconHow to Light a Beacon
How to Light a Beacon
 
OGCE Project Overview
OGCE Project OverviewOGCE Project Overview
OGCE Project Overview
 
Python for Big Data Analytics
Python for Big Data AnalyticsPython for Big Data Analytics
Python for Big Data Analytics
 
OREChem Services and Workflows
OREChem Services and WorkflowsOREChem Services and Workflows
OREChem Services and Workflows
 
Agile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics ApplicationsAgile Data Science: Hadoop Analytics Applications
Agile Data Science: Hadoop Analytics Applications
 
Apache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - VerisignApache Storm 0.9 basic training - Verisign
Apache Storm 0.9 basic training - Verisign
 
Big Data Hadoop using Amazon Elastic MapReduce: Hands-On Labs
Big Data Hadoop using Amazon Elastic MapReduce: Hands-On LabsBig Data Hadoop using Amazon Elastic MapReduce: Hands-On Labs
Big Data Hadoop using Amazon Elastic MapReduce: Hands-On Labs
 
Power of Python with Big Data
Power of Python with Big DataPower of Python with Big Data
Power of Python with Big Data
 
Agile Data Science 2.0
Agile Data Science 2.0Agile Data Science 2.0
Agile Data Science 2.0
 
Hadoop for Java Professionals
Hadoop for Java ProfessionalsHadoop for Java Professionals
Hadoop for Java Professionals
 

Similar a Ben ford intro

Vipul divyanshu mahout_documentation
Vipul divyanshu mahout_documentationVipul divyanshu mahout_documentation
Vipul divyanshu mahout_documentationVipul Divyanshu
 
Advanced Web Development
Advanced Web DevelopmentAdvanced Web Development
Advanced Web DevelopmentRobert J. Stein
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaborationJulien Pivotto
 
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOPIRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOPIRJET Journal
 
Googleappengineintro 110410190620-phpapp01
Googleappengineintro 110410190620-phpapp01Googleappengineintro 110410190620-phpapp01
Googleappengineintro 110410190620-phpapp01Tony Frame
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleJim Dowling
 
Agile Data Science 2.0
Agile Data Science 2.0Agile Data Science 2.0
Agile Data Science 2.0Russell Jurney
 
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioAlluxio, Inc.
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Cloudera, Inc.
 
OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OW2
 
OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware
 
OCCIware@OW2con 2016
OCCIware@OW2con 2016OCCIware@OW2con 2016
OCCIware@OW2con 2016Marc Dutoo
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseHao Chen
 
Webinar | Building Apps with the Cassandra Python Driver
Webinar | Building Apps with the Cassandra Python DriverWebinar | Building Apps with the Cassandra Python Driver
Webinar | Building Apps with the Cassandra Python DriverDataStax Academy
 
Agile Data Science 2.0
Agile Data Science 2.0Agile Data Science 2.0
Agile Data Science 2.0Russell Jurney
 
Big data Big Analytics
Big data Big AnalyticsBig data Big Analytics
Big data Big AnalyticsAjay Ohri
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySparkRussell Jurney
 
What's New in Cytoscape
What's New in CytoscapeWhat's New in Cytoscape
What's New in CytoscapeKeiichiro Ono
 

Similar a Ben ford intro (20)

Vipul divyanshu mahout_documentation
Vipul divyanshu mahout_documentationVipul divyanshu mahout_documentation
Vipul divyanshu mahout_documentation
 
Advanced Web Development
Advanced Web DevelopmentAdvanced Web Development
Advanced Web Development
 
Monitoring as an entry point for collaboration
Monitoring as an entry point for collaborationMonitoring as an entry point for collaboration
Monitoring as an entry point for collaboration
 
IRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOPIRJET - Survey Paper on Map Reduce Processing using HADOOP
IRJET - Survey Paper on Map Reduce Processing using HADOOP
 
Googleappengineintro 110410190620-phpapp01
Googleappengineintro 110410190620-phpapp01Googleappengineintro 110410190620-phpapp01
Googleappengineintro 110410190620-phpapp01
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData Seattle
 
Agile Data Science 2.0
Agile Data Science 2.0Agile Data Science 2.0
Agile Data Science 2.0
 
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & AlluxioUltra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
Ultra Fast Deep Learning in Hybrid Cloud Using Intel Analytics Zoo & Alluxio
 
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
Hadoop World 2011: Building Web Analytics Processing on Hadoop at CBS Interac...
 
OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...
 
OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...OCCIware: extensible and standard-based XaaS platform to manage everything in...
OCCIware: extensible and standard-based XaaS platform to manage everything in...
 
OCCIware@OW2con 2016
OCCIware@OW2con 2016OCCIware@OW2con 2016
OCCIware@OW2con 2016
 
Apache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real TimeApache Eagle: Secure Hadoop in Real Time
Apache Eagle: Secure Hadoop in Real Time
 
Apache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San JoseApache Eagle at Hadoop Summit 2016 San Jose
Apache Eagle at Hadoop Summit 2016 San Jose
 
Webinar | Building Apps with the Cassandra Python Driver
Webinar | Building Apps with the Cassandra Python DriverWebinar | Building Apps with the Cassandra Python Driver
Webinar | Building Apps with the Cassandra Python Driver
 
Agile Data Science 2.0
Agile Data Science 2.0Agile Data Science 2.0
Agile Data Science 2.0
 
Big data Big Analytics
Big data Big AnalyticsBig data Big Analytics
Big data Big Analytics
 
10 Ways To Improve Your Code
10 Ways To Improve Your Code10 Ways To Improve Your Code
10 Ways To Improve Your Code
 
Introduction to PySpark
Introduction to PySparkIntroduction to PySpark
Introduction to PySpark
 
What's New in Cytoscape
What's New in CytoscapeWhat's New in Cytoscape
What's New in Cytoscape
 

Más de Puppet

Puppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet
 
Puppetcamp r10kyaml
Puppetcamp r10kyamlPuppetcamp r10kyaml
Puppetcamp r10kyamlPuppet
 
2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)Puppet
 
Puppet camp vscode
Puppet camp vscodePuppet camp vscode
Puppet camp vscodePuppet
 
Modules of the twenties
Modules of the twentiesModules of the twenties
Modules of the twentiesPuppet
 
Applying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance codeApplying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance codePuppet
 
KGI compliance as-code approach
KGI compliance as-code approachKGI compliance as-code approach
KGI compliance as-code approachPuppet
 
Enforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationEnforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationPuppet
 
Keynote: Puppet camp compliance
Keynote: Puppet camp complianceKeynote: Puppet camp compliance
Keynote: Puppet camp compliancePuppet
 
Automating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowAutomating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowPuppet
 
Puppet: The best way to harden Windows
Puppet: The best way to harden WindowsPuppet: The best way to harden Windows
Puppet: The best way to harden WindowsPuppet
 
Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020Puppet
 
Accelerating azure adoption with puppet
Accelerating azure adoption with puppetAccelerating azure adoption with puppet
Accelerating azure adoption with puppetPuppet
 
Puppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael PinsonPuppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael PinsonPuppet
 
ServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin ReeuwijkServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin ReeuwijkPuppet
 
Take control of your dev ops dumping ground
Take control of your  dev ops dumping groundTake control of your  dev ops dumping ground
Take control of your dev ops dumping groundPuppet
 
100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy SoftwarePuppet
 
Puppet User Group
Puppet User GroupPuppet User Group
Puppet User GroupPuppet
 
Continuous Compliance and DevSecOps
Continuous Compliance and DevSecOpsContinuous Compliance and DevSecOps
Continuous Compliance and DevSecOpsPuppet
 
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick MaludyThe Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick MaludyPuppet
 

Más de Puppet (20)

Puppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepoPuppet camp2021 testing modules and controlrepo
Puppet camp2021 testing modules and controlrepo
 
Puppetcamp r10kyaml
Puppetcamp r10kyamlPuppetcamp r10kyaml
Puppetcamp r10kyaml
 
2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)2021 04-15 operational verification (with notes)
2021 04-15 operational verification (with notes)
 
Puppet camp vscode
Puppet camp vscodePuppet camp vscode
Puppet camp vscode
 
Modules of the twenties
Modules of the twentiesModules of the twenties
Modules of the twenties
 
Applying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance codeApplying Roles and Profiles method to compliance code
Applying Roles and Profiles method to compliance code
 
KGI compliance as-code approach
KGI compliance as-code approachKGI compliance as-code approach
KGI compliance as-code approach
 
Enforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automationEnforce compliance policy with model-driven automation
Enforce compliance policy with model-driven automation
 
Keynote: Puppet camp compliance
Keynote: Puppet camp complianceKeynote: Puppet camp compliance
Keynote: Puppet camp compliance
 
Automating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNowAutomating it management with Puppet + ServiceNow
Automating it management with Puppet + ServiceNow
 
Puppet: The best way to harden Windows
Puppet: The best way to harden WindowsPuppet: The best way to harden Windows
Puppet: The best way to harden Windows
 
Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020Simplified Patch Management with Puppet - Oct. 2020
Simplified Patch Management with Puppet - Oct. 2020
 
Accelerating azure adoption with puppet
Accelerating azure adoption with puppetAccelerating azure adoption with puppet
Accelerating azure adoption with puppet
 
Puppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael PinsonPuppet catalog Diff; Raphael Pinson
Puppet catalog Diff; Raphael Pinson
 
ServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin ReeuwijkServiceNow and Puppet- better together, Kevin Reeuwijk
ServiceNow and Puppet- better together, Kevin Reeuwijk
 
Take control of your dev ops dumping ground
Take control of your  dev ops dumping groundTake control of your  dev ops dumping ground
Take control of your dev ops dumping ground
 
100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software100% Puppet Cloud Deployment of Legacy Software
100% Puppet Cloud Deployment of Legacy Software
 
Puppet User Group
Puppet User GroupPuppet User Group
Puppet User Group
 
Continuous Compliance and DevSecOps
Continuous Compliance and DevSecOpsContinuous Compliance and DevSecOps
Continuous Compliance and DevSecOps
 
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick MaludyThe Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
The Dynamic Duo of Puppet and Vault tame SSL Certificates, Nick Maludy
 

Último

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...gurkirankumar98700
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 

Último (20)

08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
Kalyanpur ) Call Girls in Lucknow Finest Escorts Service 🍸 8923113531 🎰 Avail...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 

Ben ford intro

  • 1. Telemetry doesn't have to be scary Developer focused metrics that make sense.   
  • 2. Table of Contents Why we need metrics Impact Analysis Privacy. It's a thing. Telemetry Data Model Metric Plugin Design Using the Data Get Involved!
  • 3. New talk, who dis? Ben Ford Developer Advocate @ Puppet @binford2k:   
  • 4. Why we need metrics
  • 5. It's not a huge task to design and build a simple, single-purpose, bespoke Puppet module in a short period of time. 
  • 6. Maintenance is a different story  Hey, can we add ArchLinux support? What about Windows 2012? Here's a PR for a defined type to manage the database too... (MariaDB only)
  • 7. Development resources    Every single platform or feature you support takes time to maintain it!
  • 8.   Data driven development decisions Two roads diverged in a wood, and I— I took the one less traveled by, And that has made all the difference. -- Robert Frost Photo by https://unsplash.com/@leliejens
  • 10. Rangefinder [~/Projects/puppetlabs-concat]$ rangefinder manifests/fragment.pp [concat::fragment] is a _type_ ================================== The enclosing module is declared in 173 of 575 indexed public Puppetfiles Breaking changes to this file WILL impact these modules: * nightfly-ssh_keys (https://github.com/nightfly19/puppet-ssh_keys.git) * viirya-mit_krb5 (git://github.com/viirya/puppet-mit_krb5.git) * rjpearce-opendkim (https://github.com/rjpearce/puppet-opendkim) * shadow-tor (git://github.com/LeShadow/puppet-tor.git) [...] Breaking changes to this file MAY impact these modules: * empi89-quagga (UNKNOWN) * unyonsys-keepalived (UNKNOWN) * Flameeyes-udevnet (UNKNOWN) * ricbra-ratbox (git://github.com/ricbra/puppet-ratbox.git) [...] https://github.com/puppetlabs/puppet-community-rangefinder Identify the public Forge modules using parts of your code.
  • 11. Run it on a pull request! https://github.com/apps/puppet-community-rangefinder Rangefinder also comes in a GitHub App flavor.
  • 12. Static analysis of dependencies Works by dissecting latest releases of all modules on the Forge Static code analysis identifies what every module uses resource types functions classes Runs weekly to generate dependency network This uses my itemize library: https://github.com/binford2k/binford2k-itemize Photo by https://unsplash.com/@alinnnaaaa
  • 13. Dependency data is public Find more information about how it works and other cool queries you can run on the dataset powering it at my blog: https://binford2k.com/2020/04/06/rangefinder/ The data Rangefinder uses is stored in a public BigQuery database.
  • 14. Major limitation Rangefinder only works on public Forge modules (currently). Cannot identify how code is used by profiles or private modules. Photo by https://unsplash.com/@sguestsmith
  • 15. Privacy. It's a thing.
  • 16. Business Identifiable Information Most of you know BII's big cousin, PII. But many of the same considerations apply to businesses. Many companies don't allow phone-home metrics at all. Sharing too much implementation information about how an infrastructure works is just asking for security compromises. 
  • 17. Fingerprinting A more insidious danger is fingerprinting. Identifying individuals by properties or habits. Also applies to infrastructures. If you saw that a site was in the CEST time zone, had the locale set to fr-ch , and had thousands of nodes classified with various HPC modules, then you might guess that you were looking at CERN. 
  • 18. Live example of fingerprinting http://panopticlick.eff.org/
  • 19. Hard requirements  1. Provide real value to the end user. (that's you) 2. Transparent about what data is collected, why, and what it's used for. 3. Give you the ability to opt in or out without gating other benefits. 4. The data collected can be aggregated to share publicly.
  • 21. Public dataset No access to individual records themselves -- The top ten most used classes in the ecosystem SELECT name, count FROM `dataops-puppet-public-data.aggregated.class_usage_count` ORDER BY count DESC LIMIT 10 [ { "name": "Resource_api::Agent", "count": "272" }, { "name": "Account", "count": "272" }, { "name": "Ssl::Params", "count": "272" }, { "name": "Classification", "count": "272" }, { "name": "Os_patching", "count": "269" }, { "name": "Ntp", "count": "265" }, { "name": "Ntp::Install", "count": "265" }, { "name": "Ntp::Config", "count": "265" }, { "name": "Ntp::Service", "count": "265" }, { "name": "Zsh::Params", "count": "265" } ] You'll need a Google Cloud account and then you can access the dataset with your browser via the BigQuery Console. Then you can run any queries you'd like.
  • 22. Aggregating data Data is generated via SQL queries: -- Generate an aggregate table of all classes used and their counts. SELECT classes.name, classes.count FROM `bto-dataops-datalake-prod.dujour.community_metrics`, UNNEST(classes) as classes GROUP BY classes.name, classes.count https://github.com/puppetlabs/dropsonde-aggregation
  • 23. Private dataset So about that sensitive data... Telemetry reports are not terribly sensitive, but they are fingerprintable. So we store them only in a secured dataset. Limited access, only myself and our BigQuery admin team. Our own developers who want to use collected metrics data have to do it via aggregation queries just like you would. 
  • 25. Plugin hook design Plugins are implemented as hooks. description schema example run Some hooks are optional initialize setup cleanup
  • 26. Schema hook Describes data that the plugin is allowed to return. Returns a BigQuery schema. JSON format with an array of row definitions def self.schema [ { "description": "The number of environments", "mode": "NULLABLE", "name": "environment_count", "type": "INTEGER" } ] end See full schema of all plugins with dropsonde dev schema
  • 27. Run hook Generates and returns data for the metric Returns an array of rows of data. Can return any data that the Puppet Server node can generate. Use PuppetDB, codebase, etc. Methods provided to filter out private modules. Must match schema or plugin fails. def self.run [ :environment_count => Puppet.lookup(:environments).list.count, ] end
  • 28. Example hook Representative example of sample data Provide one element of randomized representative data. Used to generate a dataset that's shaped like the real dataset. Must match the schema or plugin fails. Used for developers to generate aggregation queries. def self.example [ :environment_count => rand(1..100), ] end Available as dataops-puppet-public-data:community.community_metrics
  • 29. Previewing a telemetry report $ dropsonde --enable environments preview Puppet Telemetry Report Preview =============================== Dropsonde::Metrics::Environments ------------------------------- This group of metrics gathers information about environments. - environment_count: The number of environments 4 Site ID: bab5a61cb7af7d37fa65cb1ad97b2495b4bdbc85bac7b4f9ca6932c8cd9038dd7e87be13abb367e124bfdda2de14949 Before you submit data, you can see what's generated This example selects only a single metric Site ID is a cryptographic hash that just correlates reports Can be regenerated by providing a seed value
  • 30. Data validation The plugin cannot return data that it does not declare. The database schema is also defined from the plugin schemas.
  • 32. Public data All you need is a Google Cloud account. Then you can explore the datasets. Once you have an account, browse to https://console.cloud.google.com/bigquery?p=dataops- puppet-public-data&d=aggregated to check out the dataops-puppet-public-data project.
  • 33. Interesting queries Dataset includes: Forge module metadata Static analysis of Forge modules GitHub repositories that look like Puppet modules Aggregated telemetry data Combine them as you will. Example: a list of the modules on the Forge that define custom native types: SELECT DISTINCT g.repo_name, f.slug FROM `dataops-puppet-public-data.community.github_ruby_files` g JOIN `dataops-puppet-public-data.community.forge_modules` f ON g.repo_name = REGEXP_EXTRACT(f.source, r'^(?:https?://github.com/)?(.*?)(?:.git)?$' WHERE STARTS_WITH(g.path, 'lib/puppet/type') LIMIT 1000
  • 34. So what else is possible? Who knows, really. Maybe we can incorporate Impact Analysis into PDK validations. Maybe the Forge will start surfacing some of these metrics. (hint: it will) Maybe Vox Pupuli will incorporate usage data into their Tasks management interface. Maybe you will write something I haven't even thought of.
  • 36. What I'm asking from you This is totally opt-in only right now. puppet module install puppetlabs/dropsonde Check out the example dataset and contribute some aggregation queries. Develop and contribute a metric plugin. Doesn't have to be strictly Puppet, just ecosystem: Reid already made a plugin to see the impact of building a better control repo and Puppetfile for better r10k deploys. David could see how many people are using the Rocket.chat integration for voxpupuli/puppet-webhook . puppet module install puppetlabs/dropsonde Build some tooling that uses the public data. Seriously, install the module and classify your primary Puppet server. It would help a ton. Dropsonde : an expendable weather reconnaissance device created by the National Center for Atmospheric Research, designed to be dropped from an aircraft at altitude over water to measure storm conditions as the device falls to the surface.  Installing Dropsonde is super straightforward. Install the puppetlabs-dropsonde Puppet module and classify your Puppet server with the dropsonde class. If you have multiple compile nodes, just classify the primary Puppet server. This will install the tool and configure a weekly cron job to submit the reports. 
  • 37.