A cornerstone of the DevOps philosophy, investment in automation at all stages across the SDLC has increased over recent years. Automation promises velocity and reduced errors, helps foster repeatable processes, and removes the need for long hours on dull, repetitive tasks. So what’s not to like? The downside of automation is that unless applied at the right place in your SDLC it can make a bad process worse. Automation also raises questions around job security, the need for re-skilling in other areas, and tool sprawl if different teams each choose their preferred technology. This session will outline:
-A short chronology of where automation has impacted the modern software stack
-Where it makes the most sense to automate (by identifying your key constraints)
-Best practices for adopting automation and how to identify where it’s working — and where it isn’t
For more information, visit: www.appdynamics.com
Automation: The Good, The Bad and The Ugly with DevOpsGuys - AppD Summit Europe
1. www.devopsguys.com | Phone: 0800 368 7378 | e-mail: team@devopsguys.com | 2017
Automation:
The Good, the Bad and the Ugly
Getting your Automation strategy right
6. 6@DevOpsGuys
The End Result
“Database data such as projects, issues,
snippets, etc. created between January 31st
17:20 UTC and 23:30 UTC has been lost.”
“It's hard to estimate how much data has
been lost exactly, but we estimate we have
lost at least 5000 projects, 5000 comments,
and roughly 700 users.”
https://about.gitlab.com/2017/02/10/postmortem-of-database-outage-of-
january-31/
7. 7@DevOpsGuys
The Good, The Bad and The Ugly
• An automated spam attack plus a
• An automated user deletion process manually triggered by an employee
incorrectly approving the abuse report against a gitlab employee account,
• Created a replication delay issue exacerbated because automated write-
ahead log archiving wasn’t enabled
• That led to the accidental manual deletion of data
• Compounded by automated backups failing
• That no-one noticed because the notification email was automatically
blocked by DMARC
• They plan to fix some of this by automating the backup / restore validation
cycle
18. 18@DevOpsGuys
Plan - Requirements
• Atlassian or VSTS Or GitHub Enterprise
• Issue/Work Item Tracking
• Sprint/Kanban Boards
• Wiki for Requirements & other docs*
• Source Code & CI Integrations for Feedback loops
* Confluence has a edge here!
19. 19@DevOpsGuys
Plan - Communicate
• Informal communication is very important!
• The key is to be able to communicate and share
information with the minimum of “friction”
• Act as a point of integration for “ChatOps”
• We currently use Slack (www.slack.com)
• Microsoft Teams is getting better (rapidly)
20. 20@DevOpsGuys
Code – Source Code
• The de facto DVCS system is Git
• Excellent for distributed teams, remote working etc.
• Github – Cloud and on-premise Enterprise version
• VSTS – Git online or TFS on-premise
20
21. 21@DevOpsGuys
Code – Automated Developer Environments
• We (strongly) recommend Vagrant for virtualised local development
environments
• Faster provisioning of local environments
• Push out new environment updates and tools
• Keep teams in Sync
• Combine with Packer (https://www.packer.io/) and your preferred CM tools
(e.g. Ansible) for complete environment control
• Check everything into source control for version mgmt.
• Use Vagrant + Vmware workstation for better performance & compatibility
• https://www.vagrantup.com/vmware
23. 23@DevOpsGuys
Code – Database & SQL
• Your database schemas and static data are also part of your CI process
(but often overlooked)
• The should be treated like code and checking into source control!
• Our tool(set) of choice is Redgate SQL Source Control
• SQL Server
• http://www.red-gate.com/products/sql-development/sql-source-control/
• Oracle
• http://www.red-gate.com/products/oracle-development/source-control-for-oracle/
23
25. 25@DevOpsGuys
Build - Continuous Integration
• Our CI tool of choice is TeamCity from Jetbrains
• https://www.jetbrains.com/teamcity/
• Easy to configure / extend
• Very cost-effective (free for small teams)
• Support when you need it
• VSTS has its own Build server OR you can use TC, Jenkins etc
• Lots of open-source & Cloud alternatives
• Jenkins
• Travis-CI
• Wercker etc
26. 27@DevOpsGuys
Build – Unit Testing
• There are many, many unit testing frameworks so
it’s hard to say “what’s best”…
• Some are language-specific, some are ported to
multiple languages
• In a Java world… JUnit is probably the most well
known, along with TestNG.
• In a .Net world … NUnit, which is a port of JUnit
27. 28@DevOpsGuys
Build – Static Code Analysis
• There are (again) a lot of different for different types of
static code analysis
• Some are integrated with Build servers e.g. Sonar
(http://www.sonarqube.org/features/)
• Some are integrated with the IDE (e.g. ReSharper for
C#.Net which is almost a “must have” product
https://www.jetbrains.com/resharper/)
TOP TIP
InfoSec love
this stuff…
28. 29@DevOpsGuys
Build – Package Automation
• Part of the build process is creating a releasable
package.
• In a Linux world the de-facto standard is an RPM or DEB
• In a Windows world the de-facto standard is an NuGet
package
• http://nuget.codeplex.com/
• You can even use Chocolatey on Windows like a Linux
package manager to install nugget packages!
https://chocolatey.org/
• Should be created automatically as part of your Build
process.
29. 30@DevOpsGuys
Test – Test Management
• Some Don’ts do start with…
• Don’t use Excel spreadsheets to track test execution & status – it
rapidly becomes a time-wasting exercise in futility
• Don’t use HP Quality Centre, it’s just woeful. Full stop.
• VSTS has test case management (but I haven’t used it personally)
• Zephyr for Jira is a good option if you’ve gone the Jira route.
30. 31@DevOpsGuys
Test – Acceptance Testing
• Many, many Acceptance testing frameworks out there…
• Fitnesse is very popular (and cross-platform)
• We are also (huge) fans of Gherkin (GWT) syntax and
Cucumber-based BDD Acceptance Testing frameworks
• Cumber for Java
http://cukes.info/install-cucumber-jvm.html
• SpecFlow for .Net
http://www.specflow.org/
31
31. 32@DevOpsGuys
Testing – Browser UI Testing
• For web-driven UI’s the widely adopted industry standard is
Selenium. http://www.seleniumhq.org/
• It’s common
• It’s easy to find people with Selenium skills
• It’s easy to get Selenium training
• It works
• It’s free
• SauceLabs and VSTS will Cloud Host your selenium testing (as
will many others)
32. 33@DevOpsGuys
Release – Artifacts & Release Mgmt
• Store your Artifacts (packages, binaries, jars/wars etc) in:
• Nexus – http://www.sonatype.org/nexus/
• Artifactory - http://www.jfrog.com/article/devops/
• ProGet (.Net specific) - http://inedo.com/proget/overview
• Use Jira or VSTS to Manage your release processes
• #KillTheCAB
33
Given Release Package is Ready for Deployment
When deployed via an Automated Release
Pipeline
Then “ITIL Standard Change” is True
And No CAB is Required
33. 34@DevOpsGuys
Here is where it gets blurry…
Env.
Provisioning
Configuratio
n
Managemen
t
Application
Release
Automation
34. 35@DevOpsGuys
Deploy – Environment Provisioning
• In this context “Environment Provisioning” means “the ability to instantiate
(create) compute resources (IaaS or PaaS), normally in a Cloud
environment, and then trigger further configuration management and
provisioning activities”
• HashiCorp Terraform is our weapon of choice Lately we’ve been using the
Hashicorp products
• Vagrant – e.g. with customer Rackspace provider -
https://github.com/mitchellh/vagrant-rackspace
• Terraform* - https://www.terraform.io/
• *Note - currently doesn’t have a VMware provider
35
35. 36@DevOpsGuys
Deploy – Server Configuration Mgmt
• Pick one:
• Chef
• Puppet
• Ansible
• Powershell DSC (cross-platform!)
• You can find about 100 DevOps people who know one, or more, of these 4
tools for every 1 person that knows anything about any of the “Enterprise
DevOps Tools”
36
36. 37@DevOpsGuys
Deploy – Application Release Automation (ARA)
• Lot of choices in this area but our preferred
patterns are:
• Linux - Ansible triggers YUM / Apt-Get package
managers to deploy the RPM / Deb packages
• Windows – Octopus Deploy or VSTS Release
Manager to deploy NuGet packages
37
37. 38@DevOpsGuys
Deploy – Docker & Containerisation
“Docker is an open platform for developers and
sysadmins to build, ship, and run distributed
applications… Docker enables apps to be quickly
assembled from components and eliminates the friction
between development, QA, and production
environments. As a result, IT can ship faster and run the
same app, unchanged, on laptops, data center VMs, and
any cloud” – Docker.com
38
41. 42@DevOpsGuys
Alerting
• Who gets woken up by what notification method
• PagerDuty
• OpsGenie
• VictorOps
• Notification Channels
• Slack
• Mobile App
• SMS
• Email
42. 43@DevOpsGuys
Summary
•Automation is Good, Bad and Ugly
•Automation is inevitable
•Start at your Constraint
•There are lots of choices
•YMMV
Don’t spend
months on
evaluations.
Pick one,
trial it,
start learning
44. 45@DevOpsGuys
About DevOpsGuys
• Founded 2013
• 70 Staff
• 30+ Clients
• Headquartered in Cardiff, Wales
• AppDynamics Partner
• team@devopsguys.com
• Established as thought leaders in
DevOps
• Quoted by Gartner and Forrester
in research
• Founded winops.org
• Top ranked DevOps blog
“DevOpsGuys are luminaries in the UK DevOps space.”
Gene Kim, Author – “The Phoenix Project”
Editor's Notes
At about 1130pm a tired and somewhat frustrated engineer is dealing with some replication issues on a Postgres database – partly caused by a external spam attack and partly caused by an automated process deleting an account flagged for abuse… that turn out to be a Gitlab engineer account with a lot of associated projects etc
In order to try and resolve the problem in getting the replication to initialise properly on the slave he clears out the data directory on the slave… only to release his SSH session is currently on the PRIMARY (master) not the SECONDARY (Slave). Despite quickly cancelling the command only 4.5Gb of about 310 Gb of data is left.
2017/01/31 23:00-ish
YP thinks that perhaps pg_basebackup is being super pedantic about there being an empty data directory, decides to remove the directory. After a second or two he notices he ran it on db1.cluster.gitlab.com, instead of db2.cluster.gitlab.com
2017/01/31 23:27 YP - terminates the removal, but it’s too late. Of around 310 GB only about 4.5 GB is left - Slack
And because we live in a modern age… we live tweet and live stream everything…
But it’s OK they had automated backups!
Every 24 hours a backup is generated using pg_dump, this backup is uploaded to Amazon S3. Old backups are automatically removed after some time.
Every 24 hours we generate an LVM snapshot of the disk storing the production database data. This snapshot is then loaded into the staging environment, allowing us to more safely test changes without impacting our production environment. Direct access to the staging database is restricted, similar to our production database.
For various servers (e.g. the NFS servers storing Git data) we use Azure disk snapshots. These snapshots are taken once per 24 hours.
Replication between PostgreSQL hosts, primarily used for failover purposes and not for disaster recovery.
PG-Dump backups were borked due a version mismatch between Postgres versions…
Azure snapshots not enabled for DB Servers
LVM snapshot was 6hrs (the one taken before the maintenance) or 24hrs old (the one from last night)
The Julie Andrew approved method.
W
This statement from DevOps report is pretty relevant here.
IT shops who utilize best practices around continuous delivery, deploy code more frequently and with more confidence. And that enables them to be more agile in their software delivery process and makes the company twice as likely to exceed their profitability
We were co-founded by 2 experienced technologists, with a track record of delivering results at enterprise scale.