Marc embraces database virtualization and containerization to help Dave's team adopt DataOps practices. This allows team members to access self-service virtual test environments on demand. It increases data accessibility by 10%, resulting in over $65 million in additional income. DataOps removes the biggest barrier by automating and accelerating data delivery to support fast development and testing cycles.
Data Gravity is the ability for data to attract applications, services, etc. As with the laws of physical gravity, data, due to its mass, will attract all else that has less mass.
There are larger data sources every day. Databases are at the center of this friction and the natural life of a database is growth.
By 2020, a third of all data will be on the cloud and 58% of data will be comprised in big data.
By 2020, we’ll grow from today’s 4.4 zettabyets to an approximate, but staggering 44 zettabytes, or 44 trillion gigabytes.
And by 2020, a third of that data will pass through the cloud.
More data has been created in just the last two years than the previous history of humanity
1.7MB of new information created every second per human on the planet.
Introduction of big data often has same development pain points.
Agile 2008 conference, Andrew Clay Shafer and Patrick Debois discussed "Agile Infrastructure”
The term DevOps was popularized through a series of "devopsdays" starting in 2009 in Belgium
I made an attempt to introduce it to my local user group in 2012 and it failed miserably. Last year, made a second attempt to great success.
. It will be their first Scrum “sprint”
Looking to increase productivity 20%
25% increase in collaboration.
Gains in revenue are predicted.
And they’re looking forward to succeeding…
George has worked with the team to pick out the right tools tools for their environment, including Git for their repository, Jenkins for collaboration and deployment, Juju for Security and they’re even using Ansible for some of the automation.
They’re well on their way with a great introduction of tools.
The team builds out the sprint backlog and plans out what they will do and have to accomplish in the two weeks they have for the sprint. Everyone is assigned their tasks and ready to begin.
Each morning starts out great, as they have their daily scrum standups, taking just a few minutes to get everyone on board, tasks assigned and goals on what will be accomplished for the day off the burndown list.
The developers are starting to trip over each other in their traditional waterfall data environment. DBAs are busy and having difficulty providing them the data they need.
They all want to succeed, but without the data, they start to miss the deadlines for the daily scrum burndown list.
Our Developer, Dave, built out a new test script that has been automated, but needs to be tested against fresh copy of production to development and test.
The problem is, the DBAs can’t get him the data fast enough through traditional methods, even when they use DevOps methodologies,
It can’t fix the current technologies the DBAs are employing.
Over 80% of time is waiting for RDBMS, (relational databases) to be refreshed. Developers and Testers are waiting for data to do their primary functions.
This allows for faster and less costly migrations to the cloud, too.
The developers are feeling under more pressure as they can’t get the data they need and the DBAs are pressured to get space, time and resources.
This has become an oxy-moron. Databases are becoming larger and DBAs are slowing down the process, so we’ll remove those that understand how to manage it best?
George is at his wits end trying to figure out how they will ever succeed at DevOps and at Scrum if they can’t get through a simple two week sprint.
Where we had that one department with money to throw at an odd project, buying a server, developing what they needed and then it was our problem, now we have companies doing 30% or more business auditing cloud projects to deem if they are viable or not.
BUT WHAT HAPPENS WHEN THEY DON’T or we don’t address the problem??
For a typical Fortune 1000 company, just a 10% increase in data accessibility will result in more than $65 million additional net income.
Leveraging data coupld increase revenue by as much as 60%
Marc gets it. He sees how much he and his team is in demand and knows that something needs to change.
In computing, virtualization means to create a virtual version of a device or resource, such as a server, storage device, network or even a database. The framework divides the resource into one or more execution environments. For data, this can result in a golden copy or source that is used for a centralized location and removal of duplicated data. For read and writes, having unique data for that given copy, while duplicates are kept to singular.
Oracle: RMAN duplicates, cold backup to restores, datapump and other archaic data transfer processes are time consuming.
SQL Server: Import/Export, Recover from Backup, Replication with log shipping,
By virtualizing, we remove the “weight” of the data. We know that 80% of the data won’t change between copies, so why do we need individual copies of it. Our source is then deduped and compressed to conserve more space.
I work with Delphix, so you would think I know our virtualization the best, but the truth is, I also know many other virtualization tools at a very detailed level.
The amount of information I know on Oracle virtualization tools is pretty insane, in fact.
I work with Delphix, so you would think I know our virtualization the best, but the truth is, I also know many other virtualization tools at a very detailed level.
The amount of information I know on Oracle virtualization tools is pretty insane, in fact.
How do we “rewind” data and code changes now?
Why should the DBA rewind changes made in dev and test?
Why should you be the one to do this in test?
Virtualization removes this.
The Virtual databases are read and write, so even maintenance tasks, like DBCC’s can be offloaded to one.
Ability to version control, not just the meta data, but the user data!
Point out the engine and size after we’ve compressed and de-duplicated.
Note that each of the VDBs will take approximately 5-10G vs. 1TB to offer a FULL read/write copy of the production system
It will do so in just a matter of minutes.
That this can also be done for the application tier!
Each Virtual Database, (VDB) will no longer require space, (only background and foreground memory for SGA/PGA, etc.) and local redo logs. This is a considerable savings, but…
If we take this a step further by embracing write changes only on blocks changed from the source, then we’ll experience 10-20 copies of a database in about the same space that one database requires.
Package software into standardized units for development, shipment and deployment. A container image is a lightweight, stand-alone, executable package of a piece of software that includes everything needed to run it: code, runtime, system tools, system libraries, settings.
The next step is the ability to migrate to the cloud or from one cloud to another. Right now, 60% of customers are using 2-5 clouds on average. The ability to move a Data Pod from one cloud to another is incredibly powerful.
Companies are spending increased time now just migrating to the cloud, but to other clouds and if it would be as simple as migrating a Data pod with a few changes to the new storage location, (i.e. cloud) that could save companies millions of dollars.
We refer to a container as a template in our product.
Note that a data pod can be moved here or to the cloud…
Agile 2008 conference, Andrew Clay Shafer and Patrick Debois discussed "Agile Infrastructure”
The term DevOps was popularized through a series of "devopsdays" starting in 2009 in Belgium
I made an attempt to introduce it to my local user group in 2012 and it failed miserably. Last year, made a second attempt to great success.
This may appear to be a traffic disaster of changes, but for developers with Agile experience, a “sprint” looks just like this. You have different sprints that are quick runs and merges where developers are working separately on code that must merge successfully at the correct intersection and be deployed.
Versioning with source control is displayed at the top, using Virtual images. You can see each iteration of the sprints.
In the middle section is the branches of that occur during the development process. A virtual can be spun from a virtual, which means that it’s easier for developers to work from the work another developer has produced.
Stopping points and release via a clone is simply minutes vs. hours or days.
This is the interface for Developers and testers- they can bookmark before important tasks or rewind to any point in the process. They can bookmark and branch for full development/testing needs.
Agile 2008 conference, Andrew Clay Shafer and Patrick Debois discussed "Agile Infrastructure”
The term DevOps was popularized through a series of "devopsdays" starting in 2009 in Belgium
I made an attempt to introduce it to my local user group in 2012 and it failed miserably. Last year, made a second attempt to great success.
Data can only be processed if there is at least one lawful basis to do so[14]. The lawful basis for processing data are:
the data subject has given consent to the processing of his or her personal data for one or more specific purposes.
processing is necessary for the performance of a contract to which the data subject is party or in order to take steps at the request of the data subject prior to entering into a contract.
processing is necessary for compliance with a legal obligation to which the controller is subject.
processing is necessary in order to protect the vital interests of the data subject or of another natural person.
processing is necessary for the performance of a task carried out in the public interest or in the exercise of official authority vested in the controller.
processing is necessary for the purposes of the legitimate interests pursued by the controller or by a third party, except where such interests are overridden by the interests or fundamental rights and freedoms of the data subject which require protection of personal data, in particular where the data subject is a child.
Or does it shift the problem toward authentication and authorization?
Masking personally-identifiable, (PII, HIPPA, PCI, etc.) information renders it useless from a security standpoint
Resolves both the technical and personal responsibility issue.
The data can be masked before it moves to non-production, removing unnecessary risk.
As we discussed, on average, 80% of data is non-production.
GDPR is on the horizon and US must be compliant by May, 2018
And yet we state that we won’t need DBAs? That data isn’t the center of challenge?
The business is able to provision new environments or refresh existing ones in a matter of minutes.
Developers and testers who’ve worked with bookmarks and branching of their code changes can now do the same with database changes, rewinding and refreshing as they need without impacting the DBAs day. This allows the DBA to do more with their time.
Having tools that includes the database in the Agile development cycle makes a pivotal change in how the DBA is capable of being part of DevOps.