SlideShare una empresa de Scribd logo
1 de 86
Pythian Operational Visibility
Percona Live Santa Clara, 2015
April 13, 2015
Derek Downey
MySQL Principal Consultant
Laine Campbell
Co-Founder, Open Source Database Practice
ABOUT PYTHIAN
• 200+ leading brands trust us to keep their systems fast,
relaible, and secure
• Elite DBA & SysAdmin workforce: 7 Oracle ACEs, 2 Oracle
ACE Directors, 5 Microsoft MVPs, 1 Cloudera Champion of
Big Data, 1 Datastax Platinum Administrator — More than any
other company, regardless of head count
• Oracle, Microsoft, MySQL, Hadoop, Cassandra, MongoDB,
and more.
• Infrastructure, Cloud, SRE, DevOps, and application expertise
• Big data practice includes architects, R&D, data scientists,
and operations capabilities
• Zero lock-in, utility billing model, easily blended into existing
teams.
10,000Pythian currently manages more than 10,000
systems.
350Pythian currently employs more than 350
people in 25 countries worldwide.
1997Pythian was founded in 1997
content
© 2014 Pythian
● discussion
○ laine campbell
○ one hour
● set-up host environments to monitor
○ derek downey
○ 30 minutes
● review observability stack
○ laine campbell
○ 30 minutes
● attach to observability stack, hands-on, Q&A
○ derek downey and laine campbell
○ one hour
Our Goals: to understand
© 2014 Pythian
● observability objectives, principles and outcomes
● current state and problems
● metrics
● observability architecture
● choosing what to measure
○ business KPIs and ways to track them
○ pre-emptive and diagnostics measurements
● the Pythian opsviz stack
○ what it is
○ how to set it up
○ how to visualize data
operational visibility
continuous improvement
kaizen recognizes improvement can be small or large.
many small improvements can make a big change.
to improve a system, you must...
© 2014 Pythian
● understand it
● describe it
● involve, and motivate all
stakeholders
enabling kaizen
© 2014 Pythian
plan do
act study
we can do none of this...
© 2014 Pythian
without visibility
the objectives of observability
© 2014 Pythian
● business velocity
● business availability
● business efficiency
● business scalability
the principles of observability
© 2014 Pythian
● store business and operations data together
● store at low resolution for core KPIs
● support self-service visualization
● keep your architecture simple and scalable
● democratize data for all
● collect and store once
the outcomes of observability
© 2014 Pythian
● high trust and transparency
● continuous deployment
● engineering velocity
● happy staff (in all groups)
● cost-efficiency
state of the union
© 2014 Pythian
starting to address the traditional problems with
opsviz
traditional monitoring
© 2014 Pythian
the problems
© 2014 Pythian
● too many dashboards
● data collected multiple
times
● resolution much too
high
● does not support
ephemeral
● hard to automate
● logs not centralized
better...
© 2014 Pythian
● telemetry collected
once
● logs centralized
● logs alerted on and
graphed
● 1 second resolution
possible
● supports ephemeral
● plays well with CM
● database table data
into dashboards
what must we improve?
© 2014 Pythian
● architectural component complexity and fragility
● functional automation and ephemeral support
● storage and summarization
● naive alerting and anomaly detection
● not understanding and using good math
● insufficient visualization and analysis
what’s in a metric?
© 2014 Pythian
resolution
latency
diversity
telemetry
● counters
● gauges
events
traditional
synthetic
architectural components
© 2014 Pythian
sensing
collecting
analysis
storage
visualization
alerting
architectural components
© 2014 Pythian
● Telemetry
● Events and Logs
● Applications
● Databases and SQL
● Servers and Resources
sensing
architectural components
© 2014 Pythian
● Agent or Agentless
● Push and Pull
● Filtering and Tokenizing
● Scaling
● Performance Impact
sensing
collecting
architectural components
© 2014 Pythian
● In-Stream
● Feeding into Automation
● Anomaly Detection
● Aggregation and
Calculations
sensing
collecting
analysis
architectural components
© 2014 Pythian
● Telemetry
● Events
● Resolution and
Aggregation
● Backends
sensing
collecting
analysis
storage
architectural components
© 2014 Pythian
● rules-based processing
● notification routing
● event aggregation and
management
● under, not over paging
● actionable alerts
sensing
collecting
analysis
storage
alerting
architectural components
© 2014 Pythian
● executive dashboards
● operational dashboards
● anomaly identification
● capacity planning
sensing
collecting
analysis
storage
alerting
visualization
what to measure?
© 2014 Pythian
we measure to support our KPIs
we measure to pre-empt incidents
we measure to diagnose problems
we alert when customers feel the pain
supporting our KPIs
© 2014 Pythian
velocity
efficiency
security
performance
availability
https://derpicdn.net/img/view/2012/8/3/65841__safe_fluttershy_tank_scooter_artist-colon-giantmosquito_tortoise_vespa.jpg
velocity
© 2014 Pythian
velocity
https://derpicdn.net/img/view/2012/8/3/65841__safe_fluttershy_tank_scooter_artist-colon-giantmosquito_tortoise_vespa.jpg
how fast can the org push new features?
how fast can the org pivot?
how fast can the org scale up or down?
deployment counts
DB object changes
data loads and changes
provisioning counts
cluster add/removal
member add/removal
engineering support
query review turnaround
data model review
turnaround
deployment time
DDL timings
data load timings
provisioning timing
cluster add/removal
member add/removal
deployment errors
failed DDL/DML
schema mismatch
provisioning errors
cluster add/removal
member add/removal
efficiency
© 2014 Pythian
velocity
https://derpicdn.net/img/view/2012/8/3/65841__safe_fluttershy_tank_scooter_artist-colon-giantmosquito_tortoise_vespa.jpg
how cost-efficient is our environment?
how elastic is our environment?
cloud spend
data storage costs
data compute costs
data in/out costs
physical spend
data storage costs
data compute costs
data in/out costs
staffing spend
DBE spend
Ops spend
cloud utilization
database capacity
database utilization
physical utilization
database capacity
database utilization
staffing utilization
DBE utilization
Ops utilization
provisioning counts
cluster add/removal
member add/removal
application utilization
percent capacity used
mapped to product or
feature
staffing elasticity
DBE/ops hiring time
DBE/ops training time
efficiency
security
© 2014 Pythian
velocity
https://derpicdn.net/img/view/2012/8/3/65841__safe_fluttershy_tank_scooter_artist-colon-giantmosquito_tortoise_vespa.jpg
how secure is our environment?
penetration tests
frequency
success
classified storage
live
in backups
audit results
frequency
results
audit trail data
utilization
access
users with access
account access
account audit
infosec incidents
event frequency
efficiency
security
performance
© 2014 Pythian
velocity
https://derpicdn.net/img/view/2012/8/3/65841__safe_fluttershy_tank_scooter_artist-colon-giantmosquito_tortoise_vespa.jpg
What is the AppDex of our environment?
AppDex(n), where n is the latency
AppDex(2.5), score of 95 indicates:
● 70% of queries under 2.5
● 25% of queries tolerable (5)
● 5% of queries as outliers
efficiency
security
performance
© 2014 Pythian
availability
© 2014 Pythian
velocity
https://derpicdn.net/img/view/2012/8/3/65841__safe_fluttershy_tank_scooter_artist-colon-giantmosquito_tortoise_vespa.jpg
how available is our environment to
customers?
how available is each component to the
application?
external response
pings
websites
APIs
system availability
server uptime
daemon uptime
accessibility to app
resource consumption
CPU
storage
memory
network
efficiency
security
performance
availability
pre-empting incidents
supporting diagnostics
© 2014 Pythian
identify anomalies in latency or utilization
identify dangerous trends in latency or utilization
identify error rates indicating potential failure
measure as much as possible
alert on as little as possible
© 2014 Pythian
● align alerts to customer pain
● automate remediation if possible
● use metrics and events to solve
what cannot be remediated
https://fc04.deviantart.net/fs71/i/2012/110/0/1/ninja_lyra_by_x72assassin-d4x0xqn.png
https://lockerdome.com/vox.com/7101308691941396
when measuring, understand...
© 2014 Pythian
your artificial bounds, which:
● to ensure resources are not exhausted
● to ensure application concurrency stays in control
your resource constraints
compromising on storage
© 2014 Pythian
telemetry data
© 2014 Pythian
collect and then flush
storing more than just averages
● min/max
● standard deviation
● percentiles for outlier removal
averages lie
storing histograms
http://s3.amazonaws.com/bronibooru/941e30f3cb2129951df0d7d674fefcad.png
the application
© 2014 Pythian
closest to perceived customer experience
documenting application components in SQL
understanding end to end
transactions
prioritization by latency
budgets
the application
© 2014 Pythian
application performance management tools
application logging to logstash
fire and forget to an event processing system
telemetry for occurrence counts
histograms for visualizing
the server
© 2014 Pythian
the basics, resource utilization, process behavior and the
network
log aggregation and measuring
● syslogs
● mysql logs
● cron, authentication, mail logs
aggregation up in distributed systems
https://lockerdome.com/vox.com/7101308691941396
the database: mysql
© 2014 Pythian
exposed database metrics
sql analytics and metrics
connection layer
how does they impact:
● availability KPIs
● concurrency KPIs
● latency KPIs
database metrics: workload
© 2014 Pythian
generic workload distribution
impacts to latency budgets
how fast are we hitting our resource and concurrency bounds?
● selects
● prepared statements
● ddl - data definition language
● dml - data manipulation language
● administrative commands
correlate DDL and admin to availability impacts
measure shifts in workload that may be impacting latency
database metrics: workload
© 2014 Pythian
data access behavior
● sort statistics
● join statistics
● handler status variables
○ index scans vs. full table scans
○ key index access
○ commits and rollbacks
how are we impacting our latency
budgets?
https://e621.net/data/e4/e4/e4e434d480672a67550c7c4113c85c73.jpg
database metrics: workload
event metrics, inside out
© 2014 Pythian
statement event
stage/sql/creating
sort event
stage/sql/copying
to tmp table
stage/sql/checking
query cache for
sql
sql stage
stage/com/ping
stage/com/quit
stage/com/error
com stage
helps identify where
time is spent, in SQL
statements
this helps to
diagnose and
improve
performance
database metrics: workload
where is time being spent?
© 2014 Pythian
wait event
event name
event source code
event operation
(read|write|lock)
stage/com/ping
stage/com/quit
stage/com/error
joined with threads,
allows for trending of
counts of wait
events.
this helps to
diagnose and
improve concurrency
and performance
issues
database metrics: sql
© 2014 Pythian
all sql should be logged, with context
● comments pointing to application source code
● latency, and the components therein
● resources consumed
● data access paths taken
performance schema -> event and log analysis and visualization
slow logs -> event and log analysis and visualization
network sniffing -> event and log analysis and visualization
© 2014 Pythian
database metrics: sql
© 2014 Pythian
all SQL should be logged, with context
THIS IS HARD
solutions like vivid cortex can radically improve velocity
the database: connection layer
© 2014 Pythian
this is key to latency and availability KPIs
you must connect to your database
for latency, you have a fixed budget
getting to your network, other transaction
components and SQL consume much of it
for availability, you must understand
your bounds
tcp ports and mysql
© 2014 Pythian
max_connections (mysql)
● take one tcp port
time_wait (kernel)
● how long the port stays open
● 60 in most linux kernels
● effectively reduces port range by a factor of 60
port range of 30,000 limits to 5,000 total network
connections
http://fc06.deviantart.net/fs71/f/2012/119/d/3/fat_pinkie_pie_by_nice123456-d4xy2w3.png
the database: network
mysql (5.6)
© 2014 Pythian
OS fixed amounts
● max tcp ports
● max tcp backlog
● network bandwidth
the database: network
mysql (5.6)
© 2014 Pythian
mysql configuration
● max_connections
● back_log
● max_connect_errors
● connect_timeout
● net_read_timeout
● net_write_timeout
● open_files_limit
connections: putting it together
use your network bounds
monitor proximity
© 2014 Pythian
status counters - network
● packets per sec
● request times
● requests per sec
● connection state (time_wait, listen, established)
● backlog
● socket queue drops and overflows
● time to get a tcp connection
the database: memory
operating system
© 2014 Pythian
shared memory, file descriptors, semaphores
● max shared memory segment size
● max number of segments
● max total of all memory available
● max file_descriptors per system and user
● max semaphores
the database: memory
mysql (5.6)
© 2014 Pythian
global memory bounds
● key_buffer_size
● innodb_buffer_pool_size
● innodb_additional_mem_pool_size
● innodb_log_buffer_size
● query_cache_size
the database: memory
mysql (5.6)
© 2014 Pythian
connection memory (max_connections)
● stack (thread_stack)
● connection and result buffers (net_buffer_length)
○ up to max_allowed_packet
● random read buffer (read_rnd_buffer_size)
● sequential read buffer (read_buffer_size)
● sort buffer (max of sort_buffer_size or max_heap_table_size)
● join buffer (join_buffer_size)
● (binlog_cache_size)
the database: memory
mysql (5.6)
© 2014 Pythian
max_connections = 1000
● thread_stack = 256k
● net_buffer_length x2 = 16k x 2 = 32k
○ up to max_allowed_packet x2 = 1m x2 = 2m
● read_rnd_buffer_size = 256k
● read_buffer_size = 128k
● sort_buffer_size = 256k max_heap_table_size = 16M
● join_buffer_size = 128k
● binlog_cache_size = 32k
total = 18.78m (reduce to ~3m by reducing max_heap…)
1000 connections = 2.9 GB
connections: putting it together
understand mysql impact to latency
understand proximity to mysql bounds
© 2014 Pythian
status counters: mysql
● processlist: connection and state counts
● thread statistics (thread_xxx)
● connection durations
● open_tables and open_files
● semaphores globally and per thread
● aborted_clients
● aborted_connects
● connection high water marks
● mysql network traffic
● mysql handlers - for buffer usage
● query response times
what’s next?
© 2014 Pythian
better time series storage
● automatically distributed and federated, thus manageable and scalable
● leverage parallelism and data aggregation
● proper data consistency, backup and recovery
● instrumented and tuneable
● can consume billions of metrics and store them
what’s next?
© 2014 Pythian
machine learning
● using code to pull out the signal from the noise
● easier correlation of metrics
● anomaly detection
● incident prediction
● capacity prediction
● REAL MATH
what’s next?
© 2014 Pythian
consolidation
● business metrics
● telemetry data
● event and log text data
lab time
● set-up your hosts
● stretch, hydrate and use facilities
● 30 minutes
setup ec2
● https://github.com/dtest/plsc15-opvis
asbolus
the pythian opsviz stack
© 2014 Pythian
https://github.com/pythian/opsviz
http://dashboard.pythian.asbol.us/
resides in AWS, built via cloudformation and opsworks
internet facing rabbitMQ listener
● for external logstash/statsd/sensu clients
● using AMQP, SSL elastic load balancer
● in AWS VPC
asbolus
the pythian opsviz stack
© 2014 Pythian
originally conceived and built by blackbird devops team
● taylor ludwig https://github.com/taylorludwig
● jonathan dietz https://github.com/jonathandietz
● aaron lee https://github.com/aaronmlee
continued development by pythian
● alex lovell-troy https://github.com/alexlovelltroy
● derek downey https://github.com/dtest
● laine campbell https://github.com/lainevcampbell
● dennis walker https://github.com/denniswalker
asbolus
© 2014 Pythian
telemetry data: sensu
© 2014 Pythian
generated from sensu agent
● sensu agent on host polls from 1 to 60 seconds
● agent pushes to rabbitMQ
● rabbitMQ sends to sensu
once in sensu
● event handlers review
● flushed to carbon/graphite
telemetry data: logstash
© 2014 Pythian
generated from logstash agent
● logstash agent on host pushes to logstash server
● logstash server tokenizes and submits to statsD
● statsD flushes and sends to carbon/graphite
event data: logstash
© 2014 Pythian
generated from logstash agent
● logstash agent on host pushes to logstash server
● logstash server tokenizes and submits to elasticsearch
monitoring: sensu
© 2014 Pythian
sensu server receives data from
● sensu agents
● statsD on logstash host
sensu handlers:
● flush to graphite
● send alerts to pagerduty
● create tickets in jira
● send messages to chat rooms
sensu
architecture
© 2014 Pythian
monitoring: why sensu?
© 2014 Pythian
clients subscribe to checks, supporting ephemeral hosts
sensu server can be parallelized and ephemeral
clients easily added to configuration management
backwards compatible to nagios
multiple multi-site strategies available
excellent API
telemetry storage: graphite
© 2014 Pythian
why graphite?
● works with many different pollers, to graph everything!
● combines maturity with functionality better than others
● can be clustered for scale
what are the limitations?
● clustering for scale is complex, not native
● flat files means no joining multiple series for complex
queries
● advanced statistical analysis not easy
event storage: elasticsearch
© 2014 Pythian
why?
● native distribution via clustering and sharding
● performant indexing and querying
● elasticity (unicorn scale)
what are the limitations?
● security still minimal
● enterprise features becoming available (at a price)
visualization
© 2014 Pythian
telemetry: grafana
logs/events: kibana
incidents/alerts: uchiwa
scaled and available sensu
© 2014 Pythian
rabbitMQ
● network partition
○ pause-minority
● node failures
○ mirrored queues
○ tcp load balancers
● AZ failures
○ multiple availability zones
○ pause minority recovery
● scaling concerns
○ auto-scaling nodes
○ elastic load balancer
scaled and available sensu
© 2014 Pythian
sensu servers
● redis failure
○ elasticache, multi-AZ
● sensu main host
○ multiple hosts
○ use the same redis service
○ multi-AZ
monitoring your monitor
© 2014 Pythian
rabbitMQ
● monitor queue growth for anomalies
● monitor for network partitions
● monitor auto scaling cluster size
sensu
● sensu cluster size (n+1 sensu hosts)
● redis availability
© 2014 Pythian
workflow
● metric sent to ELB
● ELB sends to rabbitMQ cluster
● rabbitMQ
○ writes to master
○ replicates to mirror
● rabbitMQ sends to sensu
scaled and available graphite
© 2014 Pythian
carbon cache (tcp daemon, listening for metrics)
● scale with multiple caches on each host
carbon relay
● used to distribute to multiple carbon caches
● by metric name, or consistent hashing
● can be redundant, using load balancers
whisper (flat file database, storage)
● can be replicated at the relay level
● running out of capacity and having to grow requires
rehashing
© 2014 Pythian
workflow
● metric sent to ELB
● ELB sends to carbon relay
● carbon relay
○ chooses carbon cache
○ replicates as needed
● carbon cache flushes to
whisper
scaling and available
elasticsearch
© 2014 Pythian
node and cluster scaling
● clustering scales reads
● distribute across availability zones
● sharding indices allows for distributing data
● multiple clusters for multiple indexes
network partitions
● running masters on dedicated nodes
● running data nodes on dedicated nodes
● run search load balancers on dedicated nodes
© 2014 Pythian
workflow
● master/replica nodes route data
and manage the cluster
● client nodes redirect queries
● data nodes store index shards
what’s next?
© 2014 Pythian
visualization
● use sensu API to get incidents/alerts into graphite/
● merge kibana and grafana to one page
monitoring
● integrate flapjack for event aggregation and routing
● continue to add more metrics
full stack
● anomaly detection via heka or skyline
● influxdb for storage
lab time
● work on dashboards!
● 15 minute walkthrough w/ derek
● 45 minute play time
Q&A
email: downey@pythian.com
twitter: https://twitter.com/derek_downey

Más contenido relacionado

La actualidad más candente

Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStaxWebinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStaxDataStax
 
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...DataStax Academy
 
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus Webinar
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus WebinarBuild and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus Webinar
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus WebinarImpetus Technologies
 
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Impetus Technologies
 
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...DataStax
 
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...DataStax
 
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...DataStax
 
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDon't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDataStax
 
DataStax Training – Everything you need to become a Cassandra Rockstar
DataStax Training – Everything you need to become a Cassandra RockstarDataStax Training – Everything you need to become a Cassandra Rockstar
DataStax Training – Everything you need to become a Cassandra RockstarDataStax
 
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...DataStax
 
DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesLars Albertsson
 
Microservices Patterns with GoldenGate
Microservices Patterns with GoldenGateMicroservices Patterns with GoldenGate
Microservices Patterns with GoldenGateJeffrey T. Pollock
 
Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...Ontico
 
How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?OVHcloud
 
End to End Supply Chain Control Tower
End to End Supply Chain Control TowerEnd to End Supply Chain Control Tower
End to End Supply Chain Control TowerDatabricks
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseDataStax
 
In-Memory Stream Processing with Hazelcast Jet @MorningAtLohika
In-Memory Stream Processing with Hazelcast Jet @MorningAtLohikaIn-Memory Stream Processing with Hazelcast Jet @MorningAtLohika
In-Memory Stream Processing with Hazelcast Jet @MorningAtLohikaNazarii Cherkas
 
How to Successfully Visualize DSE Graph data
How to Successfully Visualize DSE Graph dataHow to Successfully Visualize DSE Graph data
How to Successfully Visualize DSE Graph dataDataStax
 
Introduction: Architecting for Scale
Introduction: Architecting for ScaleIntroduction: Architecting for Scale
Introduction: Architecting for ScaleDataStax
 
Building and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStaxBuilding and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStaxDataStax
 

La actualidad más candente (20)

Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStaxWebinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
Webinar: Get On-Demand Education Anytime, Anywhere with Coursera and DataStax
 
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
ProtectWise Revolutionizes Enterprise Network Security in the Cloud with Data...
 
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus Webinar
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus WebinarBuild and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus Webinar
Build and Manage Hadoop & Oracle NoSQL DB Solutions- Impetus Webinar
 
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
Leveraging NoSQL Database Technology to Implement Real-time Data Architecture...
 
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
Don’t Get Caught in a PCI Pickle: Meet Compliance and Protect Payment Card Da...
 
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...
Webinar | Real-time Analytics for Healthcare: How Amara Turned Big Data into ...
 
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
Webinar: Buckle Up: The Future of the Distributed Database is Here - DataStax...
 
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerceDon't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
Don't Let Your Shoppers Drop; 5 Rules for Today’s eCommerce
 
DataStax Training – Everything you need to become a Cassandra Rockstar
DataStax Training – Everything you need to become a Cassandra RockstarDataStax Training – Everything you need to become a Cassandra Rockstar
DataStax Training – Everything you need to become a Cassandra Rockstar
 
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
Webinar - Macy’s: Why Your Database Decision Directly Impacts Customer Experi...
 
DataOps - Lean principles and lean practices
DataOps - Lean principles and lean practicesDataOps - Lean principles and lean practices
DataOps - Lean principles and lean practices
 
Microservices Patterns with GoldenGate
Microservices Patterns with GoldenGateMicroservices Patterns with GoldenGate
Microservices Patterns with GoldenGate
 
Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...Key trends in Big Data and new reference architecture from Hewlett Packard En...
Key trends in Big Data and new reference architecture from Hewlett Packard En...
 
How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?How to scale your PaaS with OVH infrastructure?
How to scale your PaaS with OVH infrastructure?
 
End to End Supply Chain Control Tower
End to End Supply Chain Control TowerEnd to End Supply Chain Control Tower
End to End Supply Chain Control Tower
 
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax EnterpriseTop 10 Best Practices for Apache Cassandra and DataStax Enterprise
Top 10 Best Practices for Apache Cassandra and DataStax Enterprise
 
In-Memory Stream Processing with Hazelcast Jet @MorningAtLohika
In-Memory Stream Processing with Hazelcast Jet @MorningAtLohikaIn-Memory Stream Processing with Hazelcast Jet @MorningAtLohika
In-Memory Stream Processing with Hazelcast Jet @MorningAtLohika
 
How to Successfully Visualize DSE Graph data
How to Successfully Visualize DSE Graph dataHow to Successfully Visualize DSE Graph data
How to Successfully Visualize DSE Graph data
 
Introduction: Architecting for Scale
Introduction: Architecting for ScaleIntroduction: Architecting for Scale
Introduction: Architecting for Scale
 
Building and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStaxBuilding and Maintaining Bulletproof Systems with DataStax
Building and Maintaining Bulletproof Systems with DataStax
 

Destacado

Velocity pythian operational visibility
Velocity pythian operational visibilityVelocity pythian operational visibility
Velocity pythian operational visibilityLaine Campbell
 
ProxySQL Tutorial - PLAM 2016
ProxySQL Tutorial - PLAM 2016ProxySQL Tutorial - PLAM 2016
ProxySQL Tutorial - PLAM 2016Derek Downey
 
Using Vault to decouple MySQL Secrets
Using Vault to decouple MySQL SecretsUsing Vault to decouple MySQL Secrets
Using Vault to decouple MySQL SecretsDerek Downey
 
Recruiting for diversity in tech
Recruiting for diversity in techRecruiting for diversity in tech
Recruiting for diversity in techLaine Campbell
 
RDS for MySQL, No BS Operations and Patterns
RDS for MySQL, No BS Operations and PatternsRDS for MySQL, No BS Operations and Patterns
RDS for MySQL, No BS Operations and PatternsLaine Campbell
 
Scaling MySQL in Amazon Web Services
Scaling MySQL in Amazon Web ServicesScaling MySQL in Amazon Web Services
Scaling MySQL in Amazon Web ServicesLaine Campbell
 

Destacado (6)

Velocity pythian operational visibility
Velocity pythian operational visibilityVelocity pythian operational visibility
Velocity pythian operational visibility
 
ProxySQL Tutorial - PLAM 2016
ProxySQL Tutorial - PLAM 2016ProxySQL Tutorial - PLAM 2016
ProxySQL Tutorial - PLAM 2016
 
Using Vault to decouple MySQL Secrets
Using Vault to decouple MySQL SecretsUsing Vault to decouple MySQL Secrets
Using Vault to decouple MySQL Secrets
 
Recruiting for diversity in tech
Recruiting for diversity in techRecruiting for diversity in tech
Recruiting for diversity in tech
 
RDS for MySQL, No BS Operations and Patterns
RDS for MySQL, No BS Operations and PatternsRDS for MySQL, No BS Operations and Patterns
RDS for MySQL, No BS Operations and Patterns
 
Scaling MySQL in Amazon Web Services
Scaling MySQL in Amazon Web ServicesScaling MySQL in Amazon Web Services
Scaling MySQL in Amazon Web Services
 

Similar a Pythian operational visibility

Does Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus WebinarDoes Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus WebinarImpetus Technologies
 
18 May 2017 - Vuzion Love Cloud
18 May 2017 - Vuzion Love Cloud18 May 2017 - Vuzion Love Cloud
18 May 2017 - Vuzion Love CloudVuzion
 
Case Study: Vivo Automated IT Capacity Management to Optimize Usage of its Cr...
Case Study: Vivo Automated IT Capacity Management to Optimize Usage of its Cr...Case Study: Vivo Automated IT Capacity Management to Optimize Usage of its Cr...
Case Study: Vivo Automated IT Capacity Management to Optimize Usage of its Cr...CA Technologies
 
Australian Cloud and Data Centre Strategy Summit 2016 gus sabatino
Australian Cloud and Data Centre Strategy Summit 2016   gus sabatinoAustralian Cloud and Data Centre Strategy Summit 2016   gus sabatino
Australian Cloud and Data Centre Strategy Summit 2016 gus sabatinoGus Sabatino
 
A DevOps adoption playbook- achieving business value at scale
A DevOps adoption playbook- achieving business value at scaleA DevOps adoption playbook- achieving business value at scale
A DevOps adoption playbook- achieving business value at scaleSanjeev Sharma
 
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsWhy data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsImply
 
Microservices in production 15/12/2015
Microservices in production 15/12/2015Microservices in production 15/12/2015
Microservices in production 15/12/2015Damien Daly
 
Pivotal Big Data Roadshow
Pivotal Big Data Roadshow Pivotal Big Data Roadshow
Pivotal Big Data Roadshow VMware Tanzu
 
The 3 Pillars of Remote Application Development
The 3 Pillars of Remote Application DevelopmentThe 3 Pillars of Remote Application Development
The 3 Pillars of Remote Application DevelopmentJenna Starmer
 
Enabling Resource Management — The Right People for the Right Projects
Enabling Resource Management — The Right People for the Right ProjectsEnabling Resource Management — The Right People for the Right Projects
Enabling Resource Management — The Right People for the Right ProjectsCA Technologies
 
Best Practices for a Complete Postgres Enterprise Architecture Setup
Best Practices for a Complete Postgres Enterprise Architecture SetupBest Practices for a Complete Postgres Enterprise Architecture Setup
Best Practices for a Complete Postgres Enterprise Architecture SetupEDB
 
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSBig Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSMatt Stubbs
 
Driving Real Insights Through Data Science
Driving Real Insights Through Data ScienceDriving Real Insights Through Data Science
Driving Real Insights Through Data ScienceVMware Tanzu
 
Modernizing IT in the Platform Era
Modernizing IT in the Platform EraModernizing IT in the Platform Era
Modernizing IT in the Platform EraApcera
 
5 Steps to Achieving the Single Pane of Glass Across DevOps -- APM, NPM, Metr...
5 Steps to Achieving the Single Pane of Glass Across DevOps -- APM, NPM, Metr...5 Steps to Achieving the Single Pane of Glass Across DevOps -- APM, NPM, Metr...
5 Steps to Achieving the Single Pane of Glass Across DevOps -- APM, NPM, Metr...DevOps.com
 
Monitoring As a Service
Monitoring As a ServiceMonitoring As a Service
Monitoring As a ServiceAmit Panchal
 
Business Data Lake Best Practices
Business Data Lake Best PracticesBusiness Data Lake Best Practices
Business Data Lake Best PracticesCapgemini
 
Taking AppSec to 11 - BSides Austin 2016
Taking AppSec to 11 - BSides Austin 2016Taking AppSec to 11 - BSides Austin 2016
Taking AppSec to 11 - BSides Austin 2016Matt Tesauro
 

Similar a Pythian operational visibility (20)

Does Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus WebinarDoes Big Data Spell Big Costs- Impetus Webinar
Does Big Data Spell Big Costs- Impetus Webinar
 
18 May 2017 - Vuzion Love Cloud
18 May 2017 - Vuzion Love Cloud18 May 2017 - Vuzion Love Cloud
18 May 2017 - Vuzion Love Cloud
 
Case Study: Vivo Automated IT Capacity Management to Optimize Usage of its Cr...
Case Study: Vivo Automated IT Capacity Management to Optimize Usage of its Cr...Case Study: Vivo Automated IT Capacity Management to Optimize Usage of its Cr...
Case Study: Vivo Automated IT Capacity Management to Optimize Usage of its Cr...
 
Backup manager
Backup manager Backup manager
Backup manager
 
Australian Cloud and Data Centre Strategy Summit 2016 gus sabatino
Australian Cloud and Data Centre Strategy Summit 2016   gus sabatinoAustralian Cloud and Data Centre Strategy Summit 2016   gus sabatino
Australian Cloud and Data Centre Strategy Summit 2016 gus sabatino
 
A DevOps adoption playbook- achieving business value at scale
A DevOps adoption playbook- achieving business value at scaleA DevOps adoption playbook- achieving business value at scale
A DevOps adoption playbook- achieving business value at scale
 
Why data warehouses cannot support hot analytics
Why data warehouses cannot support hot analyticsWhy data warehouses cannot support hot analytics
Why data warehouses cannot support hot analytics
 
Microservices in production 15/12/2015
Microservices in production 15/12/2015Microservices in production 15/12/2015
Microservices in production 15/12/2015
 
Pivotal Big Data Roadshow
Pivotal Big Data Roadshow Pivotal Big Data Roadshow
Pivotal Big Data Roadshow
 
The 3 Pillars of Remote Application Development
The 3 Pillars of Remote Application DevelopmentThe 3 Pillars of Remote Application Development
The 3 Pillars of Remote Application Development
 
Enabling Resource Management — The Right People for the Right Projects
Enabling Resource Management — The Right People for the Right ProjectsEnabling Resource Management — The Right People for the Right Projects
Enabling Resource Management — The Right People for the Right Projects
 
Best Practices for a Complete Postgres Enterprise Architecture Setup
Best Practices for a Complete Postgres Enterprise Architecture SetupBest Practices for a Complete Postgres Enterprise Architecture Setup
Best Practices for a Complete Postgres Enterprise Architecture Setup
 
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICSBig Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
Big Data LDN 2018: MICROLISE: USING BIG DATA AND AI IN TRANSPORT AND LOGISTICS
 
Driving Real Insights Through Data Science
Driving Real Insights Through Data ScienceDriving Real Insights Through Data Science
Driving Real Insights Through Data Science
 
Modernizing IT in the Platform Era
Modernizing IT in the Platform EraModernizing IT in the Platform Era
Modernizing IT in the Platform Era
 
5 Steps to Achieving the Single Pane of Glass Across DevOps -- APM, NPM, Metr...
5 Steps to Achieving the Single Pane of Glass Across DevOps -- APM, NPM, Metr...5 Steps to Achieving the Single Pane of Glass Across DevOps -- APM, NPM, Metr...
5 Steps to Achieving the Single Pane of Glass Across DevOps -- APM, NPM, Metr...
 
Monitoring As a Service
Monitoring As a ServiceMonitoring As a Service
Monitoring As a Service
 
Business Data Lake Best Practices
Business Data Lake Best PracticesBusiness Data Lake Best Practices
Business Data Lake Best Practices
 
Taking AppSec to 11 - BSides Austin 2016
Taking AppSec to 11 - BSides Austin 2016Taking AppSec to 11 - BSides Austin 2016
Taking AppSec to 11 - BSides Austin 2016
 
Kumar Anil
Kumar AnilKumar Anil
Kumar Anil
 

Más de Laine Campbell

An Introduction To Palomino
An Introduction To PalominoAn Introduction To Palomino
An Introduction To PalominoLaine Campbell
 
Hybrid my sql_hadoop_datawarehouse
Hybrid my sql_hadoop_datawarehouseHybrid my sql_hadoop_datawarehouse
Hybrid my sql_hadoop_datawarehouseLaine Campbell
 
Methods of Sharding MySQL
Methods of Sharding MySQLMethods of Sharding MySQL
Methods of Sharding MySQLLaine Campbell
 
CouchConf SF 2012 Lightning Talk - Operational Excellence
CouchConf SF 2012 Lightning Talk - Operational ExcellenceCouchConf SF 2012 Lightning Talk - Operational Excellence
CouchConf SF 2012 Lightning Talk - Operational ExcellenceLaine Campbell
 
Understanding MySQL Performance through Benchmarking
Understanding MySQL Performance through BenchmarkingUnderstanding MySQL Performance through Benchmarking
Understanding MySQL Performance through BenchmarkingLaine Campbell
 

Más de Laine Campbell (6)

Running MySQL in AWS
Running MySQL in AWSRunning MySQL in AWS
Running MySQL in AWS
 
An Introduction To Palomino
An Introduction To PalominoAn Introduction To Palomino
An Introduction To Palomino
 
Hybrid my sql_hadoop_datawarehouse
Hybrid my sql_hadoop_datawarehouseHybrid my sql_hadoop_datawarehouse
Hybrid my sql_hadoop_datawarehouse
 
Methods of Sharding MySQL
Methods of Sharding MySQLMethods of Sharding MySQL
Methods of Sharding MySQL
 
CouchConf SF 2012 Lightning Talk - Operational Excellence
CouchConf SF 2012 Lightning Talk - Operational ExcellenceCouchConf SF 2012 Lightning Talk - Operational Excellence
CouchConf SF 2012 Lightning Talk - Operational Excellence
 
Understanding MySQL Performance through Benchmarking
Understanding MySQL Performance through BenchmarkingUnderstanding MySQL Performance through Benchmarking
Understanding MySQL Performance through Benchmarking
 

Último

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)wesley chun
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesBoston Institute of Analytics
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 

Último (20)

Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
HTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation StrategiesHTML Injection Attacks: Impact and Mitigation Strategies
HTML Injection Attacks: Impact and Mitigation Strategies
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 

Pythian operational visibility

  • 1. Pythian Operational Visibility Percona Live Santa Clara, 2015 April 13, 2015 Derek Downey MySQL Principal Consultant Laine Campbell Co-Founder, Open Source Database Practice
  • 2. ABOUT PYTHIAN • 200+ leading brands trust us to keep their systems fast, relaible, and secure • Elite DBA & SysAdmin workforce: 7 Oracle ACEs, 2 Oracle ACE Directors, 5 Microsoft MVPs, 1 Cloudera Champion of Big Data, 1 Datastax Platinum Administrator — More than any other company, regardless of head count • Oracle, Microsoft, MySQL, Hadoop, Cassandra, MongoDB, and more. • Infrastructure, Cloud, SRE, DevOps, and application expertise • Big data practice includes architects, R&D, data scientists, and operations capabilities • Zero lock-in, utility billing model, easily blended into existing teams. 10,000Pythian currently manages more than 10,000 systems. 350Pythian currently employs more than 350 people in 25 countries worldwide. 1997Pythian was founded in 1997
  • 3. content © 2014 Pythian ● discussion ○ laine campbell ○ one hour ● set-up host environments to monitor ○ derek downey ○ 30 minutes ● review observability stack ○ laine campbell ○ 30 minutes ● attach to observability stack, hands-on, Q&A ○ derek downey and laine campbell ○ one hour
  • 4. Our Goals: to understand © 2014 Pythian ● observability objectives, principles and outcomes ● current state and problems ● metrics ● observability architecture ● choosing what to measure ○ business KPIs and ways to track them ○ pre-emptive and diagnostics measurements ● the Pythian opsviz stack ○ what it is ○ how to set it up ○ how to visualize data
  • 5. operational visibility continuous improvement kaizen recognizes improvement can be small or large. many small improvements can make a big change.
  • 6. to improve a system, you must... © 2014 Pythian ● understand it ● describe it ● involve, and motivate all stakeholders
  • 7. enabling kaizen © 2014 Pythian plan do act study
  • 8. we can do none of this... © 2014 Pythian without visibility
  • 9. the objectives of observability © 2014 Pythian ● business velocity ● business availability ● business efficiency ● business scalability
  • 10. the principles of observability © 2014 Pythian ● store business and operations data together ● store at low resolution for core KPIs ● support self-service visualization ● keep your architecture simple and scalable ● democratize data for all ● collect and store once
  • 11. the outcomes of observability © 2014 Pythian ● high trust and transparency ● continuous deployment ● engineering velocity ● happy staff (in all groups) ● cost-efficiency
  • 12. state of the union © 2014 Pythian starting to address the traditional problems with opsviz
  • 14. the problems © 2014 Pythian ● too many dashboards ● data collected multiple times ● resolution much too high ● does not support ephemeral ● hard to automate ● logs not centralized
  • 15. better... © 2014 Pythian ● telemetry collected once ● logs centralized ● logs alerted on and graphed ● 1 second resolution possible ● supports ephemeral ● plays well with CM ● database table data into dashboards
  • 16. what must we improve? © 2014 Pythian ● architectural component complexity and fragility ● functional automation and ephemeral support ● storage and summarization ● naive alerting and anomaly detection ● not understanding and using good math ● insufficient visualization and analysis
  • 17. what’s in a metric? © 2014 Pythian resolution latency diversity telemetry ● counters ● gauges events traditional synthetic
  • 18. architectural components © 2014 Pythian sensing collecting analysis storage visualization alerting
  • 19. architectural components © 2014 Pythian ● Telemetry ● Events and Logs ● Applications ● Databases and SQL ● Servers and Resources sensing
  • 20. architectural components © 2014 Pythian ● Agent or Agentless ● Push and Pull ● Filtering and Tokenizing ● Scaling ● Performance Impact sensing collecting
  • 21. architectural components © 2014 Pythian ● In-Stream ● Feeding into Automation ● Anomaly Detection ● Aggregation and Calculations sensing collecting analysis
  • 22. architectural components © 2014 Pythian ● Telemetry ● Events ● Resolution and Aggregation ● Backends sensing collecting analysis storage
  • 23. architectural components © 2014 Pythian ● rules-based processing ● notification routing ● event aggregation and management ● under, not over paging ● actionable alerts sensing collecting analysis storage alerting
  • 24. architectural components © 2014 Pythian ● executive dashboards ● operational dashboards ● anomaly identification ● capacity planning sensing collecting analysis storage alerting visualization
  • 25. what to measure? © 2014 Pythian we measure to support our KPIs we measure to pre-empt incidents we measure to diagnose problems we alert when customers feel the pain
  • 26. supporting our KPIs © 2014 Pythian velocity efficiency security performance availability https://derpicdn.net/img/view/2012/8/3/65841__safe_fluttershy_tank_scooter_artist-colon-giantmosquito_tortoise_vespa.jpg
  • 27. velocity © 2014 Pythian velocity https://derpicdn.net/img/view/2012/8/3/65841__safe_fluttershy_tank_scooter_artist-colon-giantmosquito_tortoise_vespa.jpg how fast can the org push new features? how fast can the org pivot? how fast can the org scale up or down? deployment counts DB object changes data loads and changes provisioning counts cluster add/removal member add/removal engineering support query review turnaround data model review turnaround deployment time DDL timings data load timings provisioning timing cluster add/removal member add/removal deployment errors failed DDL/DML schema mismatch provisioning errors cluster add/removal member add/removal
  • 28. efficiency © 2014 Pythian velocity https://derpicdn.net/img/view/2012/8/3/65841__safe_fluttershy_tank_scooter_artist-colon-giantmosquito_tortoise_vespa.jpg how cost-efficient is our environment? how elastic is our environment? cloud spend data storage costs data compute costs data in/out costs physical spend data storage costs data compute costs data in/out costs staffing spend DBE spend Ops spend cloud utilization database capacity database utilization physical utilization database capacity database utilization staffing utilization DBE utilization Ops utilization provisioning counts cluster add/removal member add/removal application utilization percent capacity used mapped to product or feature staffing elasticity DBE/ops hiring time DBE/ops training time efficiency
  • 29. security © 2014 Pythian velocity https://derpicdn.net/img/view/2012/8/3/65841__safe_fluttershy_tank_scooter_artist-colon-giantmosquito_tortoise_vespa.jpg how secure is our environment? penetration tests frequency success classified storage live in backups audit results frequency results audit trail data utilization access users with access account access account audit infosec incidents event frequency efficiency security
  • 30. performance © 2014 Pythian velocity https://derpicdn.net/img/view/2012/8/3/65841__safe_fluttershy_tank_scooter_artist-colon-giantmosquito_tortoise_vespa.jpg What is the AppDex of our environment? AppDex(n), where n is the latency AppDex(2.5), score of 95 indicates: ● 70% of queries under 2.5 ● 25% of queries tolerable (5) ● 5% of queries as outliers efficiency security performance
  • 32. availability © 2014 Pythian velocity https://derpicdn.net/img/view/2012/8/3/65841__safe_fluttershy_tank_scooter_artist-colon-giantmosquito_tortoise_vespa.jpg how available is our environment to customers? how available is each component to the application? external response pings websites APIs system availability server uptime daemon uptime accessibility to app resource consumption CPU storage memory network efficiency security performance availability
  • 33. pre-empting incidents supporting diagnostics © 2014 Pythian identify anomalies in latency or utilization identify dangerous trends in latency or utilization identify error rates indicating potential failure
  • 34. measure as much as possible alert on as little as possible © 2014 Pythian ● align alerts to customer pain ● automate remediation if possible ● use metrics and events to solve what cannot be remediated https://fc04.deviantart.net/fs71/i/2012/110/0/1/ninja_lyra_by_x72assassin-d4x0xqn.png
  • 35. https://lockerdome.com/vox.com/7101308691941396 when measuring, understand... © 2014 Pythian your artificial bounds, which: ● to ensure resources are not exhausted ● to ensure application concurrency stays in control your resource constraints
  • 37. telemetry data © 2014 Pythian collect and then flush storing more than just averages ● min/max ● standard deviation ● percentiles for outlier removal averages lie storing histograms
  • 38. http://s3.amazonaws.com/bronibooru/941e30f3cb2129951df0d7d674fefcad.png the application © 2014 Pythian closest to perceived customer experience documenting application components in SQL understanding end to end transactions prioritization by latency budgets
  • 39. the application © 2014 Pythian application performance management tools application logging to logstash fire and forget to an event processing system telemetry for occurrence counts histograms for visualizing
  • 40. the server © 2014 Pythian the basics, resource utilization, process behavior and the network log aggregation and measuring ● syslogs ● mysql logs ● cron, authentication, mail logs aggregation up in distributed systems
  • 41. https://lockerdome.com/vox.com/7101308691941396 the database: mysql © 2014 Pythian exposed database metrics sql analytics and metrics connection layer how does they impact: ● availability KPIs ● concurrency KPIs ● latency KPIs
  • 42. database metrics: workload © 2014 Pythian generic workload distribution impacts to latency budgets how fast are we hitting our resource and concurrency bounds? ● selects ● prepared statements ● ddl - data definition language ● dml - data manipulation language ● administrative commands correlate DDL and admin to availability impacts measure shifts in workload that may be impacting latency
  • 43. database metrics: workload © 2014 Pythian data access behavior ● sort statistics ● join statistics ● handler status variables ○ index scans vs. full table scans ○ key index access ○ commits and rollbacks how are we impacting our latency budgets? https://e621.net/data/e4/e4/e4e434d480672a67550c7c4113c85c73.jpg
  • 44. database metrics: workload event metrics, inside out © 2014 Pythian statement event stage/sql/creating sort event stage/sql/copying to tmp table stage/sql/checking query cache for sql sql stage stage/com/ping stage/com/quit stage/com/error com stage helps identify where time is spent, in SQL statements this helps to diagnose and improve performance
  • 45. database metrics: workload where is time being spent? © 2014 Pythian wait event event name event source code event operation (read|write|lock) stage/com/ping stage/com/quit stage/com/error joined with threads, allows for trending of counts of wait events. this helps to diagnose and improve concurrency and performance issues
  • 46. database metrics: sql © 2014 Pythian all sql should be logged, with context ● comments pointing to application source code ● latency, and the components therein ● resources consumed ● data access paths taken performance schema -> event and log analysis and visualization slow logs -> event and log analysis and visualization network sniffing -> event and log analysis and visualization
  • 48. database metrics: sql © 2014 Pythian all SQL should be logged, with context THIS IS HARD solutions like vivid cortex can radically improve velocity
  • 49. the database: connection layer © 2014 Pythian this is key to latency and availability KPIs you must connect to your database for latency, you have a fixed budget getting to your network, other transaction components and SQL consume much of it for availability, you must understand your bounds
  • 50. tcp ports and mysql © 2014 Pythian max_connections (mysql) ● take one tcp port time_wait (kernel) ● how long the port stays open ● 60 in most linux kernels ● effectively reduces port range by a factor of 60 port range of 30,000 limits to 5,000 total network connections
  • 51. http://fc06.deviantart.net/fs71/f/2012/119/d/3/fat_pinkie_pie_by_nice123456-d4xy2w3.png the database: network mysql (5.6) © 2014 Pythian OS fixed amounts ● max tcp ports ● max tcp backlog ● network bandwidth
  • 52. the database: network mysql (5.6) © 2014 Pythian mysql configuration ● max_connections ● back_log ● max_connect_errors ● connect_timeout ● net_read_timeout ● net_write_timeout ● open_files_limit
  • 53. connections: putting it together use your network bounds monitor proximity © 2014 Pythian status counters - network ● packets per sec ● request times ● requests per sec ● connection state (time_wait, listen, established) ● backlog ● socket queue drops and overflows ● time to get a tcp connection
  • 54. the database: memory operating system © 2014 Pythian shared memory, file descriptors, semaphores ● max shared memory segment size ● max number of segments ● max total of all memory available ● max file_descriptors per system and user ● max semaphores
  • 55. the database: memory mysql (5.6) © 2014 Pythian global memory bounds ● key_buffer_size ● innodb_buffer_pool_size ● innodb_additional_mem_pool_size ● innodb_log_buffer_size ● query_cache_size
  • 56. the database: memory mysql (5.6) © 2014 Pythian connection memory (max_connections) ● stack (thread_stack) ● connection and result buffers (net_buffer_length) ○ up to max_allowed_packet ● random read buffer (read_rnd_buffer_size) ● sequential read buffer (read_buffer_size) ● sort buffer (max of sort_buffer_size or max_heap_table_size) ● join buffer (join_buffer_size) ● (binlog_cache_size)
  • 57. the database: memory mysql (5.6) © 2014 Pythian max_connections = 1000 ● thread_stack = 256k ● net_buffer_length x2 = 16k x 2 = 32k ○ up to max_allowed_packet x2 = 1m x2 = 2m ● read_rnd_buffer_size = 256k ● read_buffer_size = 128k ● sort_buffer_size = 256k max_heap_table_size = 16M ● join_buffer_size = 128k ● binlog_cache_size = 32k total = 18.78m (reduce to ~3m by reducing max_heap…) 1000 connections = 2.9 GB
  • 58. connections: putting it together understand mysql impact to latency understand proximity to mysql bounds © 2014 Pythian status counters: mysql ● processlist: connection and state counts ● thread statistics (thread_xxx) ● connection durations ● open_tables and open_files ● semaphores globally and per thread ● aborted_clients ● aborted_connects ● connection high water marks ● mysql network traffic ● mysql handlers - for buffer usage ● query response times
  • 59. what’s next? © 2014 Pythian better time series storage ● automatically distributed and federated, thus manageable and scalable ● leverage parallelism and data aggregation ● proper data consistency, backup and recovery ● instrumented and tuneable ● can consume billions of metrics and store them
  • 60. what’s next? © 2014 Pythian machine learning ● using code to pull out the signal from the noise ● easier correlation of metrics ● anomaly detection ● incident prediction ● capacity prediction ● REAL MATH
  • 61. what’s next? © 2014 Pythian consolidation ● business metrics ● telemetry data ● event and log text data
  • 62. lab time ● set-up your hosts ● stretch, hydrate and use facilities ● 30 minutes
  • 64. asbolus the pythian opsviz stack © 2014 Pythian https://github.com/pythian/opsviz http://dashboard.pythian.asbol.us/ resides in AWS, built via cloudformation and opsworks internet facing rabbitMQ listener ● for external logstash/statsd/sensu clients ● using AMQP, SSL elastic load balancer ● in AWS VPC
  • 65. asbolus the pythian opsviz stack © 2014 Pythian originally conceived and built by blackbird devops team ● taylor ludwig https://github.com/taylorludwig ● jonathan dietz https://github.com/jonathandietz ● aaron lee https://github.com/aaronmlee continued development by pythian ● alex lovell-troy https://github.com/alexlovelltroy ● derek downey https://github.com/dtest ● laine campbell https://github.com/lainevcampbell ● dennis walker https://github.com/denniswalker
  • 67. telemetry data: sensu © 2014 Pythian generated from sensu agent ● sensu agent on host polls from 1 to 60 seconds ● agent pushes to rabbitMQ ● rabbitMQ sends to sensu once in sensu ● event handlers review ● flushed to carbon/graphite
  • 68. telemetry data: logstash © 2014 Pythian generated from logstash agent ● logstash agent on host pushes to logstash server ● logstash server tokenizes and submits to statsD ● statsD flushes and sends to carbon/graphite
  • 69. event data: logstash © 2014 Pythian generated from logstash agent ● logstash agent on host pushes to logstash server ● logstash server tokenizes and submits to elasticsearch
  • 70. monitoring: sensu © 2014 Pythian sensu server receives data from ● sensu agents ● statsD on logstash host sensu handlers: ● flush to graphite ● send alerts to pagerduty ● create tickets in jira ● send messages to chat rooms
  • 72. monitoring: why sensu? © 2014 Pythian clients subscribe to checks, supporting ephemeral hosts sensu server can be parallelized and ephemeral clients easily added to configuration management backwards compatible to nagios multiple multi-site strategies available excellent API
  • 73. telemetry storage: graphite © 2014 Pythian why graphite? ● works with many different pollers, to graph everything! ● combines maturity with functionality better than others ● can be clustered for scale what are the limitations? ● clustering for scale is complex, not native ● flat files means no joining multiple series for complex queries ● advanced statistical analysis not easy
  • 74. event storage: elasticsearch © 2014 Pythian why? ● native distribution via clustering and sharding ● performant indexing and querying ● elasticity (unicorn scale) what are the limitations? ● security still minimal ● enterprise features becoming available (at a price)
  • 75. visualization © 2014 Pythian telemetry: grafana logs/events: kibana incidents/alerts: uchiwa
  • 76. scaled and available sensu © 2014 Pythian rabbitMQ ● network partition ○ pause-minority ● node failures ○ mirrored queues ○ tcp load balancers ● AZ failures ○ multiple availability zones ○ pause minority recovery ● scaling concerns ○ auto-scaling nodes ○ elastic load balancer
  • 77. scaled and available sensu © 2014 Pythian sensu servers ● redis failure ○ elasticache, multi-AZ ● sensu main host ○ multiple hosts ○ use the same redis service ○ multi-AZ
  • 78. monitoring your monitor © 2014 Pythian rabbitMQ ● monitor queue growth for anomalies ● monitor for network partitions ● monitor auto scaling cluster size sensu ● sensu cluster size (n+1 sensu hosts) ● redis availability
  • 79. © 2014 Pythian workflow ● metric sent to ELB ● ELB sends to rabbitMQ cluster ● rabbitMQ ○ writes to master ○ replicates to mirror ● rabbitMQ sends to sensu
  • 80. scaled and available graphite © 2014 Pythian carbon cache (tcp daemon, listening for metrics) ● scale with multiple caches on each host carbon relay ● used to distribute to multiple carbon caches ● by metric name, or consistent hashing ● can be redundant, using load balancers whisper (flat file database, storage) ● can be replicated at the relay level ● running out of capacity and having to grow requires rehashing
  • 81. © 2014 Pythian workflow ● metric sent to ELB ● ELB sends to carbon relay ● carbon relay ○ chooses carbon cache ○ replicates as needed ● carbon cache flushes to whisper
  • 82. scaling and available elasticsearch © 2014 Pythian node and cluster scaling ● clustering scales reads ● distribute across availability zones ● sharding indices allows for distributing data ● multiple clusters for multiple indexes network partitions ● running masters on dedicated nodes ● running data nodes on dedicated nodes ● run search load balancers on dedicated nodes
  • 83. © 2014 Pythian workflow ● master/replica nodes route data and manage the cluster ● client nodes redirect queries ● data nodes store index shards
  • 84. what’s next? © 2014 Pythian visualization ● use sensu API to get incidents/alerts into graphite/ ● merge kibana and grafana to one page monitoring ● integrate flapjack for event aggregation and routing ● continue to add more metrics full stack ● anomaly detection via heka or skyline ● influxdb for storage
  • 85. lab time ● work on dashboards! ● 15 minute walkthrough w/ derek ● 45 minute play time