How to stop fingerpointing when your application is down
1. How to stop Finger-pointing
when your Application is down
Deepak Kaul
Solution Consultant, Compuware APM
Moving away from a reactive to a proactive IT organization
2. We help organizations optimize the performance
and value of their business-critical applications
• Web, non-Web, mobile, streaming, cloud-based applications
• Rapid issue notification with actionable diagnostics
• Insight into performance impacts business (revenue, brand, cost)
SaaS,
Cloud-Based and
On-Premises
Offerings
• Rapid startup and
payback
*”Trends: The Diversification Of End User Experiencing Monitoring”, Forrester Research, Inc., July 5, 2011
4,000+ Customers
Worldwide
• 2,500+ enterprise
customers
• 1,500+ SMB
customers
Global Reach
• 80+ offices
• 29 countries
• 100s of partners
• Global service
delivery
Recognized as
Industry Leader
• Gartner:
Leader in APM magic
quadrant
• Forrester Research:
“…a complete view
of end user
experience”*
• Ovum:
“Game-changing”
A New Generation of APM
3. .
Gartner Positions Compuware as a Leader in APM
Magic Quadrant
.
Gartner, Inc. Magic
Quadrant for Application
Performance Monitoring
August 16, 2012
Jonah Kowall, Will Cappelli
This Magic Quadrant graphic was published by Gartner, Inc.
as part of a larger research document and should be
evaluated in the context of the entire document.
The Gartner report is available upon request from
Compuware. Gartner does not endorse any vendor, product
or service depicted in its research publications, and does not
advise technology users to select only those vendors with the
highest ratings. Gartner research publications consist of the
opinions of Gartner's research organization and should not be
construed as statements of fact. Gartner disclaims all
warranties, expressed or implied, with respect to this
research, including any warranties of merchantability or
fitness for a particular purpose.
Gartner Positions Compuware
As Leader In APM Magic
Quadrant since last 3 years
4. Compuware APM: Growing list of World-class Customers…
Financial Services eCommerceSaaS & Cloud
OtherISV Government
Telco
Insurance
5. INTERNET
CUSTOMERS
DATA CENTER
Storage DB Servers Web
Servers
App
Servers
Middleware
ServersMainframe
Load
Balancers
Network
Mobile
Carriers
Content
Delivery
Networks
Major
ISP
Local
ISP
Third-party/
Cloud Services
Traditional monitoring – How finger pointing starts..
NETWORK TEAM
!
APPLICATION TEAM
SERVER TEAM
MAINFRAME TEAM
This
application
is slow!
Expand the network capacity!
Problem solved..I’m on it!
6. DATA CENTER
Storage DB Servers Web
Servers
App
Servers
Middleware
ServersMainframe
Load
Balancers
Network
NETWORK TEAM
APPLICATION TEAM
SERVER TEAM
MAINFRAME TEAM
INTERNET
CUSTOMERS
Mobile
Carriers
Content
Delivery
Networks
Major
ISP
Local
ISP
Third-party/
Cloud Services
Traditional monitoring – How finger pointing starts..
!
Not my Problem!
Not my Problem!
Not my Problem!
Not my Problem! This
application
is slow!
7. DATA CENTER
Storage DB Servers Web
Servers
App
Servers
Middleware
ServersMainframe
Load
Balancers
Network
NETWORK TEAM
APPLICATION TEAM
SERVER TEAM
MAINFRAME TEAM
INTERNET
CUSTOMERS
Mobile
Carriers
Content
Delivery
Networks
Major
ISP
Local
ISP
Third-party/
Cloud Services
Traditional monitoring – How finger pointing starts..
!
Not my Problem!
Not my Problem!
Not my Problem!
Not my Problem! This
application
is slow!
Increase infrastructure capacity!
More Servers!
More Bandwidth!
Increase Capacity!
Increase Storage! This
application
is still slow!
8. DATA CENTER
Storage DB Servers Web
Servers
App
Servers
Middleware
ServersMainframe
Load
Balancers
Network
NETWORK TEAM
APPLICATION TEAM
SERVER TEAM
MAINFRAME TEAM
INTERNET
CUSTOMERS
Mobile
Carriers
Content
Delivery
Networks
Major
ISP
Local
ISP
Third-party/
Cloud Services
Traditional monitoring – How finger pointing starts..
!
This
application
is slow!
!
All my lights are
green!
All my lights are
green!
All my lights are
green!
All my lights are
green! This
application
is slow!
CTO
Service
Manager
War Room
blah blah
blah blah
…. !!!!!...
……. … ……..
????????
9. Typical App Performance Lifecycle – Where it all begins..
Development
(local, remote, outsourced)
Test/QA
(local, remote, outsourced)
• Load testing
Business
Production
(local, remote, outsourced)
• Cloud load testing
• Monitoring
10. ✘What?
✘Who?
✘When?
✘How?
✘Code?
✘Recreate?
✘Business impact?
✘Priority?
✘Competitive info?
Problems with Typical App Performance Lifecycle
Too much time
reproducing problems!
Not engineered for performance!
Too many iterations!
Too many business
impacting issues!
Not enough business context!
$$$$$$
Development
(local, remote, outsourced)
Test/QA
(local, remote, outsourced)
• Load testing
Production
(local, remote, outsourced)
• Cloud load testing
• Monitoring
Business
12. All transactions
Click-to-code
All details
Which users
$$ amount
Conversions
Abandonment
Etc.
Compuware Lifecycle-Oriented APM: Single System
No need to
reproduce issues
Performance from the start
Fewer iterations
24x7, all transactions
Business impact
$
Development
(local, remote, outsourced)
Test/QA
(local, remote, outsourced)
• Load testing
Production
(local, remote, outsourced)
• Cloud load testing
• Monitoring
Business
Fewer issues
13. ONE APM System: 5 Modern APM Solutions
Application-Centric
World
14. Deep
analysis
Application
Browsers
Mobile
apps
Compuware APM: Driven by End-User Experience
C/C++
Private
agents
Private
Last
Mile
150,000+
consumer
- grade
desktops
168+
countries
2,500+
ISPs
Major
mobile
carriers
around
the globe
Backbone
• Synthetic
monitoring
• Load testing
Last Mile
• Synthetic
monitoring
• Load testing
Enterprise
• Synthetic
monitoring
• Broad view of end-user experience and
multi-tier transactions (real-user monitoring)
Data Center and Cloud
• Deep application transaction management
All tiers, all transactions, all users
150+
enterprise-
grade
nodes
Data
centers
and cloud
providers
PurePath
Real Users
• User
experience
management
16. The dynaTrace Difference - Real Business Impact
dynaTrace Technology Business Impact
4
5
Zero Configuration
Auto-discovery, auto-adaptive
Auto-diagnostics, Auto-BTs & more
Fastest Time To Value
10x the apps in 1/10th the time
Lowest TCO available
1
All Transaction, 24x7
True trace and capture, across tiers
Less than 2% overhead
Deep visibility, to code-level
Proactive
See issues before they impact users
10x-100x faster time to resolve
Give dev & testing a production
view
2
User Perspective
Know user experience & behavior
All devices, all browsers, all clicks
Extensible to native mobile apps
More Revenue & Loyalty
Assure optimal performance
Understand impact
Delight customers & partners
3
17. Synthetics
dynaTrace - How It Works
Web ServerBrowser / Rich-Client Java .NET Other Database
Performance
Warehouse
PurePath
Collector
dynaTrace
ServerdynaTrace
Client
Sessions
Store
Exported
Session
Offline
Session
Analysis
Lowest overhead
with externalized
data processing. No
app.-side data
processing
Only 24x7 heterogeneous
always-on distributed global
deep transaction trace. No
after-the-fact tracing
CPU, RT, Mem., Method
arguments / returns, SQLs,
Remoting, Msgs., Logs
Exceptions, Sync. No
statistical guesses
Shared full-depth
transaction, context
information.
No guesswork
Self-learning,
Auto-discovery,
Auto placement.
Low maintenance
Real-time transaction
analysis, business
transaction mapping,
alerting. No averages
Globally scalable
collector architecture,
secure. For cloud, virtual
environments
User experience, Web 2.0
page actions, clicks, end-
to-end transactions.
Transparent in production
Zero-config.
Deploy w/
single file.
No config.
files
Trace & compare real
& synthetic
transactions.
One system
PCI Compliant
18. A Common Language Cross Lifecycle
End-to-End Transaction
Execution Path
• Across tiers: browser –
servers - database
• Remoting
• Web Services
• External services
• Code-level depth
• Heterogeneous- .NET,
Java and more
Contextual
Transaction
Information
• Method arguments
• SQL bind variables
• Synchronization
• Exceptions
• Logs
+ +
Environmental Data
• Memory Dumps
• Thread Dumps
• Monitoring data
• PMI, JMX, CLR
• Win, Unix, DB,
VMWare, ETC
=
Web ServerBrowser / Rich-Client Java .NET Other Database
=
Production ArchitectureTest/QA
DevelopmentBusiness
dynaTrace
Session
Synthetics
Last updated or created: April ‘11Key themes:IntroTalk trackThis presentation covers the Compuware approach to APM, which is driven by end user experience. We believe the most important part of APM – and the only way to do it right – is to focus on the end user experience. We’ll explain all that in this sessionNote to the speaker:This slide deck tells a story – about the importance of end user experience in APM and how Compuware addresses it. When you’ve learned this story, you can tune the delivery of it to meet your particular audience. You don’t need to use every slide in exactly the same sequence. The important objective is to properly convey the message: describe the problems company experience and explain how we uniquely address them. There are also optional slides that you can add in as required. And, ideally, you should add in slides that you create about your prospect to convey that you understand their unique situation.
Last updated or created: Sept‘11Key themes:Gartner recognizes Compuware as a leader in the APM market and the company with the most “completeness of vision.”Talk trackIn Sept ‘11, Gartner published this year’s update to their APM Magic Quadrant report. The report evaluates 29 APM vendors on “ability to execute” and “completeness of vision.” The vendors are placed in one of four quadrants depending on their evaluation results.Gartner placed Compuware in the Leader’s quadrant, and you can see from the placement that Gartner gives Compuware the highest rating for “completeness of vision.” No other company even comes close on this scale. Why? Gartner gives Compuware high marks for our “First Mile to Last Mile” strategy that covers the entire application delivery chain. No other company has this vision, and it places Compuware clearly in the top position for vision and strategy. It means that Compuware is building a future for its customers that will allow them to retain the competitive advantage they get from maximizing the performance of their applications.Compuware also ranks very highly on the “ability to execute” axis.Compuware’s position in this evaluation moved up substantially from last year’s report. The movement reflects the value that Gartner places on our overall vision and strategy, which includes our major product releases and our acquisitions and integrations.Gartner may be the most respected analyst firm in the world. It’s clear from their analysis that Compuware is in a leadership position in the APM market.Please note that this evaluation did NOT include the benefits of the dynaTrace acquisition, which happened in July 2011 and was too late to be included in this MQ. We believe that when dynaTrace is rolled into the Compuware positioning next year, it will only improve our placement, especially on the “completeness of vision” axis.The entire APM MQ report is available for viewing from the Compuware website.Note to the speaker:This is an incredibly powerful message, and it should be delivered to every prospect.
Application performance is a central concern to many groups in a company. They each have their own particular concerns related to app performance. Business is concerned with how they can use app performance to be a competitive advantage.Development is concerned with agility: how they can build and fix performance into an application faster?QA is concerned with quality: how can they ensure performanceOperations is concerned with stability: how can they keep things running smoothly?They are linked via an Application Performance Lifecycle. This lifecycle starts with business requirements, flows to Development, into QA, and then into operations. As issues arise or the business changes, the lifecycle repeats. Because of the importance of application performance, businesses are looking to move thru this lifecycle with rapid agility.
Applications Change Rapidly ~ Business demands IncreasingApplications have become:mobile and distributedreliant on third partiescloud-basedincreasingly complex and fragileCloud invasion challengeHow to get visibility into every end user and automate collection of client side performance and functional problems?How to best prioritize and reduce mean time to repair of identified problems?What are the best practices to analyze and optimize application performance on these new sets of technologies and platforms?It takes us too long to reproduce and resolve app issues in production, in test, and in development – we spend way too much time guessing and not enough time fixingWe need to reduce app release cycle time without sacrificing code quality and are struggling to achieve thisWe need deeper visibility across a wider range of tiers, technologies and services to support our growingly complex environmentWe need to enable production and pre-production to work more closely together to improve efficiency and produce better results, faster
Continuous Application Performance Management is a critical component across the Application Lifecycle. The earlier in the Lifecycle you manage and get your performance under control the less you have to worry about actual problems later on when you ship your product.Compuware unifies the Application Performance Lifecyclelifecycle by providing two main items:Common metrics for business, development, and operations teams. These common metrics provide a common view of how app performance affects the business, and helps these groups prioritize requirements and issues.Common tools, data, and diagnostics for dev, QA, and Operations teams. This is based on PurePath technology that tracks all transactions all the time. This eliminates the need to reproduce issues (because the data is contained in the tracked transactions). It creates a “lingua franca” for Operations, Dev, and QA so they have a common view of the issues. Ultimately this saves a lot of time and money.
Last updated or created: April ‘11Key themes:major change #3: the Cloud has arrivedTalk trackIf it wasn’t complicated enough to have the data center and the web be more complex, now we also have the cloud as part of the equation.More and more companies are moving some or all of their applications to a private or public cloud. And that certainly changes the way you do APM – the cloud is opaque, so you can’t monitor its inner workings, and the cloud is shared, so you need to be careful that someone else’s app is not making yours slow.THIS is today’s app delivery chain. Far more complex than just a few years ago.
Setup the context upfront What the demo is going to highight and how with dT we can get RCA in easy steps
These are the key five points that differentiate dynaTrace from the other solutions out there. Frist off it was created with a transaction-centric view in mind. It captures every transaction that flows through the system. This isn’t sampling, it isn’t mathematical averaging. It’s everything! We pull all of the context of that transaction together – the code methods, logs, exceptions, SQL and even the bind variables!It also isn’t something that you have waiting in the wings to turn on when there is a problem – and hope that you can catch it again. It’s on 24/7, production safe, with very low overhead. When the issue happens you have all of the details you need to troubleshoot it and quickly diagnose the cause of the issue.The tool is of great value to developers, reducing the number of bugs that make it into production. We all know that those bugs can be very expensive – they take longer to find and fix. They also can damage the integrity of your production data as well as hurt your reputation with your customers. But with a complex system, you can’t capture all of the bugs before they get into production. Development never has the same quantity or type of data, and seldom sees the types of load that production does. Real users will use your system different than testers, and issues like thread contention and synchronization often only comes up during real world load. With dynaTrace everyone is speaking the same language. When an issue comes up, operations only has to export the session and send it to the development team. They have all the details they need to fix the issue – the full context of what was going on. No longer are they digging through a large volume of code trying to reproduce the issue. And you also don’t have to give them access to a production system. No more gathering up logs from disparate systems. Everything is there in one place!We also have customers that utilize our dynaTrace solution in an automated way as part of their regular build cycle. Our dynaTrace solution can automate a regression comparison from one build to the next. The report lets you know if the code is running as fast or faster, or if something you did hurt your performance. It will also let you know if new errors or exceptions have appeared. The system integrates with load testing tools like Gomez, LoadRunner and Performance Center.It was also developed to be an open system. Extensive APIs exist, and we have a large quantity of plug-ins that were written by our SEs or our customers. They are all shared on a community portal. We also integrate with many development tools, load testing tools and bug tracking tools. This allows you to leverage your existing infrastructure rather than trying to figure out how to tie dynaTrace into your systems.
The implementation is very simple. There is a 1 megabyte library that is loaded on the JVM or CLR. You update the argument that starts the JVM or CLR and restart. That’s it. dynaTrace is now automatically discovering the code being used. No need to configure what to watch and what to ignore. dynaTrace sees everything that affects performance. The data is offloaded from the agent onto a Collector, which then sends the data to our Server. Since the agent does no real processing it helps keep the overhead very low. On smaller installs it is also possible to have the Server and Collector on the same machine.We also have facilities for collecting other data (OS, application specific non-JVM/non-.NET) by using custom plug-ins. These can pull in other information that might be relevant to the performance or analysis. We also make it possible for memory dump analysis to be done on a separate system. When your JVMs have 40 gigs of RAM, that makes for a 40 gig file – this can be sent to a separate system for analysis, reducing the impact on the dynaTrace Server.Now this diagram is meant to highlight how we can scale dynaTrace. On smaller deployments, the Collector and Server can be the same system, and you won’t need the Analysis Server
So again – dynaTracesees the whole transaction as itflowsthrough the system. It gathers up all the details about what services wereused as well as what code wascalled. It also pulls in the arguments passed by the methodsalongwith SQL bind variables, all exceptions and the logs. This createswhatwe call a PurePath. That isthencombinedwith system level data to give us the full context of thattransaciton as itflowedthrough the systems. We have all the detailsweneed to troubleshootwhy a transaction ran slow or generatederrors.For pages that have a large amount of client-sideJavascript, we do also have a client-side agent thatcan time renderings and alsoprovide information on errors and other issues.
Setup the context upfront What the demo is going to highight and how with dT we can get RCA in easy steps
Use this image in your slide deck for when you transition from your Power Point into your Demo. When you show the Transaction Flow in dynaTrace, point out how it compares to this diagram – and how dynaTrace discovered it all on its own.The Demo will be conducted using the sample application Easy Travel. Easy Travel is a multi-tier web application implemented in .Net and Java.The overall architecture consists of:· Two Java processes providing the Customer Frontend and the Business backend server· Two .NET processes providing the B2B Frontend and the Payment backend server· A C++ application which receives credit card numbers via IPC/Named Pipe and simulates verifying the number against a third party provider· A Launcher GUI which allows to control the processes and also hosts the Java Derby Database· A Java Derby Database for storing the travel data.