4. 1. The need for diagnostic data in
cloud applications
2. Data we can monitor
3. Using the Azure Diagnostic
Agent
4. Real-world guidance for
troubleshooting Azure apps
10. Resolution
1. Enable Azure diagnostics
– Set key performance
counters
2. Add logging statements
around key functionality
– Especially external services
3. Test, test, test
4. Analyze
5. Fix it
Scenario
o
o
o
o
o
o
17. Diagnostic Item Table Name Blob Container Name
Windows Event Logs WADWindowsEventLogsTable
Performance Counters WADPerformanceCountersTable
Trace Log Statements WADLogsTable
Azure Diagnostic Infrastructure
Logs
WADDiagnosticInfrastructureLogs
Custom Logs
(i.e. log4net, NLog, etc.)
<custom>
IIS Logs WADDirectoriesTable* wad-iis-logfiles
IIS Failed Request Logs WADDirectoriesTable* wad-iis-failedreqlogfiles
Crash Dumps WADDirectoriesTable*
* Location of the blob log file is specified in the Container field and name of the blob in the RelativePath field. The
AbsolutePath field contains the name of the file as it existed on the role instance.
18. 1. Role starts
2. Diagnostic monitor agent starts
3. Diagnostics configured
4. Data buffered locally
5. Data transferred to storage
o
19.
20. o Trace logs
o IIS logs
o Infrastructure logs
o No transfer
o OnStart()
o Overrides default
o diagnostics.wadcfg
o Root of worker or
bin of web
21. public override bool OnStart()
{
// Create the DiagnosticMonitorConfiguration object to use for configuring the monitoring agent.
DiagnosticMonitorConfiguration config = DiagnosticMonitor.GetDefaultInitialConfiguration();
// Performance Counter configuration
config.PerformanceCounters.DataSources.Add(new PerformanceCounterConfiguration
{
CounterSpecifier = @"Processor(_Total)% Processor Time",
SampleRate = TimeSpan.FromSeconds(30)
});
config.PerformanceCounters.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
// Log configuration
config.Logs.ScheduledTransferLogLevelFilter = LogLevel.Information;
config.Logs.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
// Event Log configuration
config.WindowsEventLog.DataSources.Add("Application!*");
config.WindowsEventLog.DataSources.Add("System!*");
config.WindowsEventLog.ScheduledTransferLogLevelFilter = LogLevel.Warning;
config.WindowsEventLog.ScheduledTransferPeriod = TimeSpan.FromMinutes(1);
// Start the diagnostic monitor with the new configuration
DiagnosticMonitor.Start("Microsoft.WindowsAzure.Plugins.Diagnostics.ConnectionString", config);
return base.OnStart();
}
Impacts local agent only!
30. o
o Vital information
o
o
o
o Day-to-day operational data
o
o
o
o
31. Process for Azure hosted solutions is not that different
from traditional, on-premises solutions.
Compute node
resource usage
Windows Event
logs
Database queries
response times
Application
specific exceptions
Database
connection & cmd
failures
Microsoft Azure
Storage Analytics
Successful projects share one common trait . . Not what you might think
Latest hot language
Hot platform
Smartest people
Agile vs. waterfall
Money
http://assets.bitnami.com/assets/windows_azure_logo-metro.png
http://technologiesreview.com/wp-content/uploads/2011/02/AWS_LOGO_CMYK.png
http://www.istockphoto.com/stock-photo-35165202-portrait-of-male-college-student.php?st=73c78c9
The #1 problem I see over & over
Multiple servers – more difficult to handle
Keep locally? Hard
What if a server dies?
Need a central location
Configure in Visual Studio
Show the declarative way
Show were the file is located – bin and root for Web and Worker respective (D:ProjectsDemosJustAzureAzureDiagnosticsAzureDiagnosticscsxRelease
olesWorkerRole1approot)
Show in storage – show using AMS.
Show the file in blob storage (pghtechfest14)
Show viewing data in Visual Studio
Show LinqPad
Show AMS
Log all calls to external services
Include as much detail as possible (destination, method, timing info, result, etc.)
Log details of transient faults
Number of retry actions
Cause of the fault
Did the application fail over to a secondary instance?
Detect an emerging problem!
Partition telemetry data by date (or hour) – reduce impact of data aggregation or reporting
Use a different storage account!
Remove old / non-relevant telemetry data