Showbox is the new Exchange Data Center site which monitors every exchange server in the entire world. This site helps our engineers and on call incident managers stay informed about the health and status of Exchange, and get it back up and healthy as quickly and efficiently as possible.
2. showbox scenarios
Alert > Assess > Act > Evaluate
Incident Management
Alert: Investigations are triggered from service thresholds, partner service teams, or from customer support. A page is sent to the On Call
Engineer‟s phone and the engineer Acknowledges (acks) the alert so it doesn‟t roll over to other on call staff.
Assess: The alert is read, and provides a place to start investigation. If the alert is not a problem the engineer can fix they will Lateral it to
someone who can solve it. The scope of the issue is assessed. In the case of a data base outage, the backup copies are checked
Act: If the copies are good the service is restarted Engineer waits until the service indicates it is back online. Alerts are monitored for
related problems and when possible the Engineer suppresses them to not wake up other Engineers unnecessarily. The Engineer goes
back to bed and does further investigation the following day.
Evaluate: Failover logs and debug scripts are launched for later root cause analysis. Bugs are edited and filed as appropriate.
If the impact is significant an Incident Manager will be engaged.
Alert: IM Requests happen when there is a significant customer impact. These requests engage the Incident Manager, and the
Communication Manager.
Assess: The IM works with the On Call Engineer to assess the impact, informs the CM who publishes external posts to the public if
required. The IM will make the call when additional people need to be brought in, which may include partner teams, Ops, and other
engineers to diagnose and generate a recovery plan. Minutes count!
Act: The plan is put into play. Once recovery is completed the service is monitored
Evaluate: Post-mortem will be done.
3. showbox scenarios
Alert > Assess > Act > Evaluate
Customer Service Requests
Alert: A customer calls Frontline Engineer, who then must verify who the customer is over the phone, this is usually by finding the primary
domain or the Company name.
Assess: While the customer is relating their problem CSS checks to see if there are any known issues that might be impacting the domain,
these might be existing escalations, bugs, or service issues or known work around.
Act: If the call has a high impact or if the Frontline CSS cannot solve the problem they escalate to the Escalation Engineer who can fix the
problem or escalate to Engineering.
Evaluate: Always reviewed by the Customer Experience Team (CXP) monthly.
Change Request External
Alert: Escalation request is received.
Assess: The Escalation Engineer attempts to identify the correct recovery action
Act: A customer request such as moving a mailbox to another server is generated and goes through a triage process. The Frontline
Engineer is apprised of the status of the request and then contacts the company with the outcome.
Evaluate: Always reviewed by the Customer Experience Team (CXP) monthly.
Change Request Internal
Alert: A change request ( a bug or upgrade) is communicated to the Engineer
Assess: Engineering assesses the impact of the change
Act: The change is rolled out and monitored closely for a period of time. Analysis
Evaluate: why did we need the change right away, why wasn‟t it rolled out as a checked in build, why wasn‟t it automated.
4. technical ability varies
Scenarios focus on alert and overview in the portal the portal is a great place for overviews and a starting place for deep dive
SMB ADMIN LORG ADMIN CSS EE SLT EXCHANGE ENGINEER
LOW OFF THE CHART
Technical ability and tolerance
5. reports for everyone
Chart gardens are where we are more open ended
M1 focus
SMB ADMIN LORG ADMIN CSS EE SLT EXCHANGE ENGINEER
LOW OFF THE CHART
Portal pages OVERVIEWS ANSWER THE QUESTION OF “HOW IS MY STUFF DOING?”
CHART GARDENS FOR TROUBLE
SHOOTING
Configurable pop-outs are a useful scenario for:
• IMs
• On call engineers
• Product unit engineers
• …anyone doing deep comparative analysis of specific issues
6. anatomy of showbox
Scope control
Noun: a place or object within
the topology. Defined by the
scope control. Scope control
can be also be changed by a
link within a chart or piece of
data.
Navigation
Supports key
scenarios, navigates to content
Content
Adjective: A
description, state, or other
information about the scoped
selection. Some links in content
can change scope
Actions
Verbs: Actions taken are in
relation to the scope and are a
reaction to the description.
Optimally actions like “Escalate
Now”, “Lateral”, and
“Communicate Now” include
the scope as an editable text
field in the form, and include
state information if possible to
make it easier on the
recipient, and to help alert mails
click through to the correct
scope
7. primary navigation
health escalation changes optics
Secondary Nav Overview Availability Customer Service Performance
Control Data block list + Rotator Tap Rotator Tap Rotator Tap Rotator Tap
Preview
Content Availability, Keynote, OWA, OLK, LiveID, OrgId, Auto CAS, Hub, MBX, more
Spike-o-meter Mobile+RIM, Mailflow, Discover Service, Auto
UM, EWS Discover Xml, OAB
Content changes Yes No No No Yes
add domain specific just narrow scope to just narrow scope to just narrow scope to add domain specific
for narrow scope metrics, or for server selection selection selection metrics, or for server
change to a role change to a role
specific view specific view
8. primary navigation
health escalation changes optics
Secondary Nav Alerts Support Calls People Directory Protocols
Control List+Preview+alert block List+Preview+alert block table table
Content Alerts Support calls Directory Protocols and PDFs
Content changes for When scoped to a server or
domain
scope
9. primary navigation
health escalation changes optics
Secondary Nav Overview Inventory Deployment Requests
Control Data block list + Preview Rotator Rotator List view and network in
details
Content Pivot+Network Timeline of future List of requests with request
changes+Network info and network chart in
details
Content changes for Yes
add domain specific
scope metrics, or for server
change to a role specific
view
10. primary navigation
health escalation changes optics
Secondary Nav MSR Service Triage AdHoc More…
Control Link farm Link farm Query submission Pivot
Content Executive level reports that Useful perf counters and Place to submit a query to Everything in ESP now
summarize core service reports grouped by feature black box server and review
statistics areas for Engineers results
11. 2 key layouts
1. Overviews
A quick scan of the content should
answer the question, “Is there anything
wrong?”
Overviews should summarize the
contents of a primary navigational
area, and the general state of the
scoped selection.
If there are three additional secondary
nav tabs, each tab should be
represented in the overview. Quick
investigation on this page (preview)
should let the user drill down quickly to
a specific point of interest.
It contains:
• Data Blocks
• and a preview pane.
• Possibly a map too
This proposed design is being built for o365 Wave 15
12. 2 key layouts
2. List and preview
Standard UMC control with the
addition of the alert block.
Use for Alerts, Support calls, Change Visual treatment
requests
UnAcked incident
List is sortable, searchable, can be Open incident
filtered, and can add and remove Resolved incident or alert
items.
Preview is highly configurable and
can display custom layouts if needed
e.g. inject a chart or as E14 Discovery
did, insert a table
14. 3
Showbox ! Steven McQueen
region: all forest: all dag: all site: all copygroup: all server: all !
Secondary navigation 1 secondary navigation 2 secondary navigation 3 secondary
Primary navigation item 1 navigation 4
Primary navigation item 2
Primary navigation item 3
Primary navigation item 4
Content which is filtered by scope control
15. Using dropdowns to select scope parent: all node2: all node3: all node4: all node5: all leaf: all
parent1
parent: all node2: all node3: all node4: all node5: all leaf: all
all parents
parent1
parent2
parent3
parent4
parent1 node2-1
node2: all node3: all node4: all node5: all leaf: all
all node2
node2-1
node2-2
node2-3
node2-4
parent1 node2-1 node3: all node4: all node5: all leaf: all
16. Using type down and dot to parent: all node2: all node3: all node4: all node5: all leaf: all
advance to the next field
parent1
parent: all node2: all node3: all node4: all node5: all leaf: all
all parents
parent1
parent2
parent3
parent4
parent:node2-
parent1 . all node2: all node3: all node4: all node5: all leaf: all
all node2
node2-1
node2-2
node2-3
node2-4
parent:node2-1 . node3-1
parent1 . all node2: all node3: all node4: all node5: all leaf: all
all node3
node3-1
node3-2
node3-3
node3-4
parent1 node2-1 node3-1 node4: all node5: all leaf: all
17. Zooming in and out of data
Content area in UI shows the State 1
appropriate content for the
parent1 node2-1 node3-1 node4-2 node5: all leaf: all
selection.
Original state is not changed until
user explicitly changes it. selection
parent1 node2-1 node3-1 node4-2 node5: all leaf: all
State 2 parent1 node2-1 node3-1 node4-2 node5: all leaf: all
selection parent1 node2-1 node3-1 node4-2 node5: all leaf: all
State 3 parent1 node2-1 node3-1 node4-2 node5: all leaf: all
Explicit Change parent1 node2-1 node3-1 node4-2 node5: all leaf: all
…change continued parent1 node2-1
node2-1 node3-1 node4-2 node5: all leaf: all
all node2
node2-1
node2-2
node2-3
node2-4
State 1 parent: all node2-2 node3: all node4: all node5: all leaf: all
18. Rendering data clusters using
parentheses and simple Boolean parent1 node2-1 (node3-1 - node3-5) node4: all node5: all leaf: all
queries.
parent1 node2-1 (node3-1 , node3-5) node4: all node5: all leaf: all
19. Searching for an object State 1 parent: all node2: all node3: all node4: all node5: all leaf: all
query parent: all node2: all node3: all node4: all node5: all leaf: all
Leaf12-
parent: all node2: all node3: all node4: all node5: all leaf: all
Leaf12-1
Leaf12-10
Leaf12-11
Leaf12-12
resolution parent2 node2-2 node3-6 node4-4 node5-1 leaf12-10
Searching for a clustering concept State 1 parent: all node2: all node3: all node4: all node5: all leaf: all
query parent: all node2: all node3: all node4: all node5: all leaf: all
Concept
parent: all node2: all node3: all node4: all node5: all leaf: all
resolution parent1 node2-1 (node3-1 , node3-5) Concept
21. data block
Stackable UX Lego blocks that for v1 will be organized statically within layouts, however since the
data they get is subject to the scope the data will change appropriately.
How it works
Each instance of the data block is encoded with a usage Scoped to Lots of labels for one data
superset of data fields. When there is no data for all regions
a label the label will not be shown, and when
Tenants +2% 486,012 block:
Mailboxes +5% 82,763,121
there is no data for the block the entire block is
Active -7% 8,453,454 Tenants
hidden.
Sent mail +10% 50 million Domains
The same commandlet is called from a given Mailboxes
layout all the time, but since it is scoped by the Active users
UX different combinations of data can be usage Scoped to
Sent mail
a tenant
returned. Mailboxes +2% 1000 Etc
Active +5% 33 Etc
This control supports flagging, links, trends, and Sent mail -7% 321 Etc
simple tabular data layouts. Etc
Etc
Layout Etc…
Data blocks should be fixed width, have a
Scope dictates the query, only fields which
maximum of four columns and those columns
have data are shown
should to align with all the columns for critical
data so they can be easily scanned APC
Current 99.90% 9:00 AM
Low 96.76% 8:45 AM
Average 97.70% 1 hour
availability
Current 99.90%
Low 96.76%
25. stacker plot heatmap
How it works
• Every site is represented, and each site is
represented only once Black selection
• Each chart has four selectable
regions, green, yellow, red, which load the
corresponding list view.
Red selection
Pros
• Outliers are bigger, and more in focus
• Chart scales to very large data sets
Yellow selection
• With list view, very meaningful information is
available
• Groups of items with the same capacity are
selectable Green selection
• Scope control is now the way to change
scope instead of drilling down a lot from the
chart, making back more difficult
Cons
• Harder to compare regions or other large
groups. Eg. APC vs. NAM, Namprod01 vs.
Namprod02
27. 3
Showbox ! Steven McQueen
!
parent: all node2: all node3: all node4: all node5: all leaf: all
health overview availability customer service performance
escalations
alerts availability and alert volume Updated: 2/9/2012 9:00AM
changes 4 active alerts APC
99.5
LAM
optics
availability 15min 1 hr NAM
EUR
Active monitoring 99.7% 99.5%
ESC
Keynote 99.5% 94.8%
Alerts
2
customer latency failures 95
Outlook +2% 5 7
Mobile +5% 14
8 AM 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9 AM
Mailflow -7% 35
Provisioning +10% 3 TIME: 1h 2h 6h Custom
service latency failures
Network +2% 12
availability and alerts / 8:50am – 9:00am
Live ID +5% 3 REGION TYPE TIME AVAILABILITY
[SERVICE INCIDENT]
Monitoring -7% 4 NAM [SERVICE INCIDENT] keynote failures 12/9 9:12 95% ACP keynote failures
AD +10% 10 for connections via
NAM [RESOLVED INCIDENT] Quis nostrud 12/9 9:12 99% Singapore SingTel
FOPE -1% 2
OWNER
Datacenter - Ack Now!
Engage IM!
SCOPE
NAM/NAMPROD07/CH1PRO
D702/CH1PRD0702CA017
IMPACT
Outlook Connectivity
28. 3
Showbox ! Steven McQueen
!
parent: all node2: all node3: all node4: all node5: all leaf: all
health alerts support calls people directory protocols
escalations
changes
ALERT TIME OWNER
[SERVICE INCIDENT] ACP
optics [SERVICE INCIDENT] keynote failures for connections via Singapore SingTel 09/27 9:12 Pending – Datacen… keynote failures for connections
[SERVICE INCIDENT] This database has had only one good copy for 20 minutes 09/27 9:12 Jessed-High Availabi… via Singapore SingTel
[SERVICE INCIDENT] one healthy copy for TestADReplication: One or more 09/27 9:12 Jessed-High Availabi… !
[SERVICE INCIDENT] Lorem ipsum dolor sit amet, consectetur adipisicing elit, 09/27 9:12 Jessed-High Availabi…
[INVESTIGATION] Lorem ipsum dolor sit amet, consectetur adipisicing elit, 09/27 9:12 Jessed-High Availabi… owner:
Datacenter - Ack Now!
[INVESTIGATION] Ut enim ad minim veniam, quis nostrud exercitation ulla 09/27 9:12 Jessed-High Availabi…
Engage IM!
[INVESTIGATION] Quis nostrud exercitation ullamco laboris nisi ut aliquip ex 09/27 9:12 Jessed-High Availabi…
[RESOLVED INCIDENT] Quis nostrud exercitation ullamco laboris nisi ut aliqui 09/27 9:12 Jessed-High Availabi… scope:
CH1PRD0702CA017
[RESOLVED INVESTIGATION] Quis nostrud exercitation ullamco 09/27 9:12 Jessed-High Availabi…
More…
[RESOLVED INVESTIGATION] Quis nostrud exercitation ullamco laboris nisi 09/27 9:12 Jessed-High Availabi…
[RESOLVED INCIDENT] Quis nostrud exercitation ullamco laboris nisi ut aliqui 09/27 9:12 Jessed-High Availabi… impact
Tenants: 363
[RESOLVED INVESTIGATION] Quis nostrud exercitation ullamco 09/27 9:12 Jessed-High Availabi…
Users: 8834
[RESOLVED INVESTIGATION] Quis nostrud exercitation ullamco laboris nisi 09/27 9:12 Jessed-High Availabi… More…
[RESOLVED INCIDENT] Quis nostrud exercitation ullamco laboris nisi ut aliqui 09/27 9:12 Jessed-High Availabi…
RESOLVED INVESTIGATION] Quis nostrud exercitation ullamco 09/27 9:12 Jessed-High Availabi…
[RESOLVED INVESTIGATION] Quis nostrud exercitation ullamco laboris nisi 09/27 9:12 Jessed-High Availabi…
[RESOLVED INCIDENT] Quis nostrud exercitation ullamco laboris nisi ut aliqui 09/27 9:12 Jessed-High Availabi…
[RESOLVED INVESTIGATION] Quis nostrud exercitation ullamco 09/27 9:12 Jessed-High Availabi…
[RESOLVED INVESTIGATION] Quis nostrud exercitation ullamco laboris nisi 09/27 9:12 Jessed-High Availabi…
29. 3
Showbox ! Steven McQueen
!
parent: all node2: all node3: all node4: all node5: all leaf: all
health monthly service review service triage ad hoc more
escalations
monthly service review
changes
Key Usage Stats Server and Hardware
optics Total Mailbox Count and Active Mailbox Count Lorem Ipsum Dolor Sit and Consectetur Adipisicing Elit
Provisioning Support
Provisioning Latency and Failures, Tenant Growth by Offering, Lorem Ipsum Dolor Sit and Consectetur Adipisicing Elit
and Tenant Growth by Segment
Upgrades
Availability & Incidents Lorem Ipsum Dolor Sit and Consectetur Adipisicing Elit
Keynote Availability, SCOM Availability, and Availability
Incidents
Migration
Lorem Ipsum Dolor Sit and Consectetur Adipisicing Elit
Escalation Analysis
Top Escalations by Type, Top Root Causes
Site Resiliency
Lorem Ipsum Dolor Sit and Consectetur Adipisicing Elit
Networking, Directory & Capacity Heatmap
Migrations, Connections, Load Balancer, AD Health , and
Capacity Heatmap,
Build Release & Operations Scorecard
Build Release Scorecard , Operations Scorecard , and Data
Protection
30. 3
Showbox ! Steven McQueen
!
parent: all node2: all node3: all node4: all node5: all leaf: all
health overview availability customer service performance
escalations
Updated: 06/25/2011 9:00AM
changes
CAS CPU HUB CPU HUB IO MBX CPU MBX SPACE MBX IO AD CPU AD IO F5 CPU F5 MEM UM
optics
CAS CPU failures
SITE Failover Resource Value MBXs ACTVE MBXs DBs MACHINES
state
NAM06/SN2PRD0602 Unstable 79% 0 0 0 41/48 NAM01/SN2PRD0602
details
State: Provisioned
Version: R5
Build: 14.01.0225.071
More…
impact
Client Session Concurrency: 113,076
Deliveries/Sec: 267
related CAS CPU
SN2PRD0602
SN2PRD0102
CH1PRD0106
31. 3
Showbox ! Steven McQueen
!
parent: all node2: all node3: all node4: all node5: all leaf: all
health overview availability customer service performance
escalations
Updated: 06/25/2011 9:00AM
changes
CAS CPU HUB CPU HUB IO MBX CPU MBX SPACE MBX IO AD CPU AD IO F5 CPU F5 MEM UM
optics
CAS CPU at 60% capacity
SITE Failover state Resource Value MBXs ACTVE MBXs DBs MACHINES
NAM06/SN2PRD0602 Unstable 75% 0 0 0 41/48 NAM01/SN2PRD0602
NAM06/CH1PRD0602 Unstable 75% 0 0 0 41/48 details
NAM04/SN2PRD0402 Critical 65% 0 0 0 41/48 State: Provisioned
Version: R5
NAM04/CH1PRD0402 Warning 55% 0 0 0 41/48 Build: 14.01.0225.071
NAM06/SN2PRD0604 Warning 55% 0 0 0 41/48 More…
NAM02/SN2PRD0202 Warning 55% 0 0 0 41/48
impact
APC01/HKNPRD0102 Warning 55% 0 0 0 41/48 Client Session Concurrency: 113,076
Deliveries/Sec: 267
EUR01/AMSPRD0302 Warning 45% 0 0 0 41/48
NAM01/SN2PRD0102 Warning 45% 0 0 0 41/48 related
CAS CPU
SN2PRD0602
SN2PRD0102
CH1PRD0106
33. chart gardens
How it works
• Pop out to stock chart configurations, from
links in the page or from the chart drop
down.
• URL is visible and equals a parameterized
link to the visible configuration of charts.
This is an aid to IMs and Engineers who
want to get back to this view (add it to
favorites, copy it to an email) to get others
quickly up to speed on the thing they are
focused on.
• Charts can be modified,
• Allow users to add charts from the entire
suite of reports in Showbox
34. 3
Showbox ! Steven McQueen
!
parent: all node2: all node3: all node4: all node5: all leaf: all
health overview availability customer service performance
escalations
Updated: 06/25/2011 9:00AM
changes
CAS CPU HUB CPU HUB IO MBX CPU MBX SPACE MBX IO AD CPU AD IO F5 CPU F5 MEM UM
optics
CAS CPU at 60% capacity
SITE Failover state Resource Value MBXs ACTVE MBXs DBs MACHINES
NAM06/SN2PRD0602 Unstable 75% 0 0 0 41/48 NAM01/SN2PRD0602
NAM06/CH1PRD0602 Unstable 75% 0 0 0 41/48 details
NAM04/SN2PRD0402 Critical 65% 0 0 0 41/48 State: Provisioned
Version: R5
NAM04/CH1PRD0402 Warning 55% 0 0 0 41/48 Build: 14.01.0225.071
NAM06/SN2PRD0604 Warning 55% 0 0 0 41/48 More…
NAM02/SN2PRD0202 Warning 55% 0 0 0 41/48
impact
APC01/HKNPRD0102 Warning 55% 0 0 0 41/48 Client Session Concurrency: 113,076
Deliveries/Sec: 267
EUR01/AMSPRD0302 Warning 45% 0 0 0 41/48
NAM01/SN2PRD0102 Warning 45% 0 0 0 41/48 related
CAS CPU
SN2PRD0602
SN2PRD0102
CH1PRD0106
35. https://pod51005.outlook.com/showbox/CAS/ChartGardenx.aspx?pwmcid=1&ReturnObjectType=1
Exchange SharePoint Lync
3
region: all forest: all dag: allallsite: all allcopygroup: all all server: all
region: all forest: all dag: site: copygroup: server: all
i ! !
1h 3h 8h Custom
healthCAS CPU TOP 5 overview availability customer service performance
escalations Updated: 06/25/2011 9:00AM SITE PERF MACHINES
Updated: 06/25/2011 9:00AM
sec
NAM01/SN2PRD0602 75% 41/48 NAM01/SN2PRD0602
changes CAS CPU HUB CPU HUB IO MBX CPU 300
MBX SPACE MBX IO
NAM01/SN2PRD0602
AD CPU
75%
AD IO
41/48
F5 CPU F5 MEM UM ARR
failover state
optics 250 NAM01/SN2PRD0602 65% 41/48 unstable
NAM01/SN2PRD0602 65% 41/48
200 details
NAM01/SN2PRD0602 65% 41/48 State: Provisioned
150
Version: R5
Build: 14.01.0225.071
100
More…
50
impact
0
8:00 am 8:15 8:30
CAS CPU9:00 60% capacity
8:45
at 9:15 9:30 9:45 10:00
Client Session Concurrency: 113,076
Deliveries/Sec: 267
SITE PERF MBXs ACTVE MBXs DBs MACHINES
NAM01/SN2PRD0602 75% 0 0 0 41/48
CAS MEMORY TOP 5 NAM01/SN2PRD0602
NAM01/SN2PRD0602 75% 0 0 0 41/48
SITE details
NAM01/SN2PRD0602 65%
Updated: 06/25/2011 9:00AM
0 0 0 41/48
PERF MACHINES
State: Provisioned
NAM01/SN2PRD0602 65% 0 0
sec
NAM01/SN2PRD0602
0 41/48 75% 41/48 Version: R5
NAM01/SN2PRD0602
Build: 14.01.0225.071
NAM01/SN2PRD0602 65% 0 0
300 NAM01/SN2PRD0602
0 41/48 75% 41/48 failover state
More…
NAM01/SN2PRD0602 65% 0 0 250 NAM01/SN2PRD0602
0 41/48 65% 41/48 unstable
NAM01/SN2PRD0602 NAM01/SN2PRD0602 impact
65% 0 0 200 0 41/48 65% 41/48
details
Client Session Concurrency: 113,076
NAM01/SN2PRD0602 65% 0 0 NAM01/SN2PRD0602
0 41/48 65% 41/48 State: Provisioned
Deliveries/Sec: 267
150
Version: R5
NAM01/SN2PRD0602 65% 0 0 0 41/48
100 related
Build: 14.01.0225.071
CAS CPU
More…
50 SN2PRD0602
impact
SN2PRD0102
0
Client Session Concurrency: 113,076
8:00 am 8:15 8:30 8:45 9:00 9:15 9:30 9:45 10:00 CH1PRD0106
Deliveries/Sec: add charts
267 share close
CHPRD0102
36. chart gardens plus
How it works
• More cowbell
• Don‟t start here, only
use this if your
scenarios typically
require it.
37. https://pod51005.outlook.com/showbox/CAS/ChartGardenx.aspx?pwmcid=1&ReturnObjectType=1
region: all forest: all dag: all site: all copygroup: all server: all
CHARTS 1h 3h 8h Custom
DATA
CAS CPU TOP 5
HISTORY
MORE… Updated: 06/25/2011 9:00AM SITE PERF
sec
NAM01/SN2PRD0602 75% NAM01/SN2PRD0602
300 NAM01/SN2PRD0602 75% failover state
250 NAM01/SN2PRD0602 65% unstable
NAM01/SN2PRD0602 65%
200 details
NAM01/SN2PRD0602 65% State: Provisioned
150
Version: R5
Build: 14.01.0225.071
100
More…
50
impact
0
Client Session Concurrency: 113,076
8:00 am 8:15 8:30 8:45 9:00 9:15 9:30 9:45 10:00
Deliveries/Sec: 267
CAS CPU TOP 5
Updated: 06/25/2011 9:00AM SITE PERF
sec
NAM01/SN2PRD0602 75% NAM01/SN2PRD0602
300 NAM01/SN2PRD0602 75% failover state
250 NAM01/SN2PRD0602 65% unstable
NAM01/SN2PRD0602 65%
200 details
NAM01/SN2PRD0602 65% State: Provisioned
150
Version: R5
Build: 14.01.0225.071
100
More…
50
impact
0
Client Session Concurrency: 113,076
8:00 am 8:15 8:30 8:45 9:00 9:15 9:30 9:45 10:00
Deliveries/Sec: 267
add charts share close
38. chart library Xxx – Windows Internet Explorer x
CHART LIBRARY HELP
ACTIVE DIRECTORY
How it works AVAILABILITY
Select individual charts or groups to add to your page.
• All charts that can be natively shown in showbox are CAS CAS CPU
available (excludes non-UMC optics) EAS A group of charts to find Lorem ipsum dolor sit amet,
HUB consectetur adipisicing elit, sed do eiusmod tempor
• Major chart grouping/taxonomy needs to be MAILBOX CAS CPU
rationalized across all teams in showbox MRS
Connections
• Charts can be added as groups or individually
Request rate per protocol
Memory
Related counters
OWA Logons
Keynote
Blah
Blah
blah
save cancel
40. Availability rotator model – how it works
The rotator tap works by allowing correlation of large amounts of data
easily. Here are the key elements that make up the rotator:
Rotation – graphs will rotate as explained in the interaction model.
Hero chart - compares three/four data sets: overall availability another data set(s), as
2
selectable regions which correspond with the list view and mini charts
Time – the rotator provided an „in the now‟ view of data
Static – graphs will be static unless interacted with by the user
1
Pop out – Dependency on EDS to build the chart gardens – plan for Beta 3/12. Will
allow adding more graphs for comparison in a new window (Early March timeframe)
Scale graphs – each graph will need to be in the hero spot as well as mini graph
3
4
Error handling – need to follow up with Sean and Srdjan
Time selection – width of selection area (for time) will remain constant – no matter
what the time scale is scoped to (see example). Selection will change on mini charts to
5
match hero chart selection.
Legend links – the legend links that are locations/regions etc. will be links to change
scope
List view – title and data correspond with selection on hero chart
Hero chart List view Mini charts/Data blocks
Mini charts - Quickly scan able for patterns or outliers, can quickly move into the hero
spot for deeper analysis (on click)
41. Availability interaction model
Interactions:
Hero chart
• Clickable time increments (shaded area)
• Wish list is to have the shaded selected area expand and
collapse on mouse drag (ex: think stock charting
timelines)(future)
• User can change the time scope of the chart – mini charts
also update to this time
• Click on the pop out icon to open a new window that contains
the charts and allows for addition of others 3
2
• Legend locations are links – allowing for quick scope change
List view
• clicking selects and highlights the item, providing details for
that incident, links/actions 4
• User can ack an incident
• List can be filtered
Mini charts/data blocks
• Click on the desired chart and it will move (rotate) into the
5
hero spot (carousel counter-clockwise)
1
other
• Custom time range can go from 15 min to???
• Max of “X” mini charts in the rotator, if the user wants to see
more charts they must open up in a chart garden
• Animation will fade out charts and data when the actual
„rotation‟ of the charts occurs to minimize noise and
confusion.
Order of rotation commands
1. Clear
2. Rotate
3. Render chart
4. Render list view
42. Time selection on hero chart
The time selection width will remain the same regardless of what timeframe you choose. The only change will be the specific time span you will see in the list view.
The selection on the hero chart will be reflected in the mini charts.
1 hour 10 min
timeframe
8 hour 80 min
timeframe
43. 3
Showbox ! Steven McQueen
!
parent: all node2: all node3: all node4: all node5: all leaf: all
health overview availability customer service performance
escalations
alerts CTPs
changes 4 active alerts availability latency failures
100% -2% 112
optics
availability 15min 1 hr
Active monitoring 99.7% 99.5% STPs
Keynote 99.5% 94.8%
99.2% 6% 36
customer latency failures network
Outlook +2% 5
65% 17% 154
Mobile +5% 14
Mailflow -7% 35
Provisioning +10% 3
mailflow
service latency 78% -5% 1,018
failures
Network +2% 12
Live ID +5% 3
Monitoring -7% 4
AD +10% 10
FOPE -1% 2
44. 3
Showbox ! Steven McQueen
!
region: all forest: all site: all dag: all server: all
health overview availability customer service performance
escalations
availability and alert volume Updated: 2/9/2012 9:00AM CTPs
changes APC
99.5 4.2
LAM
optics NAM
EUR
ESC
8AM 8:15 830 845 9AM
Alerts
2
STPs
95
57
7
8 AM 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9 AM
TIME: 1h 2h 6h 24hr 1wk
8AM 8:15 830 845 9AM
mailflow
availability and alerts 8:50am – 9:00am 174
REGION TYPE TIME AVAILABILITY [SERVICE
INCIDENT] ACP
NAM [SERVICE INCIDENT] keynote failu 12/9 9:12 95%
keynote failures for
NAM [RESOLVED INCIDENT] Quis nostrud 12/9 9:12 99% connections via 8AM 8:15 830 845 9AM
Singapore SingTel networking
People:
Owner
5,236
Datacenter - Ack Now!
IM:
Engage IM!
8AM 8:15 830 845 9AM
MORE
Scope:
45. 3
Showbox ! Steven McQueen
!
region: all forest: all site: all dag: all server: all
health overview availability customer service performance
escalations
availability and alert volume Updated: 2/9/2012 9:00AM CTPs
changes APC
99.5 4.2
LAM
optics NAM
EUR
ESC
8AM 8:15 830 845 9AM
Alerts
2
STPs
95
57
7
8 AM 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9 AM
TIME: 1h 2h 6h 24hr 1wk
8AM 8:15 830 845 9AM
mailflow
availability and alerts 8:30am – 8:40am 174
REGION TYPE TIME AVAILABILITY [SERVICE
INCIDENT] outlook
NAM [SERVICE INCIDENT] outlook failu 12/9 9:12 99%
failures for
connections via 8AM 8:15 830 845 9AM
Singapore SingTel networking
People:
Owner
5,236
Datacenter - Ack Now!
IM:
Engage IM!
8AM 8:15 830 845 9AM
MORE
Scope:
46. 3
Showbox ! Steven McQueen
!
region: all forest: all site: all dag: all server: all
health overview availability customer service performance
escalations
availability and alert volume Updated: 2/9/2012 9:00AM CTPs
changes APC
99.5 4.2
LAM
optics NAM
EUR
ESC
8AM 8:15 830 845 9AM
Alerts
2
STPs
95
57
7
8 AM 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9 AM
TIME: 1h 2h 6h 24hr 1wk
8AM 8:15 830 845 9AM
mailflow
availability and alerts 8:30am – 8:40am 174
REGION TYPE TIME AVAILABILITY [SERVICE
INCIDENT] outlook
NAM [SERVICE INCIDENT] outlook failu 12/9 9:12 99%
failures for
connections via 8AM 8:15 830 845 9AM
Singapore SingTel networking
People:
Owner
5,236
Datacenter - Ack Now!
IM:
Engage IM!
8AM 8:15 830 845 9AM
MORE
Scope:
47. 3
Showbox ! Steven McQueen
!
region: NAM forest: all site: all dag: all server: all
health overview availability customer service performance
escalations
availability and alert volume Updated: 2/9/2012 9:00AM CTPs
changes NAMPROD01
99.5 4.2
NAMPROD02
optics NAMPROD03
NAMPROD04
NAMPROD05
8AM 8:15 830 845 9AM
NAMPROD06
2 NAMPROD07
STPs
95
NAMPROD08
ESC 57
7 Alerts
8 AM 8:05 8:10 8:15 8:20 8:25 8:30 8:35 8:40 8:45 8:50 8:55 9 AM
TIME: 1h 2h 6h 24hr 1wk
8AM 8:15 830 845 9AM
mailflow
availability and alerts 8:50am – 9:00am 174
REGION TYPE TIME AVAILABILITY [SERVICE
INCIDENT] outlook
NAM [SERVICE INCIDENT] MB server 12/9 9:12 99%
failures for
NAM [INVESTIGATION] Quis nostrud 12/9 9:15 100% connections via 8AM 8:15 830 845 9AM
Singapore SingTel networking
People:
Owner
5,236
Datacenter - Ack Now!
IM:
Engage IM!
8AM 8:15 830 845 9AM
MORE
Scope: