A presentation titled "Logging and Monitoring APIs" that Heather O'Sullivan, Dan Cundiff, and Eric Helgeson from Target Corporation gave at API Strategy & Practice Conference 2013.
Blog post about the conference and the presentation we gave: http://seasidetea.blogspot.com/2013/03/our-presentation-at-api-strategy.html
2. Context: APIs @ Target
• RESTful APIs
• APIs across all domains in our business
• Products (inventory, price, description, etc.)
• Locations
• Promotions, etc.
• Used by mobile, applications, partners on the
outside, etc.
• Constantly evolving, rapidly improving, all the time
2
3. API story: Guest price matching an Xbox
1 2 3 4 5 6
Daily Product
Catalogs Products Locations
Deals Availability API Gateway
Step Guest Experience API Result
1 Opens Target Mobile App Catalogs, Daily Deals
2 Scans barcode Products
3 Views product page Products Xbox item description, image and price of $179
4 Checks store availability Products, Product Availability Xbox available at the Nicollet Mall Target
5 View store location on map Locations Map
6 LOD checks price on iPad Price Products Validates $179 price, instead of $189 that was listed on the Xbox in-store
Checker app
4. Problem
• First API go-live:
• Millions of log events per day, logs everywhere
• Needed end to end visibility of APIs
• Needed ability to discover information in logs
• Can we be pro-active? React faster?
• Looming horizon:
• BILLIONS of log events coming
• Questions changing everyday from business, ops,
execs, developers
4
5. Log all the things
• Consumer apps
• Provider systems
• External API gateway logs
• Anything in between (OS, firewalls, proxies)
• Correlate with logs from apps degrees removed
(e.g. .com web logs)
6. Metrics for APIs
Traffic Metrics Service Metrics Support Metrics
– Total calls – Performance – Support tickets
– Top methods – Availability – Response time
– Call chains – Error rates – Community metrics
– Quota faults – Code defects
Business Metrics
Developer Metrics Marketing Metrics – Direct revenue
– Total developer count – Developer – Indirect revenue
– Number registrations – Market share
of active developers – Developer portal – Costs
– Top developers funnel
– Trending apps – Traffic sources
– Retention – Event metrics
(source: http://blog.programmableweb.com/2012/08/02/the-api-measurement-secret-know-what-metrics-matter/)
15. Monitoring API Development
• Story: Practice code as documentation. Every commit, Jenkins runs,
extracts documentation from code, puts it in the respective wiki pages
(automated / no humans)
• Monitor documentation changes using the MediaWiki API
15
17. Business intelligence from APIs
• Where are people searching?
• Where should we build our next store?
• How far are people traveling?
• What time of day?
• Mobile vs website?
• iOS vs Android?
• International?
17
[H] Here’s the context for all the material that follows. “Enterprise Services” program is all about…
[H] Guest is at competing retailer and sees an Xbox that he wants to buy. Its $199, but he thinks he can get is cheaper at Target.1, 2: Guest uses iPhone to scans the Xbox bar code; price comes back at $179!The Catalogs API, Daily Deals API, and a few others display the content on the first page and as you start to navigate.3: The Products API: displays the detail around the Xbox item including description, image, and price.The guest likes the idea of saving $20, but they’re not sure where the closest Target is or if there are any available at the store.4, 5: Product Availability API shows our Guest that the Xbox is available at the downtown Target and with another click, a map is displayed using the Locations API.So far, we’ve seen how 4 APIs are used on the iPhone app. If Guest’s buddy had an Android, they would be calling the same APIs and the same data would be displayed. Functionality is developed once and leveraged across multiple platforms.6: Guest arrives at Target, but is surprised to see the Xbox marked at $189 vs. the $179 he had expected. A team member sees the Guest and asks if she can help them find something. Using the Price Checker app on her iPad, the team member confirms the $179 price and helps the Guest check out. Again, leveraging the Products API.
[E] Logsscatted everywhere = complex ecosystemLooming horizon = data explosionStory: going live, millions of hits start coming in, try to figure out what is actually happening
[E]The more you log,the more complete the monitoring picture can be.
[D] Now that we have all of those logs, we can measure things.
[E] Have to know the profile of your APIs over time to understand. Have to know what's normal to know what’s wrong.
[E] APIs behave differently over time, monitor over time. See the problem before it happens.
[E] Same thing for errors, normalize over time; what’s normal and what do you need to investigate further. Batch job that runs every day at 2am? Does that affect our APIs?
[D]
[E]A list of consumers across all APIsover a 24 hour period.Story:Identify bad API key before the developer knew what was wrong.
[D]
[E]What APIs are popular, which ones are growing or shrinking, where do we need to look at scaling?
[D] API might seem healthy but underneath the infrastructure might not be. Monitor it too!
[D] Monitor your development too! Git, Jenkins, etc. The small things add up.
[D]Abig story to draw you in!Anonymizedlat/long data of guest searching for stores in the last 15 minutes.If a store wasn’t nearby those 61 people in Idaho, did they go somewhere else to by Tide, diapers, or socks?Conceptually, maybe we should build a store there (we don’t actually plan our stores with a sole data point like that, but it gives you an idea)?
[D]
[D]Global dashboard summarizing all APIsBI dashboardsExecutive dashboardsEnvironment dashboards for each API:CI,Test,Stage,ProdAlert trending dashboards for each API