How do you build a general purpose data collection tool that can be integrated into 100+ sites, handle 15,000 1K writes/second, and have nearly zero downtime? How do you do that and have an architecture virtually unchanged for 3+ years? Start with using Mongo. Michael Makunas and Ramesh Nuthalapati from Viacom Media Networks, home of MTV, Comedy Central, Nickelodeon, and dozens of other brands, detail the architecture of the voting, polling, and data collection system that powers everything from voting for the MTV Video Music Awards, to sliming celebrities at the Nickelodeon Kids Choice Awards, to contests to meet Miley Cyrus.
2. Who Are We?
Michael Makunas
Director - Multiplatform Engineering
@mmakunas
Ramesh Nuthalapati
Architect - Multiplatform Engineering
@nuthalapati
2
3. Who Are We?
Viacom Media Networks
Viacom creates original entertainment content for every
audience, on every screen around the world. The company's
brands strive for creative excellence and cultural impact across
programming, motion pictures and digital experiences that live
online, on mobile, and via apps.
3
4. Today We Are Talking
About...
Funnel
Viacom’s internal high volume polling, voting, and data
collection service.
4
5. …and How MongoDB just
works.
5
• Launched in 2011 on 2.0
• Currently on 2.4
6. Use Cases
Award Show Voting
•MTV Video Music Awards (VMAs)
•European Movie Awards (EMAs)
•Kids Choice Awards (KCAs)
•CMT Music Awards
Polling
•MTV March Musical Madness
•Spike Bellator On Air Match Voting
Contests
•MTV Ultimate Fan Experience
6
7. What is “High Volume” to
Viacom?
•From big to little, continuous “horizontal” support across
100+ global sites
•Voting, polling, and data collection occurs 24/7
•No “maintenance windows”
•Sudden spikes to 15K+ votes per second from on-air
“Throws” during prime time viewing: “Go to the site and
Vote Now!”
•Storage of 100s of millions of votes for analysis after events
7
10. Statistics (Pushing The Limit)
2015 Kids Choice Awards
•531 Million votes (a new record!)
•7 Million votes during the show
10
11. Some Definitions
Funnel
•Our internal name for our system
Site
•A brand and/or site specific namespace for collecting data
Tent-pole
•A major programming (entertainment) event
Collection
•An individual poll, voting event, sweepstakes, etc.
•NOT a MongoDB Collection
Entry
•A single vote or poll entry
Processor
•Performs an action on a entry (validate it, persists it, etc.)
11
13. Why Did We Build It?
•Previous systems unmanageable
•One system for data collection
•Contests, sweepstakes, small polls
•Built in-house with ATG Dynamo and MySQL
•Per-site deployment model meant multiple installations per
brand
•Flexible schema == MySql “xml blobs”
•Separate system for high volume polling
•Comercial
•Lack of customization
•Per-brand installation
13
14. Why Did We Choose
Mongo?
•Flexible schema document style database provides the
quick-to-market support needed by 100+ sites, brands, and
countries (All who want things “just a little bit different”)
•Proven scalability
•JSON in and JSON out
14
15. Architectural Goals
• High write throughput over read throughput and
consistency
• Handle up to 10,000 1KB writes per second. Read
throughput is secondary
• High Availability
• Replace dozens of separate installations and services
• Wide adoption across internal brands
15
17. Server Layout
• High availability
• Primary, Secondary configuration with automatic
failover
• Primary stores all reads and writes, secondary stays in-
sync with Primary
• Monitoring with MongoDB Management Service (MMS)
• Email Alerts from MMS for high volume op inserts
17
21. Processors
• Each entry received by Funnel will be handled by an ordered list of
processors (the ProcessorChain) As each processor in the list is visited, the
processor will decide whether to continue to the next one or to terminate
further processing
• A Processor can raise an error, and its detail message will be configurable
and localizable. When an error is raised, it can signal the system to take a
few different actions depending on the severity of the error condition:
Log the error but continue to processing
Bubble up the error to the client and cease further processing
Ignore the error, cease further processing, and return an OK response
to the client
21
30. Optimizations
We didn’t get it perfect the first time:
• Correct Indices - Index on created Date of Entry Collection.
EnsureIndex is called during collection creation and
update
• Batch update of vote tallies - Update the count per second
instead of per Entry
• Hourly Stats – Pre-caluculating hourly totals saved resources
during live events to get stats for past 24 hours, 7 days and
30 days
30
31. SPAM Prevention
•De-duplicator processor: De-duplicates X number of
entries for a defined period (eg. 30 Entries per minute).
Internal to Funnel application
•VERISIGN DDoS Protection service
•Akamai WAF (part of Kona Web Application Firewall)
•CAPTCHA
•Authentication (FLUX - Viacom Internal System)
31
32. Before & After
Previous Systems
•Decentralized
•Multiple Installations
•Outdated App Server
•Rigid Schema requiring
rigid change control
process
32
Funnel
•Service Oriented
•Performant
•Flexible Schema; easily
extended