Over the last few years, the ad industry has been decisive and diligent about demanding better media quality. As a result, we've seen dramatic reductions in waste and made huge improvements in the areas of Viewability, Fraud and Brand Safety. Now it's time for the industry to set its sights on the new leading cause of waste. Join Jake Moskowitz, head of the Emodo Institute, for a glimpse into the pervasive problem of data inaccuracy. In this session, Jake will outline the causes, scope and magnitude of today's data quality issues, and discuss tactical ways to ensure advertisers get what they pay for.
Emodo is the data arm of Ericsson, the telecommunications company that powers roughly 80% of US mobile traffic.
BDSM⚡Call Girls in Sector 150 Noida Escorts >༒8448380779 Escort Service
Media vs Data: Why the Double Standard?
1. 1
Media vs. Data:
Why the Double
Standard?
Jake Moskowitz
Head of the Emodo Institute
2. 2
Let’s Get to Know Each
Other
The Emodo Institute
• Research, education & resolution of data
concerns that challenge mobile
advertising.
• Helps media planners, buyers & service
providers sharpen the efficacy of mobile
data, so campaigns have a greater
impact.
3. 3
VS
WHERE
− Segment designations
− Bid request metadata that
triggers bidder decisions
− Attribution studies that lead to
reallocation
Media Data
We focussomuchon verifyingwhere our
adsrun….
− Whitelists/blacklists
− Brand Safety
− Viewability
− Fraud
HOW
… andsolittleon verifying
howwe decidewhat to buy
6. 6
Case Study: Demographic Data
• Nielsen DAR benchmarks
show 50-60% of data is
wrong for even medium-
level specificity of demo
targeting
• And it’s not improving in
any substantive way
Demographic, Age Span
Nielsen DAR Mobile Demo Benchmarks – Q1’18
8. 8
Right.
But it points (very precisely) to the wrong
place.
Not just 10 feet to the left or right.
Often, the data is 10s, 100s, 1000s of miles off.
9. 9
Only 39% of device location
data points were within one
mile of where they were
claimed to be.
Case Study: Location Data
10. 10
Across all raw data tested,
vendors only eliminate
28.4% of data inaccuracies
Pattern recognition techniques used
by vendors don’t seem to work.
Case Study: Pattern
Recognition
11. 11
CASE STUDY: QUESTION ASSUMPTIONS
• SDK scores are extremely consistent
• All accuracy scores are between 60
and 75%
Emodo has tested a wide
range of 3rd party SDKs
Case Study: SDK Data
12. 12
Audience scores for a single
vendor:
• For the same use case,
• For two separate campaigns,
• From two separate brands,
• Via two separate agencies,
• Less than 60 days apart.
Campaign #1
88.78% accurate
Campaign #2
11.65% accurate (audience A)
25.44% accurate (audience B)
Case Study: Vendor
Consistency
13. We care about accuracy. So why do we prioritize other factors?
Source: Factual Survey, May 2019
14. 14
So, what do we end up with?
67M devices that have
visited a Hyundai dealership
in the last 30 days?
15. 15
More segment stuffing...
Step 2: Understand Why is the Data Inaccurate?
102M devices that are
active members of the BP
Motor Club?
21. 21
A Day in the Life of
a Data Point
Occurrenc
e
Categorize Define
ExpandMatch
Cross-
platform
Use
A Day in the Life of a Data Point
22. 22
Introducing the “aCPM”
Inaccurate data is a primary cause of wasted impressions.
• The aCPM adjusts cost for lost value from data inaccuracy
• Applies the accuracy rate to the original CPM to calculate cost of only the
accurate impressions.
• Example: If CPM is $3.00 and data is 50% accurate, the aCPM would
actually be $6.00
• Taking steps to improve accuracy can significantly reduce aCPM
23. 23
SEGMENT BMW
INTENDER
PRICE $0.80
DEFINITION
VISITED THE BMW
PART OF A 3RD
PARTY AUTO SITE
IN THE LAST 90
DAYS
LOOK-A-LIKE MODELING
YES (70%
modeled)
DETERMINISTIC DATA USED YES
Calculate an aCPM
x 2
x4
________
= $ 6.40 aCPM
x 2
x1.5
________
= $ 4.50 aCPM
SUPPLY QSR LUNCH
PRICE $1.50
GEOFENCE CRITERIA
WITHIN 1 MILE
OF A STORE
POI DATABASE
UPDATED
QUARTERLY
LAT/LONG FILTERING NO
24. 24
What Could Go
Wrong?
1. Tech problems: Data captured isn’t
correct because technology failed, such
as “last known location”
Step 3: Ask Your Data Vendors Revealing Questions
Bad Data Sources:
2. Low quality data: Data that isn’t
persistently collected, honestly provided,
adequately scalable, etc.
3. Categorizations are wrong: store
definitions are wrong or too liberal;
irresponsible assumptions about meaning
Questions to Ask
Your Data Vendor:
What % of your data
do you throw out?
How do you verify
accuracy?
How do you verify
your POI?
What restrictions on
use of deterministic?
4. Privacy restrictions: no use of deterministic
data due to privacy concerns (only modeled
data used)
25. 25
What Could Go
Wrong?
1. Optimizing scale at the cost of
accuracy: Lookalike model extrapolation
ratio is high
Step 3: Ask Your Data Vendors Revealing Questions
Data Trade-Offs:
2. Data loss due to low match rates
3. Inaccurate cross-device due to
probabilistic methods, or incorrect
assignment of a person within a household
Questions to Ask
Your Data Vendor:
What % are modeled?
What’s your match
rate to x?
What % are running
on source platform?
4. Use of wrong segments (accidental or
purposeful) such as to increase scale or
deliver in full
Exactly which
segments were used?
26. 26
1. Prioritize: Establish data accuracy as a
top priority – equal to media quality
2. Recognize: Remember that price and
scale are negatively correlated with quality
3. Calculate: Assess the value of data
options by doing a simple aCPM calculation
4. Ask: Seek deeper answers to revealing
vendor questions
Crucial
Steps to
Better Data
Summary
What’s new in this slide? This is basically a Verizon vs. ATT commercial for the past 10 years
What’s new in this slide? This is basically a Verizon vs. ATT commercial for the past 10 years
Here’s another example. If I told you that this Lat / Long location data was extremely precise you’d probably agree, right?
- It looks very precise. Those are certainly very precise numbers.
- Each one of those digits to the right of the decimal point increases the location resolution ten-fold. Pretty exact.
Now, what if I told you that it points (albeit very precisely) to the wrong place?
That may be confusing. It should be. Those Lat/Long coordinates can be precise. But often, those precise numbers simply point to the wrong place.
It’s kind of like a musician playing a piece of music perfectly, but playing the wrong song. Only, the musician’s faux pas doesn’t chip away at your return on investment.
What’s new in this slide? This is basically a Verizon vs. ATT commercial for the past 10 years
So, when you go to select segments for a campaign, how does all of this data sourcing and processing come into play?As sophisticated as you’ve become with data, when you buy programmatically, your opportunities to use that expertise become intentionally limited. Data stores and vendors have designed a marketplace that games your expertise and takes advantage of your need for scale. Here are two ways they do this:
Data stores encourage over-buying and discourage informed decisions by revealing no distinguishing details about segments and validating that buying more and more segments = more unique qualified reach. You’ll notice that the only attributes exposed are price, reach and name – nothing about quality, like source, accuracy or % look alike audience. That leads us to #2.
Segment Stuffing: Data vendors often over-inflate the segments they sell because they know you’re looking for scale
Data stores are set up to encourage a “spray and pray” approach. They make it very easy to add as many segments as you’d like to your campaign and somehow each new segment adds significant “unique” targeted reach. But if you look closely, the numbers start to call attention to themselves and reveal their flaws. Somehow, without a lot of effort, the world of Hyundai intenders, for example, can end up including just about everybody in the country – if those numbers were actually uniques and accurate. It’s definitely scale, but at some point targeting becomes pointless if you’re targeting everybody. And you have the honor of paying the vendor a $1.25 CPM for the pleasure.
Segment Stuffing: Data vendors often over-inflate the segments they sell because they know you’re looking for scale. The biggest segment often wins.
Segment stuffing is common. It’s difficult to know how broad reach is shaped in large segments. Often, the segment is expanded with “look-alike” audiences. This PlaceIQ example targets 60M IOS and Android devices that have visited a US Hyundai dealership in the last 30 days. Do you believe that’s accurate? It’s definitely a surprising number, but it begs the question, what is it based on? Can you imagine the extrapolations that go into a segment like that?
What would happen if you targeted that segment? You’d likely pay for a lot of wasted impressions.
Here’s another one.
102M devices that are active members of the BP Motor Club. Have you ever heard of the BP Motor Club? I hadn't. But I can assure you their membership is far lower than that. Not just that, what is that based on? Can you imagine the extrapolations that go into a segment like that? And this is from Acxiom, one of the most respected names in data.
And another...
128M likely Millstone coffee drinkers. I'm sure that's really high quality data there. I'll bet their regression model threshold is 0.15. Okay, these segments are pretty obvious. But, that’s the point. Most segment stuffing isn’t obvious. Every segment you choose could easily have the same problem and produce a significant amount of waste. You don’t know where the data came from, how its been processed or how it’s been inflated for scale. How would you know?
That’s next. Let’s talk about how you can be more certain.
Competition: Protect the vendor’s secret sauce
Sales Cycle: Reduce technical concerns / additional questions
Revenue: Keep marketers from gaming the system
Evolution: Data science is nascent & continues to change
Weakness: Conceal flaws and lagging capabilities
Perception: Some vendors, at best, use “pseudo” science