11. General Approach
The customer query is turned into a
numerical representation (an embedding).
We also keep all of the latest help article
embeddings.
The top recommendations are the help
articles that have the most similar
embeddings (Cosine distance ranking).
12. You can change your
card PIN at any large
bank (HSBC,
Barclays, etc.) ATM
in the UK by
selecting PIN
services ☺
You can see and
download your bank
statements through
the Monzo app.
How do I change
my PIN?
Question + Answer - Answer
13. Baseline
Train an encoder-decoder model from
scratch, using Monzo chat data only
Challenger
Use a pre-trained BERT encoder, and
finetune it using Monzo Chat data
Result!
Challenger ⬆ self-service rate by ~9%
15. From our ongoing analysis, we can identify
topics that are:
Eligible for self-service (e.g., PIN reset)
Predictable (in our chat data)
Valuable for customer service (i.e.,
accounts for a high volume of their work)
... perhaps we could approach this as a
classification problem instead?
16. We were looking for an approach that is:
🎯 Accurate
We were looking for a way where:
🚢 We could ship quickly to get some customer
feedback
🏦 Use all the relevant state that is stored across
many backend services
17. Approach
Finetune a single binary classifier
about a topic: is the customer asking
about needing a replacement card?
If they are (& a card hasn’t already been
ordered for them), then give them the
answer directly!
18. Offline validation
Precision
Out-of-sample
precision
Card replacement 0.86 0.69
Card not arrived 0.93 0.60
Update details 0.93 0.52 😫
... ... ...
We deployed a number of these models in shadow
prediction mode. This allows it to make predictions on
live data, but it’s not sending customers any answers.
19. What the 🤬?
The main culprit was poorly tagged
conversations. But the definition of
“correctly” tagged was subjective!
20. Large set of noisy tags
Small set of re-labelled
examples
Hi there! How can I transfer money into a savings pot?
21. F1 Score
Fine tune on tags 69
Fine tune on
re-labelled
examples
78
“Ultra” fine-tune on
both
85
Example result: Update details (48% of
tagged chats labels changed when
re-labelled)
27. 1. Helping customers find the right
answers to their queries (in the app)
//
2. Helping agents to diagnose and
response to customer queries swiftly.