The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
Quality Prediction for Speech-based Telecommunication Services
1. Quality Prediction for Speech-based
Telecommunication Services
Sebastian Möller, Stefan Hillmann, Klaus-Peter Engelbrecht, Florian
Hinterleitner, Friedemann Köster, Florian Kretzschmar, Matthias Schulz,
Stefan and Usability Lab, Telekom Innovation Laboratories, TU Berlin
Quality Schaffer
Life is for sharing.
3. Motivation
Quality of Service (QoS) vs. Quality of Experience
(QoE). developer‘s point-of-view:
System
Performance: “The ability of a unit to provide the function it has
been designed for.”
(Möller, 2005)
Quality of Service (QoS): “The collective effect of service
performance which determines the degree of satisfaction of the user of
the service.” (ITU-T Rec. E.800, 1994)
Includes service support, service operability, serveability, and service
security
User‘s point-of-view:
Quality: “Result of appraisal of the perceived composition of the
service with respect to its desired composition.”
(ITU-T Rec. P.851, 2003, following Jekosch, 2000, 2005)
Quality of Experience (QoE): “The overall acceptability of an
application or service,
as perceived subjectively by the end user.” (ITU-T Rec. P.10, 2007)
Includes the complete end-to-end system effects
10
4. Motivation
Quality of Service (QoS) vs. Quality of Experience
(QoE).
Qualinet White Paper on Definitions of Quality of
Experience (2012):
“Quality of Experience (QoE) is the degree of delight or
annoyance of the user of an application or service. It results from
the fulfillment of his or her expectations with respect to the utility
and / or enjoyment of the application or service in the light of the
user’s personality and current state.”
Service: “An event in which an entity takes the responsibility that
something desirable happens on the behalf of another entity.”
(Dagstuhl Seminar 09192, May 2009)
Acceptability: “Acceptability is the outcome of a decision which
is partially based on the Quality of Experience.” (Dagstuhl
Seminar 09192, May 2009)
11
5. Motivation
Quality of Service (QoS) vs. Quality of Experience
(QoE).
Service Use
provider r
Service design Service perception
Quality elements Service Quality features
Quality of Service (QoS) Quality of
Experience (QoE)
12
6. Motivation.
Quality perception and judgment processes.
Response- Physical Signal
Modifying Factors (Physical Nature)
Adjustmen Percepti
t on
Desired Nature Perceived Nature
Anticipation
Reflexion Reflexion
Desired Quality Perceived Quality
Features Comparison Features
and
Judgment
Perceived Quality
Encoding
User
Quality Rating
(Description) (Jekosch, 2004; Raake, 2006)
13
7. Quality of Service (QoS) vs. Quality of Experience
(QoE): Taxonomy.
User Context System
factors
Influencing
Static Dynamic Environmental
Service Agent Functional
factors factors factors factors factors factors
Quality of Service (QoS)
Output modality
Perceptu Form Contextual
appropriatness
performance
Interaction
al effort appropriatness appropriateness
Dialog
Cognitiv management
e User Input System
Physical performance
worklo performance
Input Interpretation
ad response
effort modality performance
appropria
tness
Outp Cooperativity
Input
ut qualit
quali y
Quality of Experience (QoE)
ty Interaction
System quality Learnab
Aestheti Persona
cs ility Efficien
Effectivene Utility
lity
Appe Joy of use Ease of use ss cy
Intuitiv
al ity
Usability
Usefulness
Hedonic Acceptability Pragmatic
(Möller et al.,
2009) 14
9. Speech Quality Prediction
Transmitted-speech quality prediction.
Approaches:
Linguis Attitude
Emotions Experi- Motivation,
t. ence Goals
Backgr. User Factors
Subjective
Transmission
Quality
System
Judgment
System Speech
Signals
Parameter
s Estimated
Model Quality
Index
16
10. Speech Quality Prediction
Transmitted-speech quality prediction.
Signal-comparison approach:
Natural
Speech x’(Interna
Pre-
k) l
process
repres
Transmiss ing
ent. Distanc Avera Transfo
ion MOS
Interna e ge rm.
System Pre-
l
process
repres
ing
y(k y’( ent.
) k)
(Hauenstein, 1997)
17
11. Speech Quality Prediction
Transmitted-speech quality prediction.
No-reference approach:
Natural
Speech Referenx’( Interna
ce k) l
Generati repres
Transmiss on ent. Distanc Avera Transfo
ion MOS
Interna e ge rm.
System Pre-
l
process
repres
ing
y(k y’( ent. Paramet High additional
) k) ric noises
Analysis Time-varying
charact.
(ITU-T Rec. P.563, 2004)
Unnatural voice
18
13. Speech Quality Prediction
TTS quality prediction.
Signal-comparison approach for TTS quality
prediction:
Natural
speech x’(Interna
inventory Pre-
k) l
process
repres
ing
Synthesi ent. Distanc Avera Transfo
e MOS
zer Interna ge rm.
Pre-
l
process
repres
ing
y(k y’( ent.
) k)
(Cernak & Rusko, 2005)
20
14. Speech Quality Prediction
TTS quality prediction.
Parametric approach for TTS quality prediction:
Text Referenx’(Interna
ce k) l
Generati repres
Synthesiz on ent. Distanc Avera Transfo
e MOS
er Interna ge rm.
Pre-
l
process
repres
ing
y(k y’( ent. Paramet High additional
) k) ric noises
Analysis Time-varying
charact.
(ITU-T Rec. P.563, 2004)
Unnatural voice
21
15. Speech Quality Prediction
Dimension-based quality prediction.
Transmissio
n Pre- Internal MOS
System Processing Represent.
Comparison Integration Transform.
Pre- Internal
Processing Represent.
Discontinuity
Indicator Idis
Noisiness
Indicator Inoi
Coloration
Indicator Icol
Loudness
2008; Wältermann et al., 2008b,c) Indicator
Ilou
22
25. Spoken-dialogue Quality Prediction
Principle.
Approaches:
Linguist. Experi- Task
Flexibility
Attitudemotions
E Motivati
Backgr. ence Knowledge on,
User Factors Goals
Subjective
Dialog Quality
System
Judgment
Speech
System Interaction
ParameterParameter Signals
s s Estimated
Model Quality
Index
32
26. Spoken-dialogue Quality Prediction
MeMo Workbench.
Idea:
Make assumptions about (models of) the behavior of user
and application
Partially replace the user in initial evaluations by a user
model
System
Behavior
Model
Simul. Eng. Usability
Control Unit Prediction
User‘s Automat
Mental
Model ed
Testing
Set up a workbench for automated testing and usability
prediction
33
27. Spoken-dialogue Quality Prediction
MeMo Workbench.
For usable applications three different world descriptions
have to match:
User’s Mental Model: Image
the user has of the application
User task
(tasks to carry out, i.e. the user task model
?
model, and how to reach the task
goal, i.e. the user interaction model) User
interaction
System Interaction Model: Model model
User‘s mental model
underlying the interaction, of the system
coded in the application
System Task Model: Model System task
of the task a user can model
perform with the help
of the application System
interaction
model
User
System
34
28. Spoken-dialogue Quality Prediction
MeMo Workbench.
Workbench set-up:
System Step 1: Model acquisition
Task Step 2: Workbench set-up
Model
Step 3: Prediction algorithm
System derivation
Inter- Step 4: Interaction simulation &
action Model Simulatio Problem Usabilit
problem detection
Identificati y
n
User Inter- Engine on Predicti
action Control Unit & Weighting on
Model
Test User Trainin
User Task Automatic g
Model Testing
User Behavior Usabili
Model ty
Profil
e
35
31. Spoken-dialogue Quality Prediction. New
developments.
Modality Selection:
System model with multiple (serial) input modalities:
Which modality should be used for interaction?
Various influence factors of users’ modality selection
User side: familiarity/expertise, static/dynamic user
attributes, cognitive workload
System side: errors (e.g. ASR), number of interaction
steps
?
Task: complexity, dual-task, time
Environment: home, public
User model needs a mechanism to adjust
interaction probabilities
Study: investigating efficiency- and
effectiveness-guided modality selection
39
32. Spoken-dialogue Quality Prediction. New
developments.
Modality 100
Baseline Experiment, 0% ASR erros
100
Predicted Data, 20% ASR errors
Selection:
Study & model
80 80
data 60 60
x-axis: speech 40 40
shortcut
[interaction 20
Model
20
steps] 0
Human
0
Model
0 1 2 3 4 5 0 1 2 3 4 5
y-axis: speech Experiment 2, 10% ASR errors Experiment 2, 30% ASR errors
usage [%] 100 100
Model input 80 80
parameters:
60 60
no. of
interaction 40 40
steps, ASR
error rate 20
Model
20
Model
Human Human
Mechanism 0
0 1 2 3 4 5
0
0 1 2 3 4 5
will be
integrated 40
34. Other Application Examples
Tradeoff between usability and security.
Modified Tetris Game to evaluate tradeoff between usability
and security
The game was attacked by viruses which stole the user rows
(rows could be saved to be actual money)
Users could choose security level:
High level has much false
alarms, but warns each
time before an attack occurs
Low level has less false
alarms, but some attacks are
not announced
Parameters like security level
changes and number of collected
rows were analyzed
42
35. Other Application Examples.
Results.
Results of the user test:
More security changes in a high attack
likelihood condition
Earlier saving of rows in a high attack
likelihood condition
Higher average security level in a high
attack likelihood condition
Simulation of the user behavior: MeMo Workbench
Probabilistic- and rule-based modeling
Good qualitative prediction for security level changes for
both conditions
Good prediction of clearing rows up to seven rows
Prediction of other aspects needs improvement
Extension of the approach to more realistic scenarios
(mWallet, others)
43
36. Quality Prediction of Speech-based Services
Conclusions.
Modelling the human user:
Model
of
Referenc
es Judgment Description
Percepti
Model Model
on
Model
Subjective
Dialog
Quality
System
Judgment
Model of
Goals
Action Behavior Model of
Model Model Experienc
es
44
37. Thank you for your attention!
Visit www.qu.tu-berlin.de for more
information.