The Effect of Explanations & Algorithmic Accuracy on Visual Recommender Systems of Artistic Images - ACM IUI 2019

The Effect of Explanations and Algorithmic Accuracy
on
Visual Recommender Systems of Artistic Images
Pontificia UniversidadCatólica de Chile (PUC Chile)
Vicente
Domínguez
Pablo
Messina
Ivania
Donoso-Guzmán
Denis
Parra

The Effect of Explanations and Algorithmic Accuracy
on
Visual Recommender Systems of Artistic Images
(candidate to best paper)
Pontificia UniversidadCatólica de Chile (PUC Chile)
Vicente
Domínguez
Pablo
Messina
Ivania
Donoso-Guzmán
Denis
Parra

The Online Artwork Market
• Online artwork market: Growing since 2008,
despite global crises!
• In 2011, art received $11.57 billion in total global
annual revenue, over $2 billion versus 2010
(*forbes)
• Recommender systems could play an important
role, just like in other e-commerce industries.
[forbes] https://www.forbes.com/sites/abigailesman/2012/02/29/ the- worlds- strongest- economy- the- global- art- market/ (2012)
March 17th 2019 Dominguez et al ~ ACM IUI 2019 3

Related Work: Art Recommendation
• Previous art recommendationprojects date for as long as 2007, such
as the CHIP project to recommend paintings from Rijksmuseum.
(Aroyo et al., 2007)
• Some approaches have used:
– Ratings (Aroyo et al., 2007),
– Tags, Keywords or other textual features (Semeraro et al. 2012)
– Traditional visual features (van den Broek et al., 2006)
• Deep Learning (DL) have revolutionized visual feature learning, but few
works use them for recommending art (Het et al. 2016,Messina et al.
2018)

• Deep Learning (DL) have revolutionized visual feature learning, but few
works use them for recommending art (Het et al. 2016,Messina et al.
2018)

• Deep Learning (DL) have revolutionized computer vision, but few works
use DL methods for recommending art (He et al. 2016,Messina et al.
2018)

Our Approach to Image Recommendation
• Since 2017 we have been working on recommending art
images, using mostly content-based methods.
• Four papers published:
– ACM DLRS 2017: Dominguez,V.,Messina,P., Parra, D., Mery, D., Trattner,C., & Soto,A.
(2017,August).Comparing Neural and Attractivenessbased Visual Features forArtwork
Recommendation.In Proceedingsof the2nd Workshop on Deep Learning for
RecommenderSystems(pp.55-59).ACM.
– UMUAI 2018: Messina,P., Dominguez,V.,Parra, D., Trattner,C., & Soto, A. (2018).Content-
based artwork recommendation:integrating paintingmetadatawith neural and manually-
engineered visual features.UserModeling and User-Adapted Interaction,1-40.
– Two additional workshop papers in ACM RecSys 2018.

Open Questions from Content-Based RecSys
• We learned that visual features from DNNs perform better than
attractiveness visual features.
Average brightness
Saturation
Sharpness
Entropy
RGB-contrast
Colorfulness
Naturalness
Predictive Accuracy
Attractiveness
visual features
Deep Learning
visual features

Open Questions from Content-Based RecSys
• We learned that visual features from DNNs perform better than
attractiveness visual features, but they are harder to explain.
Predictive Accuracy
Explainability
Average brightness
Saturation
Sharpness
Entropy
RGB-contrast
Colorfulness
Naturalness
Attractiveness
visual features
Deep Learning
visual features

Explainability in Recommender Systems
• Explaining recommendations is a well-established and
active area of research, but there is little research on
explaining image recommendations.
March 17th 2019 Dominguez et al ~ ACM IUI 2019
Herlocker, J. L., Konstan, J. A., & Riedl, J. (2000).
Explaining collaborative filtering
recommendations. In Proceedings of the 2000
ACM conference on Computer supported
cooperative work (pp. 241-250)
10
Tintarev, N., & Masthoff, J. (2015). Explaining
recommendations: Design and evaluation.
In Recommender systems handbook (pp. 353-
382). Springer, Boston, MA.

Gaining Inspiration from DARPA XAI
DARPA XAI (Gunning, 2017)
Decision Tree
DNN

DARPA XAI inspiration
DARPA XAI (Gunning, 2017)
Decision Tree
DNN
Our research
AVF
DNN VF
Explainability Potential
PredictiveAccuracy
Is there an effect on
user perception if
we provide a UI with
explanations?

Our General Research Question
• Explainable &
Transparent UIs are well
perceived by users in
previous RecSys
research, but: Does it
matter to UX if you do
not produce accurate
enough predictions?
AVF
DNN VF
Explainability Potential
PredictiveAccuracy
- UI w/explanation
based on examples
- UI w/explanation
based on examples
or visual features

Research Questions
• RQ1. Given three different types of interfaces, one baseline interface without
explanations and two with explanations but different levels of transparency,
which one is perceived as most useful?
• RQ2. Furthermore, based on the visual content-based recommenderalgorithm
chosen (DNN or AVF), are there observable differences in how the three
interfaces are perceived?
• RQ3. How do independent variables such as algorithm, explainable interface
and domain knowledge interact in order to explain the user experience with the
recommender system in terms of perception of relevance, diversity,
explainabilityand user satisfaction?

Research Questions

Materials and Methodology
1. We collect image data from UGallery, an online store of
physical artworks (paintings and photographs).
2. We implement a system which records user preferences and
generates recommendations. Some of these
recommendations will have explanations.
3. We conduct a user study in Amazon Mechanical Turk to
assess the impact of algorithmic accuracy and explainable UI
on several user dimensions.

Data: UGallery
• Online Artwork
Store, based on CA,
USA.
• Mostly sales one-
of-a-kind physical
artwork.

CB RecSys algorithm: Visual Features
• (DNN) Deep Neural
Networks
• (AVF) Attractiveness-
based

Visual Features: DNN vs. AVF
• Deep Neural Networks (DNN):
we used an AlexNet DNN
(pre-trained with ImageNet
ILSVRC 2012 dataset)
(Krizhevsky et al, 2012)
• We map each artwork image
to its corresponding latent
vector of 4,096 dimensions
obtained at the fc6 layer of
the AlexNet network.

Visual Features: AVF
• Average brightness,
• Saturation,
• Sharpness,
• Entropy,
• RGB-contrast,
• Colorfulness,
• Naturalness
Jose San Pedro and Stefan Siersdorfer. 2009.
Ranking andClassifying Attractiveness of Photos
in Folksonomies.
In Proceedings of the 18thInternational
Conference on World Wide Web (WWW’09).

UGallery: Making Recommendations
• Scoring items based on cosine similarity between user
model and item model:

Generating Explanations
Friedrich, G., & Zanker, M. (2011). A
taxonomy for generating
explanations in recommender
systems. AI Magazine, 32(3), 90-98.

White-box Explanation (AVF)

Black-box explanation (DNN and AVF)

Study Procedure

Preference Elicitation
• We collect user preferences from a Pinterest-like interface

Study Procedure

Interface 1: no explanation, no transparency

Study Procedure

Interface 2: explainable, no transparency

Study Procedure

Interface 3: explainable & transparent

Study Procedure

Evaluation & Results
• Post-algorithm survey aspects (agreement1-100)
– Explainable:I understood why the art images were recommended to
me.
– Relevance:The art images recommended matched my interests.
– Diverse: The art images recommended were diverse.
– Interface Satisfaction: Overall, I am satisfied with the recommender
interface.
– Use Again: I would use this recommender system again for finding
art images in the future.
– Trust: I trusted the recommendations made.
• Finally, NASA TLX (cognitive effort)

Demo

Study on Amazon Mechanical Turk:
● 121 valid users completed correctly the study.
● Task took them around 10 minutes to complete.
● ~56% female, 44% male.
● 80% attended to 1 or more art classes at high school level or
above.
● 80% visited museums or art galleries at least once a year.

Results
Interface 1: UI without explanation
Interface 2: UI with example-based explanation
Interface 3: UI with transparent explanation (AVF)

Results
7 dimensions evaluated, for DNN and AVF (scale 1-100):
Perception of:
- Explainability
- Relevance
- Diversity
- Satisfaction w/UI
- Intention of use
- Trust on RecSys
- Avg. Rating

Results
• Result 1: DNN better or equal than AVF except on diversity
- AVF is perceived significantly more
diverse than DNN excepting when AVF
uses transparent and explainable
recommendations.

Results
• Result 2: Explainable interfaces increase perception of
explainability.
- Result expected: people perceive the system as
more explainable using the explainable
interfaces than non explainable.
43

Results
• Result 2: Explainable interfaces increase perception of
explainability.
- Result expected: people perceive the system as
more explainable using the explainable
interfaces than non explainable.
- But this perception is significantly higher using
DNN visual features than AVF.
44

• Result 2: Perception of relevance changes just by adding
explanations -> User Interface really matters!!
- Algorithm is the same (DNN), but by
adding explanations people perceive
recommendations as more relevant,
- Result is significant only with DNN.
45

• Result 3: No difference in Trust between DNN and AVF in I1
(without explanations)
- The difference in Trust between DNN
and AVF becomes significant only
when using explainable interfaces.

• Result 4: Without explanations, No difference between DNN
and AVF upon intention of use and satisfaction with interface.
- The difference on Satisfaction and
Intention of use between DNN and
AVF becomes significant only when
using explainable interfaces.

General Model: Knijnenburg Framework
Knijnenburg,B. P., & Willemsen,M. C. (2015).Evaluatingrecommender
systemswith user experiments.In RecommenderSystems
Handbook (pp.309-352).Springer,Boston,MA.

Visual Art RecSys Explanation Model

- Compared to the baseline interface
I1, the interface I2 (β=0.551) and
interface I3 (β=0.556) have a positive
effect on Understandability.
- The model explains 23.3% of the
variance of Understandability.

- Compared to the
baseline AVF, DNN
features have an strong
positive effect on
understandability
(β=0.710), as well as on
ratings (β=0.710).

- Perceived Effort (Insecurity, rush,
mental demand) has a significant
negative effect (β=-0.283) on
understandability.

- Satisfaction is strongly
affected by the Trust on
the system (+), as well
as by Understandability
(+), Time (+) and
Experience with the
domain (+).
- Trust is affected by
Understandability (+)
and by effort (-)

Conclusion
• DNN features performed better than AVF features, probably because
AlexNet is able to capture more complex patterns. Results confirm
previous work by Messina et al. (2018).
• Providing a UI with explanations has a big effect on several
dimensions evaluated. This study provides further evidence upon the
importance of user interfaces and not only algorithms as important
effects on UX with RecSys.
• Results indicate that explainability has a positive effect on UX with a
RecSys, but is not always well perceived with additional transparency.

Future Work
• (limitation) If additional transparency is provided in an
explanation, make sure the explanation is actually being
understood.
• Develop a full end-to-end Deep Learning art recommendation
model, rather than using pre-trained embeddings and memory-
based content-based recommendation.
• Test recent advances in XAI and visualization of CNNs into the
recommendation framework.

Acknowledgment
• The were funded by PUC Chile, Conicyt, Fondecyt grant
11150783, as well as by the Millennium Institute for
Foundational Research on Data (IMFD).

THANKS!
Denis Parra
Assistant Professor
Pontificia Universidad Católicade Chile
dparras@uc.cl

References
● Vicente Dominguez, Pablo Messina, Denis Parra, Domingo Mery, Christoph Trattner, and
Alvaro Soto. 2017. Comparing Neural and Attractiveness-based Visual Features for
Artwork Recommendation. In Proceedings of the Workshop on Deep Learning for
Recommender Systems, co-located at RecSys 2017.
● Ruining He, Chen Fang, Zhaowen Wang, and Julian McAuley. 2016. Vista: A Visually,
Socially, and Temporally-aware Model for Artistic Recommendation. In Proceedings of
the 10th ACM Conference on Recommender Systems (RecSys ’16).
● Alex Krizhevsky, Ilya Sutskever, and Geo rey E Hinton. 2012. Imagenet classi cation with
deep convolutional neural networks. In Advances in neural information processing
systems.
● Pablo Messina, Vicente Dominguez, , Denis Parra, Christoph Trattner, and Alvaro Soto.
2018. Content-Based Artwork Recommendation: Integrating Painting Metadata with
Neural and Manually-Engineered Visual Features. User Modeling and User-Adapted
Interaction (2018).
● Jose San Pedro and Stefan Siersdorfer. 2009. Ranking and Classifying Attractiveness of
Photos in Folksonomies. In Proceedings of the 18th International Conference on World
Wide Web (WWW ’09).

The Effect of Explanations & Algorithmic Accuracy on Visual Recommender Systems of Artistic Images - ACM IUI 2019

Recomendados

Recomendados

Más contenido relacionado

Similar a The Effect of Explanations & Algorithmic Accuracy on Visual Recommender Systems of Artistic Images - ACM IUI 2019

Similar a The Effect of Explanations & Algorithmic Accuracy on Visual Recommender Systems of Artistic Images - ACM IUI 2019 (20)

Más de Denis Parra Santander

Más de Denis Parra Santander (15)

Último

Último (20)

The Effect of Explanations & Algorithmic Accuracy on Visual Recommender Systems of Artistic Images - ACM IUI 2019