SlideShare una empresa de Scribd logo
1 de 42
Descargar para leer sin conexión
Analyzing Reviews and Code
of Mobile Apps for
Better Release Planning
Adelina Ciurumelea, Andreas Schaufenbühl,
Sebastiano Panichella, Harald C. Gall
software evolution & architecture lab
2
Extremely Popular Apps
8,087,067 reviews3,505,905 reviews38,742,600 reviews
3
Open Source Apps
62,707 reviews
4
The number of reviews is large compared
to the available development resources.
5
• reviews contain valuable
feedback directly from the
users
• users often report bugs, user
experience and request
features
• the review content influences
the number of downloads
Importance of reviews
6
INFORMATIVE NON-INFORMATIVE
“AR-Miner: Mining informative reviews for developers from mobile app marketplace”
N. Chen, J. Lin, S. Hoi, X. Xiao, and B. Zhang
7
BUG FEATURE REQUEST
“Release planning of mobile apps based on user reviews”
L. Villarroel, G. Bavota, B. Russo, R. Oliveto, and M. Di Penta
OTHER
8
BUGFEATURE REQUEST
• the developer has to manually analyse the unstructured groups of
reviews, understand what they talk about and extract actionable change
tasks
• what does a particular cluster talk about? Does it talk about the UI or
about the performance of the app, etc.?
9
What are the mobile specific topics
users talk about in their reviews?
10
manual analysis of ~1600
reviews
11
Hmmm...
Mm No…
This is IT
Nope Nopity nope
• not all reviews are useful
12
Hmmm...
Mm No…
This is IT
Nope Nopity nope
Sucks Way to many errors
0 stars Garbage.
problem bro
Garbage Bla bla bla
• not all reviews are useful
• some are even offensive
13
Pretty close to perfect, this app is
way better than any comic book
reader I've ever used. It's small, it
operates fast, and the interface is
incredibly clean and simple.
• others can provide valuable
information for the developer
14
Pretty close to perfect, this app is
way better than any comic book
reader I've ever used. It's small,
it operates fast, and the
interface is incredibly clean and
simple.
Resources
Usage
15
For info (in case dev not already
aware!), there is a graphical
glitch when scrolling output in
marshmallow on a nexus 5.
Compatibility
Usage
Complaint
16
Building the taxonomy
• feature extraction: TF-IDF scores and 2 and 3-
grams counts
Content analysis in 2 passes:
• start with an empty list of categories
• analyse each review and add a new category
(including definition and keywords) if necessary
• label the review with all the matching categories
• second pass: revisit the list of reviews and label
them with the appropriate categories
17
Category Description
Compatibility mentions the OS, mobile device or a specific hardware component.
Usage talks about the UI or the usability of the app.
Resources
mentions the app’s influence on the battery and memory usage or the
performance of the app/phone.
Pricing statements mentioning the license model or the price of the app.
Protection statements referring to security or privacy issues.
Complaint the user reports or complains about an issue with the app.
High Level Taxonomy
18
specialise the taxonomy
further
19
Liked it and worked very well in
lollipop, but not MM The plugins
don't refresh, manual navigation
to next image doesn't work.
Some plugins give error.
Altogether seems broken after
MM update on Note 4.
Compatibility
20
Liked it and worked very well in
lollipop, but not MM The plugins
don't refresh, manual navigation
to next image doesn't work.
Some plugins give error.
Altogether seems broken after
MM update on Note 4.
Compatibility
Device
Android Version
21
High Level Low Level Categories
Compatibility Device, Android Version, Hardware
Usage App Usability, UI
Resources Performance, Battery, Memory
Pricing Licensing, Price
Protection Security, Privacy
Low Level Taxonomy
22
Automated Classification
23
Gradient Boosted
Trees Training
Preprocessing
&
Feature Extraction
Multi-label
Classification
ML Approach
24
Preprocessing & Feature
Extraction
• preprocessing: stop words removal and stemming
• feature extraction: TF-IDF scores and 2 and 3-
grams counts
25
Training
• feature extraction: TF-IDF scores and 2 and 3-
grams counts
• one-vs-all strategy: separate classifier for each
high and low level category (18 in total)
• used the Gradient Boosted Trees model
26
Multi-label Classification
Preprocessing
Feature
Extraction Classification
High & Low
Level Categories
++
++
…
Battery
UI
Complaint
Resources
Usage
27
Example
• feature extraction: TF-IDF scores and 2 and 3-
grams counts
RQ2: Does our approach correctly recommend the software
artifacts that need to be modified in order to handle user
requests and complaints?
• 752 user reviews from our dataset
belong to AcDisplay
• analyse Compatibility and
Complaint reviews (61 reviews)
• Complaint and Android Version (22
reviews)
28
Example
• feature extraction: TF-IDF scores and 2 and 3-
grams counts
“Good but has some issues with Marshmallow I used this on
my old phone and if was flawless and I loved it. I noticed that
sometimes when I had AcDisplay activated I would not be
able to use the fingerprint sensor even after I unlocked
AcDisplay and had to enter a password. This is very frustrating
so I cannot use AcDisplay.”
“Love the design I love the app. It’s super sleek and nice. But
ever since my phone updated to marshmallow it’s stopped
working. Hope it comes back soon.”
“On Marshmallow, the screen is buggy and sometimes shows
the notification shade.”
29
• feature extraction: TF-IDF scores and 2 and 3-
grams counts
• can we link reviews to the related source code?
• IR methods based on the VSM (hard task: the vocabulary
used by reviews and source code is different)
• use additional Android project specific information (e.g.
UI functionality is implemented in Activity classes)
Source Code Localisation
30
Source Code Localisation
Android Project
Structure Info
IR - VSM
Software Artifacts
App’s Source Code
User Reviews
31
Evaluation
• feature extraction: TF-IDF scores and 2 and 3-
grams counts
RQ1: To what extent does our approach organise reviews
according to meaningful maintenance and evolution tasks
for developers?
RQ2: Does our approach correctly recommend the software
artifacts that need to be modified in order to handle user
requests and complaints?
32
Reviews Source Code
33
Study RQ1
• feature extraction: TF-IDF scores and 2 and 3-
grams counts
• ~7800 user reviews from 39 apps
34
Study RQ1
• feature extraction: TF-IDF scores and 2 and 3-
grams counts
• 2 external evaluators
• evaluate 200 reviews for
each category (3600 total)
35
Results RQ1
High Level
Category
Precision Recall F1 Score
Compatibility 71% 97% 82%
Usage 89% 94% 91%
Resources 79% 99% 88%
Pricing 85% 97% 90%
Protection 89% 98% 93%
Complaint 85% 80% 82%
36
Results RQ1
High Level
Category
Low Level
Category
Precision Recall
F1
Score
Compatibility
Device
OS Version
Hardware
85%
89%
61%
98%
86%
95%
91%
87%
74%
Usage
App Usability
UI
92%
83%
91%
93%
91%
88%
Resources
Performance
Battery
Memory
64%
78%
68%
97%
95%
95%
77%
86%
79%
Pricing
Licensing
Price
91%
85%
98%
96%
94%
90%
Protection
Security
Privacy
87%
83%
98%
96%
92%
89%
37
Results RQ1
Our approach is able to classify reviews with high precision
and recall according to the mobile specific topics we derived.
The most important categories are Usage, Resources and
Compatibility.
38
Study RQ2
• 1 external evaluator
• 91 user reviews from 2 apps
39
Results RQ2
• feature extraction: TF-IDF scores and 2 and 3-
grams counts
Quality of
Reviews
Precision Recall F1 Score
Difficult to Link 41% 83% 55%
Easier to Link 52% 79% 63%
All 51% 79% 62%
40
Results RQ2
Our approach achieves promising results in recommending
related software artifacts for specific user reviews, furthermore
better quality reviews are easier to link than lower quality ones.
41
Conclusion & Future Work
• reviews can be classified with high precision and recall
using machine learning according to mobile specific
topics
• linking reviews to source code using textual similarity
based methods is difficult
• future work: summarise reviews, improve localisation
(static analysis)
42
Discussion
What mechanisms can we adopt for enabling a reliable
and practical solution for code localisation?

Más contenido relacionado

La actualidad más candente

Joel Resume_Update
Joel Resume_UpdateJoel Resume_Update
Joel Resume_Update
Joel A Jacob
 
Continuous, Evolutionary and Large-Scale: A New Perspective for Automated Mob...
Continuous, Evolutionary and Large-Scale: A New Perspective for Automated Mob...Continuous, Evolutionary and Large-Scale: A New Perspective for Automated Mob...
Continuous, Evolutionary and Large-Scale: A New Perspective for Automated Mob...
Kevin Moran
 
Joel Resume_Updates
Joel Resume_UpdatesJoel Resume_Updates
Joel Resume_Updates
Joel A Jacob
 
Pravin Arote Updated CV
Pravin Arote Updated CVPravin Arote Updated CV
Pravin Arote Updated CV
Pravin Arote
 
Automated GUI-Testing of Android Apps: From Research to Practice
Automated GUI-Testing of Android Apps: From Research to PracticeAutomated GUI-Testing of Android Apps: From Research to Practice
Automated GUI-Testing of Android Apps: From Research to Practice
Kevin Moran
 

La actualidad más candente (20)

How do Developers Test Android Applications?
How do Developers Test Android Applications?How do Developers Test Android Applications?
How do Developers Test Android Applications?
 
Joel Resume_Update
Joel Resume_UpdateJoel Resume_Update
Joel Resume_Update
 
Continuous, Evolutionary and Large-Scale: A New Perspective for Automated Mob...
Continuous, Evolutionary and Large-Scale: A New Perspective for Automated Mob...Continuous, Evolutionary and Large-Scale: A New Perspective for Automated Mob...
Continuous, Evolutionary and Large-Scale: A New Perspective for Automated Mob...
 
Techniques and Tools for Mobile Testing Automation
Techniques and Tools for Mobile Testing AutomationTechniques and Tools for Mobile Testing Automation
Techniques and Tools for Mobile Testing Automation
 
Joel Resume_Updates
Joel Resume_UpdatesJoel Resume_Updates
Joel Resume_Updates
 
Predicting Android Application Security and Privacy Risk With Static Code Met...
Predicting Android Application Security and Privacy Risk With Static Code Met...Predicting Android Application Security and Privacy Risk With Static Code Met...
Predicting Android Application Security and Privacy Risk With Static Code Met...
 
IRJET- Research Study on Testing Mantle in SDLC
IRJET- Research Study on Testing Mantle in SDLCIRJET- Research Study on Testing Mantle in SDLC
IRJET- Research Study on Testing Mantle in SDLC
 
On-Device Bug Reporting for Android Applications
On-Device Bug Reporting for Android ApplicationsOn-Device Bug Reporting for Android Applications
On-Device Bug Reporting for Android Applications
 
Software Engineering-Part 1
Software Engineering-Part 1Software Engineering-Part 1
Software Engineering-Part 1
 
IRJET- Approaching Highlights and Security issues in Software Engineering...
IRJET-  	  Approaching Highlights and Security issues in Software Engineering...IRJET-  	  Approaching Highlights and Security issues in Software Engineering...
IRJET- Approaching Highlights and Security issues in Software Engineering...
 
Pravin Arote Updated CV
Pravin Arote Updated CVPravin Arote Updated CV
Pravin Arote Updated CV
 
Vijay_Resume
Vijay_ResumeVijay_Resume
Vijay_Resume
 
Webcast Presentation: Accelerate Continuous Delivery with Development Testing...
Webcast Presentation: Accelerate Continuous Delivery with Development Testing...Webcast Presentation: Accelerate Continuous Delivery with Development Testing...
Webcast Presentation: Accelerate Continuous Delivery with Development Testing...
 
Comparative Study on Different Mobile Application Frameworks
Comparative Study on Different Mobile Application FrameworksComparative Study on Different Mobile Application Frameworks
Comparative Study on Different Mobile Application Frameworks
 
ICSE17 - Tool Demonstration - CrashScope A Practical Tool for the Automated T...
ICSE17 - Tool Demonstration - CrashScope A Practical Tool for the Automated T...ICSE17 - Tool Demonstration - CrashScope A Practical Tool for the Automated T...
ICSE17 - Tool Demonstration - CrashScope A Practical Tool for the Automated T...
 
IRJET- Use of Artificial Intelligence in Software Development Life Cycle Requ...
IRJET- Use of Artificial Intelligence in Software Development Life Cycle Requ...IRJET- Use of Artificial Intelligence in Software Development Life Cycle Requ...
IRJET- Use of Artificial Intelligence in Software Development Life Cycle Requ...
 
Automated GUI-Testing of Android Apps: From Research to Practice
Automated GUI-Testing of Android Apps: From Research to PracticeAutomated GUI-Testing of Android Apps: From Research to Practice
Automated GUI-Testing of Android Apps: From Research to Practice
 
Survey mobile app
Survey mobile appSurvey mobile app
Survey mobile app
 
kritika_resume2
kritika_resume2kritika_resume2
kritika_resume2
 
A hybrid crowd-powered.compressed
A hybrid crowd-powered.compressedA hybrid crowd-powered.compressed
A hybrid crowd-powered.compressed
 

Destacado

Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code Analysis
Sebastiano Panichella
 
24 x 7 leadership factory
24 x 7 leadership factory24 x 7 leadership factory
24 x 7 leadership factory
aiesechyderabad
 
Tm '13 april lc day review
Tm '13 april lc day reviewTm '13 april lc day review
Tm '13 april lc day review
aiesechyderabad
 
Production log
Production logProduction log
Production log
halo4robo
 
Curriculum Night
Curriculum NightCurriculum Night
Curriculum Night
msilberberg
 
The excelsiors '13 april lc day review
The excelsiors '13 april lc day reviewThe excelsiors '13 april lc day review
The excelsiors '13 april lc day review
aiesechyderabad
 
¿Qué vas a hacer en tu ciudad?
¿Qué vas a hacer en tu ciudad?¿Qué vas a hacer en tu ciudad?
¿Qué vas a hacer en tu ciudad?
HA MFL Department
 

Destacado (20)

Reducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code AnalysisReducing Redundancies in Multi-Revision Code Analysis
Reducing Redundancies in Multi-Revision Code Analysis
 
24 x 7 leadership factory
24 x 7 leadership factory24 x 7 leadership factory
24 x 7 leadership factory
 
Tm '13 april lc day review
Tm '13 april lc day reviewTm '13 april lc day review
Tm '13 april lc day review
 
Production log
Production logProduction log
Production log
 
The ballistate review
The ballistate reviewThe ballistate review
The ballistate review
 
Launch pad
Launch padLaunch pad
Launch pad
 
Chair pdf
Chair pdfChair pdf
Chair pdf
 
Useful automation
Useful automationUseful automation
Useful automation
 
Fmp pitch
Fmp pitchFmp pitch
Fmp pitch
 
Curriculum Night
Curriculum NightCurriculum Night
Curriculum Night
 
Sustaining the Big Data Ecosystem
Sustaining the Big Data EcosystemSustaining the Big Data Ecosystem
Sustaining the Big Data Ecosystem
 
Social justice presentation
Social justice presentationSocial justice presentation
Social justice presentation
 
Digital Economy, SocialCommerce and MIT Open Courseware
Digital Economy, SocialCommerce and MIT Open Courseware Digital Economy, SocialCommerce and MIT Open Courseware
Digital Economy, SocialCommerce and MIT Open Courseware
 
La dieta sana
La dieta sanaLa dieta sana
La dieta sana
 
The excelsiors '13 april lc day review
The excelsiors '13 april lc day reviewThe excelsiors '13 april lc day review
The excelsiors '13 april lc day review
 
Using questions and answers on Milkround
Using questions and answers on MilkroundUsing questions and answers on Milkround
Using questions and answers on Milkround
 
oGCDP 2013 Plans
oGCDP 2013 PlansoGCDP 2013 Plans
oGCDP 2013 Plans
 
The origin
The originThe origin
The origin
 
¿Qué vas a hacer en tu ciudad?
¿Qué vas a hacer en tu ciudad?¿Qué vas a hacer en tu ciudad?
¿Qué vas a hacer en tu ciudad?
 
Marketing review
Marketing reviewMarketing review
Marketing review
 

Similar a Analyzing Reviews and Code of Mobile Apps for Better Release Planning

End Users’ Perception of Hybrid Mobile Apps in the Google Play Store
End Users’ Perception of Hybrid Mobile Apps in the Google Play StoreEnd Users’ Perception of Hybrid Mobile Apps in the Google Play Store
End Users’ Perception of Hybrid Mobile Apps in the Google Play Store
Ivano Malavolta
 
Understanding the Test Automation Culture of App Developers
Understanding the Test Automation Culture of App DevelopersUnderstanding the Test Automation Culture of App Developers
Understanding the Test Automation Culture of App Developers
Pavneet Singh Kochhar
 

Similar a Analyzing Reviews and Code of Mobile Apps for Better Release Planning (20)

What are the Characteristics of High-rated Apps
What are the Characteristics of High-rated AppsWhat are the Characteristics of High-rated Apps
What are the Characteristics of High-rated Apps
 
Analyzing App Rating Using Natural Language Processing and Machine Learning
Analyzing App Rating Using Natural Language Processing and Machine LearningAnalyzing App Rating Using Natural Language Processing and Machine Learning
Analyzing App Rating Using Natural Language Processing and Machine Learning
 
IEEE 2014 DOTNET DATA MINING PROJECTS Product aspect-ranking-and--its-applica...
IEEE 2014 DOTNET DATA MINING PROJECTS Product aspect-ranking-and--its-applica...IEEE 2014 DOTNET DATA MINING PROJECTS Product aspect-ranking-and--its-applica...
IEEE 2014 DOTNET DATA MINING PROJECTS Product aspect-ranking-and--its-applica...
 
2014 IEEE DOTNET DATA MINING PROJECT Product aspect-ranking-and--its-applicat...
2014 IEEE DOTNET DATA MINING PROJECT Product aspect-ranking-and--its-applicat...2014 IEEE DOTNET DATA MINING PROJECT Product aspect-ranking-and--its-applicat...
2014 IEEE DOTNET DATA MINING PROJECT Product aspect-ranking-and--its-applicat...
 
product aspect ranking and applications
product aspect ranking and applicationsproduct aspect ranking and applications
product aspect ranking and applications
 
End Users’ Perception of Hybrid Mobile Apps in the Google Play Store
End Users’ Perception of Hybrid Mobile Apps in the Google Play StoreEnd Users’ Perception of Hybrid Mobile Apps in the Google Play Store
End Users’ Perception of Hybrid Mobile Apps in the Google Play Store
 
Fragility of Layout-based and Visual GUI test scripts: an assessment study on...
Fragility of Layout-based and Visual GUI test scripts: an assessment study on...Fragility of Layout-based and Visual GUI test scripts: an assessment study on...
Fragility of Layout-based and Visual GUI test scripts: an assessment study on...
 
IRJET- Product Aspect Ranking and its Application
IRJET-  	  Product Aspect Ranking and its ApplicationIRJET-  	  Product Aspect Ranking and its Application
IRJET- Product Aspect Ranking and its Application
 
Teja Resume (1)
Teja Resume (1)Teja Resume (1)
Teja Resume (1)
 
Agile in Medical Software Development
Agile in Medical Software DevelopmentAgile in Medical Software Development
Agile in Medical Software Development
 
New GRIN-Global tools developed by CIP in 2020
New GRIN-Global tools developed by CIP in 2020New GRIN-Global tools developed by CIP in 2020
New GRIN-Global tools developed by CIP in 2020
 
AppSec Threat Modeling with 5 Agile Design Diagrams Every Project Should Have
AppSec Threat Modeling with 5 Agile Design Diagrams Every Project Should HaveAppSec Threat Modeling with 5 Agile Design Diagrams Every Project Should Have
AppSec Threat Modeling with 5 Agile Design Diagrams Every Project Should Have
 
Requirements Analysis
Requirements AnalysisRequirements Analysis
Requirements Analysis
 
software engineering
software engineering software engineering
software engineering
 
Understanding the Test Automation Culture of App Developers
Understanding the Test Automation Culture of App DevelopersUnderstanding the Test Automation Culture of App Developers
Understanding the Test Automation Culture of App Developers
 
Using JMeter and Google Analytics for Software Performance Testing
Using JMeter and Google Analytics for Software Performance TestingUsing JMeter and Google Analytics for Software Performance Testing
Using JMeter and Google Analytics for Software Performance Testing
 
Unit 6- Development Evolution model
Unit 6- Development Evolution model Unit 6- Development Evolution model
Unit 6- Development Evolution model
 
Evaluation of an Interactive Device : Microsoft Surface RT
Evaluation of an Interactive Device : Microsoft Surface RTEvaluation of an Interactive Device : Microsoft Surface RT
Evaluation of an Interactive Device : Microsoft Surface RT
 
A Preliminary Field Study of Game Programming on Mobile Devices
A Preliminary Field Study of Game Programming on Mobile DevicesA Preliminary Field Study of Game Programming on Mobile Devices
A Preliminary Field Study of Game Programming on Mobile Devices
 
Mingle box - Online Job seeking System
Mingle box - Online Job seeking SystemMingle box - Online Job seeking System
Mingle box - Online Job seeking System
 

Más de Sebastiano Panichella

Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22
Sebastiano Panichella
 
NLBSE’22: Tool Competition
NLBSE’22: Tool CompetitionNLBSE’22: Tool Competition
NLBSE’22: Tool Competition
Sebastiano Panichella
 

Más de Sebastiano Panichella (20)

The 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software EngineeringThe 3rd Intl. Workshop on NL-based Software Engineering
The 3rd Intl. Workshop on NL-based Software Engineering
 
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...
Diversity-guided Search Exploration for Self-driving Cars Test Generation thr...
 
SBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation TrackSBFT Tool Competition 2024 -- Python Test Case Generation Track
SBFT Tool Competition 2024 -- Python Test Case Generation Track
 
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation TrackSBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
SBFT Tool Competition 2024 - CPS-UAV Test Case Generation Track
 
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with AerialistSimulation-based Testing of Unmanned Aerial Vehicles with Aerialist
Simulation-based Testing of Unmanned Aerial Vehicles with Aerialist
 
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...Testing with Fewer Resources:  Toward Adaptive Approaches for Cost-effective ...
Testing with Fewer Resources: Toward Adaptive Approaches for Cost-effective ...
 
COSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical SystemsCOSMOS: DevOps for Complex Cyber-physical Systems
COSMOS: DevOps for Complex Cyber-physical Systems
 
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
Testing and Development Challenges for Complex Cyber-Physical Systems: Insigh...
 
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
An Empirical Characterization of Software Bugs in Open-Source Cyber-Physical ...
 
Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...Automated Identification and Qualitative Characterization of Safety Concerns ...
Automated Identification and Qualitative Characterization of Safety Concerns ...
 
The 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software EngineeringThe 2nd Intl. Workshop on NL-based Software Engineering
The 2nd Intl. Workshop on NL-based Software Engineering
 
The 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz TestingThe 16th Intl. Workshop on Search-Based and Fuzz Testing
The 16th Intl. Workshop on Search-Based and Fuzz Testing
 
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
Simulation-based Test Case Generation for Unmanned Aerial Vehicles in the Nei...
 
Exposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play AppsExposed! A case study on the vulnerability-proneness of Google Play Apps
Exposed! A case study on the vulnerability-proneness of Google Play Apps
 
Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22Search-based Software Testing (SBST) '22
Search-based Software Testing (SBST) '22
 
NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22NL-based Software Engineering (NLBSE) '22
NL-based Software Engineering (NLBSE) '22
 
NLBSE’22: Tool Competition
NLBSE’22: Tool CompetitionNLBSE’22: Tool Competition
NLBSE’22: Tool Competition
 
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
 "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.  "An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
"An NLP-based Tool for Software Artifacts Analysis" at @ICSME2021.
 
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
An Empirical Investigation of Relevant Changes and Automation Needs in Modern...
 
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
Search-Based Software Testing Tool Competition 2021 by Sebastiano Panichella,...
 

Último

If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
Kayode Fayemi
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
raffaeleoman
 

Último (20)

If this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New NigeriaIf this Giant Must Walk: A Manifesto for a New Nigeria
If this Giant Must Walk: A Manifesto for a New Nigeria
 
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara ServicesVVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
VVIP Call Girls Nalasopara : 9892124323, Call Girls in Nalasopara Services
 
Air breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animalsAir breathing and respiratory adaptations in diver animals
Air breathing and respiratory adaptations in diver animals
 
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
Re-membering the Bard: Revisiting The Compleat Wrks of Wllm Shkspr (Abridged)...
 
Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510Thirunelveli call girls Tamil escorts 7877702510
Thirunelveli call girls Tamil escorts 7877702510
 
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, YardstickSaaStr Workshop Wednesday w/ Lucas Price, Yardstick
SaaStr Workshop Wednesday w/ Lucas Price, Yardstick
 
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
Busty Desi⚡Call Girls in Sector 51 Noida Escorts >༒8448380779 Escort Service-...
 
Dreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video TreatmentDreaming Marissa Sánchez Music Video Treatment
Dreaming Marissa Sánchez Music Video Treatment
 
Causes of poverty in France presentation.pptx
Causes of poverty in France presentation.pptxCauses of poverty in France presentation.pptx
Causes of poverty in France presentation.pptx
 
Report Writing Webinar Training
Report Writing Webinar TrainingReport Writing Webinar Training
Report Writing Webinar Training
 
Dreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio IIIDreaming Music Video Treatment _ Project & Portfolio III
Dreaming Music Video Treatment _ Project & Portfolio III
 
Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)Introduction to Prompt Engineering (Focusing on ChatGPT)
Introduction to Prompt Engineering (Focusing on ChatGPT)
 
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdfAWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
AWS Data Engineer Associate (DEA-C01) Exam Dumps 2024.pdf
 
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdfThe workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
The workplace ecosystem of the future 24.4.2024 Fabritius_share ii.pdf
 
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptxChiulli_Aurora_Oman_Raffaele_Beowulf.pptx
Chiulli_Aurora_Oman_Raffaele_Beowulf.pptx
 
ICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdfICT role in 21st century education and it's challenges.pdf
ICT role in 21st century education and it's challenges.pdf
 
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night EnjoyCall Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
Call Girl Number in Khar Mumbai📲 9892124323 💞 Full Night Enjoy
 
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort ServiceBDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
BDSM⚡Call Girls in Sector 97 Noida Escorts >༒8448380779 Escort Service
 
My Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle BaileyMy Presentation "In Your Hands" by Halle Bailey
My Presentation "In Your Hands" by Halle Bailey
 
lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.lONG QUESTION ANSWER PAKISTAN STUDIES10.
lONG QUESTION ANSWER PAKISTAN STUDIES10.
 

Analyzing Reviews and Code of Mobile Apps for Better Release Planning

  • 1. Analyzing Reviews and Code of Mobile Apps for Better Release Planning Adelina Ciurumelea, Andreas Schaufenbühl, Sebastiano Panichella, Harald C. Gall software evolution & architecture lab
  • 2. 2 Extremely Popular Apps 8,087,067 reviews3,505,905 reviews38,742,600 reviews
  • 4. 4 The number of reviews is large compared to the available development resources.
  • 5. 5 • reviews contain valuable feedback directly from the users • users often report bugs, user experience and request features • the review content influences the number of downloads Importance of reviews
  • 6. 6 INFORMATIVE NON-INFORMATIVE “AR-Miner: Mining informative reviews for developers from mobile app marketplace” N. Chen, J. Lin, S. Hoi, X. Xiao, and B. Zhang
  • 7. 7 BUG FEATURE REQUEST “Release planning of mobile apps based on user reviews” L. Villarroel, G. Bavota, B. Russo, R. Oliveto, and M. Di Penta OTHER
  • 8. 8 BUGFEATURE REQUEST • the developer has to manually analyse the unstructured groups of reviews, understand what they talk about and extract actionable change tasks • what does a particular cluster talk about? Does it talk about the UI or about the performance of the app, etc.?
  • 9. 9 What are the mobile specific topics users talk about in their reviews?
  • 10. 10 manual analysis of ~1600 reviews
  • 11. 11 Hmmm... Mm No… This is IT Nope Nopity nope • not all reviews are useful
  • 12. 12 Hmmm... Mm No… This is IT Nope Nopity nope Sucks Way to many errors 0 stars Garbage. problem bro Garbage Bla bla bla • not all reviews are useful • some are even offensive
  • 13. 13 Pretty close to perfect, this app is way better than any comic book reader I've ever used. It's small, it operates fast, and the interface is incredibly clean and simple. • others can provide valuable information for the developer
  • 14. 14 Pretty close to perfect, this app is way better than any comic book reader I've ever used. It's small, it operates fast, and the interface is incredibly clean and simple. Resources Usage
  • 15. 15 For info (in case dev not already aware!), there is a graphical glitch when scrolling output in marshmallow on a nexus 5. Compatibility Usage Complaint
  • 16. 16 Building the taxonomy • feature extraction: TF-IDF scores and 2 and 3- grams counts Content analysis in 2 passes: • start with an empty list of categories • analyse each review and add a new category (including definition and keywords) if necessary • label the review with all the matching categories • second pass: revisit the list of reviews and label them with the appropriate categories
  • 17. 17 Category Description Compatibility mentions the OS, mobile device or a specific hardware component. Usage talks about the UI or the usability of the app. Resources mentions the app’s influence on the battery and memory usage or the performance of the app/phone. Pricing statements mentioning the license model or the price of the app. Protection statements referring to security or privacy issues. Complaint the user reports or complains about an issue with the app. High Level Taxonomy
  • 19. 19 Liked it and worked very well in lollipop, but not MM The plugins don't refresh, manual navigation to next image doesn't work. Some plugins give error. Altogether seems broken after MM update on Note 4. Compatibility
  • 20. 20 Liked it and worked very well in lollipop, but not MM The plugins don't refresh, manual navigation to next image doesn't work. Some plugins give error. Altogether seems broken after MM update on Note 4. Compatibility Device Android Version
  • 21. 21 High Level Low Level Categories Compatibility Device, Android Version, Hardware Usage App Usability, UI Resources Performance, Battery, Memory Pricing Licensing, Price Protection Security, Privacy Low Level Taxonomy
  • 23. 23 Gradient Boosted Trees Training Preprocessing & Feature Extraction Multi-label Classification ML Approach
  • 24. 24 Preprocessing & Feature Extraction • preprocessing: stop words removal and stemming • feature extraction: TF-IDF scores and 2 and 3- grams counts
  • 25. 25 Training • feature extraction: TF-IDF scores and 2 and 3- grams counts • one-vs-all strategy: separate classifier for each high and low level category (18 in total) • used the Gradient Boosted Trees model
  • 26. 26 Multi-label Classification Preprocessing Feature Extraction Classification High & Low Level Categories ++ ++ … Battery UI Complaint Resources Usage
  • 27. 27 Example • feature extraction: TF-IDF scores and 2 and 3- grams counts RQ2: Does our approach correctly recommend the software artifacts that need to be modified in order to handle user requests and complaints? • 752 user reviews from our dataset belong to AcDisplay • analyse Compatibility and Complaint reviews (61 reviews) • Complaint and Android Version (22 reviews)
  • 28. 28 Example • feature extraction: TF-IDF scores and 2 and 3- grams counts “Good but has some issues with Marshmallow I used this on my old phone and if was flawless and I loved it. I noticed that sometimes when I had AcDisplay activated I would not be able to use the fingerprint sensor even after I unlocked AcDisplay and had to enter a password. This is very frustrating so I cannot use AcDisplay.” “Love the design I love the app. It’s super sleek and nice. But ever since my phone updated to marshmallow it’s stopped working. Hope it comes back soon.” “On Marshmallow, the screen is buggy and sometimes shows the notification shade.”
  • 29. 29 • feature extraction: TF-IDF scores and 2 and 3- grams counts • can we link reviews to the related source code? • IR methods based on the VSM (hard task: the vocabulary used by reviews and source code is different) • use additional Android project specific information (e.g. UI functionality is implemented in Activity classes) Source Code Localisation
  • 30. 30 Source Code Localisation Android Project Structure Info IR - VSM Software Artifacts App’s Source Code User Reviews
  • 31. 31 Evaluation • feature extraction: TF-IDF scores and 2 and 3- grams counts RQ1: To what extent does our approach organise reviews according to meaningful maintenance and evolution tasks for developers? RQ2: Does our approach correctly recommend the software artifacts that need to be modified in order to handle user requests and complaints?
  • 33. 33 Study RQ1 • feature extraction: TF-IDF scores and 2 and 3- grams counts • ~7800 user reviews from 39 apps
  • 34. 34 Study RQ1 • feature extraction: TF-IDF scores and 2 and 3- grams counts • 2 external evaluators • evaluate 200 reviews for each category (3600 total)
  • 35. 35 Results RQ1 High Level Category Precision Recall F1 Score Compatibility 71% 97% 82% Usage 89% 94% 91% Resources 79% 99% 88% Pricing 85% 97% 90% Protection 89% 98% 93% Complaint 85% 80% 82%
  • 36. 36 Results RQ1 High Level Category Low Level Category Precision Recall F1 Score Compatibility Device OS Version Hardware 85% 89% 61% 98% 86% 95% 91% 87% 74% Usage App Usability UI 92% 83% 91% 93% 91% 88% Resources Performance Battery Memory 64% 78% 68% 97% 95% 95% 77% 86% 79% Pricing Licensing Price 91% 85% 98% 96% 94% 90% Protection Security Privacy 87% 83% 98% 96% 92% 89%
  • 37. 37 Results RQ1 Our approach is able to classify reviews with high precision and recall according to the mobile specific topics we derived. The most important categories are Usage, Resources and Compatibility.
  • 38. 38 Study RQ2 • 1 external evaluator • 91 user reviews from 2 apps
  • 39. 39 Results RQ2 • feature extraction: TF-IDF scores and 2 and 3- grams counts Quality of Reviews Precision Recall F1 Score Difficult to Link 41% 83% 55% Easier to Link 52% 79% 63% All 51% 79% 62%
  • 40. 40 Results RQ2 Our approach achieves promising results in recommending related software artifacts for specific user reviews, furthermore better quality reviews are easier to link than lower quality ones.
  • 41. 41 Conclusion & Future Work • reviews can be classified with high precision and recall using machine learning according to mobile specific topics • linking reviews to source code using textual similarity based methods is difficult • future work: summarise reviews, improve localisation (static analysis)
  • 42. 42 Discussion What mechanisms can we adopt for enabling a reliable and practical solution for code localisation?