SlideShare una empresa de Scribd logo
1 de 44
Descargar para leer sin conexión
Natural Language Processing for
Requirements Engineering
Tasks, Techniques, Tools, and Technologies
Alessio Ferrari
CNR-ISTI, Italy
Liping Zhao
University of
Manchester, U.K.
Waad Alhoshan
IMSIU, Saudi Arabia
Technical Briefing
© 2021 Alessio Ferrari, Liping Zhao and Waad Alhoshan. Cite this presentation as: A. Ferrari, L. Zhao and W. Alhoshan, "NLP for Requirements Engineering: Tasks, Techniques, Tools,
and Technologies," 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), 2021, pp. 322-323, doi:
10.1109/ICSE-Companion52605.2021.00137.
Objectives
● Provide an overview of NLP tasks for RE (NLP4RE)
● Present the results of a mapping study on NLP4RE
● Point to relevant tools and resources to practice NLP4RE
● Introduce you to transfer learning for NLP4RE
What is Natural Language Processing?
Technologies enabling extraction and manipulation of information
from natural language (NL) - English, Italian, Swedish, etc.
Wait...What is a Requirement?
● Jackson and Zave: Condition over phenomena of the environment that we
want to make true by developing the system
● Lamsweerde: Goal under the responsibility of a single agent of the
software-to-be
● ISO/IEC/IEEE 29148 Standard: Statement which translates or expresses a
need and its associated constraints and conditions
● Wikipedia: Singular documented physical or functional Need that a particular
design, product or process aims to satisfy
● No agreed INTENSIONAL definition
● Some confusion on the types of requirements (e.g., user, system, software,
business, functional, non-functional), the concept of specification, etc.
● So, let us give some EXAMPLES, and give an EXTENSIONAL definition
This, and anything that reminds
this, is a requirement!
Why are requirements so special?
● Requirements are heterogeneous (specs, app reviews, legal docs)
● Requirements specifications use a more restricted vocabulary (about
half of generic texts), and longer sentences
● App reviews are unstructured and informal
● 62% of the words used in reqs do not appear in generic texts
This suggests that NLP tools trained on generic texts
may need to be tailored for requirements
Ferrari, A., Spagnolo, G. O., & Gnesi, S. PURE: A dataset
of public requirements documents. IEEE RE 2017.
…A non-exhaustive list
Observations
● Most RE problems could be solved top-down
● I can enforce tracing when writing requirements
● I can use constrained natural languages to improve quality
● I can tag classes in advance
● I can write a glossary in advance
● Unfortunately, this does not happen, that’s why we need NLP
● We need NLP also to recover from errors when RE problems
are addressed top-down by fallible humans
NLP4RE
Techniques, Tools &
Resources
Zhao, L., Alhoshan, W., Ferrari, A., et al. Natural Language Processing for Requirements
Engineering: A Systematic Mapping Study. ACM Comput. Surv. 54, 3, Article 55 (April 2021),
41 pages. DOI:https://doi.org/10.1145/3444689
The Mapping Study
NLP4RE
Landscape
Research
Status
Research
Focus
Publication
Status
Technology
Development
NLP4RE
Tools
NLP
Technologies
RQ1
RQ2
RQ3
RQ4
RQ5
Publication Status of NLP4RE Literature
Key takeaways:
● 404 relevant papers in 4 decades
● 1st papers published in 1983
● Fast growth since 2004
● NLP4RE an active and thriving
research area in RE
●
State of Empirical Research in NLP4RE
Majority NLP4RE studies (> 67%) are solution proposals, typically involving the
development of a novel solution, a new method or a new technique
Only a small number of studies (7%) is conducted in an industrial setting via a
case study or field study
About a third (35%) of the solution proposals are not evaluated but only illustrated
using examples, discussion or simulation
Remaining solution proposals are evaluated in a lab experiment using either
students or software subjects
Limited evidence of industrial uptake of NLP4RE results
A typical NLP4RE study is a solution proposal, possibly evaluated internally
through an experiment or example, but without evaluation in the real world
State of Tool Development in NLP4RE Research
Key takeaways:
● 130 tools developed over 30 years
● Only 17 (13%) of them available online
● No evidence these 17 tools still in use
● State of tool development very poor - tools
for NLP4RE don’t exist!
NLP Technologies Used in NLP4RE Research
● NLP technique: a underlying technique
for performing a basic NLP task (e.g.,
POS tagging, parsing or tokenization)
● NLP resource: a linguistic data resource
for supporting NLP tools
- Lexical resources (e.g., WordNet,
VerbNet)
- Text corpus/dataset (e.g., British
National Corpus and Brown Corpus)
● NLP tool: a software system or library
supporting NLP pipelines (e.g., Stanford
CoreNLP, NLTK or OpenNLP)
Frequently Used NLP Techniques in NLP4RE Research
Key takeaways:
● Only 32/140 (< 1 in 4) NLP techniques
in frequent use
● 90% of these frequently used NLP
techniques are word/syntactic based
● Baseline techniques, e.g., POS
tagging, tokenization, syntactic
parsing most in use
Frequency
of
Use
A Typical NLP Pipeline for Processing Requirements Text
Frequently Used NLP Tools in NLP4RE Research
Frequency
of
Use
Key takeaways:
● 1 in 5 (14/66) NLP tools in frequent
use
● The top 5 most used tools –
Stanford CoreNLP, Gate, NLTK,
OpenNLP, and Weka - are toolkits,
supporting multiple NLP pipelines
Top Most Used NLP Tools in NLP4RE Research
● Stanford CoreNLP (https://stanfordnlp.github.io/CoreNLP/): support
common core NLP tasks, open source, datasets not available
● OpenNLP (https://opennlp.apache.org/): support both lower-level (POS,
chunking) and high-level NLP tasks (language detection, classification),
open source, datasets available
● NLTK (https://www.nltk.org): support a wide range of NLP tasks, open
source + over 50 datasets available
● GATE (https://gate.ac.uk): support text processing tasks such as
information extraction and text annotation
● WEKA (https://www.cs.waikato.ac.nz/ml/weka/): support data mining tasks
such as data pre-processing, classification, regression, clustering,
association rules, and visualization, open source
Frequently Used NLP Resources in NLP4RE Research
Frequency
of
Use
Key takeaways:
● 1 in 2 (13/25) NLP resources in
frequent use
● Lexical resources used most
(i.e., WordNet, VerbNet)
● Only two RE related datasets:
MODIS and CM-1
Other RE Related NLP Resources
● PROMISE Software Engineering Repository
(http://promise.site.uottawa.ca/SERepository/datasets-page.html)
○ Created by Sayyad and Menzies from University of Ottawa, Canada in 2005
○ Contains 20 publicly available datasets (including MODIS and CM-1)
● PURE Dataset (https://zenodo.org/record/1414117)
○ Created by Alessio Ferrari, et al. in 2017
○ Contains 79 publicly available requirements documents collected from the Web
● User Stories (https://data.mendeley.com/datasets/7zbk8zsd8y/1)
○ Created by Fabiano Dalpiaz in 2018
○ Contains 22 datasets, each with 50+ requirements, expressed as user stories
● FN-RE (https://zenodo.org/record/1291660)
○ Created by Waad Alhoshan et al. in 2018
○ Contains a dataset of requirements documents annotated with FrameNet semantics
● App Reviews
○ 13 annotated datasets are reported in this paper: Dabrowski, Jacek, et al. "App Review Analysis for
Software Engineering: A Systematic Literature Review." University College London, Tech. Rep (2020).
○ Mobile App Market (https://sites.google.com/site/appsimilarity/)
Emerging Trends in NLP4RE Research – Some Observations
2007 onwards
Consistent rise in developing
ML-based approaches (using ML
algorithms such as SVM, DT, NB,
KNN) for automatic requirements
classification
2013 onwards
Increase in using
non-traditional requirements
texts (e.g., app reviews and
user stories) in NLP4RE
research
2020
Upsurge in developing
DL-based approaches (e.g.,
BERT and Bi-LSTM) for
automatic requirements
classification
Emerging Trends in NLP4RE Research – Some Observations
These trends indicate ...
NLP4RE
Transfer Learning
Tutorial on using BERT
Transfer Learning & Humans
Transfer Learning & Machines
To train a model to learn from a type of
problem and leverage that model (i.e., the
knowledge) to solve new BUT related
problem.
Pan, Sinno Jialin, and Qiang Yang. "A survey on transfer learning." IEEE
Transactions on knowledge and data engineering 22.10 (2009): 1345-1359.
less training power and less overall
computing data (dataset)
Transfer Learning for NLP: Language Model
Language Model is a way to represent the relations between words in a language
Woman
queen (8%)
king (3%)
prince (0.2%)
princess (5%)
daughter (45%)
son (0.1%)
father (2%)
mother (25%)
Neural Language Models
(Contextual Embedding)
Designed to overcome ambiguity
and language variations
Ex. Transformers-based Models such asL BERT ,
GPT-2 & GPT-3, XLNet, and more!
BERT Language Model
Bi-directional Encoder Representations from Transformers
Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional
transformers for language understanding." arXiv preprint
arXiv:1810.04805 (2018).
Pretrained LM which is originally initiated
by Google AI to enhance the Google
search engine to predicate user’s queries
BERTBase
→ 12 encoder layers
BERTLarge
→ 24 encoder layers
Vaswani, Ashish, et al. "Attention is all you need." arXiv
preprint arXiv:1706.03762 (2017).
BERT Language Model
Bi-directional Encoder Representations from Transformers
So, what can we do with BERT model?
Use it as
classifier
Extract
features and
train a new
classifier
Fine-tune the
BERT model
Use
pre-trained
BERT model
NLP Transfer Learning for RE: Getting Started
Problem to Solve:
Classifying a set of non-functional requirements which discern usability and security
aspects.
The dataset can downloaded from https://github.com/tobhey/NoRBERT
Part 1: Using pre-trained BERT
with Zero-Shot Learning Classifier
Part 2: Using pre-trained BERT
model to extract features and train a
LR classifier
Part 3: Fine-tune BERT model and
use it in a new classifier
https://colab.research.google.com/d
rive/158H-lEJE1pc-xHc1ISBAKGD
HMt_eg4Gn?usp=sharing
https://colab.research.google.com/d
rive/1B_5ow3rvS0Qz1y-KyJtlMNnm
gmx9w3kJ?usp=sharing
https://colab.research.google.com/d
rive/1Xrm0gNaa41YwlM5g2CRYYX
cRvpbDnTRT?usp=sharing
Transfer Learning for NLP4RE Tasks: Key Takeaways
On-the-fly requirement
categorization/classification
Auto-completing
requirements at
requirements elicitation
phase
Enabling re-usability of
requirements from large
repositories
Language Model assist in identifying contextual information ...
This is very useful for NLP4RE tasks
“Unfortunately, LMs trained on unfiltered text corpora suffer from degenerate and
biased behaviour.”
Using the pre-trained model directly might not bring the best of these language
techniques ⇒ Fine-tuning is highly encouraged for domain-specific task
https://github.com/tobhey/NoRBERT
Schramowski, Patrick, et al. "Language Models have a Moral
Dimension." arXiv preprint arXiv:2103.11790 (2021).
Progress Made: NLP4RE in 2018
From my previous technical
briefing at ICSE 2018
Progress Made: NLP4RE in 2021
Some tiny but important
improvements, especially in
Classification and Feedback Analysis
What’s Next?
Requirements are
NOT normally
expressed in
natural language...
We need to
provide
means to analyse
multi-modal
requirements
State of Empirical Studies in NLP4RE
Majority NLP4RE studies (> 67%) are lab based with experiments
using either human or software subjects
A typical NLP4RE study is a solution proposal, possibly evaluated internally
through an experiment or example, but without evaluation in the real world
Only a small number (7%) is conducted in an industrial setting via a
case study or field study
About 1 in 3 of the studies are not evaluated, as they are only
illustrated through an example, discussion or simulation
Limited evidence of industrial uptake of NLP4RE results
© 2021 Alessio Ferrari, Liping Zhao and Waad Alhoshan. Cite this presentation as: A. Ferrari, L. Zhao and W. Alhoshan, "NLP for Requirements Engineering: Tasks, Techniques, Tools, and
Technologies," 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), 2021, pp. 322-323, doi:
10.1109/ICSE-Companion52605.2021.00137.

Más contenido relacionado

La actualidad más candente

Natural Language Processing: Parsing
Natural Language Processing: ParsingNatural Language Processing: Parsing
Natural Language Processing: ParsingRushdi Shams
 
Function Oriented Design
Function Oriented DesignFunction Oriented Design
Function Oriented DesignSharath g
 
System Models in Software Engineering SE7
System Models in Software Engineering SE7System Models in Software Engineering SE7
System Models in Software Engineering SE7koolkampus
 
Learning by analogy
Learning by analogyLearning by analogy
Learning by analogyNitesh Singh
 
Empirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an OverviewEmpirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an Overviewalessio_ferrari
 
Public presentations for software engineers
Public presentations for software engineersPublic presentations for software engineers
Public presentations for software engineersRoman Nikitchenko
 
Coupling and cohesion
Coupling and cohesionCoupling and cohesion
Coupling and cohesionSutha31
 
EARS: The Easy Approach to Requirements Syntax
EARS: The Easy Approach to Requirements SyntaxEARS: The Easy Approach to Requirements Syntax
EARS: The Easy Approach to Requirements SyntaxTechWell
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingTed Xiao
 
Software Engineering Layered Technology Software Process Framework
Software Engineering  Layered Technology Software Process FrameworkSoftware Engineering  Layered Technology Software Process Framework
Software Engineering Layered Technology Software Process FrameworkJAINAM KAPADIYA
 
Rdbms
RdbmsRdbms
Rdbmsrdbms
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language ProcessingMariana Soffer
 
Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...iwan_rg
 
REQUIREMENT ENGINEERING
REQUIREMENT ENGINEERINGREQUIREMENT ENGINEERING
REQUIREMENT ENGINEERINGSaqib Raza
 
Empirical Software Engineering - What is it and why do we need it?
Empirical Software Engineering - What is it and why do we need it?Empirical Software Engineering - What is it and why do we need it?
Empirical Software Engineering - What is it and why do we need it?Daniel Mendez
 
Chapter 08
Chapter 08Chapter 08
Chapter 08guru3188
 

La actualidad más candente (20)

Natural Language Processing: Parsing
Natural Language Processing: ParsingNatural Language Processing: Parsing
Natural Language Processing: Parsing
 
Requirements Engineering
Requirements EngineeringRequirements Engineering
Requirements Engineering
 
Function Oriented Design
Function Oriented DesignFunction Oriented Design
Function Oriented Design
 
System Models in Software Engineering SE7
System Models in Software Engineering SE7System Models in Software Engineering SE7
System Models in Software Engineering SE7
 
Learning by analogy
Learning by analogyLearning by analogy
Learning by analogy
 
Language models
Language modelsLanguage models
Language models
 
Empirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an OverviewEmpirical Methods in Software Engineering - an Overview
Empirical Methods in Software Engineering - an Overview
 
Public presentations for software engineers
Public presentations for software engineersPublic presentations for software engineers
Public presentations for software engineers
 
NLP
NLPNLP
NLP
 
Coupling and cohesion
Coupling and cohesionCoupling and cohesion
Coupling and cohesion
 
EARS: The Easy Approach to Requirements Syntax
EARS: The Easy Approach to Requirements SyntaxEARS: The Easy Approach to Requirements Syntax
EARS: The Easy Approach to Requirements Syntax
 
Natural language processing
Natural language processingNatural language processing
Natural language processing
 
A Panorama of Natural Language Processing
A Panorama of Natural Language ProcessingA Panorama of Natural Language Processing
A Panorama of Natural Language Processing
 
Software Engineering Layered Technology Software Process Framework
Software Engineering  Layered Technology Software Process FrameworkSoftware Engineering  Layered Technology Software Process Framework
Software Engineering Layered Technology Software Process Framework
 
Rdbms
RdbmsRdbms
Rdbms
 
Natural Language Processing
Natural Language ProcessingNatural Language Processing
Natural Language Processing
 
Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...Summary of Multilingual Natural Language Processing Applications: From Theory...
Summary of Multilingual Natural Language Processing Applications: From Theory...
 
REQUIREMENT ENGINEERING
REQUIREMENT ENGINEERINGREQUIREMENT ENGINEERING
REQUIREMENT ENGINEERING
 
Empirical Software Engineering - What is it and why do we need it?
Empirical Software Engineering - What is it and why do we need it?Empirical Software Engineering - What is it and why do we need it?
Empirical Software Engineering - What is it and why do we need it?
 
Chapter 08
Chapter 08Chapter 08
Chapter 08
 

Similar a Natural language processing for requirements engineering: ICSE 2021 Technical Briefing

A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...kevig
 
NLP applicata a LIS
NLP applicata a LISNLP applicata a LIS
NLP applicata a LISnoemiricci2
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET Journal
 
Calico 2014 intelligent call - def
Calico 2014   intelligent call - defCalico 2014   intelligent call - def
Calico 2014 intelligent call - defPiet Desmet
 
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...IJCI JOURNAL
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsShreyas Suresh Rao
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Fwdays
 
Realization of natural language interfaces using
Realization of natural language interfaces usingRealization of natural language interfaces using
Realization of natural language interfaces usingunyil96
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.Lifeng (Aaron) Han
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...write4
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...write5
 
MT SUMMIT PPT: Language-independent Model for Machine Translation Evaluation ...
MT SUMMIT PPT: Language-independent Model for Machine Translation Evaluation ...MT SUMMIT PPT: Language-independent Model for Machine Translation Evaluation ...
MT SUMMIT PPT: Language-independent Model for Machine Translation Evaluation ...Lifeng (Aaron) Han
 
A Comprehensive Study On Natural Language Processing And Natural Language Int...
A Comprehensive Study On Natural Language Processing And Natural Language Int...A Comprehensive Study On Natural Language Processing And Natural Language Int...
A Comprehensive Study On Natural Language Processing And Natural Language Int...Scott Bou
 
Text analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEText analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEDiana Maynard
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language ProcessingKevinSims18
 
IRJET- Querying Database using Natural Language Interface
IRJET-  	  Querying Database using Natural Language InterfaceIRJET-  	  Querying Database using Natural Language Interface
IRJET- Querying Database using Natural Language InterfaceIRJET Journal
 
Using Stanza NLP and TensorFlow to create a summary of a book
Using Stanza NLP and TensorFlow to create a summary of a bookUsing Stanza NLP and TensorFlow to create a summary of a book
Using Stanza NLP and TensorFlow to create a summary of a bookOlusola Amusan
 

Similar a Natural language processing for requirements engineering: ICSE 2021 Technical Briefing (20)

A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
A NOVEL APPROACH FOR NAMED ENTITY RECOGNITION ON HINDI LANGUAGE USING RESIDUA...
 
ppt
pptppt
ppt
 
NLP applicata a LIS
NLP applicata a LISNLP applicata a LIS
NLP applicata a LIS
 
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
IRJET- Survey on Deep Learning Approaches for Phrase Structure Identification...
 
Calico 2014 intelligent call - def
Calico 2014   intelligent call - defCalico 2014   intelligent call - def
Calico 2014 intelligent call - def
 
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
Ara--CANINE: Character-Based Pre-Trained Language Model for Arabic Language U...
 
srinu.pptx
srinu.pptxsrinu.pptx
srinu.pptx
 
Natural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application TrendsNatural Language Processing - Research and Application Trends
Natural Language Processing - Research and Application Trends
 
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
Thomas Wolf "An Introduction to Transfer Learning and Hugging Face"
 
subrat
 subrat subrat
subrat
 
Realization of natural language interfaces using
Realization of natural language interfaces usingRealization of natural language interfaces using
Realization of natural language interfaces using
 
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
ADAPT Centre and My NLP journey: MT, MTE, QE, MWE, NER, Treebanks, Parsing.
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...
 
Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...Jawaharlal Nehru Technological University Natural Language Processing Capston...
Jawaharlal Nehru Technological University Natural Language Processing Capston...
 
MT SUMMIT PPT: Language-independent Model for Machine Translation Evaluation ...
MT SUMMIT PPT: Language-independent Model for Machine Translation Evaluation ...MT SUMMIT PPT: Language-independent Model for Machine Translation Evaluation ...
MT SUMMIT PPT: Language-independent Model for Machine Translation Evaluation ...
 
A Comprehensive Study On Natural Language Processing And Natural Language Int...
A Comprehensive Study On Natural Language Processing And Natural Language Int...A Comprehensive Study On Natural Language Processing And Natural Language Int...
A Comprehensive Study On Natural Language Processing And Natural Language Int...
 
Text analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATEText analysis and Semantic Search with GATE
Text analysis and Semantic Search with GATE
 
Introduction to Natural Language Processing
Introduction to Natural Language ProcessingIntroduction to Natural Language Processing
Introduction to Natural Language Processing
 
IRJET- Querying Database using Natural Language Interface
IRJET-  	  Querying Database using Natural Language InterfaceIRJET-  	  Querying Database using Natural Language Interface
IRJET- Querying Database using Natural Language Interface
 
Using Stanza NLP and TensorFlow to create a summary of a book
Using Stanza NLP and TensorFlow to create a summary of a bookUsing Stanza NLP and TensorFlow to create a summary of a book
Using Stanza NLP and TensorFlow to create a summary of a book
 

Más de alessio_ferrari

Systematic Literature Reviews and Systematic Mapping Studies
Systematic Literature Reviews and Systematic Mapping StudiesSystematic Literature Reviews and Systematic Mapping Studies
Systematic Literature Reviews and Systematic Mapping Studiesalessio_ferrari
 
Case Study Research in Software Engineering
Case Study Research in Software EngineeringCase Study Research in Software Engineering
Case Study Research in Software Engineeringalessio_ferrari
 
Survey Research In Empirical Software Engineering
Survey Research In Empirical Software EngineeringSurvey Research In Empirical Software Engineering
Survey Research In Empirical Software Engineeringalessio_ferrari
 
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...alessio_ferrari
 
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
Controlled experiments, Hypothesis Testing, Test Selection, Threats to ValidityControlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validityalessio_ferrari
 
Requirements Engineering: focus on Natural Language Processing, Lecture 2
Requirements Engineering: focus on Natural Language Processing, Lecture 2Requirements Engineering: focus on Natural Language Processing, Lecture 2
Requirements Engineering: focus on Natural Language Processing, Lecture 2alessio_ferrari
 
Requirements Engineering: focus on Natural Language Processing, Lecture 1
Requirements Engineering: focus on Natural Language Processing, Lecture 1Requirements Engineering: focus on Natural Language Processing, Lecture 1
Requirements Engineering: focus on Natural Language Processing, Lecture 1alessio_ferrari
 
Ambiguity in Software Engineering
Ambiguity in Software EngineeringAmbiguity in Software Engineering
Ambiguity in Software Engineeringalessio_ferrari
 

Más de alessio_ferrari (8)

Systematic Literature Reviews and Systematic Mapping Studies
Systematic Literature Reviews and Systematic Mapping StudiesSystematic Literature Reviews and Systematic Mapping Studies
Systematic Literature Reviews and Systematic Mapping Studies
 
Case Study Research in Software Engineering
Case Study Research in Software EngineeringCase Study Research in Software Engineering
Case Study Research in Software Engineering
 
Survey Research In Empirical Software Engineering
Survey Research In Empirical Software EngineeringSurvey Research In Empirical Software Engineering
Survey Research In Empirical Software Engineering
 
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
Qualitative Studies in Software Engineering - Interviews, Observation, Ground...
 
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
Controlled experiments, Hypothesis Testing, Test Selection, Threats to ValidityControlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
Controlled experiments, Hypothesis Testing, Test Selection, Threats to Validity
 
Requirements Engineering: focus on Natural Language Processing, Lecture 2
Requirements Engineering: focus on Natural Language Processing, Lecture 2Requirements Engineering: focus on Natural Language Processing, Lecture 2
Requirements Engineering: focus on Natural Language Processing, Lecture 2
 
Requirements Engineering: focus on Natural Language Processing, Lecture 1
Requirements Engineering: focus on Natural Language Processing, Lecture 1Requirements Engineering: focus on Natural Language Processing, Lecture 1
Requirements Engineering: focus on Natural Language Processing, Lecture 1
 
Ambiguity in Software Engineering
Ambiguity in Software EngineeringAmbiguity in Software Engineering
Ambiguity in Software Engineering
 

Último

The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...kalichargn70th171
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsJhone kinadey
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfproinshot.com
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionOnePlan Solutions
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024Mind IT Systems
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerThousandEyes
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...software pro Development
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfVishalKumarJha10
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfkalichargn70th171
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech studentsHimanshiGarg82
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...Health
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comFatema Valibhai
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsArshad QA
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsAndolasoft Inc
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesVictorSzoltysek
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...panagenda
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...OnePlan Solutions
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnAmarnathKambale
 

Último (20)

The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
The Guide to Integrating Generative AI into Unified Continuous Testing Platfo...
 
Right Money Management App For Your Financial Goals
Right Money Management App For Your Financial GoalsRight Money Management App For Your Financial Goals
Right Money Management App For Your Financial Goals
 
Exploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdfExploring the Best Video Editing App.pdf
Exploring the Best Video Editing App.pdf
 
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) SolutionIntroducing Microsoft’s new Enterprise Work Management (EWM) Solution
Introducing Microsoft’s new Enterprise Work Management (EWM) Solution
 
10 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 202410 Trends Likely to Shape Enterprise Technology in 2024
10 Trends Likely to Shape Enterprise Technology in 2024
 
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected WorkerHow To Troubleshoot Collaboration Apps for the Modern Connected Worker
How To Troubleshoot Collaboration Apps for the Modern Connected Worker
 
How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...How to Choose the Right Laravel Development Partner in New York City_compress...
How to Choose the Right Laravel Development Partner in New York City_compress...
 
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdfintroduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
introduction-to-automotive Andoid os-csimmonds-ndctechtown-2021.pdf
 
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdfLearn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
Learn the Fundamentals of XCUITest Framework_ A Beginner's Guide.pdf
 
8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students8257 interfacing 2 in microprocessor for btech students
8257 interfacing 2 in microprocessor for btech students
 
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
+971565801893>>SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHAB...
 
HR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.comHR Software Buyers Guide in 2024 - HRSoftware.com
HR Software Buyers Guide in 2024 - HRSoftware.com
 
Software Quality Assurance Interview Questions
Software Quality Assurance Interview QuestionsSoftware Quality Assurance Interview Questions
Software Quality Assurance Interview Questions
 
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICECHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
CHEAP Call Girls in Pushp Vihar (-DELHI )🔝 9953056974🔝(=)/CALL GIRLS SERVICE
 
How To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.jsHow To Use Server-Side Rendering with Nuxt.js
How To Use Server-Side Rendering with Nuxt.js
 
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM TechniquesAI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
AI Mastery 201: Elevating Your Workflow with Advanced LLM Techniques
 
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
W01_panagenda_Navigating-the-Future-with-The-Hitchhikers-Guide-to-Notes-and-D...
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
Tech Tuesday-Harness the Power of Effective Resource Planning with OnePlan’s ...
 
VTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learnVTU technical seminar 8Th Sem on Scikit-learn
VTU technical seminar 8Th Sem on Scikit-learn
 

Natural language processing for requirements engineering: ICSE 2021 Technical Briefing

  • 1. Natural Language Processing for Requirements Engineering Tasks, Techniques, Tools, and Technologies Alessio Ferrari CNR-ISTI, Italy Liping Zhao University of Manchester, U.K. Waad Alhoshan IMSIU, Saudi Arabia Technical Briefing © 2021 Alessio Ferrari, Liping Zhao and Waad Alhoshan. Cite this presentation as: A. Ferrari, L. Zhao and W. Alhoshan, "NLP for Requirements Engineering: Tasks, Techniques, Tools, and Technologies," 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), 2021, pp. 322-323, doi: 10.1109/ICSE-Companion52605.2021.00137.
  • 2. Objectives ● Provide an overview of NLP tasks for RE (NLP4RE) ● Present the results of a mapping study on NLP4RE ● Point to relevant tools and resources to practice NLP4RE ● Introduce you to transfer learning for NLP4RE
  • 3. What is Natural Language Processing? Technologies enabling extraction and manipulation of information from natural language (NL) - English, Italian, Swedish, etc.
  • 4. Wait...What is a Requirement? ● Jackson and Zave: Condition over phenomena of the environment that we want to make true by developing the system ● Lamsweerde: Goal under the responsibility of a single agent of the software-to-be ● ISO/IEC/IEEE 29148 Standard: Statement which translates or expresses a need and its associated constraints and conditions ● Wikipedia: Singular documented physical or functional Need that a particular design, product or process aims to satisfy ● No agreed INTENSIONAL definition ● Some confusion on the types of requirements (e.g., user, system, software, business, functional, non-functional), the concept of specification, etc. ● So, let us give some EXAMPLES, and give an EXTENSIONAL definition
  • 5.
  • 6. This, and anything that reminds this, is a requirement!
  • 7. Why are requirements so special? ● Requirements are heterogeneous (specs, app reviews, legal docs) ● Requirements specifications use a more restricted vocabulary (about half of generic texts), and longer sentences ● App reviews are unstructured and informal ● 62% of the words used in reqs do not appear in generic texts This suggests that NLP tools trained on generic texts may need to be tailored for requirements Ferrari, A., Spagnolo, G. O., & Gnesi, S. PURE: A dataset of public requirements documents. IEEE RE 2017.
  • 9.
  • 10.
  • 11.
  • 12.
  • 13.
  • 14.
  • 15.
  • 16.
  • 17. Observations ● Most RE problems could be solved top-down ● I can enforce tracing when writing requirements ● I can use constrained natural languages to improve quality ● I can tag classes in advance ● I can write a glossary in advance ● Unfortunately, this does not happen, that’s why we need NLP ● We need NLP also to recover from errors when RE problems are addressed top-down by fallible humans
  • 18. NLP4RE Techniques, Tools & Resources Zhao, L., Alhoshan, W., Ferrari, A., et al. Natural Language Processing for Requirements Engineering: A Systematic Mapping Study. ACM Comput. Surv. 54, 3, Article 55 (April 2021), 41 pages. DOI:https://doi.org/10.1145/3444689
  • 20. Publication Status of NLP4RE Literature Key takeaways: ● 404 relevant papers in 4 decades ● 1st papers published in 1983 ● Fast growth since 2004 ● NLP4RE an active and thriving research area in RE ●
  • 21. State of Empirical Research in NLP4RE Majority NLP4RE studies (> 67%) are solution proposals, typically involving the development of a novel solution, a new method or a new technique Only a small number of studies (7%) is conducted in an industrial setting via a case study or field study About a third (35%) of the solution proposals are not evaluated but only illustrated using examples, discussion or simulation Remaining solution proposals are evaluated in a lab experiment using either students or software subjects Limited evidence of industrial uptake of NLP4RE results A typical NLP4RE study is a solution proposal, possibly evaluated internally through an experiment or example, but without evaluation in the real world
  • 22. State of Tool Development in NLP4RE Research Key takeaways: ● 130 tools developed over 30 years ● Only 17 (13%) of them available online ● No evidence these 17 tools still in use ● State of tool development very poor - tools for NLP4RE don’t exist!
  • 23. NLP Technologies Used in NLP4RE Research ● NLP technique: a underlying technique for performing a basic NLP task (e.g., POS tagging, parsing or tokenization) ● NLP resource: a linguistic data resource for supporting NLP tools - Lexical resources (e.g., WordNet, VerbNet) - Text corpus/dataset (e.g., British National Corpus and Brown Corpus) ● NLP tool: a software system or library supporting NLP pipelines (e.g., Stanford CoreNLP, NLTK or OpenNLP)
  • 24. Frequently Used NLP Techniques in NLP4RE Research Key takeaways: ● Only 32/140 (< 1 in 4) NLP techniques in frequent use ● 90% of these frequently used NLP techniques are word/syntactic based ● Baseline techniques, e.g., POS tagging, tokenization, syntactic parsing most in use Frequency of Use
  • 25. A Typical NLP Pipeline for Processing Requirements Text
  • 26. Frequently Used NLP Tools in NLP4RE Research Frequency of Use Key takeaways: ● 1 in 5 (14/66) NLP tools in frequent use ● The top 5 most used tools – Stanford CoreNLP, Gate, NLTK, OpenNLP, and Weka - are toolkits, supporting multiple NLP pipelines
  • 27. Top Most Used NLP Tools in NLP4RE Research ● Stanford CoreNLP (https://stanfordnlp.github.io/CoreNLP/): support common core NLP tasks, open source, datasets not available ● OpenNLP (https://opennlp.apache.org/): support both lower-level (POS, chunking) and high-level NLP tasks (language detection, classification), open source, datasets available ● NLTK (https://www.nltk.org): support a wide range of NLP tasks, open source + over 50 datasets available ● GATE (https://gate.ac.uk): support text processing tasks such as information extraction and text annotation ● WEKA (https://www.cs.waikato.ac.nz/ml/weka/): support data mining tasks such as data pre-processing, classification, regression, clustering, association rules, and visualization, open source
  • 28. Frequently Used NLP Resources in NLP4RE Research Frequency of Use Key takeaways: ● 1 in 2 (13/25) NLP resources in frequent use ● Lexical resources used most (i.e., WordNet, VerbNet) ● Only two RE related datasets: MODIS and CM-1
  • 29. Other RE Related NLP Resources ● PROMISE Software Engineering Repository (http://promise.site.uottawa.ca/SERepository/datasets-page.html) ○ Created by Sayyad and Menzies from University of Ottawa, Canada in 2005 ○ Contains 20 publicly available datasets (including MODIS and CM-1) ● PURE Dataset (https://zenodo.org/record/1414117) ○ Created by Alessio Ferrari, et al. in 2017 ○ Contains 79 publicly available requirements documents collected from the Web ● User Stories (https://data.mendeley.com/datasets/7zbk8zsd8y/1) ○ Created by Fabiano Dalpiaz in 2018 ○ Contains 22 datasets, each with 50+ requirements, expressed as user stories ● FN-RE (https://zenodo.org/record/1291660) ○ Created by Waad Alhoshan et al. in 2018 ○ Contains a dataset of requirements documents annotated with FrameNet semantics ● App Reviews ○ 13 annotated datasets are reported in this paper: Dabrowski, Jacek, et al. "App Review Analysis for Software Engineering: A Systematic Literature Review." University College London, Tech. Rep (2020). ○ Mobile App Market (https://sites.google.com/site/appsimilarity/)
  • 30. Emerging Trends in NLP4RE Research – Some Observations 2007 onwards Consistent rise in developing ML-based approaches (using ML algorithms such as SVM, DT, NB, KNN) for automatic requirements classification 2013 onwards Increase in using non-traditional requirements texts (e.g., app reviews and user stories) in NLP4RE research 2020 Upsurge in developing DL-based approaches (e.g., BERT and Bi-LSTM) for automatic requirements classification
  • 31. Emerging Trends in NLP4RE Research – Some Observations
  • 35. Transfer Learning & Machines To train a model to learn from a type of problem and leverage that model (i.e., the knowledge) to solve new BUT related problem. Pan, Sinno Jialin, and Qiang Yang. "A survey on transfer learning." IEEE Transactions on knowledge and data engineering 22.10 (2009): 1345-1359. less training power and less overall computing data (dataset)
  • 36. Transfer Learning for NLP: Language Model Language Model is a way to represent the relations between words in a language Woman queen (8%) king (3%) prince (0.2%) princess (5%) daughter (45%) son (0.1%) father (2%) mother (25%) Neural Language Models (Contextual Embedding) Designed to overcome ambiguity and language variations Ex. Transformers-based Models such asL BERT , GPT-2 & GPT-3, XLNet, and more!
  • 37. BERT Language Model Bi-directional Encoder Representations from Transformers Devlin, Jacob, et al. "Bert: Pre-training of deep bidirectional transformers for language understanding." arXiv preprint arXiv:1810.04805 (2018). Pretrained LM which is originally initiated by Google AI to enhance the Google search engine to predicate user’s queries BERTBase → 12 encoder layers BERTLarge → 24 encoder layers Vaswani, Ashish, et al. "Attention is all you need." arXiv preprint arXiv:1706.03762 (2017).
  • 38. BERT Language Model Bi-directional Encoder Representations from Transformers So, what can we do with BERT model? Use it as classifier Extract features and train a new classifier Fine-tune the BERT model Use pre-trained BERT model
  • 39. NLP Transfer Learning for RE: Getting Started Problem to Solve: Classifying a set of non-functional requirements which discern usability and security aspects. The dataset can downloaded from https://github.com/tobhey/NoRBERT Part 1: Using pre-trained BERT with Zero-Shot Learning Classifier Part 2: Using pre-trained BERT model to extract features and train a LR classifier Part 3: Fine-tune BERT model and use it in a new classifier https://colab.research.google.com/d rive/158H-lEJE1pc-xHc1ISBAKGD HMt_eg4Gn?usp=sharing https://colab.research.google.com/d rive/1B_5ow3rvS0Qz1y-KyJtlMNnm gmx9w3kJ?usp=sharing https://colab.research.google.com/d rive/1Xrm0gNaa41YwlM5g2CRYYX cRvpbDnTRT?usp=sharing
  • 40. Transfer Learning for NLP4RE Tasks: Key Takeaways On-the-fly requirement categorization/classification Auto-completing requirements at requirements elicitation phase Enabling re-usability of requirements from large repositories Language Model assist in identifying contextual information ... This is very useful for NLP4RE tasks “Unfortunately, LMs trained on unfiltered text corpora suffer from degenerate and biased behaviour.” Using the pre-trained model directly might not bring the best of these language techniques ⇒ Fine-tuning is highly encouraged for domain-specific task https://github.com/tobhey/NoRBERT Schramowski, Patrick, et al. "Language Models have a Moral Dimension." arXiv preprint arXiv:2103.11790 (2021).
  • 41. Progress Made: NLP4RE in 2018 From my previous technical briefing at ICSE 2018
  • 42. Progress Made: NLP4RE in 2021 Some tiny but important improvements, especially in Classification and Feedback Analysis
  • 43. What’s Next? Requirements are NOT normally expressed in natural language... We need to provide means to analyse multi-modal requirements
  • 44. State of Empirical Studies in NLP4RE Majority NLP4RE studies (> 67%) are lab based with experiments using either human or software subjects A typical NLP4RE study is a solution proposal, possibly evaluated internally through an experiment or example, but without evaluation in the real world Only a small number (7%) is conducted in an industrial setting via a case study or field study About 1 in 3 of the studies are not evaluated, as they are only illustrated through an example, discussion or simulation Limited evidence of industrial uptake of NLP4RE results © 2021 Alessio Ferrari, Liping Zhao and Waad Alhoshan. Cite this presentation as: A. Ferrari, L. Zhao and W. Alhoshan, "NLP for Requirements Engineering: Tasks, Techniques, Tools, and Technologies," 2021 IEEE/ACM 43rd International Conference on Software Engineering: Companion Proceedings (ICSE-Companion), 2021, pp. 322-323, doi: 10.1109/ICSE-Companion52605.2021.00137.