SlideShare una empresa de Scribd logo
1 de 32
‘Big models’
ThesuccessandpitfallsofTransformermodelsinNLP
Suzan Verberne | NOTaS | March 2023
Today’stalk
 Large Language Models
 BERT
 Huggingface
 Generative Pretrained Transformers (GPT)
 Challenges and problems
 Consequences for work and education
Suzan Verberne 2023
2
Large Language Models
Suzan Verberne 2023
3
LargeLanguage
Models  Transformers: Attention is all you need (2017)
 Designed for sequence-to-sequence (i.e. translation)
 Encoder-decoder architecture
Suzan Verberne 2023
4
Explanation of this paper: https://www.youtube.com/watch?v=iDulhoQ2pro
How it all started…
LargeLanguage
Models
Transformers are powerful because of
 the long-distance relation between all words (attention)
 parallel processing instead of sequential
 unsupervised pre-training on HUGE amount of data
Suzan Verberne 2023
5
LargeLanguage
Models BERT (Bidirectional Encoder Representations from
Transformers)
 An encoder-only transformer
 Input is text, output is embeddings
Suzan Verberne 2023
6
Next…
Some
linguistics…
BERT is based on the distributional hypothesis
 The context of a word defines its meaning
 Words that occur in similar contexts tend to be similar
Suzan Verberne 2023
Harris, Z. (1954). “Distributional structure”. Word. 10 (23): 146–162
Word
Embeddings
 BERT embeddings are learned from unlabelled data
 Through a process called ‘masked language modelling’
with self-supervision
Suzan Verberne 2023
BERT
 BERT is so powerful because
it is used in a transfer
learning setting
 Pre-training: learning
embeddings from huge
unlabeled data (self-
supervised)
 Fine-tuning: learning
the classification model
from smaller labeled
data (supervised) for
any NLP task (e.g.
sentiment, named
entities)
Suzan Verberne 2023
9
Huggingface
But also because:
 The authors (from Google) open-sourced the model
implementation
 And publicly release pretrained models (which are
computationally expensive to pretrain from scratch)
 https://huggingface.co/ is a the standard
implementation package for training and applying
Transformer models
 Currently over 150k models have been published on
Huggingface
Suzan Verberne 2023
10
11
Suzan Verberne 2023
Suzan Verberne 2023
12
Huggingface
Working with Huggingface
 Take a pre-trained model
 Run ‘zero-shot’:
from transformers import pipeline
sentiment_pipeline = pipeline("sentiment-analysis")
data = ["I love you", "I hate you"]
output=sentiment_pipeline(data)
print(output)
[{'label': 'POSITIVE', 'score': 0.9998656511306763},
{'label': 'NEGATIVE', 'score': 0.9991129040718079}]
 Or fine-tune on your own data
Suzan Verberne 2023
13
Default model: distilbert-base-uncased-finetuned-sst-2-english
Generative Pretrained
Transformers (GPT)
Suzan Verberne 2023
14
GPT  GPT is a decoder-only transformer model
 It does not have an encoder
 Instead: use the prompt to generate outputs
 A growing family of models since 2018: GPT-2,
DialoGPT, GPT-3, GPT3.5, ChatGPT, GPT-4
Suzan Verberne 2023
15
GPT-3
 GPT is trained to generate the most probable/plausible
text
 Trained on crawled internet data, open source books,
Wikipedia, sampled early 2022
 After each word, predict the most probable next word
given all the previous words
 It will give you fluent text that looks very real
Suzan Verberne 2023
16
Few-shot
learning
Few-shot learning: learn from a small number of examples
Suzan Verberne 2023
17
'Old paradigm'
• pre-training
• fine-tuning with ~100s-1000s
training samples
'New paradigm'
• pre-training
• prompting with ~3-50
examples in the prompt
Few-shot
learningwith
chatGPT
Suzan Verberne 2023
18
ChatGPT
 ChatGPT =
 GPT3.5
 + finetuning for conversations
 + reinforcement learning for better answers
Suzan Verberne 2023
19
https://openai.com/blog/chatgpt
WhyareLLMs
so powerful?
 Because they are HUGE (many parameters)
 And trained on HUGE data
Suzan Verberne 2023
20
https://huggingface.co/blog/large-language-models
Challenges and
problems with LLMs
Suzan Verberne 2023
21
Challengesand
problems
 Computational power
 Environmental footprint
 Heavy GPU computing required for training models
 Lengthy texts are challenging
 Low resource languages
 Low resource domains
 Closed models (‘OpenAI’) vs open source models
Suzan Verberne 2023
22
https://lessen-project.nl/ Together, the project partners will
develop, implement and evaluate state-of-the-art safe and
transparent chat-based conversational AI agents based on
state-of-the-art neural architectures. The focus is on lesser
resourced tasks, domains, and scenarios.
Challengesand
problems
 Factuality / consistency
 The output is fluent but not always correct
 Hallucination
Suzan Verberne 2023
23
Challengesand
problems
Suzan Verberne 2023
24
Challengesand
problems
Suzan Verberne 2023
25
Challengesand
problems
Suzan Verberne 2023
26
Challengesand
problems
 Search engines allow us to verify the source of the information
 Interfaces to generative language models should do the same
Suzan Verberne 2023
27
Consequences for work
and education
Suzan Verberne 2023
28
Consequences
forworkand
education
29
 Do not replace humans, but assist them to do
their work better
 When the boring part of the work is done by
computational models, the human can do the
interesting part
 (think about graphic designers using
generative models for creating images)
Suzan Verberne 2023
Consequences
forworkand
education
 Computational methods can help humans (students)
 Search engines
 Spelling correction
 Grammarly
 … Generative language models?
 New regulations
 We have to stress the importance of sources
 and of writing your own texts (and code!)
 and carefully pick our homework assignments
Suzan Verberne 2023
30
Research
opportunities
Use generative models to
 develop tools
 (e.g. QA-systems, chatbots, summarizers)
 generate training data1
 The prompting can be engineered to be more effective
 study linguistic phenomena
 which errors does the model make?
 study social phenomena
 simulate communication (opinionated /political content)2
Suzan Verberne 2023
31
1. https://github.com/arian-askari/ChatGPT-RetrievalQA
2. Chris Congleton, Peter van der Putten, and Suzan Verberne. Tracing Political Positioning of Dutch
Newspapers. In: Disinformation in Open Online Media. MISDOOM 2022.
Final
recommendations
 Listen to the interview with Emily Bender
Suzan Verberne 2023
32
Find me: https://duckduckgo.com/?t=ffab&q=suzan+verberne&ia=web

Más contenido relacionado

La actualidad más candente

Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMsSylvainGugger
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersLiangqun Lu
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language ModelsLeon Dohmen
 
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...Po-Chuan Chen
 
A Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptxA Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptxSaiPragnaKancheti
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1DianaGray10
 
Prompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowaniaPrompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowaniaMichal Jaskolski
 
An Introduction to XAI! Towards Trusting Your ML Models!
An Introduction to XAI! Towards Trusting Your ML Models!An Introduction to XAI! Towards Trusting Your ML Models!
An Introduction to XAI! Towards Trusting Your ML Models!Mansour Saffar
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroNumenta
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...David Talby
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsOVHcloud
 
Generative-AI-Exploring-beyond-the-horizons-possibilities-of-AI-WP.pdf
Generative-AI-Exploring-beyond-the-horizons-possibilities-of-AI-WP.pdfGenerative-AI-Exploring-beyond-the-horizons-possibilities-of-AI-WP.pdf
Generative-AI-Exploring-beyond-the-horizons-possibilities-of-AI-WP.pdfshashanksalunkhe12
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMsJim Steele
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersSteven Van Vaerenbergh
 

La actualidad más candente (20)

Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMs
 
BERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from TransformersBERT: Bidirectional Encoder Representations from Transformers
BERT: Bidirectional Encoder Representations from Transformers
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
Intro to LLMs
Intro to LLMsIntro to LLMs
Intro to LLMs
 
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attent...
 
A Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptxA Comprehensive Review of Large Language Models for.pptx
A Comprehensive Review of Large Language Models for.pptx
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
 
Bert
BertBert
Bert
 
Prompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowaniaPrompting is an art / Sztuka promptowania
Prompting is an art / Sztuka promptowania
 
LLMs Bootcamp
LLMs BootcampLLMs Bootcamp
LLMs Bootcamp
 
An Introduction to XAI! Towards Trusting Your ML Models!
An Introduction to XAI! Towards Trusting Your ML Models!An Introduction to XAI! Towards Trusting Your ML Models!
An Introduction to XAI! Towards Trusting Your ML Models!
 
gpt3_presentation.pdf
gpt3_presentation.pdfgpt3_presentation.pdf
gpt3_presentation.pdf
 
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve OmohundroOpenAI’s GPT 3 Language Model - guest Steve Omohundro
OpenAI’s GPT 3 Language Model - guest Steve Omohundro
 
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
Large Language Models, No-Code, and Responsible AI - Trends in Applied NLP in...
 
Fine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP modelsFine tune and deploy Hugging Face NLP models
Fine tune and deploy Hugging Face NLP models
 
Generative-AI-Exploring-beyond-the-horizons-possibilities-of-AI-WP.pdf
Generative-AI-Exploring-beyond-the-horizons-possibilities-of-AI-WP.pdfGenerative-AI-Exploring-beyond-the-horizons-possibilities-of-AI-WP.pdf
Generative-AI-Exploring-beyond-the-horizons-possibilities-of-AI-WP.pdf
 
Customizing LLMs
Customizing LLMsCustomizing LLMs
Customizing LLMs
 
Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
 
Generative AI
Generative AIGenerative AI
Generative AI
 
Explainable AI (XAI)
Explainable AI (XAI)Explainable AI (XAI)
Explainable AI (XAI)
 

Similar a ‘Big models’: the success and pitfalls of Transformer models in natural language processing

TEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORE
TEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORETEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORE
TEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCOREIJCI JOURNAL
 
Report finger print
Report finger printReport finger print
Report finger printEshaan Verma
 
Wise Document Translator Report
Wise Document Translator ReportWise Document Translator Report
Wise Document Translator ReportRaouf KESKES
 
Issues in the Design of a Code Generator.pptx
Issues in the Design of a Code Generator.pptxIssues in the Design of a Code Generator.pptx
Issues in the Design of a Code Generator.pptxSabbirHossen27
 
Research @ RELEASeD (presented at SATTOSE2013)
Research @ RELEASeD (presented at SATTOSE2013)Research @ RELEASeD (presented at SATTOSE2013)
Research @ RELEASeD (presented at SATTOSE2013)kim.mens
 
Natural Language Generation in the Wild
Natural Language Generation in the WildNatural Language Generation in the Wild
Natural Language Generation in the WildDaniel Beck
 
2019 05 11 Chicago Codecamp - Deep Learning for everyone? Challenge Accepted!
2019 05 11 Chicago Codecamp - Deep Learning for everyone? Challenge Accepted!2019 05 11 Chicago Codecamp - Deep Learning for everyone? Challenge Accepted!
2019 05 11 Chicago Codecamp - Deep Learning for everyone? Challenge Accepted!Bruno Capuano
 
Exploring Generating AI with Diffusion Models
Exploring Generating AI with Diffusion ModelsExploring Generating AI with Diffusion Models
Exploring Generating AI with Diffusion ModelsKonfHubTechConferenc
 
Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdfQualcomm Research
 
Game Design as an Intro to Computer Science (Meaningful Play 2014)
Game Design as an Intro to Computer Science (Meaningful Play 2014)Game Design as an Intro to Computer Science (Meaningful Play 2014)
Game Design as an Intro to Computer Science (Meaningful Play 2014)marksuter
 
Device for text to speech production and to braille script
Device for text to speech production and to braille scriptDevice for text to speech production and to braille script
Device for text to speech production and to braille scriptIAEME Publication
 
Addressing open Machine Translation problems with Linked Data.
  Addressing open Machine Translation problems with Linked Data.  Addressing open Machine Translation problems with Linked Data.
Addressing open Machine Translation problems with Linked Data.DiegoMoussallem
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.pptmilkesa13
 
Brightspace and Math Formulae: Making Friends - 2014 Brightspace Ignite Wisco...
Brightspace and Math Formulae: Making Friends - 2014 Brightspace Ignite Wisco...Brightspace and Math Formulae: Making Friends - 2014 Brightspace Ignite Wisco...
Brightspace and Math Formulae: Making Friends - 2014 Brightspace Ignite Wisco...D2L Barry
 
Ary Mouse for Image Processing
Ary Mouse for Image ProcessingAry Mouse for Image Processing
Ary Mouse for Image ProcessingIJERA Editor
 
Ary Mouse for Image Processing
Ary Mouse for Image ProcessingAry Mouse for Image Processing
Ary Mouse for Image ProcessingIJERA Editor
 

Similar a ‘Big models’: the success and pitfalls of Transformer models in natural language processing (20)

TEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORE
TEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORETEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORE
TEXT GENERATION WITH GAN NETWORKS USING FEEDBACK SCORE
 
Demo day
Demo dayDemo day
Demo day
 
Report finger print
Report finger printReport finger print
Report finger print
 
Wise Document Translator Report
Wise Document Translator ReportWise Document Translator Report
Wise Document Translator Report
 
Issues in the Design of a Code Generator.pptx
Issues in the Design of a Code Generator.pptxIssues in the Design of a Code Generator.pptx
Issues in the Design of a Code Generator.pptx
 
Research @ RELEASeD (presented at SATTOSE2013)
Research @ RELEASeD (presented at SATTOSE2013)Research @ RELEASeD (presented at SATTOSE2013)
Research @ RELEASeD (presented at SATTOSE2013)
 
Natural Language Generation in the Wild
Natural Language Generation in the WildNatural Language Generation in the Wild
Natural Language Generation in the Wild
 
2019 05 11 Chicago Codecamp - Deep Learning for everyone? Challenge Accepted!
2019 05 11 Chicago Codecamp - Deep Learning for everyone? Challenge Accepted!2019 05 11 Chicago Codecamp - Deep Learning for everyone? Challenge Accepted!
2019 05 11 Chicago Codecamp - Deep Learning for everyone? Challenge Accepted!
 
2005_matzon
2005_matzon2005_matzon
2005_matzon
 
Thesis
ThesisThesis
Thesis
 
Exploring Generating AI with Diffusion Models
Exploring Generating AI with Diffusion ModelsExploring Generating AI with Diffusion Models
Exploring Generating AI with Diffusion Models
 
Generative AI at the edge.pdf
Generative AI at the edge.pdfGenerative AI at the edge.pdf
Generative AI at the edge.pdf
 
Game Design as an Intro to Computer Science (Meaningful Play 2014)
Game Design as an Intro to Computer Science (Meaningful Play 2014)Game Design as an Intro to Computer Science (Meaningful Play 2014)
Game Design as an Intro to Computer Science (Meaningful Play 2014)
 
Ase01.ppt
Ase01.pptAse01.ppt
Ase01.ppt
 
Device for text to speech production and to braille script
Device for text to speech production and to braille scriptDevice for text to speech production and to braille script
Device for text to speech production and to braille script
 
Addressing open Machine Translation problems with Linked Data.
  Addressing open Machine Translation problems with Linked Data.  Addressing open Machine Translation problems with Linked Data.
Addressing open Machine Translation problems with Linked Data.
 
2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt2-Chapter Two-N-gram Language Models.ppt
2-Chapter Two-N-gram Language Models.ppt
 
Brightspace and Math Formulae: Making Friends - 2014 Brightspace Ignite Wisco...
Brightspace and Math Formulae: Making Friends - 2014 Brightspace Ignite Wisco...Brightspace and Math Formulae: Making Friends - 2014 Brightspace Ignite Wisco...
Brightspace and Math Formulae: Making Friends - 2014 Brightspace Ignite Wisco...
 
Ary Mouse for Image Processing
Ary Mouse for Image ProcessingAry Mouse for Image Processing
Ary Mouse for Image Processing
 
Ary Mouse for Image Processing
Ary Mouse for Image ProcessingAry Mouse for Image Processing
Ary Mouse for Image Processing
 

Más de Leiden University

Text mining for health knowledge discovery
Text mining for health knowledge discoveryText mining for health knowledge discovery
Text mining for health knowledge discoveryLeiden University
 
Text Mining for Lexicography
Text Mining for LexicographyText Mining for Lexicography
Text Mining for LexicographyLeiden University
 
'Het nieuwe zoeken' voor informatieprofessionals
'Het nieuwe zoeken' voor informatieprofessionals'Het nieuwe zoeken' voor informatieprofessionals
'Het nieuwe zoeken' voor informatieprofessionalsLeiden University
 
Automatische classificatie van teksten
Automatische classificatie van tekstenAutomatische classificatie van teksten
Automatische classificatie van tekstenLeiden University
 
Summarizing discussion threads
Summarizing discussion threadsSummarizing discussion threads
Summarizing discussion threadsLeiden University
 
Automatische classificatie van teksten
Automatische classificatie van tekstenAutomatische classificatie van teksten
Automatische classificatie van tekstenLeiden University
 
Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?
Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?
Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?Leiden University
 
RemBench: A Digital Workbench for Rembrandt Research
RemBench: A Digital Workbench for Rembrandt ResearchRemBench: A Digital Workbench for Rembrandt Research
RemBench: A Digital Workbench for Rembrandt ResearchLeiden University
 
Collecting a dataset of information behaviour in context
Collecting a dataset of information behaviour in contextCollecting a dataset of information behaviour in context
Collecting a dataset of information behaviour in contextLeiden University
 
Search engines for the humanities that go beyond Google
Search engines for the humanities that go beyond GoogleSearch engines for the humanities that go beyond Google
Search engines for the humanities that go beyond GoogleLeiden University
 
Krijgen we ooit de beschikking over slimme zoektechnologie?
Krijgen we ooit de beschikking over slimme zoektechnologie?Krijgen we ooit de beschikking over slimme zoektechnologie?
Krijgen we ooit de beschikking over slimme zoektechnologie?Leiden University
 

Más de Leiden University (14)

Text mining for health knowledge discovery
Text mining for health knowledge discoveryText mining for health knowledge discovery
Text mining for health knowledge discovery
 
Text Mining for Lexicography
Text Mining for LexicographyText Mining for Lexicography
Text Mining for Lexicography
 
'Het nieuwe zoeken' voor informatieprofessionals
'Het nieuwe zoeken' voor informatieprofessionals'Het nieuwe zoeken' voor informatieprofessionals
'Het nieuwe zoeken' voor informatieprofessionals
 
kanker.nl & Data Science
kanker.nl & Data Sciencekanker.nl & Data Science
kanker.nl & Data Science
 
Automatische classificatie van teksten
Automatische classificatie van tekstenAutomatische classificatie van teksten
Automatische classificatie van teksten
 
Tutorial on word2vec
Tutorial on word2vecTutorial on word2vec
Tutorial on word2vec
 
Computationeel denken
Computationeel denkenComputationeel denken
Computationeel denken
 
Summarizing discussion threads
Summarizing discussion threadsSummarizing discussion threads
Summarizing discussion threads
 
Automatische classificatie van teksten
Automatische classificatie van tekstenAutomatische classificatie van teksten
Automatische classificatie van teksten
 
Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?
Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?
Leer je digitale klanten kennen: hoe zoeken ze en wat vinden ze?
 
RemBench: A Digital Workbench for Rembrandt Research
RemBench: A Digital Workbench for Rembrandt ResearchRemBench: A Digital Workbench for Rembrandt Research
RemBench: A Digital Workbench for Rembrandt Research
 
Collecting a dataset of information behaviour in context
Collecting a dataset of information behaviour in contextCollecting a dataset of information behaviour in context
Collecting a dataset of information behaviour in context
 
Search engines for the humanities that go beyond Google
Search engines for the humanities that go beyond GoogleSearch engines for the humanities that go beyond Google
Search engines for the humanities that go beyond Google
 
Krijgen we ooit de beschikking over slimme zoektechnologie?
Krijgen we ooit de beschikking over slimme zoektechnologie?Krijgen we ooit de beschikking over slimme zoektechnologie?
Krijgen we ooit de beschikking over slimme zoektechnologie?
 

Último

Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfSumit Kumar yadav
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRDelhi Call girls
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSarthak Sekhar Mondal
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​kaibalyasahoo82800
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptxanandsmhk
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physicsvishikhakeshava1
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |aasikanpl
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 sciencefloriejanemacaya1
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfnehabiju2046
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxSwapnil Therkar
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...jana861314
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxUmerFayaz5
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxAleenaTreesaSaji
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...Sérgio Sacani
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bSérgio Sacani
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsSérgio Sacani
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCEPRINCE C P
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...RohitNehra6
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfSwapnil Therkar
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsAArockiyaNisha
 

Último (20)

Botany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdfBotany 4th semester file By Sumit Kumar yadav.pdf
Botany 4th semester file By Sumit Kumar yadav.pdf
 
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCRStunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
Stunning ➥8448380779▻ Call Girls In Panchshil Enclave Delhi NCR
 
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatidSpermiogenesis or Spermateleosis or metamorphosis of spermatid
Spermiogenesis or Spermateleosis or metamorphosis of spermatid
 
Nanoparticles synthesis and characterization​ ​
Nanoparticles synthesis and characterization​  ​Nanoparticles synthesis and characterization​  ​
Nanoparticles synthesis and characterization​ ​
 
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptxUnlocking  the Potential: Deep dive into ocean of Ceramic Magnets.pptx
Unlocking the Potential: Deep dive into ocean of Ceramic Magnets.pptx
 
Work, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE PhysicsWork, Energy and Power for class 10 ICSE Physics
Work, Energy and Power for class 10 ICSE Physics
 
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
Call Us ≽ 9953322196 ≼ Call Girls In Mukherjee Nagar(Delhi) |
 
Boyles law module in the grade 10 science
Boyles law module in the grade 10 scienceBoyles law module in the grade 10 science
Boyles law module in the grade 10 science
 
A relative description on Sonoporation.pdf
A relative description on Sonoporation.pdfA relative description on Sonoporation.pdf
A relative description on Sonoporation.pdf
 
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptxAnalytical Profile of Coleus Forskohlii | Forskolin .pptx
Analytical Profile of Coleus Forskohlii | Forskolin .pptx
 
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
Traditional Agroforestry System in India- Shifting Cultivation, Taungya, Home...
 
Animal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptxAnimal Communication- Auditory and Visual.pptx
Animal Communication- Auditory and Visual.pptx
 
GFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptxGFP in rDNA Technology (Biotechnology).pptx
GFP in rDNA Technology (Biotechnology).pptx
 
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
PossibleEoarcheanRecordsoftheGeomagneticFieldPreservedintheIsuaSupracrustalBe...
 
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43bNightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
Nightside clouds and disequilibrium chemistry on the hot Jupiter WASP-43b
 
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroidsHubble Asteroid Hunter III. Physical properties of newly found asteroids
Hubble Asteroid Hunter III. Physical properties of newly found asteroids
 
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCESTERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
STERILITY TESTING OF PHARMACEUTICALS ppt by DR.C.P.PRINCE
 
Biopesticide (2).pptx .This slides helps to know the different types of biop...
Biopesticide (2).pptx  .This slides helps to know the different types of biop...Biopesticide (2).pptx  .This slides helps to know the different types of biop...
Biopesticide (2).pptx .This slides helps to know the different types of biop...
 
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdfAnalytical Profile of Coleus Forskohlii | Forskolin .pdf
Analytical Profile of Coleus Forskohlii | Forskolin .pdf
 
Natural Polymer Based Nanomaterials
Natural Polymer Based NanomaterialsNatural Polymer Based Nanomaterials
Natural Polymer Based Nanomaterials
 

‘Big models’: the success and pitfalls of Transformer models in natural language processing

  • 2. Today’stalk  Large Language Models  BERT  Huggingface  Generative Pretrained Transformers (GPT)  Challenges and problems  Consequences for work and education Suzan Verberne 2023 2
  • 3. Large Language Models Suzan Verberne 2023 3
  • 4. LargeLanguage Models  Transformers: Attention is all you need (2017)  Designed for sequence-to-sequence (i.e. translation)  Encoder-decoder architecture Suzan Verberne 2023 4 Explanation of this paper: https://www.youtube.com/watch?v=iDulhoQ2pro How it all started…
  • 5. LargeLanguage Models Transformers are powerful because of  the long-distance relation between all words (attention)  parallel processing instead of sequential  unsupervised pre-training on HUGE amount of data Suzan Verberne 2023 5
  • 6. LargeLanguage Models BERT (Bidirectional Encoder Representations from Transformers)  An encoder-only transformer  Input is text, output is embeddings Suzan Verberne 2023 6 Next…
  • 7. Some linguistics… BERT is based on the distributional hypothesis  The context of a word defines its meaning  Words that occur in similar contexts tend to be similar Suzan Verberne 2023 Harris, Z. (1954). “Distributional structure”. Word. 10 (23): 146–162
  • 8. Word Embeddings  BERT embeddings are learned from unlabelled data  Through a process called ‘masked language modelling’ with self-supervision Suzan Verberne 2023
  • 9. BERT  BERT is so powerful because it is used in a transfer learning setting  Pre-training: learning embeddings from huge unlabeled data (self- supervised)  Fine-tuning: learning the classification model from smaller labeled data (supervised) for any NLP task (e.g. sentiment, named entities) Suzan Verberne 2023 9
  • 10. Huggingface But also because:  The authors (from Google) open-sourced the model implementation  And publicly release pretrained models (which are computationally expensive to pretrain from scratch)  https://huggingface.co/ is a the standard implementation package for training and applying Transformer models  Currently over 150k models have been published on Huggingface Suzan Verberne 2023 10
  • 13. Huggingface Working with Huggingface  Take a pre-trained model  Run ‘zero-shot’: from transformers import pipeline sentiment_pipeline = pipeline("sentiment-analysis") data = ["I love you", "I hate you"] output=sentiment_pipeline(data) print(output) [{'label': 'POSITIVE', 'score': 0.9998656511306763}, {'label': 'NEGATIVE', 'score': 0.9991129040718079}]  Or fine-tune on your own data Suzan Verberne 2023 13 Default model: distilbert-base-uncased-finetuned-sst-2-english
  • 15. GPT  GPT is a decoder-only transformer model  It does not have an encoder  Instead: use the prompt to generate outputs  A growing family of models since 2018: GPT-2, DialoGPT, GPT-3, GPT3.5, ChatGPT, GPT-4 Suzan Verberne 2023 15
  • 16. GPT-3  GPT is trained to generate the most probable/plausible text  Trained on crawled internet data, open source books, Wikipedia, sampled early 2022  After each word, predict the most probable next word given all the previous words  It will give you fluent text that looks very real Suzan Verberne 2023 16
  • 17. Few-shot learning Few-shot learning: learn from a small number of examples Suzan Verberne 2023 17 'Old paradigm' • pre-training • fine-tuning with ~100s-1000s training samples 'New paradigm' • pre-training • prompting with ~3-50 examples in the prompt
  • 19. ChatGPT  ChatGPT =  GPT3.5  + finetuning for conversations  + reinforcement learning for better answers Suzan Verberne 2023 19 https://openai.com/blog/chatgpt
  • 20. WhyareLLMs so powerful?  Because they are HUGE (many parameters)  And trained on HUGE data Suzan Verberne 2023 20 https://huggingface.co/blog/large-language-models
  • 21. Challenges and problems with LLMs Suzan Verberne 2023 21
  • 22. Challengesand problems  Computational power  Environmental footprint  Heavy GPU computing required for training models  Lengthy texts are challenging  Low resource languages  Low resource domains  Closed models (‘OpenAI’) vs open source models Suzan Verberne 2023 22 https://lessen-project.nl/ Together, the project partners will develop, implement and evaluate state-of-the-art safe and transparent chat-based conversational AI agents based on state-of-the-art neural architectures. The focus is on lesser resourced tasks, domains, and scenarios.
  • 23. Challengesand problems  Factuality / consistency  The output is fluent but not always correct  Hallucination Suzan Verberne 2023 23
  • 27. Challengesand problems  Search engines allow us to verify the source of the information  Interfaces to generative language models should do the same Suzan Verberne 2023 27
  • 28. Consequences for work and education Suzan Verberne 2023 28
  • 29. Consequences forworkand education 29  Do not replace humans, but assist them to do their work better  When the boring part of the work is done by computational models, the human can do the interesting part  (think about graphic designers using generative models for creating images) Suzan Verberne 2023
  • 30. Consequences forworkand education  Computational methods can help humans (students)  Search engines  Spelling correction  Grammarly  … Generative language models?  New regulations  We have to stress the importance of sources  and of writing your own texts (and code!)  and carefully pick our homework assignments Suzan Verberne 2023 30
  • 31. Research opportunities Use generative models to  develop tools  (e.g. QA-systems, chatbots, summarizers)  generate training data1  The prompting can be engineered to be more effective  study linguistic phenomena  which errors does the model make?  study social phenomena  simulate communication (opinionated /political content)2 Suzan Verberne 2023 31 1. https://github.com/arian-askari/ChatGPT-RetrievalQA 2. Chris Congleton, Peter van der Putten, and Suzan Verberne. Tracing Political Positioning of Dutch Newspapers. In: Disinformation in Open Online Media. MISDOOM 2022.
  • 32. Final recommendations  Listen to the interview with Emily Bender Suzan Verberne 2023 32 Find me: https://duckduckgo.com/?t=ffab&q=suzan+verberne&ia=web