SlideShare una empresa de Scribd logo
1 de 21
Descargar para leer sin conexión
Implications of GPT-3
Raven Jiang
raven@cs.stanford.edu
Overview
This document covers:
1. What is GPT-3?
2. How is GPT-3 different?
3. OpenAI’s API strategy
4. Potential commercial implications
Details and charts from the OpenAI paper and a talk given on 7/24 by Ben Mann, the second author.
Please send feedback to Raven Jiang (raven@cs.stanford.edu)
Disclaimer: I am not affiliated with OpenAI, nor an expert in deep learning. I possess practical
knowledge of its implementation
1. What is GPT-3?
2. How is GPT-3 different?
3. OpenAI’s API strategy
4. Potential commercial implications
What is GPT-3?
• Text generator deep learning model trained by OpenAI
• Transformer architecture pioneered by Google
• GPT-2, BERT, XLNet, and RoBERTa
• Task agnostic
• Unsupervised learning
• 100x larger (more parameters) than its predecessor GPT-2 (2018)
• Estimated to have cost $12 million of computation cost to train
• Trained on text data from books and websites
What does it do?
• Seemingly many things
• Translation
• Write new poetry
• Generate stories
• Have a conversation
• Answer questions
• Generate working React code
• Generate Figma designs
• Magical VLOOKUP backed by the
Internet
• Maybe creativity is not hard for AI
GPT-3 training data
Common Crawl
59%
WebText2
22%
Books1
8%
Books2
8%
Wikipedia
3%• Common Crawl is scraped web
data manually filtered for some
quality issues
• Books1 and Books2 are mostly
fiction
• Books2 includes non-English
content
Transformer architecture
Transformer-based models Older NLP neural network models
Examples Google’s BERT, OpenAI’s GPT-3, Microsoft’s
Turing-NLG
Google’s GNMT
Task Task-agnostic
The same model is successful at many
different language tasks without additional
training
Trained for a specific task
Models are usually trained for a certain task
and fine-tuned for a related task with
additional training data (e.g. Transfer
Learning)
Training Unsupervised training
Model is trained with large collections of text
without special annotations
Supervised training
Trained with large quantities of input
annotated with expected output that are
usually human-generated
Transformers are the state of the art for NLP neural networks
Example translation workflow
Transformer
1. Train on unrelated English and French text
2. Query describes desired pattern:
"""Translate these sentences:
Hello => Bonjour
That is a cat => C'est un chat
You pass butter =>"""
3. Result:
"""Translate these sentences:
Hello => Bonjour
That is a cat => C'est un chat
You pass butter => Tu passes du beurre"""
Pre-Transformer Language Models
1. Create dataset of English-French text
examples
2. Train on dataset
3. Query:
"You pass butter"
4. Result:
"Tu passes du beurre"
Goal 1: Find the French translation for “You pass butter.”
Generalizability of Transformer models
Transformer
1. Use the same model as the previous task
2. Query describes new pattern:
"""Here are some great dad jokes:
Q: How do you make a lemon drop? A: Let it fall.
Q: What has ears but cannot hear? A: A cornfield.
Q:"""
3. Result:
""”Here are some great dad jokes:
Q: How do you make a lemon drop? A: Let it fall.
Q: What has ears but cannot hear? A: A cornfield.
Q: How does a vampire start a letter? A. Dear blood."""
Pre-Transformer Language Models
1. Create/source a new annotated dataset
suited for the new task
2. Retrain the model either with Transfer
Learning or from scratch
3. Query
Goal 2: Tell some dad jokes
1. What is GPT-3?
2. How is GPT-3 different?
3. OpenAI’s API strategy
4. Potential commercial implications
How is GPT-3 different?
• It is huge.
• 175 billion parameters
• Its predecessor GPT-2 has 1.5
billion parameters
• The previous record holder,
Microsoft’s Turing-NLG, has 17
billion parameters
• Innovation of scale not technique
0 50 100 150 200
GPT-2 - 2018/06
Turing-NLG - 2020/02
GPT-3 - 2020/07
Parameters (Billions)
Parameters (Billions)
Power of scale
• Scale made a dramatic
difference in performance
• Accuracy increased from
25% to 65% for a specific
benchmarking task going
from 13B parameters to
175B parameters
Uncanny Valley
• Participants asked to spot
fake news generated by
GPT-3
• More parameters = harder
to spot
• Very close to 50-50
accuracy at GPT-3 scale
Returns to scale
• Task performance appears
to continue improving
with scale
• How will GPT-4 perform?
Consequences of scale
• Querying is extremely powerful
• Unexpectedly good performance on a large variety of tasks
• Compared to older task-specific models, API-only access is useful for
broader range of applications
• Caveat: performance probably still inferior to task-specific models
• Caveat2: performance may continue to improve with scale
1. What is GPT-3?
2. How is GPT-3 different?
3. OpenAI’s API strategy
4. Potential commercial implications
OpenAI’s API strategy
• Gated API access to selected partners
• No access to underlying GPT-3 model and its trained weights
• Turns NLP from annotation/training problem into meta-programming
• Designing queries to yield useful results on a range of language problems
• Much friendlier paradigm for small teams and product-driven startups
• MaaS (Model as a Service) is viable business(?)
• Extremely large NLP models as OpEx instead of CapEx
• No need to fine-tune models for problems with more training data
• Concerns over AGI risk
1. What is GPT-3?
2. How is GPT-3 different?
3. OpenAI’s API strategy
4. Potential commercial implications
Potential commercial implications
• Access to GPT-3 (or future GPT-4) API accelerates go-to-market speed
of a startup doing applied NLP
• Build MVP using GPT-3 without investing in any training data or infrastructure
• Switch to better performing fine-tuned models over time
• Companies like Grammarly may face low-cost competitors
• Building NLP-powered product features may be as simple as
programming GPT-3 to answer the right questions
• Caveat: Only if GPT-3 (or GPT-4) turns out to be Good Enough for
these applications. Unclear without wider access to the API
Apps using OpenAI API
Conclusion
• Task-agnostic NLP models that deliver acceptable performance may
soon be available as a service
• GPT-3 may be that model
• Potential explosion of startups building MVPs on such an API
• Investor warning: startups dependent on the API may lack expertise
and tools to iterate off MVP
• Happy to chat more: raven@cs.stanford.edu

Más contenido relacionado

La actualidad más candente

Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersSteven Van Vaerenbergh
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapAnant Corporation
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language ModelsLeon Dohmen
 
LLMs_talk_March23.pdf
LLMs_talk_March23.pdfLLMs_talk_March23.pdf
LLMs_talk_March23.pdfChaoYang81
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMsSylvainGugger
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Mihai Criveti
 
Google BARD v/s ChatGPT _ A review
Google BARD v/s ChatGPT _ A reviewGoogle BARD v/s ChatGPT _ A review
Google BARD v/s ChatGPT _ A reviewDR. Ram Kumar Pathak
 
LanGCHAIN Framework
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN FrameworkKeymate.AI
 
ChatGPT vs. GPT-3.pdf
ChatGPT vs. GPT-3.pdfChatGPT vs. GPT-3.pdf
ChatGPT vs. GPT-3.pdfAddepto
 
Everything to know about ChatGPT
Everything to know about ChatGPTEverything to know about ChatGPT
Everything to know about ChatGPTKnoldus Inc.
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1DianaGray10
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...ssuser4edc93
 
Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPTLoic Merckel
 
An Introduction to Generative AI - May 18, 2023
An Introduction  to Generative AI - May 18, 2023An Introduction  to Generative AI - May 18, 2023
An Introduction to Generative AI - May 18, 2023CoriFaklaris1
 
[DSC DACH 23] ChatGPT and Beyond: How generative AI is Changing the way peopl...
[DSC DACH 23] ChatGPT and Beyond: How generative AI is Changing the way peopl...[DSC DACH 23] ChatGPT and Beyond: How generative AI is Changing the way peopl...
[DSC DACH 23] ChatGPT and Beyond: How generative AI is Changing the way peopl...DataScienceConferenc1
 
Jawad's presentation on GPT.pptx
Jawad's presentation on GPT.pptxJawad's presentation on GPT.pptx
Jawad's presentation on GPT.pptxJawadNadeem3
 

La actualidad más candente (20)

Let's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchersLet's talk about GPT: A crash course in Generative AI for researchers
Let's talk about GPT: A crash course in Generative AI for researchers
 
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer RoadmapEpisode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
Episode 2: The LLM / GPT / AI Prompt / Data Engineer Roadmap
 
And then there were ... Large Language Models
And then there were ... Large Language ModelsAnd then there were ... Large Language Models
And then there were ... Large Language Models
 
LLMs_talk_March23.pdf
LLMs_talk_March23.pdfLLMs_talk_March23.pdf
LLMs_talk_March23.pdf
 
Fine tuning large LMs
Fine tuning large LMsFine tuning large LMs
Fine tuning large LMs
 
Intro to LLMs
Intro to LLMsIntro to LLMs
Intro to LLMs
 
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
Retrieval Augmented Generation in Practice: Scalable GenAI platforms with k8s...
 
ChatGPT 101.pptx
ChatGPT 101.pptxChatGPT 101.pptx
ChatGPT 101.pptx
 
Google BARD v/s ChatGPT _ A review
Google BARD v/s ChatGPT _ A reviewGoogle BARD v/s ChatGPT _ A review
Google BARD v/s ChatGPT _ A review
 
LanGCHAIN Framework
LanGCHAIN FrameworkLanGCHAIN Framework
LanGCHAIN Framework
 
ChatGPT vs. GPT-3.pdf
ChatGPT vs. GPT-3.pdfChatGPT vs. GPT-3.pdf
ChatGPT vs. GPT-3.pdf
 
Everything to know about ChatGPT
Everything to know about ChatGPTEverything to know about ChatGPT
Everything to know about ChatGPT
 
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1AI and ML Series - Introduction to Generative AI and LLMs - Session 1
AI and ML Series - Introduction to Generative AI and LLMs - Session 1
 
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...How Does Generative AI Actually Work? (a quick semi-technical introduction to...
How Does Generative AI Actually Work? (a quick semi-technical introduction to...
 
Generative Models and ChatGPT
Generative Models and ChatGPTGenerative Models and ChatGPT
Generative Models and ChatGPT
 
openai.pptx
openai.pptxopenai.pptx
openai.pptx
 
Webinar on ChatGPT.pptx
Webinar on ChatGPT.pptxWebinar on ChatGPT.pptx
Webinar on ChatGPT.pptx
 
An Introduction to Generative AI - May 18, 2023
An Introduction  to Generative AI - May 18, 2023An Introduction  to Generative AI - May 18, 2023
An Introduction to Generative AI - May 18, 2023
 
[DSC DACH 23] ChatGPT and Beyond: How generative AI is Changing the way peopl...
[DSC DACH 23] ChatGPT and Beyond: How generative AI is Changing the way peopl...[DSC DACH 23] ChatGPT and Beyond: How generative AI is Changing the way peopl...
[DSC DACH 23] ChatGPT and Beyond: How generative AI is Changing the way peopl...
 
Jawad's presentation on GPT.pptx
Jawad's presentation on GPT.pptxJawad's presentation on GPT.pptx
Jawad's presentation on GPT.pptx
 

Similar a Implications of GPT-3

SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...DevOpsDays Tel Aviv
 
Unleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
Unleashing the Power of OpenAI GPT-3 in FME Data Integration WorkflowsUnleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
Unleashing the Power of OpenAI GPT-3 in FME Data Integration WorkflowsSafe Software
 
ChatGPT and OpenAI.pdf
ChatGPT and OpenAI.pdfChatGPT and OpenAI.pdf
ChatGPT and OpenAI.pdfSonal Tiwari
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningSanghamitra Deb
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Alok Singh
 
Use Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdfUse Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdfM Waleed Kadous
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or realityAwantik Das
 
Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesNick Pentreath
 
Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...
Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...
Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...BIHI Oussama
 
OpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsOpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsIvo Andreev
 
PyCon Korea - Real World Graphene
PyCon Korea - Real World GraphenePyCon Korea - Real World Graphene
PyCon Korea - Real World GrapheneMarcin Gębala
 
Whats Next for Machine Learning
Whats Next for Machine LearningWhats Next for Machine Learning
Whats Next for Machine LearningOgilvy Consulting
 
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdfLLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdfGDG Bujumbura
 
Explore The Machine Learning and TensorFlow
Explore The Machine Learning and TensorFlowExplore The Machine Learning and TensorFlow
Explore The Machine Learning and TensorFlowMahaKhalidALhobishi
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLionel Briand
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabszekeLabs Technologies
 
Generative AI in CSharp with Semantic Kernel.pptx
Generative AI in CSharp with Semantic Kernel.pptxGenerative AI in CSharp with Semantic Kernel.pptx
Generative AI in CSharp with Semantic Kernel.pptxAlon Fliess
 
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity ManagementEDB
 
Advanced Testing
Advanced TestingAdvanced Testing
Advanced TestingPostman
 

Similar a Implications of GPT-3 (20)

SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
SPOTLIGHT IGNITE (10 MINUTES): THE FUTURE OF DEVELOPER TOOLS: FROM STACKOVERF...
 
Unleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
Unleashing the Power of OpenAI GPT-3 in FME Data Integration WorkflowsUnleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
Unleashing the Power of OpenAI GPT-3 in FME Data Integration Workflows
 
ChatGPT and OpenAI.pdf
ChatGPT and OpenAI.pdfChatGPT and OpenAI.pdf
ChatGPT and OpenAI.pdf
 
Multi-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learningMulti-modal sources for predictive modeling using deep learning
Multi-modal sources for predictive modeling using deep learning
 
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
Big Data Spain 2018: How to build Weighted XGBoost ML model for Imbalance dat...
 
Use Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdfUse Case Patterns for LLM Applications (1).pdf
Use Case Patterns for LLM Applications (1).pdf
 
AI hype or reality
AI  hype or realityAI  hype or reality
AI hype or reality
 
Open, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI PipelinesOpen, Secure & Transparent AI Pipelines
Open, Secure & Transparent AI Pipelines
 
Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...
Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...
Developing Apps with GPT-4 and ChatGPT_ Build Intelligent Chatbots, Content G...
 
OpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and MisconceptionsOpenAI GPT in Depth - Questions and Misconceptions
OpenAI GPT in Depth - Questions and Misconceptions
 
PyCon Korea - Real World Graphene
PyCon Korea - Real World GraphenePyCon Korea - Real World Graphene
PyCon Korea - Real World Graphene
 
Whats Next for Machine Learning
Whats Next for Machine LearningWhats Next for Machine Learning
Whats Next for Machine Learning
 
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdfLLMs for the “GPU-Poor” - Franck Nijimbere.pdf
LLMs for the “GPU-Poor” - Franck Nijimbere.pdf
 
Explore The Machine Learning and TensorFlow
Explore The Machine Learning and TensorFlowExplore The Machine Learning and TensorFlow
Explore The Machine Learning and TensorFlow
 
Large Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and RepairLarge Language Models for Test Case Evolution and Repair
Large Language Models for Test Case Evolution and Repair
 
Machine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabsMachine learning at scale - Webinar By zekeLabs
Machine learning at scale - Webinar By zekeLabs
 
Generative AI in CSharp with Semantic Kernel.pptx
Generative AI in CSharp with Semantic Kernel.pptxGenerative AI in CSharp with Semantic Kernel.pptx
Generative AI in CSharp with Semantic Kernel.pptx
 
Machine Learning for Capacity Management
 Machine Learning for Capacity Management Machine Learning for Capacity Management
Machine Learning for Capacity Management
 
Machine Learning
Machine LearningMachine Learning
Machine Learning
 
Advanced Testing
Advanced TestingAdvanced Testing
Advanced Testing
 

Último

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfIngrid Airi González
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityIES VE
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxLoriGlavin3
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxLoriGlavin3
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Scott Andery
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...AliaaTarek5
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Farhan Tariq
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxLoriGlavin3
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesThousandEyes
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersRaghuram Pandurangan
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesThousandEyes
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 

Último (20)

Generative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdfGenerative Artificial Intelligence: How generative AI works.pdf
Generative Artificial Intelligence: How generative AI works.pdf
 
Decarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a realityDecarbonising Buildings: Making a net-zero built environment a reality
Decarbonising Buildings: Making a net-zero built environment a reality
 
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptxDigital Identity is Under Attack: FIDO Paris Seminar.pptx
Digital Identity is Under Attack: FIDO Paris Seminar.pptx
 
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptxThe Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
The Fit for Passkeys for Employee and Consumer Sign-ins: FIDO Paris Seminar.pptx
 
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
Enhancing User Experience - Exploring the Latest Features of Tallyman Axis Lo...
 
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
(How to Program) Paul Deitel, Harvey Deitel-Java How to Program, Early Object...
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 
Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...Genislab builds better products and faster go-to-market with Lean project man...
Genislab builds better products and faster go-to-market with Lean project man...
 
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptxA Deep Dive on Passkeys: FIDO Paris Seminar.pptx
A Deep Dive on Passkeys: FIDO Paris Seminar.pptx
 
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyesHow to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
Generative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information DevelopersGenerative AI for Technical Writer or Information Developers
Generative AI for Technical Writer or Information Developers
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyesAssure Ecommerce and Retail Operations Uptime with ThousandEyes
Assure Ecommerce and Retail Operations Uptime with ThousandEyes
 
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 

Implications of GPT-3

  • 1. Implications of GPT-3 Raven Jiang raven@cs.stanford.edu
  • 2. Overview This document covers: 1. What is GPT-3? 2. How is GPT-3 different? 3. OpenAI’s API strategy 4. Potential commercial implications Details and charts from the OpenAI paper and a talk given on 7/24 by Ben Mann, the second author. Please send feedback to Raven Jiang (raven@cs.stanford.edu) Disclaimer: I am not affiliated with OpenAI, nor an expert in deep learning. I possess practical knowledge of its implementation
  • 3. 1. What is GPT-3? 2. How is GPT-3 different? 3. OpenAI’s API strategy 4. Potential commercial implications
  • 4. What is GPT-3? • Text generator deep learning model trained by OpenAI • Transformer architecture pioneered by Google • GPT-2, BERT, XLNet, and RoBERTa • Task agnostic • Unsupervised learning • 100x larger (more parameters) than its predecessor GPT-2 (2018) • Estimated to have cost $12 million of computation cost to train • Trained on text data from books and websites
  • 5. What does it do? • Seemingly many things • Translation • Write new poetry • Generate stories • Have a conversation • Answer questions • Generate working React code • Generate Figma designs • Magical VLOOKUP backed by the Internet • Maybe creativity is not hard for AI
  • 6. GPT-3 training data Common Crawl 59% WebText2 22% Books1 8% Books2 8% Wikipedia 3%• Common Crawl is scraped web data manually filtered for some quality issues • Books1 and Books2 are mostly fiction • Books2 includes non-English content
  • 7. Transformer architecture Transformer-based models Older NLP neural network models Examples Google’s BERT, OpenAI’s GPT-3, Microsoft’s Turing-NLG Google’s GNMT Task Task-agnostic The same model is successful at many different language tasks without additional training Trained for a specific task Models are usually trained for a certain task and fine-tuned for a related task with additional training data (e.g. Transfer Learning) Training Unsupervised training Model is trained with large collections of text without special annotations Supervised training Trained with large quantities of input annotated with expected output that are usually human-generated Transformers are the state of the art for NLP neural networks
  • 8. Example translation workflow Transformer 1. Train on unrelated English and French text 2. Query describes desired pattern: """Translate these sentences: Hello => Bonjour That is a cat => C'est un chat You pass butter =>""" 3. Result: """Translate these sentences: Hello => Bonjour That is a cat => C'est un chat You pass butter => Tu passes du beurre""" Pre-Transformer Language Models 1. Create dataset of English-French text examples 2. Train on dataset 3. Query: "You pass butter" 4. Result: "Tu passes du beurre" Goal 1: Find the French translation for “You pass butter.”
  • 9. Generalizability of Transformer models Transformer 1. Use the same model as the previous task 2. Query describes new pattern: """Here are some great dad jokes: Q: How do you make a lemon drop? A: Let it fall. Q: What has ears but cannot hear? A: A cornfield. Q:""" 3. Result: ""”Here are some great dad jokes: Q: How do you make a lemon drop? A: Let it fall. Q: What has ears but cannot hear? A: A cornfield. Q: How does a vampire start a letter? A. Dear blood.""" Pre-Transformer Language Models 1. Create/source a new annotated dataset suited for the new task 2. Retrain the model either with Transfer Learning or from scratch 3. Query Goal 2: Tell some dad jokes
  • 10. 1. What is GPT-3? 2. How is GPT-3 different? 3. OpenAI’s API strategy 4. Potential commercial implications
  • 11. How is GPT-3 different? • It is huge. • 175 billion parameters • Its predecessor GPT-2 has 1.5 billion parameters • The previous record holder, Microsoft’s Turing-NLG, has 17 billion parameters • Innovation of scale not technique 0 50 100 150 200 GPT-2 - 2018/06 Turing-NLG - 2020/02 GPT-3 - 2020/07 Parameters (Billions) Parameters (Billions)
  • 12. Power of scale • Scale made a dramatic difference in performance • Accuracy increased from 25% to 65% for a specific benchmarking task going from 13B parameters to 175B parameters
  • 13. Uncanny Valley • Participants asked to spot fake news generated by GPT-3 • More parameters = harder to spot • Very close to 50-50 accuracy at GPT-3 scale
  • 14. Returns to scale • Task performance appears to continue improving with scale • How will GPT-4 perform?
  • 15. Consequences of scale • Querying is extremely powerful • Unexpectedly good performance on a large variety of tasks • Compared to older task-specific models, API-only access is useful for broader range of applications • Caveat: performance probably still inferior to task-specific models • Caveat2: performance may continue to improve with scale
  • 16. 1. What is GPT-3? 2. How is GPT-3 different? 3. OpenAI’s API strategy 4. Potential commercial implications
  • 17. OpenAI’s API strategy • Gated API access to selected partners • No access to underlying GPT-3 model and its trained weights • Turns NLP from annotation/training problem into meta-programming • Designing queries to yield useful results on a range of language problems • Much friendlier paradigm for small teams and product-driven startups • MaaS (Model as a Service) is viable business(?) • Extremely large NLP models as OpEx instead of CapEx • No need to fine-tune models for problems with more training data • Concerns over AGI risk
  • 18. 1. What is GPT-3? 2. How is GPT-3 different? 3. OpenAI’s API strategy 4. Potential commercial implications
  • 19. Potential commercial implications • Access to GPT-3 (or future GPT-4) API accelerates go-to-market speed of a startup doing applied NLP • Build MVP using GPT-3 without investing in any training data or infrastructure • Switch to better performing fine-tuned models over time • Companies like Grammarly may face low-cost competitors • Building NLP-powered product features may be as simple as programming GPT-3 to answer the right questions • Caveat: Only if GPT-3 (or GPT-4) turns out to be Good Enough for these applications. Unclear without wider access to the API
  • 21. Conclusion • Task-agnostic NLP models that deliver acceptable performance may soon be available as a service • GPT-3 may be that model • Potential explosion of startups building MVPs on such an API • Investor warning: startups dependent on the API may lack expertise and tools to iterate off MVP • Happy to chat more: raven@cs.stanford.edu