SlideShare una empresa de Scribd logo
Retrieval
Augmented
Generation
Making a Podcast
“interactive”
Richard Rodger, Voxgig
The Culture’s inhabitants “could
record their mind-states,
effectively taking a reading of the
person’s personality which could be
stored, duplicated, read, transmitted
and installed …”
Iain M Banks, Look to Windward
How do I build this?
1. Design
2. Code
3. Practicalities
“Using the podcast audio
recordings, build a chat
interface that responds like a
podcast guest”
- ~130 episodes
- ~30 minutes per episode
- ~7000 words per transcript
- 2 new episodes per week
- Metadata:
- guest details
- show notes
1. Ingestion
Get the audio and metadata into
the “AI”
2. Querying
Get the conversational responses
out of the “AI”
Ingestion
Querying
Seems like hard work…?
Why not just concatenate all
transcripts, and use that as
the context prompt?
“Vector Embedding”
Text -> Concepts
(using a “Model”)
But why vectors?
The vector database does the
embedding for you
Large Language Models
(LLMs)
Large Language Models
(LLMs)
- Very large neural networks
- Use the Transformer* architecture
- A vector embedding
- Cares about word order
- Cares about word context
* Attention is All You Need, Vaswani et al. 2017
Retrieval Augmented Generation (RAG)
Code
// Microservice messages
aim: ingest: {
// Get transcription
transcribe: episode: {}
// Do “embedding”
ingest: transcript: {}
}
aim: chat: {
// Use prompt context to reply
chat: query: {}
}
// aim:ingest, transcribe:episode
// lambda called when there’s
// a new audio file in s3
const audio = loadAudio(event)
// Use deepgram.com to get the conversation text
const result = await deepgram
.listen
.prerecorded
.transcribeFile(
audio.content,
options
)
// Save transcript to s3
saveTranscript(result)
// aim:ingest, ingest:transcript
// lambda called when there’s
// a new transcript file in s3
const transcript = loadTranscript(event)
// Split into “chunks”, each to be added to
// an OpenSearch vector collection
const chunks = chunkify(transcript)
for(let chunk in chunks) {
// Call AWS Bedrock, specify model
const embedding =
bedrockClient.embed(chunk, model)
// Store embedding vectors in OpenSearch
openSearchClient.store(embedding)
}
// aim:chat, chat:query
const query = event.body.query // HTTP POST
// Call AWS Bedrock, specify model
const embedding =
bedrockClient.embed(query, model)
// Use embedding to get context chunk text
const context = openSearchClient
.search(embedding)
// Do prompt engineering here!
const prompt = "Answer question with Context: "
+ context + "nQuestion: " + query
// Get answer!
const answer = bedrockClient.invoke(prompt)
// Pro Tip: use a REPL!
wovs/pdm-local> aim:chat,chat:query,query:"what
is developer relations?"
{
ok: true,
why: '',
answer: "Developer Relations is the practice of
building and maintaining relationships between
companies and developers...The goal of Developer
Relations is to make the company's products as
easy to use, understand, and integrate into a
developer's workflow as possible."
}
// Open Source
// Reference implementation
github.com/mikaelvesavuori/bedrock-rag-demo
// Voxgig microservice implementation
github.com/voxgig/podmind
// Blog post (next week)
richardrodger.com
Practicalities
…, you are face to face with the
champion privy builder of
Sangamon County
Paul Savage
brightbeam.com
It’s the Wild West
“I’ve got my stuff rigged to hit mixtral-8x7, and dolphin locally, and
3.5-turbo, and the 4-series preview all with easy comparison in
emacs and stuff, and in fairness the 4.5-preview is starting to show
some edge on 8x7 …
Until I realized Perplexity will give you a decent amount of Mistral
Medium for free ….
Who is sama kidding they’re still leading here? Mistral Medium
destroys the 4.5 preview. And Perplexity wouldn’t be giving it away
in any quantity if it had a cost structure like 4.5 …
Mistral is the new “RenTech of AI”, DPO and Alibi and sliding window
and modern mixtures are well-understood so the money is in the lag
between some new edge and TheBloke having it quantized for a Mac
Mini or 4070 Super …
https://news.ycombinator.com/item?id=38948291
It’s the Wild West
Here's a glossary to understand this post:
- mixtral-8x7 or 8x7: Open source model by Mistral AI.
- Dolphin: An uncensored version of the mistral model
- 3.5-turbo: GPT-3.5 Turbo, the cheapest API from OpenAI
- 4-series preview OR "4.5 preview": GPT-4 Turbo, the most capable API from OpenAI
- mistral-medium: A new model by Mistral AI that they are only serving through AI. It's in
private beta and there's a waiting list to access it.
- Perplexity: A new search engine that is challenging Google by applying LLM to search
- Sama: Sam Altman, CEO of OpenAI
- RenTech: Renaissance Technologies, a secretive hedge fund known for delivering
impressive returns improving on the work of others
- DPO: Direct Preference Optimization. It is a technique that leverages AI feedback to
optimize the performance of smaller, open-source models like Zephyr-7B1.
- Alibi: a Python library that provides tools for machine learning model inspection and
interpretation2. It can be used to explain the predictions of any black-box model,
including LLMs.
- Sliding window: a type of attention mechanism introduced by Mistral-7B3. It is used to
support longer sequences in LLMs.
- Modern mixtures: The process of using multiple models together, like "mixtral" is a
mixture of several mistral models.
- TheBloke: Open source developer that is very quick at quantizing all new models that
come out
- Quantize: Decreasing memory requirements of a new model by decreasing the precision
of weights, typically with just minor performance degradation.
- 4070 Super: NVIDIA 4070 Super, new graphics card announced just a week ago
https://github.com/mikaelves
avuori/bedrock-rag-demo
So you want to deliver a RAG
project…
Do you want good
performance?
Do you want high quality
answers?
ann-benchmarks.com
Do you like regressions?
Do you like unrealistic
expectations?
Do you like being unable to solve
fundamental limitations? (maybe)
…my experience over the past few
months suggests that for system
programming, LLMs almost never
provide acceptable solutions…
antirez.com/news/140
(Salvatore Sanfilippo - wrote Redis)
But…
I am not an animal brain, I am
not even some attempt to
produce an AI through
software running on a
computer. I am a Culture Mind.
We are close to gods, … we
are quicker; we live faster and
more completely than you do,
with so many more senses,
such a greater store of
memories and at such a fine
level of detail.
- Look to Windward,
Iain M Banks
Thanks!
richardrodger.com

Más contenido relacionado

Similar a Using RAG to create your own Podcast conversations.pdf

Construction Techniques For Domain Specific Languages
Construction Techniques For Domain Specific LanguagesConstruction Techniques For Domain Specific Languages
Construction Techniques For Domain Specific Languages
ThoughtWorks
 
Build an App with Blindfold - Britt Barak
Build an App with Blindfold - Britt Barak Build an App with Blindfold - Britt Barak
Build an App with Blindfold - Britt Barak
DroidConTLV
 
Ad507
Ad507Ad507
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Zabbix
 
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONMicroservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Adrian Cockcroft
 
Splunk Conf 2014 - Getting the message
Splunk Conf 2014 - Getting the messageSplunk Conf 2014 - Getting the message
Splunk Conf 2014 - Getting the message
Damien Dallimore
 
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Bill Buchan
 
Distributed Reactive Architecture: Extending SOA with Events
Distributed Reactive Architecture: Extending SOA with EventsDistributed Reactive Architecture: Extending SOA with Events
Distributed Reactive Architecture: Extending SOA with Events
Steve Pember
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
Dataconomy Media
 
Sinfonier: How I turned my grandmother into a data analyst - Fran J. Gomez - ...
Sinfonier: How I turned my grandmother into a data analyst - Fran J. Gomez - ...Sinfonier: How I turned my grandmother into a data analyst - Fran J. Gomez - ...
Sinfonier: How I turned my grandmother into a data analyst - Fran J. Gomez - ...
Codemotion
 
Vote NO for MySQL
Vote NO for MySQLVote NO for MySQL
Vote NO for MySQL
Ulf Wendel
 
Key-Value Stores: a practical overview
Key-Value Stores: a practical overviewKey-Value Stores: a practical overview
Key-Value Stores: a practical overview
Marc Seeger
 
HTTP Plugin for MySQL!
HTTP Plugin for MySQL!HTTP Plugin for MySQL!
HTTP Plugin for MySQL!
Ulf Wendel
 
NoSQL in MySQL
NoSQL in MySQLNoSQL in MySQL
NoSQL in MySQL
Ulf Wendel
 
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
MongoDB
 
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Brocade
 
Orchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache MahoutOrchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache Mahout
aneeshabakharia
 
Rasa AI: Building clever chatbots
Rasa AI: Building clever chatbotsRasa AI: Building clever chatbots
Rasa AI: Building clever chatbots
Tom Bocklisch
 
ROS Based Programming and Visualization of Quadrotor Helicopters
ROS Based Programming and Visualization of Quadrotor HelicoptersROS Based Programming and Visualization of Quadrotor Helicopters
ROS Based Programming and Visualization of Quadrotor Helicopters
Atılay Mayadağ
 
20100730 phpstudy
20100730 phpstudy20100730 phpstudy
20100730 phpstudy
Yusuke Ando
 

Similar a Using RAG to create your own Podcast conversations.pdf (20)

Construction Techniques For Domain Specific Languages
Construction Techniques For Domain Specific LanguagesConstruction Techniques For Domain Specific Languages
Construction Techniques For Domain Specific Languages
 
Build an App with Blindfold - Britt Barak
Build an App with Blindfold - Britt Barak Build an App with Blindfold - Britt Barak
Build an App with Blindfold - Britt Barak
 
Ad507
Ad507Ad507
Ad507
 
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
Erik Skytthe - Monitoring Mesos, Docker, Containers with Zabbix | ZabConf2016
 
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCONMicroservices Application Tracing Standards and Simulators - Adrians at OSCON
Microservices Application Tracing Standards and Simulators - Adrians at OSCON
 
Splunk Conf 2014 - Getting the message
Splunk Conf 2014 - Getting the messageSplunk Conf 2014 - Getting the message
Splunk Conf 2014 - Getting the message
 
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
Lotusphere 2007 AD507 Leveraging the Power of Object Oriented Programming in ...
 
Distributed Reactive Architecture: Extending SOA with Events
Distributed Reactive Architecture: Extending SOA with EventsDistributed Reactive Architecture: Extending SOA with Events
Distributed Reactive Architecture: Extending SOA with Events
 
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft..."Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
"Source Code Abstracts Classification Using CNN", Vadim Markovtsev, Lead Soft...
 
Sinfonier: How I turned my grandmother into a data analyst - Fran J. Gomez - ...
Sinfonier: How I turned my grandmother into a data analyst - Fran J. Gomez - ...Sinfonier: How I turned my grandmother into a data analyst - Fran J. Gomez - ...
Sinfonier: How I turned my grandmother into a data analyst - Fran J. Gomez - ...
 
Vote NO for MySQL
Vote NO for MySQLVote NO for MySQL
Vote NO for MySQL
 
Key-Value Stores: a practical overview
Key-Value Stores: a practical overviewKey-Value Stores: a practical overview
Key-Value Stores: a practical overview
 
HTTP Plugin for MySQL!
HTTP Plugin for MySQL!HTTP Plugin for MySQL!
HTTP Plugin for MySQL!
 
NoSQL in MySQL
NoSQL in MySQLNoSQL in MySQL
NoSQL in MySQL
 
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
Big Data Analytics 3: Machine Learning to Engage the Customer, with Apache Sp...
 
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
Event-driven automation, DevOps way ~IoT時代の自動化、そのリアリティとは?~
 
Orchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache MahoutOrchestrating the Intelligent Web with Apache Mahout
Orchestrating the Intelligent Web with Apache Mahout
 
Rasa AI: Building clever chatbots
Rasa AI: Building clever chatbotsRasa AI: Building clever chatbots
Rasa AI: Building clever chatbots
 
ROS Based Programming and Visualization of Quadrotor Helicopters
ROS Based Programming and Visualization of Quadrotor HelicoptersROS Based Programming and Visualization of Quadrotor Helicopters
ROS Based Programming and Visualization of Quadrotor Helicopters
 
20100730 phpstudy
20100730 phpstudy20100730 phpstudy
20100730 phpstudy
 

Más de Richard Rodger

Richard_TheDev2023_pattern.pptx.pdf
Richard_TheDev2023_pattern.pptx.pdfRichard_TheDev2023_pattern.pptx.pdf
Richard_TheDev2023_pattern.pptx.pdf
Richard Rodger
 
richard-rodger-awssofia-microservices-2019.pdf
richard-rodger-awssofia-microservices-2019.pdfrichard-rodger-awssofia-microservices-2019.pdf
richard-rodger-awssofia-microservices-2019.pdf
Richard Rodger
 
richardrodger-microservice-algebra-cluj-apr.pdf
richardrodger-microservice-algebra-cluj-apr.pdfrichardrodger-microservice-algebra-cluj-apr.pdf
richardrodger-microservice-algebra-cluj-apr.pdf
Richard Rodger
 
richardrodger-designing-microservices-london-may.pdf
richardrodger-designing-microservices-london-may.pdfrichardrodger-designing-microservices-london-may.pdf
richardrodger-designing-microservices-london-may.pdf
Richard Rodger
 
richardrodger-designing-microservices-london-may.pdf
richardrodger-designing-microservices-london-may.pdfrichardrodger-designing-microservices-london-may.pdf
richardrodger-designing-microservices-london-may.pdf
Richard Rodger
 
richardrodger-microservice-risk-dublin-mar.pdf
richardrodger-microservice-risk-dublin-mar.pdfrichardrodger-microservice-risk-dublin-mar.pdf
richardrodger-microservice-risk-dublin-mar.pdf
Richard Rodger
 
richardrodger-service-discovery-waterford-feb.pdf
richardrodger-service-discovery-waterford-feb.pdfrichardrodger-service-discovery-waterford-feb.pdf
richardrodger-service-discovery-waterford-feb.pdf
Richard Rodger
 
richardrodger-vespa-waterford-oct.pdf
richardrodger-vespa-waterford-oct.pdfrichardrodger-vespa-waterford-oct.pdf
richardrodger-vespa-waterford-oct.pdf
Richard Rodger
 
Richardrodger designing-microservices-uxdx-dublin-oct
Richardrodger designing-microservices-uxdx-dublin-octRichardrodger designing-microservices-uxdx-dublin-oct
Richardrodger designing-microservices-uxdx-dublin-oct
Richard Rodger
 
How microservices fail, and what to do about it
How microservices fail, and what to do about itHow microservices fail, and what to do about it
How microservices fail, and what to do about it
Richard Rodger
 
Rapid Digital Innovation: How Node.js Delivers
Rapid Digital Innovation: How Node.js DeliversRapid Digital Innovation: How Node.js Delivers
Rapid Digital Innovation: How Node.js Delivers
Richard Rodger
 
Richardrodger nodeconfeu-2014-final
Richardrodger nodeconfeu-2014-finalRichardrodger nodeconfeu-2014-final
Richardrodger nodeconfeu-2014-final
Richard Rodger
 
Richardrodger nodeday-2014-final
Richardrodger nodeday-2014-finalRichardrodger nodeday-2014-final
Richardrodger nodeday-2014-final
Richard Rodger
 
Richardrodger nodeday-2014-final
Richardrodger nodeday-2014-finalRichardrodger nodeday-2014-final
Richardrodger nodeday-2014-final
Richard Rodger
 
Richardrodger microxchgio-feb-2015-final
Richardrodger microxchgio-feb-2015-finalRichardrodger microxchgio-feb-2015-final
Richardrodger microxchgio-feb-2015-final
Richard Rodger
 
Micro-services Battle Scars
Micro-services Battle ScarsMicro-services Battle Scars
Micro-services Battle Scars
Richard Rodger
 
Richard rodger technical debt - web summit 2013
Richard rodger   technical debt - web summit 2013Richard rodger   technical debt - web summit 2013
Richard rodger technical debt - web summit 2013
Richard Rodger
 
The Seneca Pattern at EngineYard Distill 2013 Conference
The Seneca Pattern at EngineYard Distill 2013 ConferenceThe Seneca Pattern at EngineYard Distill 2013 Conference
The Seneca Pattern at EngineYard Distill 2013 Conference
Richard Rodger
 
Building businesspost.ie using Node.js
Building businesspost.ie using Node.jsBuilding businesspost.ie using Node.js
Building businesspost.ie using Node.js
Richard Rodger
 
How to Write Big Apps (Richard Rodger NodeDublin 2012)
How to Write Big Apps (Richard Rodger NodeDublin 2012)How to Write Big Apps (Richard Rodger NodeDublin 2012)
How to Write Big Apps (Richard Rodger NodeDublin 2012)
Richard Rodger
 

Más de Richard Rodger (20)

Richard_TheDev2023_pattern.pptx.pdf
Richard_TheDev2023_pattern.pptx.pdfRichard_TheDev2023_pattern.pptx.pdf
Richard_TheDev2023_pattern.pptx.pdf
 
richard-rodger-awssofia-microservices-2019.pdf
richard-rodger-awssofia-microservices-2019.pdfrichard-rodger-awssofia-microservices-2019.pdf
richard-rodger-awssofia-microservices-2019.pdf
 
richardrodger-microservice-algebra-cluj-apr.pdf
richardrodger-microservice-algebra-cluj-apr.pdfrichardrodger-microservice-algebra-cluj-apr.pdf
richardrodger-microservice-algebra-cluj-apr.pdf
 
richardrodger-designing-microservices-london-may.pdf
richardrodger-designing-microservices-london-may.pdfrichardrodger-designing-microservices-london-may.pdf
richardrodger-designing-microservices-london-may.pdf
 
richardrodger-designing-microservices-london-may.pdf
richardrodger-designing-microservices-london-may.pdfrichardrodger-designing-microservices-london-may.pdf
richardrodger-designing-microservices-london-may.pdf
 
richardrodger-microservice-risk-dublin-mar.pdf
richardrodger-microservice-risk-dublin-mar.pdfrichardrodger-microservice-risk-dublin-mar.pdf
richardrodger-microservice-risk-dublin-mar.pdf
 
richardrodger-service-discovery-waterford-feb.pdf
richardrodger-service-discovery-waterford-feb.pdfrichardrodger-service-discovery-waterford-feb.pdf
richardrodger-service-discovery-waterford-feb.pdf
 
richardrodger-vespa-waterford-oct.pdf
richardrodger-vespa-waterford-oct.pdfrichardrodger-vespa-waterford-oct.pdf
richardrodger-vespa-waterford-oct.pdf
 
Richardrodger designing-microservices-uxdx-dublin-oct
Richardrodger designing-microservices-uxdx-dublin-octRichardrodger designing-microservices-uxdx-dublin-oct
Richardrodger designing-microservices-uxdx-dublin-oct
 
How microservices fail, and what to do about it
How microservices fail, and what to do about itHow microservices fail, and what to do about it
How microservices fail, and what to do about it
 
Rapid Digital Innovation: How Node.js Delivers
Rapid Digital Innovation: How Node.js DeliversRapid Digital Innovation: How Node.js Delivers
Rapid Digital Innovation: How Node.js Delivers
 
Richardrodger nodeconfeu-2014-final
Richardrodger nodeconfeu-2014-finalRichardrodger nodeconfeu-2014-final
Richardrodger nodeconfeu-2014-final
 
Richardrodger nodeday-2014-final
Richardrodger nodeday-2014-finalRichardrodger nodeday-2014-final
Richardrodger nodeday-2014-final
 
Richardrodger nodeday-2014-final
Richardrodger nodeday-2014-finalRichardrodger nodeday-2014-final
Richardrodger nodeday-2014-final
 
Richardrodger microxchgio-feb-2015-final
Richardrodger microxchgio-feb-2015-finalRichardrodger microxchgio-feb-2015-final
Richardrodger microxchgio-feb-2015-final
 
Micro-services Battle Scars
Micro-services Battle ScarsMicro-services Battle Scars
Micro-services Battle Scars
 
Richard rodger technical debt - web summit 2013
Richard rodger   technical debt - web summit 2013Richard rodger   technical debt - web summit 2013
Richard rodger technical debt - web summit 2013
 
The Seneca Pattern at EngineYard Distill 2013 Conference
The Seneca Pattern at EngineYard Distill 2013 ConferenceThe Seneca Pattern at EngineYard Distill 2013 Conference
The Seneca Pattern at EngineYard Distill 2013 Conference
 
Building businesspost.ie using Node.js
Building businesspost.ie using Node.jsBuilding businesspost.ie using Node.js
Building businesspost.ie using Node.js
 
How to Write Big Apps (Richard Rodger NodeDublin 2012)
How to Write Big Apps (Richard Rodger NodeDublin 2012)How to Write Big Apps (Richard Rodger NodeDublin 2012)
How to Write Big Apps (Richard Rodger NodeDublin 2012)
 

Último

Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
Marcin Chrost
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
Alina Yurenko
 
Requirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional SafetyRequirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional Safety
Ayan Halder
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
Yara Milbes
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
Green Software Development
 
Mobile app Development Services | Drona Infotech
Mobile app Development Services  | Drona InfotechMobile app Development Services  | Drona Infotech
Mobile app Development Services | Drona Infotech
Drona Infotech
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
Patrick Weigel
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
Alberto Brandolini
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
Green Software Development
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
Rakesh Kumar R
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
sjcobrien
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
VALiNTRY360
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
mz5nrf0n
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
SOCRadar
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
Bert Jan Schrijver
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
XfilesPro
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
Peter Muessig
 
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
mz5nrf0n
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
Remote DBA Services
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
Green Software Development
 

Último (20)

Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !Enums On Steroids - let's look at sealed classes !
Enums On Steroids - let's look at sealed classes !
 
All you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVMAll you need to know about Spring Boot and GraalVM
All you need to know about Spring Boot and GraalVM
 
Requirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional SafetyRequirement Traceability in Xen Functional Safety
Requirement Traceability in Xen Functional Safety
 
SMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API ServiceSMS API Integration in Saudi Arabia| Best SMS API Service
SMS API Integration in Saudi Arabia| Best SMS API Service
 
Energy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina JonuziEnergy consumption of Database Management - Florina Jonuzi
Energy consumption of Database Management - Florina Jonuzi
 
Mobile app Development Services | Drona Infotech
Mobile app Development Services  | Drona InfotechMobile app Development Services  | Drona Infotech
Mobile app Development Services | Drona Infotech
 
WWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders AustinWWDC 2024 Keynote Review: For CocoaCoders Austin
WWDC 2024 Keynote Review: For CocoaCoders Austin
 
Modelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - AmsterdamModelling Up - DDDEurope 2024 - Amsterdam
Modelling Up - DDDEurope 2024 - Amsterdam
 
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, FactsALGIT - Assembly Line for Green IT - Numbers, Data, Facts
ALGIT - Assembly Line for Green IT - Numbers, Data, Facts
 
Fundamentals of Programming and Language Processors
Fundamentals of Programming and Language ProcessorsFundamentals of Programming and Language Processors
Fundamentals of Programming and Language Processors
 
Malibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed RoundMalibou Pitch Deck For Its €3M Seed Round
Malibou Pitch Deck For Its €3M Seed Round
 
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfTop Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdf
 
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
原版定制美国纽约州立大学奥尔巴尼分校毕业证学位证书原版一模一样
 
socradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdfsocradar-q1-2024-aviation-industry-report.pdf
socradar-q1-2024-aviation-industry-report.pdf
 
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
J-Spring 2024 - Going serverless with Quarkus, GraalVM native images and AWS ...
 
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...
 
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling ExtensionsUI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
UI5con 2024 - Boost Your Development Experience with UI5 Tooling Extensions
 
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
在线购买加拿大英属哥伦比亚大学毕业证本科学位证书原版一模一样
 
Oracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptxOracle 23c New Features For DBAs and Developers.pptx
Oracle 23c New Features For DBAs and Developers.pptx
 
GreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-JurisicGreenCode-A-VSCode-Plugin--Dario-Jurisic
GreenCode-A-VSCode-Plugin--Dario-Jurisic
 

Using RAG to create your own Podcast conversations.pdf

  • 2.
  • 3. The Culture’s inhabitants “could record their mind-states, effectively taking a reading of the person’s personality which could be stored, duplicated, read, transmitted and installed …” Iain M Banks, Look to Windward
  • 4.
  • 5.
  • 6.
  • 7.
  • 8.
  • 9. How do I build this?
  • 10. 1. Design 2. Code 3. Practicalities
  • 11. “Using the podcast audio recordings, build a chat interface that responds like a podcast guest”
  • 12. - ~130 episodes - ~30 minutes per episode - ~7000 words per transcript - 2 new episodes per week - Metadata: - guest details - show notes
  • 13. 1. Ingestion Get the audio and metadata into the “AI” 2. Querying Get the conversational responses out of the “AI”
  • 16. Seems like hard work…? Why not just concatenate all transcripts, and use that as the context prompt?
  • 17. “Vector Embedding” Text -> Concepts (using a “Model”)
  • 19. The vector database does the embedding for you
  • 21. Large Language Models (LLMs) - Very large neural networks - Use the Transformer* architecture - A vector embedding - Cares about word order - Cares about word context * Attention is All You Need, Vaswani et al. 2017
  • 23. Code
  • 24. // Microservice messages aim: ingest: { // Get transcription transcribe: episode: {} // Do “embedding” ingest: transcript: {} } aim: chat: { // Use prompt context to reply chat: query: {} }
  • 25. // aim:ingest, transcribe:episode // lambda called when there’s // a new audio file in s3 const audio = loadAudio(event) // Use deepgram.com to get the conversation text const result = await deepgram .listen .prerecorded .transcribeFile( audio.content, options ) // Save transcript to s3 saveTranscript(result)
  • 26. // aim:ingest, ingest:transcript // lambda called when there’s // a new transcript file in s3 const transcript = loadTranscript(event) // Split into “chunks”, each to be added to // an OpenSearch vector collection const chunks = chunkify(transcript) for(let chunk in chunks) { // Call AWS Bedrock, specify model const embedding = bedrockClient.embed(chunk, model) // Store embedding vectors in OpenSearch openSearchClient.store(embedding) }
  • 27. // aim:chat, chat:query const query = event.body.query // HTTP POST // Call AWS Bedrock, specify model const embedding = bedrockClient.embed(query, model) // Use embedding to get context chunk text const context = openSearchClient .search(embedding) // Do prompt engineering here! const prompt = "Answer question with Context: " + context + "nQuestion: " + query // Get answer! const answer = bedrockClient.invoke(prompt)
  • 28. // Pro Tip: use a REPL! wovs/pdm-local> aim:chat,chat:query,query:"what is developer relations?" { ok: true, why: '', answer: "Developer Relations is the practice of building and maintaining relationships between companies and developers...The goal of Developer Relations is to make the company's products as easy to use, understand, and integrate into a developer's workflow as possible." }
  • 29. // Open Source // Reference implementation github.com/mikaelvesavuori/bedrock-rag-demo // Voxgig microservice implementation github.com/voxgig/podmind // Blog post (next week) richardrodger.com
  • 30. Practicalities …, you are face to face with the champion privy builder of Sangamon County
  • 32. It’s the Wild West “I’ve got my stuff rigged to hit mixtral-8x7, and dolphin locally, and 3.5-turbo, and the 4-series preview all with easy comparison in emacs and stuff, and in fairness the 4.5-preview is starting to show some edge on 8x7 … Until I realized Perplexity will give you a decent amount of Mistral Medium for free …. Who is sama kidding they’re still leading here? Mistral Medium destroys the 4.5 preview. And Perplexity wouldn’t be giving it away in any quantity if it had a cost structure like 4.5 … Mistral is the new “RenTech of AI”, DPO and Alibi and sliding window and modern mixtures are well-understood so the money is in the lag between some new edge and TheBloke having it quantized for a Mac Mini or 4070 Super … https://news.ycombinator.com/item?id=38948291
  • 33. It’s the Wild West Here's a glossary to understand this post: - mixtral-8x7 or 8x7: Open source model by Mistral AI. - Dolphin: An uncensored version of the mistral model - 3.5-turbo: GPT-3.5 Turbo, the cheapest API from OpenAI - 4-series preview OR "4.5 preview": GPT-4 Turbo, the most capable API from OpenAI - mistral-medium: A new model by Mistral AI that they are only serving through AI. It's in private beta and there's a waiting list to access it. - Perplexity: A new search engine that is challenging Google by applying LLM to search - Sama: Sam Altman, CEO of OpenAI - RenTech: Renaissance Technologies, a secretive hedge fund known for delivering impressive returns improving on the work of others - DPO: Direct Preference Optimization. It is a technique that leverages AI feedback to optimize the performance of smaller, open-source models like Zephyr-7B1. - Alibi: a Python library that provides tools for machine learning model inspection and interpretation2. It can be used to explain the predictions of any black-box model, including LLMs. - Sliding window: a type of attention mechanism introduced by Mistral-7B3. It is used to support longer sequences in LLMs. - Modern mixtures: The process of using multiple models together, like "mixtral" is a mixture of several mistral models. - TheBloke: Open source developer that is very quick at quantizing all new models that come out - Quantize: Decreasing memory requirements of a new model by decreasing the precision of weights, typically with just minor performance degradation. - 4070 Super: NVIDIA 4070 Super, new graphics card announced just a week ago
  • 35. So you want to deliver a RAG project…
  • 36. Do you want good performance?
  • 37. Do you want high quality answers? ann-benchmarks.com
  • 38. Do you like regressions?
  • 39. Do you like unrealistic expectations?
  • 40. Do you like being unable to solve fundamental limitations? (maybe) …my experience over the past few months suggests that for system programming, LLMs almost never provide acceptable solutions… antirez.com/news/140 (Salvatore Sanfilippo - wrote Redis)
  • 42. I am not an animal brain, I am not even some attempt to produce an AI through software running on a computer. I am a Culture Mind. We are close to gods, … we are quicker; we live faster and more completely than you do, with so many more senses, such a greater store of memories and at such a fine level of detail. - Look to Windward, Iain M Banks