SlideShare una empresa de Scribd logo
1 de 5
Descargar para leer sin conexión
Application of W-curves and TSP to Clustering HIV1 Sequences

                  Douglas  Cork1,2,3, Steven Lembark4, Nelson Michael1,5, Jerome Kim1,5

  US Military HIV Research Program1; Henry M. Jackson Foundation for the Advancement of Military 
  Medicine2, Rockville, MD.  20850, BCPS Dept., Illinois Institute of Technology3, Chicago, IL. 60616, 
  Workhorse Computing4, Woodhaven, NY.,  Walter Reed Army Institute of Research5, Rockville, MD.

Abstract - The high mutation rate in HIV-1 makes it             automated process that can identify and compare drug
difficult to treat and analyze. Monitoring the evolution of     resistance sites would be an enormous help. The W-
drug resistance requires frequent resequencing, but             curve's scoring process produces a set of localized
comparing and visualizing the progress is difficult. One        comparison values that can be used as landmarks that
difficulty is simply locating the areas of interest: gaps and   can be used to automate the process of locating relevant
crossover mutations make it difficult to isolate clinically     areas for comparison. Clustering new samples with
significant sequences for comparison. Effectively               known ones using the Traveling Salesman Problem
displaying the results of comparisons grouped according         (“TSP”) can automate the grouping of new samples into
to multiple regions is also a problem. Our comparison           drug-resistance categories. Combining them leads to an
algorithm based on the W-curve helps automate the               automated process for tracking medication in
comparison process, producing results suitable for              individuals or analysis of groups.
clustering via a modified solution to the Traveling
Salesman Problem (“TSP”). Appropriate color-coding of
the TSP results allows us to display the results of multiple    3       Background
comparisons effectively for single samples or time-series.      Analyzing the data in many HIV studies is made
The results can be useful for providing guidance in             difficult by a combination of HIV's genetics and the
treatment, analyzing the membership in anonymous study          sources of data. Patients in many HIV clinics are
populations, tracking the evolution of drug resistance in       anonymous, often because they are illegally engaged in
populations, or rates of co-infection within study groups.      prostitution, illegal drug use, or homosexual sex.
                                                                Studies of groups look for new strains and how they
Key words:  HIV­1 Genomic Clustering, W­curve,                  spread and have to check for changing members of the
Traveling Salesman Problem, Drug Resistance                     sample population; individual patients have to be
                                                                sequenced frequently to evaluate appropriate drugs.
1        Disclaimer                                         HIV's high mutation and crossover rates combined by
                                                            discontinuous sequences for the drug responses make
The opinions or assertions contained herein are the private the analysis difficult. The high mutation in areas
views of the author, and are not to be construed as         between the regions of interest confound any analysis
official, or as reflecting true views of the Dept. of the   based on whole genes or regions: there is simply too
Army or the DoD.                                            much white noise on the genome to use large portions
                                                            of it for comparison. High rates of mutation and co-
 2        Introduction                                      infection leave many individuals with multiple strains
                                                            of HIV, further confounding the analysis, and HIV
One step in managing HIV infection response is              strains have frequent crossover mutations, making
monitoring individuals and study populations for            things even worse.
evolution of drug resistance. Given in vivo replication
                                                            Treatment of HIV is still largely a manual process:
kinetics with more than 109 new cells infected every day,
                                                            patients have to be sampled and sequenced frequently
each and every possible single-point mutation occurs
                                                            and doctors have to make informed decisions on how to
between 104 and 105 times per day in an HIV infected
                                                            deal with evolution of the strains infecting them.
individuals [5]. This leaves identifying the markers for
                                                            Presenting sequence comparison results is particularly
drug resistance difficult, requiring multiple manual steps
                                                            important: physicians have to evaluate the similarity of
in many cases. Especially for population studies, an
                                                            a single individual to multiple known drug-resistant
strains.                                                      sequences will leave the curves in the same quadrant.
                                                              This keeps our same-quadrant measure small for SNP's.
In the US 98% of the infections are type-B, and the single
FDA-approved program for treatment using genetics             We use a two-pass process for comparing W-curves.
treats type-B only. The U.S. Army has to treat soldiers       The first pass produces an array of alignment regions
who acquire HIV all over the world and only 85% of their      with starts, stops, and a difference measure that we call
cases are type-B, leaving them without any good options       “chunks”. SNP's increase the difference measure, gaps
for 15% of their cases. Being able to at least classify the   show up as differences in the starting values on
non-B cases and evaluate their treatment outcomes             successive chunks of the comparison, indels as
effectively would be a huge help.                             successive chunks with no change in the relative offset.
                                                              The second pass summarizes the chunks into whatever
                                                              measures are useful, for example by averaging the
4          Methodology                                        differences over the length of a sequence.
                                                              The advantage of chunking the results first is that
4.1        W-curves                                           similar chunks can be used to locate landmarks for
The W-curve was originally designed as the basis for a        aligning sequences with one another. Areas with small
graphical tool for visualizing very large regions of DNA      differences provide an automated means of locating the
[3,4]. Over time it has been adapted for numerical            offsets between start and stop values in the sequences.
comparison as well [6]. By abstracting the genetic            Given a library of sequences with known landmarks
sequences into a three-dimensional space, W-curves offer      such as points of known drug resistance, we can score a
a wider range of comparisons rather than comparisons          single incoming sequence against all of them.
based on searching, aligning and tree-building (assuming
a mutation model) with unidimensional strings of
characters. W-curves make it easier to find patterns in       4.2      TSP and Clustering
sequences or locate common features between genes.        The Traveling Salesman Problem (“TSP”) is quite easy
Their 3D nature also makes it possible to align smaller   to describe but difficult to solve. The problem is to take
features than are possible with string based techniques.  a list of distances between cities and make a tour of
The design of our comparison utilities builds on these    them which visits each city once with the least total
capabilities to provide a technique for matching the smalldistance. Substitute cost and the algorithm will find the
regions of HIV-1's CD4 epitopes along its gp120 gene.     “cheapest” route through all of the cities. As it turns
                                                          out, this problem is NP-complete, requiring analysis of
The order of bases along the corners is significant:      all routes to guarantee the least distance. Much work
number of hydrogen bonds (2 or 3) and chemical structure has gone into developing heuristics for solving this
(purine or pyramidine) share quadrants around the square. problem and there are fast algorithms for approximating
This means that most synonymous SNP's in the gene         the solution.
                                                              The utility of TSP problems is that an optimal tour will
                                                              cluster the closest cities together. If the difference
                                                              measures are for genes, they can similarly be clustered
                                                              on any region of interest. A number of techniques for
                                                              determining inter- and intra-clade distances have been
                                                              developed. One technique developed by Climer and
                                                              Zhang at Washington University is to add a fixed
                                                              number of “dummy cities” to the list [7]. Each dummy
                                                              has a small distance to all other cities (we use 2-20). The
                                                              non-zero distance leaves these cities in the intra-cluster
                                                              gaps. We display the resulting tours as color-coded
                                                              pinwheel diagrams. Appropriate color-coding makes
                                                              these relatively easy to analyze individually or compare
                                                              to one another.



Figure 1: W­curve generation showing initial  
points for curve "CG".
Figure 2: Example of tour generated by R's TSP package of the wild, Library of drug­resistant  
(“dr”), and random sample strains of HIV­1. The library strains cluster tightly together at the 
top (red) with the wild strain (#33) outside the drug­resistant cluster to the right. The approach 
shown here works well if individuals are sampled over time: pre­assigning the colors from a first 
tour makes it relatively easy to watch if samples from individuals drift into different groups.

4.3      Generating and Analyzing TSP                        have found that this works better than simply assigning 
         Clusters with R                                     the colors sequentially from one to the sample size. 
The R statistical package includes a TSP library, available  Assigning the colors by position along the tour gives 
from CRAN. We have used it here to generate an               some additional visual feedback from the similarity of 
approximate solution for clustering the genes. In our case  colors. This can be seen in Figure 2 where the closely 
an optimal tour is not required: any good approximation      related “library” sequences are all red. We have found 
will cluster the genes properly. Our approach starts with a  this color scheme also works well for comparing tours 
square distance matrix having zeros on the diagonals. The  generated from different sections of the same genes: 
TSP package does not require a symmetric matrix but in       even if the nodes appear in different orders on different 
our case we use one.                                         graphs the similar colors grouping together help 
                                                             identify the clusters.
The colors shown in Figure 2 are assigned by generating a 
list of 1024 colors and rotating the tour so that it starts  A slightly different color scheme can be used to observe 
with the first library sequence. After that the colors are   the evolution of drug resistant strains in a population. In 
assigned by taking the fractional tour length times 1024.    that case coloring the time scale allows us to watch 
For example, if the total tour length were 20 and one of     which direction the group is evolving. With the library 
the sequences fell at a cumulative distance of 9 then it     sequences colored red and the population samples green 
would be assigned a color of 9 / 20 * 1024, or 460. We       through blue over time, the migration of blue dots 
Figure 3: Example W­curves POL gene from wild (top) and example drug resistant HIV­1 strains  
[6]. Local features of the W­curve geometry can provide landmarks for isolating smaller  
regions of the sequences for comparison [8].

towards library samples is easy to pick out. The library      sequences with known clinical results. With the library 
samples can be drug resistant, or susceptible to different    samples in one color and the individual's sequence of 
drugs. Either way, the progression across clusters is         samples colored over time we can see how the various 
relatively easy to view.                                      samples migrate between clusters. This provides a nice 
                                                              way to integrate treatment information about an 
This approach also works well for integrating samples of 
                                                              individual that may rely on samples of unrelated genes.
an individual over time. We can display results of 
comparing various regions of an individual to a library of 
4.4      Combining the TSP and W-Curve
The TSP approach shown here will work for any 
comparison mechanism: ClustalW, Fasta, or Blast provide 
a suitable square comparison matrix and generate the 
                                                              7        References
graphic results from R. Our use of the W­curve has an             [1] Leitner T, Korber B, Daniels M, Calef C, Foley
advantage due to chunks: we can automate scoring                      B. (2005) HIV-1 Subtype and Circulating
discontinuous regions that may have differing alignments.             Recombinant Form (CRF) Reference
This matters with HIV­1 where the high rate of SNP's                  Sequences., http://hiv.lanl.gov
leaves too much white noise in larger areas and the high          [2] Brown BK, Sanders-Buell E, Rosa Borges A,
rate of gaps makes locating the often small, discontinuous            Robb ML, Birx DL, et al. (2008) Crossclade
areas causing drug resistance difficult. The curve's                  neutralization patterns among HIV1 strains from
geometry also provides us a more feature­rich                         the six major clades of the pandemic evaluated
environment in which to compare the curves. Figure 3                  and compared in two different models. Virology
shows the wild (HXB2 standard) and five drug resistant                375: 529–538.
sequences [citation: website with the sequences we used].         [3] Wu D, Roberge J, Cork DJ, Nguyen BG, Grace
Differences in geometry are visible, even when viewed at              T (1993) Computer visualization of long
different scales [citation: last year's publication]. The             genomic sequences, in Visualization 1993, IEEE
geometric representation also offers us more options for              Press, New York City, New York, CP 33:308–
approximate matching using discrete spatial mathematics               315.
than string comparison techniques allow.                          [4] Cork D, Lembark S, Tovanabutra S, Sanders-
                                                                      Buell E, Brown B, Robb M, Wieczorek L,
                                                                      Polonis V, Michael N, Kim J.
5        Conclusions                                                  Application of W-curves and TSP to Clustering
The W-curve provides us with a way of using landmarks                 HIV-1 Sequences by Epitope. 847-853
to identify regions of interest and score only the relevant          http://www.informatik.uni-
portions of sample sequences. The TSP with Climer &                  trier.de/~ley/db/conf/biocomp/biocomp2010.html
Zhang's boundary techniques offers a fast, effective way          [5]. Coffin JM, (1995) HIV population dynamics in
to cluster genes. The R statistical package provides us               vivo: implications for genetic variation,
with the tools to analyze and color-code the results for              pathogenesis, and therapy. Science 267, 483-
analysis. Taken together this provides us with a useful               489.
tool for comparing the status and evolution drug response         [6] Stanford HIV RT and Protease Sequence
for HIV-1.                                                            Database
                                                                     http://hivdb.stanford.edu/pages/genotype-rx.html
6        Acknowledgments                                          [7] http://mypages.iit.edu/˜cork/houstep/introduction.
Military HIV-1 Research Project (MHRP)/Henry Jackson              [8] http://www.bioinformatics.org/wcurve/
Foundation (HJF) and Workhorse Computing. This work               [9] Cork DJ, Lembark S, Tovanabutra S, Robb ML,
was funded by MHRP (Military HIV-1 Research Project)                  Kim JH (2010) W-curve Alignments for HIV1
cooperative agreement (W81XWH-07-2-0067-P00001)                       Genomic Comparisons. PLoS ONE 5(6):
between the Henry M. Jackson Foundation for the                       e10829. doi:10.1371/journal.plone.0010829
Advancement of Military Medicine, Inc., and the U.S.              [10] Climer S, Zhang W (2004) Take a walk and
Department of Defense (DOD). The cohort work was                      cluster genes: a TSPbased approach to optimal
supported through NICHD by R01 HD34343-03 and                         rearrangement clustering. ACM International
partly by a cooperative agreement between the Henry M.                Conference Proceeding Series; Vol. 69, p. 22.
Jackson Foundation for the Advancement of Military                    http://portal.acm.org/citation.cfm?id=1015419
Medicine and the U.S. Dept. of Defense. The funders had           [11] Zhou T, Xu L, Dey B, Hessell AJ, Van Ryk D,
no role in study design, data collection and analysis,                Xiang SH, Yang X, Zhang MY, Zwick MB,
decision to publish, or preparation of the manuscript.                Arthos J, Burton DR, Dimitrov DS, Sodroski J,
The authors wish to thank Pranamee Sarma and Scott                    Wyatt R, Nabel GJ, Kwong PD (2007)
Zintek for their assistance preparing the W-curve graphics            Structural definition of a conserved
used in this paper.                                                   neutralization epitope on HIV-1 gp120. Feb
                                                                      15;445(7129):732-7.

Más contenido relacionado

La actualidad más candente

2014QBioPoster_Jie
2014QBioPoster_Jie2014QBioPoster_Jie
2014QBioPoster_JieZhao Jie
 
The Recent advances in gene delivery using nanostructures and future prospects
The Recent advances in gene delivery using nanostructures and future prospectsThe Recent advances in gene delivery using nanostructures and future prospects
The Recent advances in gene delivery using nanostructures and future prospectsAANBTJournal
 
Bioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung CancerBioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung CancerPreveenRamamoorthy
 
Gb 2011-12-9-228
Gb 2011-12-9-228Gb 2011-12-9-228
Gb 2011-12-9-228鋒博 蔡
 
Whole genome scanning, resolving clinical diagnosis and management amaist com...
Whole genome scanning, resolving clinical diagnosis and management amaist com...Whole genome scanning, resolving clinical diagnosis and management amaist com...
Whole genome scanning, resolving clinical diagnosis and management amaist com...koda004
 
Probability Models for Estimating Haplotype Frequencies and Bayesian Survival...
Probability Models for Estimating Haplotype Frequencies and Bayesian Survival...Probability Models for Estimating Haplotype Frequencies and Bayesian Survival...
Probability Models for Estimating Haplotype Frequencies and Bayesian Survival...Université de Dschang
 
Integrative analysis and visualization of clinical and molecular data for can...
Integrative analysis and visualization of clinical and molecular data for can...Integrative analysis and visualization of clinical and molecular data for can...
Integrative analysis and visualization of clinical and molecular data for can...Lake Como School of Advanced Studies
 
Association mapping approaches for tagging quality traits in maize
Association mapping approaches for tagging quality traits in maizeAssociation mapping approaches for tagging quality traits in maize
Association mapping approaches for tagging quality traits in maizeSenthil Natesan
 
Recent advances in genetic Predisposition of Myasthenia Gravis
Recent advances in genetic Predisposition of Myasthenia GravisRecent advances in genetic Predisposition of Myasthenia Gravis
Recent advances in genetic Predisposition of Myasthenia Gravisangelisralopez
 
Heinrich_et_al-2003-International_Journal_of_Cancer
Heinrich_et_al-2003-International_Journal_of_CancerHeinrich_et_al-2003-International_Journal_of_Cancer
Heinrich_et_al-2003-International_Journal_of_CancerBianca Heinrich
 
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...Antoaneta Vladimirova
 
Crimson Publishers- New Hope for Cancer Immunotherapy: Viral Based Cancer Vac...
Crimson Publishers- New Hope for Cancer Immunotherapy: Viral Based Cancer Vac...Crimson Publishers- New Hope for Cancer Immunotherapy: Viral Based Cancer Vac...
Crimson Publishers- New Hope for Cancer Immunotherapy: Viral Based Cancer Vac...CrimsonPublishers-SBB
 
NGS in cancer treatment
NGS in cancer treatmentNGS in cancer treatment
NGS in cancer treatmentNur Suhaida
 
Joseph Levy MedicReS World Congress 2013 - 2
Joseph Levy MedicReS World Congress 2013 - 2Joseph Levy MedicReS World Congress 2013 - 2
Joseph Levy MedicReS World Congress 2013 - 2MedicReS
 

La actualidad más candente (20)

2014QBioPoster_Jie
2014QBioPoster_Jie2014QBioPoster_Jie
2014QBioPoster_Jie
 
Review paper
Review paperReview paper
Review paper
 
The Recent advances in gene delivery using nanostructures and future prospects
The Recent advances in gene delivery using nanostructures and future prospectsThe Recent advances in gene delivery using nanostructures and future prospects
The Recent advances in gene delivery using nanostructures and future prospects
 
Bioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung CancerBioinformatics-driven discovery of EGFR mutant Lung Cancer
Bioinformatics-driven discovery of EGFR mutant Lung Cancer
 
Editorial frontiers ov
Editorial frontiers ovEditorial frontiers ov
Editorial frontiers ov
 
Gb 2011-12-9-228
Gb 2011-12-9-228Gb 2011-12-9-228
Gb 2011-12-9-228
 
Whole genome scanning, resolving clinical diagnosis and management amaist com...
Whole genome scanning, resolving clinical diagnosis and management amaist com...Whole genome scanning, resolving clinical diagnosis and management amaist com...
Whole genome scanning, resolving clinical diagnosis and management amaist com...
 
Probability Models for Estimating Haplotype Frequencies and Bayesian Survival...
Probability Models for Estimating Haplotype Frequencies and Bayesian Survival...Probability Models for Estimating Haplotype Frequencies and Bayesian Survival...
Probability Models for Estimating Haplotype Frequencies and Bayesian Survival...
 
AIDS_21(7)_807-811
AIDS_21(7)_807-811AIDS_21(7)_807-811
AIDS_21(7)_807-811
 
Integrative analysis and visualization of clinical and molecular data for can...
Integrative analysis and visualization of clinical and molecular data for can...Integrative analysis and visualization of clinical and molecular data for can...
Integrative analysis and visualization of clinical and molecular data for can...
 
Nrgastro.2011.105
Nrgastro.2011.105Nrgastro.2011.105
Nrgastro.2011.105
 
Association mapping approaches for tagging quality traits in maize
Association mapping approaches for tagging quality traits in maizeAssociation mapping approaches for tagging quality traits in maize
Association mapping approaches for tagging quality traits in maize
 
Recent advances in genetic Predisposition of Myasthenia Gravis
Recent advances in genetic Predisposition of Myasthenia GravisRecent advances in genetic Predisposition of Myasthenia Gravis
Recent advances in genetic Predisposition of Myasthenia Gravis
 
10.1128@jcm.00298 17
10.1128@jcm.00298 1710.1128@jcm.00298 17
10.1128@jcm.00298 17
 
Heinrich_et_al-2003-International_Journal_of_Cancer
Heinrich_et_al-2003-International_Journal_of_CancerHeinrich_et_al-2003-International_Journal_of_Cancer
Heinrich_et_al-2003-International_Journal_of_Cancer
 
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
A Retrospective Analysis of Exome Sequencing Cases Using the GenePool™ Genomi...
 
Crimson Publishers- New Hope for Cancer Immunotherapy: Viral Based Cancer Vac...
Crimson Publishers- New Hope for Cancer Immunotherapy: Viral Based Cancer Vac...Crimson Publishers- New Hope for Cancer Immunotherapy: Viral Based Cancer Vac...
Crimson Publishers- New Hope for Cancer Immunotherapy: Viral Based Cancer Vac...
 
JCV2010
JCV2010JCV2010
JCV2010
 
NGS in cancer treatment
NGS in cancer treatmentNGS in cancer treatment
NGS in cancer treatment
 
Joseph Levy MedicReS World Congress 2013 - 2
Joseph Levy MedicReS World Congress 2013 - 2Joseph Levy MedicReS World Congress 2013 - 2
Joseph Levy MedicReS World Congress 2013 - 2
 

Destacado

Vl3 - The Web
Vl3 - The WebVl3 - The Web
Vl3 - The Webrobertour
 
Nasser Mansour CV_ 2016
Nasser Mansour CV_  2016Nasser Mansour CV_  2016
Nasser Mansour CV_ 2016Nasser Mansour
 
Nagorik uddyog’s economic and social contribution
Nagorik uddyog’s economic and social contributionNagorik uddyog’s economic and social contribution
Nagorik uddyog’s economic and social contributionAshraf Ahmed
 
How to-write-academic-acknowledgements
How to-write-academic-acknowledgementsHow to-write-academic-acknowledgements
How to-write-academic-acknowledgementsNaveed Ahmed Unar
 
LASUG Insite Plus Sitecore Connector
LASUG Insite Plus Sitecore ConnectorLASUG Insite Plus Sitecore Connector
LASUG Insite Plus Sitecore ConnectorKautilya Prasad
 
Hunting terminal dan pasar
Hunting terminal dan pasarHunting terminal dan pasar
Hunting terminal dan pasarYovie Mar
 
Regulation of atp7 a gene expression by the grx1 as an inducer in menkes d...
Regulation of atp7 a gene expression by the    grx1 as an inducer in menkes d...Regulation of atp7 a gene expression by the    grx1 as an inducer in menkes d...
Regulation of atp7 a gene expression by the grx1 as an inducer in menkes d...Pranamee Sarma
 
Extending Sitecore Commerce Connect
Extending Sitecore Commerce ConnectExtending Sitecore Commerce Connect
Extending Sitecore Commerce ConnectKautilya Prasad
 
Extending Sitecore Commerce Connect
Extending Sitecore Commerce ConnectExtending Sitecore Commerce Connect
Extending Sitecore Commerce ConnectKautilya Prasad
 
Qlik viewご紹介 v1.0
Qlik viewご紹介 v1.0Qlik viewご紹介 v1.0
Qlik viewご紹介 v1.0Yusuke-Ishii
 

Destacado (14)

[STAY TUNED- Estigues connectat!]
[STAY TUNED- Estigues connectat!][STAY TUNED- Estigues connectat!]
[STAY TUNED- Estigues connectat!]
 
Vl3 - The Web
Vl3 - The WebVl3 - The Web
Vl3 - The Web
 
Nasser Mansour CV_ 2016
Nasser Mansour CV_  2016Nasser Mansour CV_  2016
Nasser Mansour CV_ 2016
 
Cancer research paper
Cancer research paperCancer research paper
Cancer research paper
 
Homeschooling
HomeschoolingHomeschooling
Homeschooling
 
Nagorik uddyog’s economic and social contribution
Nagorik uddyog’s economic and social contributionNagorik uddyog’s economic and social contribution
Nagorik uddyog’s economic and social contribution
 
How to-write-academic-acknowledgements
How to-write-academic-acknowledgementsHow to-write-academic-acknowledgements
How to-write-academic-acknowledgements
 
LASUG Insite Plus Sitecore Connector
LASUG Insite Plus Sitecore ConnectorLASUG Insite Plus Sitecore Connector
LASUG Insite Plus Sitecore Connector
 
Hunting terminal dan pasar
Hunting terminal dan pasarHunting terminal dan pasar
Hunting terminal dan pasar
 
Regulation of atp7 a gene expression by the grx1 as an inducer in menkes d...
Regulation of atp7 a gene expression by the    grx1 as an inducer in menkes d...Regulation of atp7 a gene expression by the    grx1 as an inducer in menkes d...
Regulation of atp7 a gene expression by the grx1 as an inducer in menkes d...
 
Extending Sitecore Commerce Connect
Extending Sitecore Commerce ConnectExtending Sitecore Commerce Connect
Extending Sitecore Commerce Connect
 
Extending Sitecore Commerce Connect
Extending Sitecore Commerce ConnectExtending Sitecore Commerce Connect
Extending Sitecore Commerce Connect
 
Qlik viewご紹介 v1.0
Qlik viewご紹介 v1.0Qlik viewご紹介 v1.0
Qlik viewご紹介 v1.0
 
Sim week 06 chapter 01
Sim week 06   chapter 01Sim week 06   chapter 01
Sim week 06 chapter 01
 

Similar a Wc 2011-final

Gene sequencing and tb – a new age approach.pptx
Gene sequencing and tb – a new age approach.pptxGene sequencing and tb – a new age approach.pptx
Gene sequencing and tb – a new age approach.pptxShajahanPS
 
The emerging picture of host genetic control of susceptibility and outcome in...
The emerging picture of host genetic control of susceptibility and outcome in...The emerging picture of host genetic control of susceptibility and outcome in...
The emerging picture of host genetic control of susceptibility and outcome in...Meningitis Research Foundation
 
Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...Ronak Shah
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencingIncedo
 
OncoRep: A n-of-1 reporting tool to support genome-guided treatment for breas...
OncoRep: A n-of-1 reporting tool to support genome-guided treatment for breas...OncoRep: A n-of-1 reporting tool to support genome-guided treatment for breas...
OncoRep: A n-of-1 reporting tool to support genome-guided treatment for breas...Tobias Meißner
 
Gene hunting strategies
Gene hunting strategiesGene hunting strategies
Gene hunting strategiesAshfaq Ahmad
 
From reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene findingFrom reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene findingJoaquin Dopazo
 
Nanomaterials for Virus Detection
Nanomaterials for Virus DetectionNanomaterials for Virus Detection
Nanomaterials for Virus DetectionRichardJGray
 
Crossectional studies 2
Crossectional studies 2Crossectional studies 2
Crossectional studies 2Bruno Mmassy
 
BMJ Open-2016-Conway Morris--2
BMJ Open-2016-Conway Morris--2BMJ Open-2016-Conway Morris--2
BMJ Open-2016-Conway Morris--2Tracey Mare
 
Report- Genome wide association studies.
Report- Genome wide association studies.Report- Genome wide association studies.
Report- Genome wide association studies.Varsha Gayatonde
 
Systems biology for Medicine' is 'Experimental methods and the big datasets
Systems biology for Medicine' is 'Experimental methods and the big datasetsSystems biology for Medicine' is 'Experimental methods and the big datasets
Systems biology for Medicine' is 'Experimental methods and the big datasetsimprovemed
 
TEMPUS xT, Biópsia sólida vs TEMPUS xF, Biópsia Líquida
TEMPUS xT, Biópsia sólida vs TEMPUS xF, Biópsia LíquidaTEMPUS xT, Biópsia sólida vs TEMPUS xF, Biópsia Líquida
TEMPUS xT, Biópsia sólida vs TEMPUS xF, Biópsia LíquidaComunicaoIberlab
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation SequencingShelomi Karoon
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicJoaquin Dopazo
 

Similar a Wc 2011-final (20)

Gene sequencing and tb – a new age approach.pptx
Gene sequencing and tb – a new age approach.pptxGene sequencing and tb – a new age approach.pptx
Gene sequencing and tb – a new age approach.pptx
 
HIV Resistance (Journal Club)
HIV Resistance (Journal Club)HIV Resistance (Journal Club)
HIV Resistance (Journal Club)
 
Scientific Sessions 2015: Typing of STI pathogens
Scientific Sessions 2015: Typing of STI pathogensScientific Sessions 2015: Typing of STI pathogens
Scientific Sessions 2015: Typing of STI pathogens
 
GWAS
GWASGWAS
GWAS
 
The emerging picture of host genetic control of susceptibility and outcome in...
The emerging picture of host genetic control of susceptibility and outcome in...The emerging picture of host genetic control of susceptibility and outcome in...
The emerging picture of host genetic control of susceptibility and outcome in...
 
Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...Developing a framework for for detection of low frequency somatic genetic alt...
Developing a framework for for detection of low frequency somatic genetic alt...
 
Next generation sequencing
Next generation sequencingNext generation sequencing
Next generation sequencing
 
OncoRep: A n-of-1 reporting tool to support genome-guided treatment for breas...
OncoRep: A n-of-1 reporting tool to support genome-guided treatment for breas...OncoRep: A n-of-1 reporting tool to support genome-guided treatment for breas...
OncoRep: A n-of-1 reporting tool to support genome-guided treatment for breas...
 
Gene hunting strategies
Gene hunting strategiesGene hunting strategies
Gene hunting strategies
 
From reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene findingFrom reads to pathways for efficient disease gene finding
From reads to pathways for efficient disease gene finding
 
Nanomaterials for Virus Detection
Nanomaterials for Virus DetectionNanomaterials for Virus Detection
Nanomaterials for Virus Detection
 
Crossectional studies 2
Crossectional studies 2Crossectional studies 2
Crossectional studies 2
 
BMJ Open-2016-Conway Morris--2
BMJ Open-2016-Conway Morris--2BMJ Open-2016-Conway Morris--2
BMJ Open-2016-Conway Morris--2
 
Report- Genome wide association studies.
Report- Genome wide association studies.Report- Genome wide association studies.
Report- Genome wide association studies.
 
Systems biology for Medicine' is 'Experimental methods and the big datasets
Systems biology for Medicine' is 'Experimental methods and the big datasetsSystems biology for Medicine' is 'Experimental methods and the big datasets
Systems biology for Medicine' is 'Experimental methods and the big datasets
 
TEMPUS xT, Biópsia sólida vs TEMPUS xF, Biópsia Líquida
TEMPUS xT, Biópsia sólida vs TEMPUS xF, Biópsia LíquidaTEMPUS xT, Biópsia sólida vs TEMPUS xF, Biópsia Líquida
TEMPUS xT, Biópsia sólida vs TEMPUS xF, Biópsia Líquida
 
Next Generation Sequencing
Next Generation SequencingNext Generation Sequencing
Next Generation Sequencing
 
A New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The ClinicA New Generation Of Mechanism-Based Biomarkers For The Clinic
A New Generation Of Mechanism-Based Biomarkers For The Clinic
 
A genetic model for neurodevelopmental disease
A genetic model for neurodevelopmental diseaseA genetic model for neurodevelopmental disease
A genetic model for neurodevelopmental disease
 
journal.pone.0006828.PDF
journal.pone.0006828.PDFjournal.pone.0006828.PDF
journal.pone.0006828.PDF
 

Último

Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service AvailableDipal Arora
 
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...adilkhan87451
 
Call Girls Coimbatore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Coimbatore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 8250077686 Top Class Call Girl Service AvailableDipal Arora
 
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋TANUJA PANDEY
 
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...adilkhan87451
 
Kollam call girls Mallu aunty service 7877702510
Kollam call girls Mallu aunty service 7877702510Kollam call girls Mallu aunty service 7877702510
Kollam call girls Mallu aunty service 7877702510Vipesco
 
Most Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on WhatsappMost Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on WhatsappInaaya Sharma
 
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...Sheetaleventcompany
 
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls * UPA...
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls  * UPA...Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls  * UPA...
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls * UPA...mahaiklolahd
 
Models Call Girls In Hyderabad 9630942363 Hyderabad Call Girl & Hyderabad Esc...
Models Call Girls In Hyderabad 9630942363 Hyderabad Call Girl & Hyderabad Esc...Models Call Girls In Hyderabad 9630942363 Hyderabad Call Girl & Hyderabad Esc...
Models Call Girls In Hyderabad 9630942363 Hyderabad Call Girl & Hyderabad Esc...GENUINE ESCORT AGENCY
 
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls ServiceGENUINE ESCORT AGENCY
 
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...parulsinha
 
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...chandars293
 
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service AvailableCall Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service AvailableGENUINE ESCORT AGENCY
 
Call Girls Mumbai Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Mumbai Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Mumbai Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Mumbai Just Call 8250077686 Top Class Call Girl Service AvailableDipal Arora
 
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service AvailableDipal Arora
 
Call Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service AvailableDipal Arora
 
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In AhmedabadGENUINE ESCORT AGENCY
 
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...Dipal Arora
 

Último (20)

Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Amritsar Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
Call Girls in Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service Avai...
 
Call Girls Coimbatore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Coimbatore Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Coimbatore Just Call 8250077686 Top Class Call Girl Service Available
 
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
VIP Hyderabad Call Girls Bahadurpally 7877925207 ₹5000 To 25K With AC Room 💚😋
 
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
Russian Call Girls Lucknow Just Call 👉👉7877925207 Top Class Call Girl Service...
 
Kollam call girls Mallu aunty service 7877702510
Kollam call girls Mallu aunty service 7877702510Kollam call girls Mallu aunty service 7877702510
Kollam call girls Mallu aunty service 7877702510
 
Most Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on WhatsappMost Beautiful Call Girl in Bangalore Contact on Whatsapp
Most Beautiful Call Girl in Bangalore Contact on Whatsapp
 
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
💚Call Girls In Amritsar 💯Anvi 📲🔝8725944379🔝Amritsar Call Girl No💰Advance Cash...
 
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls * UPA...
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls  * UPA...Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls  * UPA...
Call Girl in Indore 8827247818 {LowPrice} ❤️ (ahana) Indore Call Girls * UPA...
 
Models Call Girls In Hyderabad 9630942363 Hyderabad Call Girl & Hyderabad Esc...
Models Call Girls In Hyderabad 9630942363 Hyderabad Call Girl & Hyderabad Esc...Models Call Girls In Hyderabad 9630942363 Hyderabad Call Girl & Hyderabad Esc...
Models Call Girls In Hyderabad 9630942363 Hyderabad Call Girl & Hyderabad Esc...
 
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service
9630942363 Genuine Call Girls In Ahmedabad Gujarat Call Girls Service
 
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
Premium Call Girls In Jaipur {8445551418} ❤️VVIP SEEMA Call Girl in Jaipur Ra...
 
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...Top Rated  Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
Top Rated Hyderabad Call Girls Erragadda ⟟ 9332606886 ⟟ Call Me For Genuine ...
 
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service AvailableCall Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
Call Girls Ahmedabad Just Call 9630942363 Top Class Call Girl Service Available
 
Call Girls Mumbai Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Mumbai Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Mumbai Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Mumbai Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Varanasi Just Call 8250077686 Top Class Call Girl Service Available
 
Call Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service AvailableCall Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service Available
Call Girls Hyderabad Just Call 8250077686 Top Class Call Girl Service Available
 
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
8980367676 Call Girls In Ahmedabad Escort Service Available 24×7 In Ahmedabad
 
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7Call Girls in Gagan Vihar (delhi) call me [🔝  9953056974 🔝] escort service 24X7
Call Girls in Gagan Vihar (delhi) call me [🔝 9953056974 🔝] escort service 24X7
 
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...
Top Rated Pune Call Girls (DIPAL) ⟟ 8250077686 ⟟ Call Me For Genuine Sex Serv...
 

Wc 2011-final

  • 1. Application of W-curves and TSP to Clustering HIV1 Sequences Douglas  Cork1,2,3, Steven Lembark4, Nelson Michael1,5, Jerome Kim1,5 US Military HIV Research Program1; Henry M. Jackson Foundation for the Advancement of Military  Medicine2, Rockville, MD.  20850, BCPS Dept., Illinois Institute of Technology3, Chicago, IL. 60616,  Workhorse Computing4, Woodhaven, NY.,  Walter Reed Army Institute of Research5, Rockville, MD. Abstract - The high mutation rate in HIV-1 makes it automated process that can identify and compare drug difficult to treat and analyze. Monitoring the evolution of resistance sites would be an enormous help. The W- drug resistance requires frequent resequencing, but curve's scoring process produces a set of localized comparing and visualizing the progress is difficult. One comparison values that can be used as landmarks that difficulty is simply locating the areas of interest: gaps and can be used to automate the process of locating relevant crossover mutations make it difficult to isolate clinically areas for comparison. Clustering new samples with significant sequences for comparison. Effectively known ones using the Traveling Salesman Problem displaying the results of comparisons grouped according (“TSP”) can automate the grouping of new samples into to multiple regions is also a problem. Our comparison drug-resistance categories. Combining them leads to an algorithm based on the W-curve helps automate the automated process for tracking medication in comparison process, producing results suitable for individuals or analysis of groups. clustering via a modified solution to the Traveling Salesman Problem (“TSP”). Appropriate color-coding of the TSP results allows us to display the results of multiple 3 Background comparisons effectively for single samples or time-series. Analyzing the data in many HIV studies is made The results can be useful for providing guidance in difficult by a combination of HIV's genetics and the treatment, analyzing the membership in anonymous study sources of data. Patients in many HIV clinics are populations, tracking the evolution of drug resistance in anonymous, often because they are illegally engaged in populations, or rates of co-infection within study groups. prostitution, illegal drug use, or homosexual sex. Studies of groups look for new strains and how they Key words:  HIV­1 Genomic Clustering, W­curve,  spread and have to check for changing members of the Traveling Salesman Problem, Drug Resistance sample population; individual patients have to be sequenced frequently to evaluate appropriate drugs. 1 Disclaimer HIV's high mutation and crossover rates combined by discontinuous sequences for the drug responses make The opinions or assertions contained herein are the private the analysis difficult. The high mutation in areas views of the author, and are not to be construed as between the regions of interest confound any analysis official, or as reflecting true views of the Dept. of the based on whole genes or regions: there is simply too Army or the DoD. much white noise on the genome to use large portions of it for comparison. High rates of mutation and co- 2 Introduction infection leave many individuals with multiple strains of HIV, further confounding the analysis, and HIV One step in managing HIV infection response is strains have frequent crossover mutations, making monitoring individuals and study populations for things even worse. evolution of drug resistance. Given in vivo replication Treatment of HIV is still largely a manual process: kinetics with more than 109 new cells infected every day, patients have to be sampled and sequenced frequently each and every possible single-point mutation occurs and doctors have to make informed decisions on how to between 104 and 105 times per day in an HIV infected deal with evolution of the strains infecting them. individuals [5]. This leaves identifying the markers for Presenting sequence comparison results is particularly drug resistance difficult, requiring multiple manual steps important: physicians have to evaluate the similarity of in many cases. Especially for population studies, an a single individual to multiple known drug-resistant
  • 2. strains. sequences will leave the curves in the same quadrant. This keeps our same-quadrant measure small for SNP's. In the US 98% of the infections are type-B, and the single FDA-approved program for treatment using genetics We use a two-pass process for comparing W-curves. treats type-B only. The U.S. Army has to treat soldiers The first pass produces an array of alignment regions who acquire HIV all over the world and only 85% of their with starts, stops, and a difference measure that we call cases are type-B, leaving them without any good options “chunks”. SNP's increase the difference measure, gaps for 15% of their cases. Being able to at least classify the show up as differences in the starting values on non-B cases and evaluate their treatment outcomes successive chunks of the comparison, indels as effectively would be a huge help. successive chunks with no change in the relative offset. The second pass summarizes the chunks into whatever measures are useful, for example by averaging the 4 Methodology differences over the length of a sequence. The advantage of chunking the results first is that 4.1 W-curves similar chunks can be used to locate landmarks for The W-curve was originally designed as the basis for a aligning sequences with one another. Areas with small graphical tool for visualizing very large regions of DNA differences provide an automated means of locating the [3,4]. Over time it has been adapted for numerical offsets between start and stop values in the sequences. comparison as well [6]. By abstracting the genetic Given a library of sequences with known landmarks sequences into a three-dimensional space, W-curves offer such as points of known drug resistance, we can score a a wider range of comparisons rather than comparisons single incoming sequence against all of them. based on searching, aligning and tree-building (assuming a mutation model) with unidimensional strings of characters. W-curves make it easier to find patterns in 4.2 TSP and Clustering sequences or locate common features between genes. The Traveling Salesman Problem (“TSP”) is quite easy Their 3D nature also makes it possible to align smaller to describe but difficult to solve. The problem is to take features than are possible with string based techniques. a list of distances between cities and make a tour of The design of our comparison utilities builds on these them which visits each city once with the least total capabilities to provide a technique for matching the smalldistance. Substitute cost and the algorithm will find the regions of HIV-1's CD4 epitopes along its gp120 gene. “cheapest” route through all of the cities. As it turns out, this problem is NP-complete, requiring analysis of The order of bases along the corners is significant: all routes to guarantee the least distance. Much work number of hydrogen bonds (2 or 3) and chemical structure has gone into developing heuristics for solving this (purine or pyramidine) share quadrants around the square. problem and there are fast algorithms for approximating This means that most synonymous SNP's in the gene the solution. The utility of TSP problems is that an optimal tour will cluster the closest cities together. If the difference measures are for genes, they can similarly be clustered on any region of interest. A number of techniques for determining inter- and intra-clade distances have been developed. One technique developed by Climer and Zhang at Washington University is to add a fixed number of “dummy cities” to the list [7]. Each dummy has a small distance to all other cities (we use 2-20). The non-zero distance leaves these cities in the intra-cluster gaps. We display the resulting tours as color-coded pinwheel diagrams. Appropriate color-coding makes these relatively easy to analyze individually or compare to one another. Figure 1: W­curve generation showing initial   points for curve "CG".
  • 3. Figure 2: Example of tour generated by R's TSP package of the wild, Library of drug­resistant   (“dr”), and random sample strains of HIV­1. The library strains cluster tightly together at the  top (red) with the wild strain (#33) outside the drug­resistant cluster to the right. The approach  shown here works well if individuals are sampled over time: pre­assigning the colors from a first  tour makes it relatively easy to watch if samples from individuals drift into different groups. 4.3 Generating and Analyzing TSP have found that this works better than simply assigning  Clusters with R the colors sequentially from one to the sample size.  The R statistical package includes a TSP library, available  Assigning the colors by position along the tour gives  from CRAN. We have used it here to generate an  some additional visual feedback from the similarity of  approximate solution for clustering the genes. In our case  colors. This can be seen in Figure 2 where the closely  an optimal tour is not required: any good approximation  related “library” sequences are all red. We have found  will cluster the genes properly. Our approach starts with a  this color scheme also works well for comparing tours  square distance matrix having zeros on the diagonals. The  generated from different sections of the same genes:  TSP package does not require a symmetric matrix but in  even if the nodes appear in different orders on different  our case we use one. graphs the similar colors grouping together help  identify the clusters. The colors shown in Figure 2 are assigned by generating a  list of 1024 colors and rotating the tour so that it starts  A slightly different color scheme can be used to observe  with the first library sequence. After that the colors are  the evolution of drug resistant strains in a population. In  assigned by taking the fractional tour length times 1024.  that case coloring the time scale allows us to watch  For example, if the total tour length were 20 and one of  which direction the group is evolving. With the library  the sequences fell at a cumulative distance of 9 then it  sequences colored red and the population samples green  would be assigned a color of 9 / 20 * 1024, or 460. We  through blue over time, the migration of blue dots 
  • 4. Figure 3: Example W­curves POL gene from wild (top) and example drug resistant HIV­1 strains   [6]. Local features of the W­curve geometry can provide landmarks for isolating smaller   regions of the sequences for comparison [8]. towards library samples is easy to pick out. The library  sequences with known clinical results. With the library  samples can be drug resistant, or susceptible to different  samples in one color and the individual's sequence of  drugs. Either way, the progression across clusters is  samples colored over time we can see how the various  relatively easy to view. samples migrate between clusters. This provides a nice  way to integrate treatment information about an  This approach also works well for integrating samples of  individual that may rely on samples of unrelated genes. an individual over time. We can display results of  comparing various regions of an individual to a library of 
  • 5. 4.4 Combining the TSP and W-Curve The TSP approach shown here will work for any  comparison mechanism: ClustalW, Fasta, or Blast provide  a suitable square comparison matrix and generate the  7 References graphic results from R. Our use of the W­curve has an  [1] Leitner T, Korber B, Daniels M, Calef C, Foley advantage due to chunks: we can automate scoring  B. (2005) HIV-1 Subtype and Circulating discontinuous regions that may have differing alignments.  Recombinant Form (CRF) Reference This matters with HIV­1 where the high rate of SNP's  Sequences., http://hiv.lanl.gov leaves too much white noise in larger areas and the high  [2] Brown BK, Sanders-Buell E, Rosa Borges A, rate of gaps makes locating the often small, discontinuous  Robb ML, Birx DL, et al. (2008) Crossclade areas causing drug resistance difficult. The curve's  neutralization patterns among HIV1 strains from geometry also provides us a more feature­rich  the six major clades of the pandemic evaluated environment in which to compare the curves. Figure 3  and compared in two different models. Virology shows the wild (HXB2 standard) and five drug resistant  375: 529–538. sequences [citation: website with the sequences we used].  [3] Wu D, Roberge J, Cork DJ, Nguyen BG, Grace Differences in geometry are visible, even when viewed at  T (1993) Computer visualization of long different scales [citation: last year's publication]. The  genomic sequences, in Visualization 1993, IEEE geometric representation also offers us more options for  Press, New York City, New York, CP 33:308– approximate matching using discrete spatial mathematics  315. than string comparison techniques allow. [4] Cork D, Lembark S, Tovanabutra S, Sanders- Buell E, Brown B, Robb M, Wieczorek L, Polonis V, Michael N, Kim J. 5 Conclusions Application of W-curves and TSP to Clustering The W-curve provides us with a way of using landmarks HIV-1 Sequences by Epitope. 847-853 to identify regions of interest and score only the relevant http://www.informatik.uni- portions of sample sequences. The TSP with Climer & trier.de/~ley/db/conf/biocomp/biocomp2010.html Zhang's boundary techniques offers a fast, effective way [5]. Coffin JM, (1995) HIV population dynamics in to cluster genes. The R statistical package provides us vivo: implications for genetic variation, with the tools to analyze and color-code the results for pathogenesis, and therapy. Science 267, 483- analysis. Taken together this provides us with a useful 489. tool for comparing the status and evolution drug response [6] Stanford HIV RT and Protease Sequence for HIV-1. Database http://hivdb.stanford.edu/pages/genotype-rx.html 6 Acknowledgments [7] http://mypages.iit.edu/˜cork/houstep/introduction. Military HIV-1 Research Project (MHRP)/Henry Jackson [8] http://www.bioinformatics.org/wcurve/ Foundation (HJF) and Workhorse Computing. This work [9] Cork DJ, Lembark S, Tovanabutra S, Robb ML, was funded by MHRP (Military HIV-1 Research Project) Kim JH (2010) W-curve Alignments for HIV1 cooperative agreement (W81XWH-07-2-0067-P00001) Genomic Comparisons. PLoS ONE 5(6): between the Henry M. Jackson Foundation for the e10829. doi:10.1371/journal.plone.0010829 Advancement of Military Medicine, Inc., and the U.S. [10] Climer S, Zhang W (2004) Take a walk and Department of Defense (DOD). The cohort work was cluster genes: a TSPbased approach to optimal supported through NICHD by R01 HD34343-03 and rearrangement clustering. ACM International partly by a cooperative agreement between the Henry M. Conference Proceeding Series; Vol. 69, p. 22. Jackson Foundation for the Advancement of Military http://portal.acm.org/citation.cfm?id=1015419 Medicine and the U.S. Dept. of Defense. The funders had [11] Zhou T, Xu L, Dey B, Hessell AJ, Van Ryk D, no role in study design, data collection and analysis, Xiang SH, Yang X, Zhang MY, Zwick MB, decision to publish, or preparation of the manuscript. Arthos J, Burton DR, Dimitrov DS, Sodroski J, The authors wish to thank Pranamee Sarma and Scott Wyatt R, Nabel GJ, Kwong PD (2007) Zintek for their assistance preparing the W-curve graphics Structural definition of a conserved used in this paper. neutralization epitope on HIV-1 gp120. Feb 15;445(7129):732-7.