These are the first lecture slides of the BITS bioinformatics training session on the UCSC Genome Browser.
See http://www.bits.vib.be/index.php?option=com_content&view=article&id=17203990:orange-genome-browsers-ucsc-training&catid=81:training-pages&Itemid=190
2. Introduction
§
Browse genes in their genomic context
§
See features in and around a specific gene
§
Investigate genome organization and explore larger
chromosome regions
§
Search and retrieve information on a gene- and
genome-scale
§
Compare genomes
3. Introduction
§
Collaboration between main genome browsers
Ensembl, UCSC and NCBI
» use same genome assemblies
» interlinking between sites
§
Ensembl Genome Browser: http://www.ensembl.org/
§
NCBI Map Viewer: http://www.ncbi.nlm.nih.gov/mapview/
§
UCSC Genome Browser: http://genome.ucsc.edu/
19. § The UCSC Genome browser was created by the
Genome Bioinformatics Group
at the University of California Santa Cruz (UCSC).
http://genome.ucsc.edu/
20. §
The Genome Browser zooms and scrolls
over chromosomes, showing the work of
annotators worldwide.
21. § Blat quickly maps your sequence to the genome.
BLAT is not BLAST !
BLAT works by keeping an index of the entire genome in memory.
The index consists of all non-overlapping DNA 11-mers or protein 4-mers.
The index is used to find areas of probable homology, which are then
loaded into memory for a detailed alignment.
BLAT on DNA can quickly find sequences of 95% and greater similarity
of length 40 bases or more.
BLAT on proteins finds sequences of 80% and greater similarity of length
20 amino acids or more.
22. §
The Table Browser provides convenient
access to the underlying database.
23. § The Gene Sorter displays a sorted table of genes
that are related to one another.
The relationship can be one of several types, including protein-
level homology,
similarity of gene expression profiles,
or genomic proximity.
24. § In-Silico PCR searches a sequence database with a pair of PCR
primers, using an indexing strategy for fast performance.
§ When successful, the search returns a file (fasta) containing all
sequences in the database that lie between and include the
primer pair.
25. § Genome Graphs is a tool for displaying
genome-wide data sets such as the results
of genome-wide SNP association studies,
linkage studies and homozygosity mapping.
26. § Galaxy allows you to do analyses you cannot do
anywhere else without the need to install or
download anything.
§ You can analyze multiple alignments, compare
genomic annotations and much more...
27. § VisiGene lets you browse through a large
collection of in situ mouse and frog images.
28. § The Proteome Browser provides a wealth of
protein information presented in the form of
graphical images of tracks and histograms
and links to other sites.
29. § The Utilities page contains links to some tools
created by the UCSC Genome Bioinformatics Group.
§ DNA Duster & Protein Duster remove non-sequence
related characters from an input sequence.
30. § The Utilities page contains links to some tools
created by the UCSC Genome Bioinformatics Group.
§ DNA Duster & Protein Duster remove non-sequence
related characters from an input sequence.
46. Navigation: position control
§
Click the zoom in and zoom out buttons on top
to zoom in or out 1.5, 3 or 10-fold
on the center of the window
47. Navigation: position control
§
Zoom in 3-fold by clicking anywhere
on the base position track
§
Zoom to a specific region using “drag and zoom”
48. Navigation: position control
§
To scroll the view of the display horizontally
by set increments of 10%, 50% or 95%
of the displayed size (as given in base pairs)
click the corresponding move arrow
49. Navigation: position control
§
To scroll the left of right side by a specified number of
vertical gridlines while keeping the opposite side fixed
click the appropriate move start or move end
arrow
50. Navigation: position control
§
To display a (completely) different position
enter the new location in the position/search text
box
§
You can also jump to an other gene location
52. HIDE = removes a track from view
FULL = each item on a separate line
53. DENSE = all items collapsed into single line
SQUISH = all items on several lines
PACKED and at 50% height
PACK = each item separate and
efficiently stacked (full height)
78. Browser graphics in PDF
TABLE GET CURRENT
BROWSER DNA BROWSER
GRAPHIC IN PDF
TO GET
OTHER
CLICK DATA
LINE
79. 1
CURRENT
BROWSER
GRAPHIC IN PDF
TO GET
OTHER
DATA
80.
81.
82. Exercises (I)
1) Search for your gene of interest
on Human Feb. 2009 (GRCh37/hg19) Assembly
» Include 1000 base pairs up- and downstream
» Only show the tracks:
RefSeq Genes (pack)
Conservation (full, primates only)
» Save graphical view as PDF (exercises1_1)
83. Exercises (I)
2) How many transcripts are there?
» Compare UCSC Genes with RefSeq and Ensembl genes!
» Save graphical view as PDF (exercises1_2)
84. Exercises (I)
3) What are the flanking genes?
Are these conserved outside mammals?
» Zoom out until you can see at least
two or three flanking genes
(may need to hide some tracks, leave RefSeq on)
» Now have a look in the chicken genome
» Save graphical view as PDF
(exercises1_3a en exercises1_3b)
85. Exercises (I)
4) Is there any regulatory information available?
» Change the view to see the genomic region upstream
(exon 1 and ~2000 upstream) and open some regulatory tracks
e.g. ORegAnno, TFBS Conserved, TS miRNA sites
» Save graphical view as PDF (exercises1_4)