PopGenBase introduction meeting at the American Society of Tropical Medicine and Hygiene targeted for vector biologists, ecologist and epidemiologists. PopGenBase are designed to hold genetic variation data for natural populations of mosquitoes.
2. VectorBase - PopGenBase Group Meeting
About Ag PopGenBase
There are presently no open databases that combine population, ecological and genomics information for the
vector community. Besides enabling new understanding, a population biology resource would help to preserve
data that has already been acquired, facilitate collaboration among laboratories and permit better, more integrat-
ed studies of vector problems.
In October 2008 VectorBase initiated a plan for the development of a web-based data repository for information
about the population genomics of invertebrate vectors of human disease. This endeavor is being pursued under
the direction of Dr. Gregory C. Lanzaro, University of California-Davis and Dr. Charles E. Taylor, University
of California-Los Angeles. Development of this new feature of VectorBase will begin with Ag PopGenBase a
population genomics database for Anopheles gambiae with plans to develop similar resources for all of the taxa
currently represented in VectorBase.
Whereas current research aimed at describing the genetics of natural vector populations involve determin-
ing genotypes at tens of loci, it is clear that in the near future such studies will involve tens, even hundreds of
thousands of loci. This dramatic change will be achieved through the inclusion of genomics information into the
design of population level studies. The maintenance and analysis of such large volumes of data are challenging.
Furthermore, the integration of information across individual studies will be daunting.
We foresee the following outcomes resulting from this project:
1. Provide encouragement to the vector population genetics community to publicly store raw data and make
it accessible to other investigators, much in the manner of many of the public data resources of NCBI and
EMBL-EBI.
2. Providing and encouraging the use of such a resource, will bring some much-needed standardization of ge-
netic markers and inter-study compatibility to the field of vector population genetics.
3. Will directly link markers used in population studies with emerging knowledge of the genome, generating
far more powerful population data sets, because selection of genetic loci can be based on the best informa-
tion available on the function of those loci and genome-level constraints on their variation.
4. Encourage studies in other areas of vector biology, like ecology, behavior, and epidemiology, to incorporate
genetic information, in order to identify genes affecting complex phenotypes.
We have recently completed Stage 1 of Ag PopGenBase (Ag PGB). This first stage includes data for 5,870 indi-
vidual An. gambiae s.l. collected from 54 sites in Mali and for 3,245 individuals from 51 sites in Cameroon. The
data includes species identification (An. gambiae and An. arabiensis), molecular form (M vs. S), chromosomal
form (Forest, Bamako, Mopti and Savanna) and karyotype (for paracentric inversions on the right arm of chro-
mosome 2 and the 2La inversion on the left arm of chromosome 2). A Google Maps API provides visualization
of these data on maps.
In Stage 2 of Ag PGB members of the community may request database service to setup password protected
files to allow each individual researcher to use a pilot Ag PGB for the management and analysis of data for their
own ongoing projects. Feedback will be solicited so the particular interests and needs of the community can be
considered for inclusion in future versions of Ag PGB, and to assist us in expanding the population database to
other taxa.
1
3. VectorBase - PopGenBase Group Meeting
1 Current PopGenBase
Access GoogleMaps API
Although we are currently working on establishing an Geographic distribution of molecular forms and chro-
Ag PopGenBase link to vectorbase.org, it can now be mosomal forms can be viewed using the GoogleMaps
accessed at the UCD site: https://grassi2.ucdavis.edu. interface. Population genetic data is retrieved directly
from the database. This view is ideal for checking
Genetic Data progress of projects in real time.
Ag PGB provides Anopheles gambiae population ge-
GoogleEarth
netic data from Mali and Cameroon. Molecular forms,
karyotype (chromosome inversion genotype for 2La, We provide not just pie charts, pre-generated from the
2Rj, b, c, d, and u), and chromosome images for indi- database, of various genetic markers for each collec-
vidual mosquitoes are available from this site. These tion site but also provide an IGBP landcover layer so
data originated from the labs of Gregory Lanzaro and that the assocations of genetic markers and with envi-
Charles Taylor. We will begin solicitations of data ronmental features can be visualized.
from other labs soon.
2
4. VectorBase - PopGenBase Group Meeting
Query page : selection panels (top) and output in various format (bottom). Chromosome image links
are provided in HTML output format for karyotype verification.
3
5. VectorBase - PopGenBase Group Meeting
GoogleMaps API (top) and
GoogleEarth (right) screen-
shots. The database tool
makes visualization of popu-
lation data in a geographic
context easy. GoogleMaps
API and GoogleEarth are
becoming more powerful for
mapping various types of
data.
4
6. VectorBase - PopGenBase Group Meeting
2 Pilot Workbench Project
Goal Contributing data
We would like to recruit beta users for our pilot work- If you have population genetic data that you would
bench project. Research groups may request database like to make available to the vector population biol-
service for their own ongoing projects. These data will ogy community please consider providing it to us for
be password protected and accessable only to those PopGenBase. Any data dealing with genetic polymor-
who are permitted by the principal investigator of the phism in natural vector populations may be appropri-
project. Feedback will be solicited so that particular ate for PopGenBase.
interests and needs of the community will be consid-
Contact
ered for inclusion in the next stage of PopGenBase.
Program
Workplan Gregory Lanzaro (PI): gclanzaro@ucdavis.edu
Although the current PopGenBase is designed specifi- Charles Taylor (PI): taylor@biology.ucla.edu
cally for An. gambiae s.s., we would like to provide
assistance to those organizing population data for Technical Inquery
other vector species. Based on sample data provided Yoosook Lee : yoslee@ucdavis.edu
by researchers, PopGenBase programmers will make
adjustments in default settings or write new tools to
handle new data.
Our goal is to create a database where resarchers can
populate and update data for themselves. It is, how-
ever, likely that the data from different labs will have
different formats and organization schemes. Also
some laboratories may not have convinient access to
personnel who can write script to conver data into an
appropriate format. Ag PopGenBase curators will be
available to assist you to populate your database dur-
ing the next 6 months development cycle. We intend to
develop a robust data submission script by the end of
this project (Jun. 29th, 2009). PopGenBase workbench
beta users will be expected to provide feedback to as-
sist us in developing a data management tool with the
broadest utility.
Data query, retrieval and visualization tools will be
provided by PopGenBase. We would like to hear from
you for any other computational needs that you would
like to see included in PopGenBase. This project was
launched for the community of vector biologists,
ecologists and epidemiologists. Through this interac-
tion, we would like to provide relavent service to these
underserved communities.
5