Scientific research is inherently a collaborative task; in our case it is a dialog among different researchers to reach a shared understanding of the underlying biology. To facilitate this dialog we have developed two web-based annotation tools: Apollo (http://genomearchitect.org/), a genomic feature editor, designed to support structural annotation of gene models, and Noctua (http://noctua.berkeleybop.org/), a biological-process model builder designed for describing the functional roles of gene products. Here we wish to outline an inventory of essential requirements that, in our experience, enable an annotation tool to meet the needs of both professional biocurators as well as other members of the research community. Here are the general requirements, beyond specific functional requirements, that any annotation tool must satisfy.
Essential Requirements for Community Annotation Tools
1. Scientific research is inherently a
collaborative task. Berkeley
Bioinformatics Open-Source Projects
(BBOP) develops software to support
the work of biocurators. Our tools
foster dialog among researchers to
reach a shared understanding of the
underlying biology.
ESSENTIAL REQUIREMENTS FOR
COMMUNITY ANNOTATION TOOLS
INHERENTLY
COLLABORATIVE
Annotation tools help improve curation
quality, and in many research groups
they constitute a means to introducing
human curation to the annotation
process for the first time.
APOLLO NOCTUA
BBOP
Apollo supports curation of gene models for multiple organisms in
one server and generates analysis-ready dataas well as progress
reports. We make continuous improvements to support theneeds
of the growing researchcommunity that forms our user base.
Apollo is a collaborative genome annotation editor.
It is instantaneous, web-based, and builton top of
JBrowse.
212,500 214,000
GB4 0 0 27 -RA GB4 0 0 22 -RA
GB4 0 0 28 -RA
Scaffold 17
1,350,000
Scaffold details
Scaffold details
ANNOTATION ACROSS
SCAFFOLDS
Scaffold 223
. . . . . .Scaffold 223
LATEST IMPROVEMENTS
Scaffold 17 Scaffold n
1
2
3
1. User-created Annotations
2. Evidence Tracks: Experimental Data, Alignments
3. Annotator Panel
Noctua directly models information as agraph, escaping many of the
pitfalls of more tabular modeling. Arich, interactiveand collaborative
interface allows users to assemble graphs torepresent biological
knowledge including aspects such as references andevidence.
Noctua is a modern web-based application and
stack for modeling complexbiological processes.
Immediate communication between curators through parallel chat
mechanisms.
Real time updates to allow geographically dispersed curators to
conduct joint, simultaneous efforts.
Well supported history mechanisms providing the ability to comment
on versions, browse versions to see different edits and commentary,
and revert to earlier versions.
Rigorously documenting the experimental or inferential basis for all of
the annotations that are made, with credit assigned through citations.
Offering incentives for adoption, such as facilitating the publication
process.
Providing different levels of permissions for users and administrators,
for example so that a curator might “doodle” within their own work
area before releasing their version for feedback from others.
Functional stability and ease of migrating forward when new software
is released.
Prompt responsiveness to users’ requests and informative documentation,
and dedicated resources for training and user support, from online
seminars to video tutorials to repositories with teaching materials.
And, most importantly, a publishing mechanism, such that biocurators
and other contributors receive credit for their insights and
contributions to our collective understanding of biology.
Monica Munoz-Torres1, Chris Mungall1, Nathan Dunn1, Seth
Carbon1
, Heiko Dietze1
, Nicole Washington1
, Jeremy Nguyen1
,
Paul Thomas2, Suzanna Lewis1.
1 Environmental GenomicsandSystemsBiology Division,
Lawrence Berkeley National Laboratory, Berkeley, CA
2 Keck School ofMedicine, DepartmentofPreventiveMedicine,
University ofSouthern California, LosAngeles, CA
Apollo isfunded by NIH grants5R01GM080203 from NIGMS, and 5R01HG004483 from
NHGRI. Also supportedby the Director, Office ofScience, Office ofBasicEnergy Sciences,
ofthe U.S. Department ofEnergy underContractNo. DE-AC02-05CH11231.
Noctua also presents a
complete set of tooling
for data extraction and
integration.
Noctua model for GTPase activator activity as described in published literature on blood pressure for Mouse Arhg ap42.
AmiGO‘s “CytoView” of GO model for
Mouse Arhg ap42 (blood pressure).
Annotations included in the theGO model for Mouse Arhg ap42 (blood
pressure). These are shown in a tabulatedview of the L EGO graph using
AmiGO. L EGO is a graph-based abstraction of biological k nowledge,
versatile beyond a GAF file.
This is the same model represented in theWeb Ontology Language
(OWL) using Noctua, as seen in its command line interface.