Wide Search Molecular Replacement and the NEBioGrid portal interface

Wide-‐Search

Molecular

Replacement
Ian
Stokes-‐Rees
http://portal.nebiogrid.org/

When
WS-‐MR
is
suitable

• You’ve
got
good
data
(<4
A)
• You’ve
tried
MR
with
lots
of
good
candidates
• a
priori
knowledge
• sequence
similarity
(PSI-‐BLAST
search)
• Or
• protein
not
sequenced
• no
a
priori
knowledge
of
expected
fold
• You
haven’t
found
any
good
models
to
use
for

phasing
• Time
to
try
a
brute-‐force
search:
WS-‐MR

When
MR
is
not
suitable

• Complexes
containing
signiOicant
DNA
or
RNA
• at
least
right
now,
these
will
probably
not
work
• You
haven’t
tried
MR
and
just
want
a
“quick
Oix”
• Very
large
or
very
small
structures
• both
are
computationally
difOicult
• Low
resolution
(>
4.5
A)
• experience
so
far
suggests
these
aren’t
going
to
be
helped
much

Requirements
• ReOlection
data
in
MTZ
Oile
format
• Must
have
amplitude
columns
(e.g.
FP,
SIGFP)

• Doesn’t
work
with
intensities
(I,
SIGI)

• Time
• To
analyze
results
• To
take
next
steps

• Managed
expectations
• Identify
good
MR
candidates
about
1
in
4
cases
• We
don’t
produce
a
fully
phased
structure,
only
a
list
of
good
MR

candidates
and
their
best
placements
as
returned
by
Phaser

• Experience
with
Phaser
to
interpret
results
and

re-‐run
candidate
models

Background
• Utilizes
Phaser
for
MR
• Utilizes
Open
Science
Grid
for
computing
• References
• Stokes-‐Rees,
Sliz,
Protein
structure
determination
by
exhaustive
search
of
Protein
Data

Bank
derived
databases,
Proc.
Nat'l
Academy
of
Sciences
doi:10.1073/pnas.1012095107
• Stokes-‐Rees,
Sliz,
Compute
and
data
management
strategies
for
grid
deployment
of
high

throughput
protein
structure
studies,
IEEE
Workshop
on
Many
Task
Computing
on
Grids

and
Supercomputers
2010
(MTAGS10),
Seattle,
November
2010
• Phaser:
McCoy,
Grosse-‐Kunstleve,
Adams,
Winn,
Storoni,
Read;
J.
Appl.
Cryst.
(2007).
40,

658-‐674
• Murzin
A.
G.,
Brenner
S.
E.,
Hubbard
T.,
Chothia
C.
(1995).
SCOP:
a
structural

classi?ication
of
proteins
database
for
the
investigation
of
sequences
and
structures.
J.
Mol.

Biol.
247,
536-‐540.
• Requires
20-‐50,000
hours
of
computing
• Produces
300,000
Oiles
• Attempts
100,000
single-‐domain
MR
trials
using
all
SCOP

domains

Step
1:
Register
to
use
Portal
https://portal.nebiogrid.org/d/accounts/create

Step
2:
Submit
Computational
Task
https://portal.nebiogrid.org/d/apps/wsmr/create

Side
Note:
MTZ
columns

• Use
CCP4
tool
“mtzdmp”
to
check
column
names

and
resolution
if
you’re
not
sure

column
$ mtzdmp GAS.mtz | less names resolution
...
* Column Labels :
H K L FP SIGFP FreeRflag
...
* Resolution Range :
0.00050 0.25197 ( 44.699 - 1.992 A )
...

Step
3a:
Review
active
task

list
on
portal

click
here
to

access
task

Step
3b:
Check
email
for
task

details
and
link

click
here
to

access
task

Step
4:
Log
into
job
page

Step
5a:
Review
web
page

Step
5b:
Check
status
Click

here

Remember:
Someone
from
SBGrid
will
R
=
Running
manually
review
your
job
and
release
it.

Until
that
happens
your
job
won’t
even
be
in

I
=
Idle
the
queue.

Even
after
that,
it
could
be
in
the
H
=
Held
queue
for
several
days
before
it
starts

running.

Do
email
us
if
you
have
questions

or
if
it
seems
stuck
or
not
running.

Step
5c:
Check
status
summary
of

active
jobs

outcomes
to
date

Step
6a:
Review
scatter
graphs

Look
for
a
cluster
of
high

TFZ
and
high
LLG
results

distinct
from
the
rest

NOTE:
This
graph
is
a
static
image

Step
6b:
Cases
with
no
strong

MR
candidates*

*
Remember
this
is
usually
the
case,
unfortunately

Step
6c:
Review
scatter
graphs

Click
this
button

to
load
data
and

enable
clickable

image

NOTE:
This
graph
is
a
dynamic
clickable
image.

Only
the
Oirst
5000
results
by
LLG
are
currently

available
because
of
memory
constraints

Step
6d:
Review
scatter
graphs

Click
data
point

to
view
details

Click
large
cartoon

image
to
add
to
PDB

image
basket details

Step
7:
Review
tabular
data

live
results
(space
delimited)

sorted
results
(tab
delimited),

generated
by
”check
status”

Step
8:
Wait
for
job
to
Oinish
No
running
jobs
(all
done)

NOTE:
This
job
is
not
results
aprox.
100,000
yet
Oinished! errors
<
5,000

Step
9:
Download
Oinalized

augmented
results
augmented
contains
static
SCOP

domain
class
and
name
(25
MB)

Oinal
contains
a
sorted,
cleaned

set
of
results
(5
MB)

Step
10:
Review
and

download
speciOic
SCOP
PDB

• Use
the
tabular
results
to
identify
speciOic
SCOP
codes

that
look
promising
• PDBs
can
be
fetched
using
one
of
these
resources:
http://portal.nebiogrid.org/biodb/scop/v1.75/clean/code2/
http://abitibi.sbgrid.org/cgi/pdbview.py
http://abitibi.sbgrid.org/cgi/tmalign.py

Step
11:
Recreate
Phaser
output

This
is
the
command

input
to
Phaser
ROOT 2vlj-test
MODE MR_AUTO
HKLIn ../2vlj.mtz
LABIn F=FP SIGF=SIGFP
ENSEmble 200la_ PDB 00/200la_.pdb IDENtity 0.3
COMPosition SOLVENT 50.0
RESOlution 2.4
SEARch ENSEmble 200la_ NUM 1

Click
on
“test”

directory
(bottom
of
job
page)

Step
12:
Over
to
you

• You
now
need
to
reOine
your
structure
• WS-‐MR
only
gets
you
as
far
attempting
to

identify
promising
MR
candidates
if
you
haven’t

had
success
with
conventional
model

identiOication
methods
• Some
further
MR
options
that
exist:
• Second
domain
search
with
Oirst
domain
Oixed
• homo-‐dimer/homo-‐trimer
searches
• Custom
PDB
search
library
-‐
you
give
us
the
PDBs,
we
can
run
WS-‐MR

over
the
set

Conclusion
and
Thanks
• We
welcome
ideas
for
improvements
• Special
processing
requirements?
• We
may
be
able
to
do
this
from
the
command
line
interface
• Please
contact
us
if
you
have
any
questions
• hpc@sbgrid.org

• Open
Science
Grid
is
a
big
enabler
here!
• http://opensciencegrid.org
• Thanks
to
SBGrid
team:
• http://www.sbgrid.org
• Thanks
to
the
Sliz
Lab
at
Harvard
Medical
School:
• http://hkl.hms.harvard.edu

Wide Search Molecular Replacement and the NEBioGrid portal interface

Recomendados

Recomendados

Más contenido relacionado

Similar a Wide Search Molecular Replacement and the NEBioGrid portal interface

Similar a Wide Search Molecular Replacement and the NEBioGrid portal interface (20)

Más de Boston Consulting Group

Más de Boston Consulting Group (16)

Wide Search Molecular Replacement and the NEBioGrid portal interface