SlideShare una empresa de Scribd logo
1 de 45
Descargar para leer sin conexión
Energy Landscapes and Dynamics of Biopolymers

             Michael Thomas Wolfinger

                  University of Vienna


                  5 March 2012
Outline



   1 Biopolymer structure


   2 Energy landscapes


   3 Folding kinetics


   4 RNA refolding


   5 Summary
The RNA model


                                                                                                            AA
    GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCACCA




                                                                                                          G    G
                                                                                                          U    A
                                                                                                           CAU
                                                                                                             GC
                                                                                                              AU
                                                                                                               CG
                                                                                                                CG AG
                                                                                            ✲        GG A     G        G                                  ✲
                                                                                                    G    GAGC           U
                                                                                                    U    CUCG           C
                                                                                                     UGA      A        C
                                                                                                                UU AG U
                                                                                                                       A G
                                                                                                                 UA     C U
                                                                                                                 AU       A G UU
                                                                                                                           C
                                                                                                                 GU        C    C
                                                                                                                 GC         U AG
                                                                                                                 CG
                                                                                                                 GC
                                                                                                                     A
                                                                                                                     C
                                                                                                                     C
                                                                                                                   A
                                                                                   ´´´´´´´ ºº ´´´´ ºººººººº µµµµ º ´´´´´ ººººººº µµµµµ ººººº ´´´´´ ººººººº µµµµµ µµµµµµµ ºººº



  A secondary structure is a list of base pairs that fulfills two constraints:
                                                                                   A base may participate in at most one base pair.
                                                                                   Base pairs must not cross, i.e., no two pairs (i, j) and (k, l) may have i < k < j < l.
                                                                                   (no pseudo-knots)

  The optimal as well as the suboptimal structures can be computed recursively.
The HP-model

                                                                           F(l)                          L(l)

                                                        H
 In this simplified model, a conformation is a                                                                               F(f)
                                                                           R(r)
 self-avoiding walk (SAW) on a given lattice in 2       P

 or 3 dimensions. Each bond is a straight line,                     F(f)                     R(b)                           L(f)

 bond angles have a few discrete values. The 20                                                          L(r)
 letter alphabet of amino acids (monomers) is
 reduced to a two letter alphabet, namely H and                      FRRLLFLF
 P. H represents hydrophobic monomers, P
 represents hydrophilic or polar monomers.                  L
                                                                                                     U

                                                                                                                    L




 Advantages:                                        B                  F


                                                                                 B                                      F




     lattice-independent folding algorithms
                                                            R

                                                                                     R
                                                                                                     D

                                                                P            F

     simple energy function                                                                                     W


                                                                                         M                              Z



     hydrophobicity can be reasonably modeled
                                                                       Q
                                                                                             Q       L                         F




                                                                            R                                       X
                                                                                 B                         R


                                                                                         N                              Y


                                                    B                                            P
Energy functions

                          RNA                                        Lattice Proteins
   The standard energy model expresses the free        The energy function for a sequence with
   energy of a secondary structure S as the sum        n residues S = s1 s2 . . . sn with si ∈ A =
   of the energies of its loops l                      {a1 , a2 , . . . , ab }, the alphabet of b residues, and
                                                       an overall configuration x = (x1 , x2 , . . . , xn ) on
                                                       a lattice L can be written as the sum of pair
                    E (S) = ∑ E (l)
                                l∈S
                                                       potentials

                                                             E (S, x) =           ∑           Ψ[si , sj ].
                                                                               i < j −1
                                                                             |xi − xj | = 1


                                                                                      H P N X
                          i        j                          H P                   H −4  0  0 0
                                                            H −1 0                  P  0  1 −1 0
                                   N                        P  0 0                  N  0 −1  1 0
                    1
             G
           A U
                                                                                    X  0  0  0 0
           A    G
             U U GC U
                  A UG
                    G      A                    UUC
                      C G U        C G A U C UUG
                        U             U            G
                        A     UCCC             C
                                      G U A G G U GU
                         GC A G G G
                                    AU
                          C G
                         G G
                       G C
                      A    U
                    U UAC
                   G     U
                     GG



                E = −17.5kcal/mol                                            E = −16
Folding landscape - energy landscape


   The energy landscape of a biopolymer molecule is a complex surface of
   the (free) energy versus the conformational degrees of freedom.


    Number of RNA secondary structures          dim   Lattice Type      µ         γ
                             3                              SQ       2.63820   1.34275
             cn ∼ 1.86n · n− 2                   2         TRI       4.15076    1.343
     dynamic programming algorithms available
                                                           HEX       1.84777    1.345
                                                            SC       4.68391    1.161
          Number of LP structures                3         BCC       6.53036    1.161
              cn ∼ µ n · nγ −1                             FCC       10.0364    1.162
               problem is NP-hard



   Formally, three things are needed to construct an energy landscape:
       A set X of configurations
       an energy function f : X → R
       a symmetric neighborhood relation N : X × X
Folding landscape - energy landscape


   The energy landscape of a biopolymer molecule is a complex surface of
   the (free) energy versus the conformational degrees of freedom.


    Number of RNA secondary structures          dim   Lattice Type      µ         γ
                             3                              SQ       2.63820   1.34275
             cn ∼ 1.86n · n− 2                   2         TRI       4.15076    1.343
     dynamic programming algorithms available
                                                           HEX       1.84777    1.345
                                                            SC       4.68391    1.161
          Number of LP structures                3         BCC       6.53036    1.161
              cn ∼ µ n · nγ −1                             FCC       10.0364    1.162
               problem is NP-hard



   Formally, three things are needed to construct an energy landscape:
       A set X of configurations
       an energy function f : X → R
       a symmetric neighborhood relation N : X × X
Folding landscape - energy landscape


   The energy landscape of a biopolymer molecule is a complex surface of
   the (free) energy versus the conformational degrees of freedom.


    Number of RNA secondary structures          dim   Lattice Type      µ         γ
                             3                              SQ       2.63820   1.34275
             cn ∼ 1.86n · n− 2                   2         TRI       4.15076    1.343
     dynamic programming algorithms available
                                                           HEX       1.84777    1.345
                                                            SC       4.68391    1.161
          Number of LP structures                3         BCC       6.53036    1.161
              cn ∼ µ n · nγ −1                             FCC       10.0364    1.162
               problem is NP-hard



   Formally, three things are needed to construct an energy landscape:
       A set X of configurations
       an energy function f : X → R
       a symmetric neighborhood relation N : X × X
Folding landscape - energy landscape


   The energy landscape of a biopolymer molecule is a complex surface of
   the (free) energy versus the conformational degrees of freedom.


    Number of RNA secondary structures          dim   Lattice Type      µ         γ
                             3                              SQ       2.63820   1.34275
             cn ∼ 1.86n · n− 2                   2         TRI       4.15076    1.343
     dynamic programming algorithms available
                                                           HEX       1.84777    1.345
                                                            SC       4.68391    1.161
          Number of LP structures                3         BCC       6.53036    1.161
              cn ∼ µ n · nγ −1                             FCC       10.0364    1.162
               problem is NP-hard



   Formally, three things are needed to construct an energy landscape:
       A set X of configurations
       an energy function f : X → R
       a symmetric neighborhood relation N : X × X
The move set


                                      o
                                  o       o
                                  o       o
                                  o       o
                      o
          o       o       o
      o       o

                  o       o
      o       o                               FRLLRLFRR   FRLLFLFRR
      o       o




                          o
                      o       o


                                               FUDLUDLU   FURLUDLU


     For each move there must be an inverse move
     Resulting structure must be in X
     Move set must be ergodic
Energy barriers and barrier trees

   Some topological definitions:
   A structure is a
                                                                           d
        local minimum if its energy is lower                                       c
                                                                           b
        than the energy of all neighbors
                                                                   5                       a
        local maximum if its energy is higher                          4
                                                                               3
        than the energy of all neighbors
                                                                                       2
        saddle point if there are at least two E
        local minima thar can be reached by a                                                  1*
        downhill walk starting at this point
   We further define
        a walk between two conformations x and y as a list of conformations
        x = x1 . . . xm+1 = y such that ∀1 ≤ i ≤ m : N(xi , xi+1 )
        the lower part of the energy landscape (written as X ≤η ) as all
        conformations x such that E (S, x) ≤ η (with a predefined threshold η ).

        C. Flamm, I. L. Hofacker, P. F. Stadler, and M. T. Wolfinger.
        Barrier trees of degenerate landscapes.
        Z. Phys. Chem., 216:155–173, 2002.
Energy barriers and barrier trees

   Some topological definitions:
   A structure is a
                                                                           d
        local minimum if its energy is lower                                       c
                                                                           b
        than the energy of all neighbors
                                                                   5                       a
        local maximum if its energy is higher                          4
                                                                               3
        than the energy of all neighbors
                                                                                       2
        saddle point if there are at least two E
        local minima thar can be reached by a                                                  1*
        downhill walk starting at this point
   We further define
        a walk between two conformations x and y as a list of conformations
        x = x1 . . . xm+1 = y such that ∀1 ≤ i ≤ m : N(xi , xi+1 )
        the lower part of the energy landscape (written as X ≤η ) as all
        conformations x such that E (S, x) ≤ η (with a predefined threshold η ).

        C. Flamm, I. L. Hofacker, P. F. Stadler, and M. T. Wolfinger.
        Barrier trees of degenerate landscapes.
        Z. Phys. Chem., 216:155–173, 2002.
The lower part of the energy landscape

   Two conformations x and y are mutually accessible at the level η
   (written as x η y ) if there is a walk from x to y such that all
   conformations z in the walk satisfy E (S, z) ≤ η . The saddle height
   ˆ
   f (x, y ) of x and y is defined by

                              f (x, y ) = min{η | x
                              ˆ                             η     y}
                                      ≤η
   Given the set of all local minima Xmin below threshold η , the lower
   energy part X ≤η of the energy landscape can alternatively be written as


                                            ≤η ˆ
                          X ≤η = {y | ∃x ∈ Xmin : f (x, y ) ≤ η }

   Given a restricted set of low-energy conformations, Xinit , and a reasonable
   value for η , the lower part of the energy landscape can be calculated.

       M. T. Wolfinger, S. Will, I. L. Hofacker, R. Backofen, and P. F. Stadler.
       Exploring the lower part of discrete polymer model energy landscapes.
       Europhys. Lett., 2006.
The Flooder approach


   E




                11111111111111111111
                00000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
   Ep           11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                                           111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                                           111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                                           111111111111111111111
                                           000000000000000000000
            D   11111111111111111111
                00000000000000000000   D   111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                                           111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                                           111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                 C
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                                             C
                                           111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                                           111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                                           111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                                           111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                                           111111111111111111111
                                           000000000000000000000
                        B
                11111111111111111111
                00000000000000000000                B
                                           111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                                           111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                                           111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                11111111111111111111
                00000000000000000000       111111111111111111111
                                           000000000000000000000
                                           111111111111111111111
                                           000000000000000000000
                             A                           A
The concept of Barriers
The algorithm of Barriers


  Barriers
  Require: all suboptimal secondary structures within a certain energy range from mfe
   1: B ⇐ 0  /
   2: for all x ∈ subopt do
   3:    K ⇐0    /
   4:    N ⇐ generate neighbors(x)
   5:    for all y ∈ N do
   6:        if b ⇐ lookup hash(y ) then
   7:           K ⇐ K ∪b
   8:        end if
   9:    end for
  10:    if K = 0 then
                   /
  11:        B ⇐ B ∪ {x}
  12:    end if
  13:    if |K | ≥ 2 then
  14:        merge basins(K )
  15:    end if
  16:    write hash(x)
  17: end for
The flooding algorithm




   1
The flooding algorithm




   1          2
The flooding algorithm




   1          2         3
The flooding algorithm




   1          2         3   4
Barrier tree example
Information from the barrier trees




       Local minima
       Saddle points
       Barrier heights
       Gradient basins
       Partition functions and free energies of (gradient) basins
   A gradient basin is the set of all initial points from which a gradient walk
   (steepest descent) ends in the same local minimum.
Folding kinetics




   Biomolecules may have kinetic traps which prevent them from reaching
   equilibrium within the lifetime of the molecule. Long molecules are often
   trapped in such meta-stable states during transcription.

   Possible solutions are
        Stochastic folding simulations (predict folding pathways)
       Predicting structures for growing fragments of the sequence
       Analysis of the energy landscape based on complete suboptimal
       folding
Biopolymer dynamics



   The probability distribution P of structures as a function of time is ruled
   by a set of forward equations, also known as the master equation

                       dPt (x)
                               = ∑ [Pt (y )kxy − Pt (x)kyx ]
                         dt      y =x

   Given an initial population distribution, how does the system evolve in
   time? (What is the population distribution after n time-steps?)

                        d
                           Pt = UPt =⇒        Pt = e tU P0
                        dt
Kinfold: A stochastic kinetic foling algorithm

    Simulate folding kinetics by a rejection-less Monte-Carlo type
    algorithm:




    Generate all neighbors using the move-set



    Assign rates to each move, e.g.                                       P1   P2
                                                                     P8             P3
                                                                     P7             P4

                                          ∆E                              P6   P5


                     Pi = min 1, exp −
                                          kT


    Select a move with probability proportional to its rate
    Advance clock 1/ ∑i Pi .
Treekin: Barrier tree kinetics

       Local minima                           Gradient basins
       Saddle points                          Partition functions
       Barrier heights                        Free energies of (gradient) basins
   With this information, a reduced dynamics can be formulated as a Markov
   process by means of macrostates (i.e. basins in the barrier tree) and
   Arrhenius-like transition rates between them.
                                                                                 1



        d
           Pt = UPt =⇒                Pt = e tU P0                             0.8
                                                                                              ✟✟
                                                                                              ✙
                                                                                                ✟
        dt                                                                                                                  ❍❍




                                                       population percentage
                                                                                                                              ❍❥
                                                                                                                               ❍
                                                                               0.6
                                                                                                      ✟
       macro-states form a partition of the                                                         ✟✟
                                                                                                    ✙
                                                                                                    ✟

       full configuration space                                                 0.4



       transition rates between macro-states                                   0.2                                          ✠
                                                                                                                             
                                                                                                                              
                                                                                       ❍❍
                            ∗                                                            ❍❥
                                                                                          ❍
       rβ α = Γβ α exp   −(Eβ α   − Gα )/kT
                                                                                0
                                                                                  -2    -1     0     1      2      3    4     5     6
                                                                                10     10     10    10    10      10   10   10     10
                                                                                                           time


       M. T. Wolfinger, W. A. Svrcek-Seiler, C. Flamm, I. L. Hofacker, and P. F. Stadler.
       Efficient computation of RNA folding dynamics.
       J. Phys. A: Math. Gen., 37(17):4731–4741, 2004.
Dynamics of a short artificial RNA

                                                               CUGCGGCUUUGGCUCUAGCC n = 20
               6.0




                     34

                               33
                                           32
               4.0




                                             31
                          30



                                                               29

                                                                         28


                                                                              27
                                                                               25



                                                                               26
                                                                               24




                                                                                                              23

                                                                                                                        22




                                                                                                                                                                  21
                                                                                 20
               2.0




                                                                    18




                                                                                                                                 19
                                                                                                      17

                                                                                                      16
                                                                                   14




                                                                                                      13

                                                                                                                   15




                                                                                                                                                                       12
                                                                                                        11
                                                                                                 10
               0.0




                                                                                                                                               9
                                                                                                                                           8




                                                                                                                                                                            7
                                                                                                                             6
              -2.0




                                                                                                                                                          5
                                                                                                                                                     4
                                                                                                                                      3




                                                                                                                                                              2
              -4.0




                                                                                                                                                                                1
                                                           1

                                                                                                                                               mfe
                                                                                                                                               2
                                                         0.8                                                                                   3
                                                                                                                                               4
                                                                                                                                               5
                                                                                                                                               8
                                population probability




                                                         0.6



                                                         0.4



                                                         0.2



                                                          0 -2                               0                      2                  3
                                                                                   -1                   1                                                 4
                                                          10                  10        10            10       10                 10                 10
                                                                                                       time
Dynamics of tRNA
                              GCGGAUUUAGCUCAGDDGGGAGAGCGCCAGACUGAAYAUCUGGAGGUCCUGUGTPCGAUCCACAGAAUUCGCACCA




                              1                                                                                                     1

                                                                              mfe
                                                                              5
                            0.8                                               80                                                  0.8
                                                                              56
   population probability




                                                                                                         population probability
                            0.6                                                                                                   0.6




                            0.4                                                                                                   0.4




                            0.2                                                                                                   0.2




                             0 0                                                                                                   0 3                 5        6               7        8        9
                                        1        2        3     4         5        6        7        8                                        4                                                            10
                             10    10       10       10       10     10       10       10       10                                 10    10       10       10              10       10       10       10
                                                              time                                                                                                  time
Dynamics of lattice proteins: HEX/TET lattice


                                                           NNHHPPNNPHHHHPXP n = 16

   2.0
                                                                                                          1
               NNHHPPNNPHHHHPXP    n = 16
   1.0                                                                                                                                                                          1
                                                                                                                                                                                4
                                                                                                        0.8                                                                     7
   0.0                                                                                                                                                                          24
                                                                                                                                                                                25
                                                                                                                                                                                25 (pinfold)




                                                                               population probability
  -1.0                                                                                                  0.6
          24
          25
          26
          27
          28
          29
          30
          31
          32
          33
          34
          35
          36
          37
          38
          39
          40
          41
          42
          43
          44
          45
          46
          47
          48
          49
          50
          51
          52
          53
          54
          55
          56
          57
          58
          59
          60
          61
          62
          63
          64
          65
          66




  -2.0
                                             13
                                             14
                                             15
                                             16
                                             17
                                             18
                                             19
                                             20
                                             21
                                             22
                                             23
                                                                                                        0.4

  -3.0


                                                                                                        0.2
  -4.0
                                   9
                                   10




                                                                8
                                                                11
                                                                12




  -5.0
                                                                                                         0 -3
                                       2
                                       3
                                       4
                                       5
                                       6




                                                                 7
                                                                 1




                                                                                                                          -2        -1        0              1              2           3            4                 5
                                                                                                         10          10        10        10                10       10             10           10                10
                                                                                                                                                            time
   -2.0                                                                                                   1


   -4.0
                                                                                                                                                                                1
                                                                                                        0.8                                                                     4
                                                                                                                                                                                16
   -6.0                                                                                                                                                                         17



                                                                               population probability
                                                                                                                                                                                37
                                                                                                        0.6                                                                     37 (pinfold)
   -8.0



  -10.0
                                                                                                        0.4


  -12.0

                                                                                                        0.2
              40
              34
              39
              42
              48
              47
              46
              49




                             24
                             25
                             29
                             28
                             30
                             32
                             31
                             33
                             36
                             38
                             45




                                       41
                                       35



                                                 37
                                                 44
                                                           27
                                                           50
                                                           26
                                                           43




  -14.0
          9




                   4
                   5
                   11
                   10
                   20
                   19




                                  6
                                  18
                                  8
                                  7
                                  12

                                            15
                                            14
                                            17
                                            16

                                                      22



                                                                13
                                                                21
                                                                23
                         3
                         2




                                                                     1




                                                                                                         0           -2              0                 2                                    6                 8
                                                                                                                                                                        4
                                                                                                                10             10                 10               10                10                  10
                                                                                                                                                           time
Barrier tree kinetics - problems and pitfalls


   The method works fine for moderately sized systems.
   Currently, we consider approx. 100 million structures within a single run
   of Barriers to calculate the topology of the landscape.
   However, we are interested in larger systems:
      biologically relevant RNA switches
      large 3D lattice proteins

   The next steps:
       use high-level diagonalization routines for sparse matrices
       calculate low-energy structures
       sample (thermodynamics properties of) individual basins
       sample low-enery refolding paths
Barrier tree kinetics - problems and pitfalls


   The method works fine for moderately sized systems.
   Currently, we consider approx. 100 million structures within a single run
   of Barriers to calculate the topology of the landscape.
   However, we are interested in larger systems:
      biologically relevant RNA switches
      large 3D lattice proteins

   The next steps:
       use high-level diagonalization routines for sparse matrices
       calculate low-energy structures
       sample (thermodynamics properties of) individual basins
       sample low-enery refolding paths
The PathFinder tool

    A heuristic approach to efficiently estimate low-energy refolding paths
  Overall procedure for direct paths:
         1 Calculate distance bewteen start and target structure
         2 Generate all neighbors of the start structure whose distance to the target is less
              than the distance of the start structure
         3 Sort those neighbor structures by their energies
         4 Take the n energetically best structures, take them as new starting points and
              repeat the procedure until the stop structure is found
         5 If a path has been found, try to find another one with lower energy barrier
    Energy




                                                               Energy




             START                                                      START



                                                        STOP                                                       STOP


              8      7   6   5      4       3   2   1    0               8      7   6   5      4       3   2   1    0
                                 Distance                                                   Distance
PathFinder example - SV11




  SV11 is a RNA switch of length 115




        E = −69.2 kcal/mol             E = −96.4 kcal/mol
            metastable                       stable
       template for Qβ replicase          not a template
SV11 refolding path 1/3: Esaddle = −52.2 kcal/mol

                           -50
                                                     ✛
                                                                                ✛


                           -60
      Energy (kcal/mol)




                           -70
                                     ■
                                     ❅
                                     ❅
                                       ❅                       ✕
                                                               ✁
                                                              ✁
                                                             ✁
                           -80                           ✁




                           -90

                                                                                              ✲

                          -100
                                 0         10   20       30        40      50       60   70   80
                                                                   steps
SV11 refolding path 2/3: Esaddle = −57.7 kcal/mol


                           -50

                                                            ✟
                                                          ✟✟
                                                          ✙
                           -60
      Energy (kcal/mol)




                           -70                                                ❄
                                     ■
                                     ❅
                                     ❅
                                       ❅

                           -80
                                                            ✒
                                                             
                                                           


                           -90

                                                                                       ✲

                          -100
                                 0         10   20   30     40      50   60       70   80
                                                            steps
SV11 refolding path 3/3: Esaddle = −59.2 kcal/mol


                           -50

                                                                        ✟
                                                                      ✟✟
                                                                      ✙
                           -60
      Energy (kcal/mol)




                           -70                                                                                  ❄
                                     ■
                                     ❅
                                     ❅
                                       ❅                                     GC
                                                                             GC
                                                                               A


                                                                             GC
                                                                           C    U
                                                                                U
                                                                            AU
                                                                            CG
                                                                            CG
                                                                            CG
                                                                            CG
                                                                            CG


                           -80
                                                                            CG
                                                                            UAA
                                                                            UAA




                                                                                ✒
                                                                                 
                                                                           CG
                                                                           GC
                                                                           GC
                                                                           GC
                                                                           GC



                                                                                
                                                                          GC
                                                                          GC
                                                                          UA
                                                                          CG
                                                                          AU
                                                                       CC G
                                                                   C U GG C GA
                                                               CG     A G    C
                                                                            C U GC
                                                            U G G C GUU       U C
                                                       C GA                        G
                                                             AC                C GC C C C
                                                     U   CU         A
                                                                               U   G A
                                                      A G           A                U    A
                                                                     A         U          A
                                                                              U      A
                                                                      G                GCU
                                                                        GGCC




                           -90

                                                                                                                         ✲

                          -100
                                 0         10   20          30                                40      50   60       70   80
                                                                                              steps
SV11 refolding paths

                          -50




                          -60
     Energy (kcal/mol)




                          -70




                          -80



                          -90




                         -100
                                0   10   20   30   40      50   60   70   80
                                                   steps
libPF - a generic path sampling library




      In practice, this path sampling heuristics is implemented as a C
      library
      All structures along a path are stored in a hash and therefore
      available for the next iterations
      Heuristics routines are strictly separated from model-dependent
      routines, i.e. the library is completely generic
      Currently, RNA secondary structures and lattice proteins are
      implemented
      It is easy to extend the functionality to other discrete systems
Conclusion



      Discrete models allow a detailed study of the energy surface

      Barrier trees represent the landscape topology

      The lower part of the energy landscape is accessible by a flooding
      approach

      Macrostate folding kinetics reduces simulation time drastically

      A path sampling approach yields low-energy refolding paths and is a
      valuable tool for further kinetics studies
Conclusion



      Discrete models allow a detailed study of the energy surface

      Barrier trees represent the landscape topology

      The lower part of the energy landscape is accessible by a flooding
      approach

      Macrostate folding kinetics reduces simulation time drastically

      A path sampling approach yields low-energy refolding paths and is a
      valuable tool for further kinetics studies
Conclusion



      Discrete models allow a detailed study of the energy surface

      Barrier trees represent the landscape topology

      The lower part of the energy landscape is accessible by a flooding
      approach

      Macrostate folding kinetics reduces simulation time drastically

      A path sampling approach yields low-energy refolding paths and is a
      valuable tool for further kinetics studies
Conclusion



      Discrete models allow a detailed study of the energy surface

      Barrier trees represent the landscape topology

      The lower part of the energy landscape is accessible by a flooding
      approach

      Macrostate folding kinetics reduces simulation time drastically

      A path sampling approach yields low-energy refolding paths and is a
      valuable tool for further kinetics studies
Conclusion



      Discrete models allow a detailed study of the energy surface

      Barrier trees represent the landscape topology

      The lower part of the energy landscape is accessible by a flooding
      approach

      Macrostate folding kinetics reduces simulation time drastically

      A path sampling approach yields low-energy refolding paths and is a
      valuable tool for further kinetics studies
Conclusion



      Discrete models allow a detailed study of the energy surface

      Barrier trees represent the landscape topology

      The lower part of the energy landscape is accessible by a flooding
      approach

      Macrostate folding kinetics reduces simulation time drastically

      A path sampling approach yields low-energy refolding paths and is a
      valuable tool for further kinetics studies
References



                  Christoph Flamm, Ivo Hofacker, Peter Stadler
                  Rolf Backofen, Sebastian Will, Martin Mann

      M. T. Wolfinger, W. A. Svrcek-Seiler, C. Flamm, I. L. Hofacker, and P. F. Stadler.
      Efficient computation of RNA folding dynamics.
      J. Phys. A: Math. Gen., 37(17):4731–4741, 2004.
      M. T. Wolfinger, S. Will, I. L. Hofacker, R. Backofen, and P. F. Stadler.
      Exploring the lower part of discrete polymer model energy landscapes.
      Europhys. Lett., 74(4):725–732, 2006.
      Ch. Flamm, W. Fontana, I.L. Hofacker, and P. Schuster.
      RNA folding at elementary step resolution.
      RNA, 6:325–338, 2000
      C. Flamm, I. L. Hofacker, P. F. Stadler, and M. T. Wolfinger.
      Barrier trees of degenerate landscapes.
      Z. Phys. Chem., 216:155–173, 2002.

Más contenido relacionado

Último

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
KarakKing
 

Último (20)

Fostering Friendships - Enhancing Social Bonds in the Classroom
Fostering Friendships - Enhancing Social Bonds  in the ClassroomFostering Friendships - Enhancing Social Bonds  in the Classroom
Fostering Friendships - Enhancing Social Bonds in the Classroom
 
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
2024-NATIONAL-LEARNING-CAMP-AND-OTHER.pptx
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Interdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptxInterdisciplinary_Insights_Data_Collection_Methods.pptx
Interdisciplinary_Insights_Data_Collection_Methods.pptx
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Plant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptxPlant propagation: Sexual and Asexual propapagation.pptx
Plant propagation: Sexual and Asexual propapagation.pptx
 
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptxExploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
Exploring_the_Narrative_Style_of_Amitav_Ghoshs_Gun_Island.pptx
 
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptxOn_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
On_Translating_a_Tamil_Poem_by_A_K_Ramanujan.pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
NO1 Top Black Magic Specialist In Lahore Black magic In Pakistan Kala Ilam Ex...
 
Single or Multiple melodic lines structure
Single or Multiple melodic lines structureSingle or Multiple melodic lines structure
Single or Multiple melodic lines structure
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptxCOMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
COMMUNICATING NEGATIVE NEWS - APPROACHES .pptx
 
Towards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptxTowards a code of practice for AI in AT.pptx
Towards a code of practice for AI in AT.pptx
 
This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.This PowerPoint helps students to consider the concept of infinity.
This PowerPoint helps students to consider the concept of infinity.
 
SOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning PresentationSOC 101 Demonstration of Learning Presentation
SOC 101 Demonstration of Learning Presentation
 
Google Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptxGoogle Gemini An AI Revolution in Education.pptx
Google Gemini An AI Revolution in Education.pptx
 
Salient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functionsSalient Features of India constitution especially power and functions
Salient Features of India constitution especially power and functions
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 

Destacado

How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
ThinkNow
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

Destacado (20)

Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 

Energy Landscapes and Dynamics of Biopolymers

  • 1. Energy Landscapes and Dynamics of Biopolymers Michael Thomas Wolfinger University of Vienna 5 March 2012
  • 2. Outline 1 Biopolymer structure 2 Energy landscapes 3 Folding kinetics 4 RNA refolding 5 Summary
  • 3. The RNA model AA GCGGAUUUAGCUCAGUUGGGAGAGCGCCAGACUGAAGAUCUGGAGGUCCUGUGUUCGAUCCACAGAAUUCGCACCA G G U A CAU GC AU CG CG AG ✲ GG A G G ✲ G GAGC U U CUCG C UGA A C UU AG U A G UA C U AU A G UU C GU C C GC U AG CG GC A C C A ´´´´´´´ ºº ´´´´ ºººººººº µµµµ º ´´´´´ ººººººº µµµµµ ººººº ´´´´´ ººººººº µµµµµ µµµµµµµ ºººº A secondary structure is a list of base pairs that fulfills two constraints: A base may participate in at most one base pair. Base pairs must not cross, i.e., no two pairs (i, j) and (k, l) may have i < k < j < l. (no pseudo-knots) The optimal as well as the suboptimal structures can be computed recursively.
  • 4. The HP-model F(l) L(l) H In this simplified model, a conformation is a F(f) R(r) self-avoiding walk (SAW) on a given lattice in 2 P or 3 dimensions. Each bond is a straight line, F(f) R(b) L(f) bond angles have a few discrete values. The 20 L(r) letter alphabet of amino acids (monomers) is reduced to a two letter alphabet, namely H and FRRLLFLF P. H represents hydrophobic monomers, P represents hydrophilic or polar monomers. L U L Advantages: B F B F lattice-independent folding algorithms R R D P F simple energy function W M Z hydrophobicity can be reasonably modeled Q Q L F R X B R N Y B P
  • 5. Energy functions RNA Lattice Proteins The standard energy model expresses the free The energy function for a sequence with energy of a secondary structure S as the sum n residues S = s1 s2 . . . sn with si ∈ A = of the energies of its loops l {a1 , a2 , . . . , ab }, the alphabet of b residues, and an overall configuration x = (x1 , x2 , . . . , xn ) on a lattice L can be written as the sum of pair E (S) = ∑ E (l) l∈S potentials E (S, x) = ∑ Ψ[si , sj ]. i < j −1 |xi − xj | = 1 H P N X i j H P H −4 0 0 0 H −1 0 P 0 1 −1 0 N P 0 0 N 0 −1 1 0 1 G A U X 0 0 0 0 A G U U GC U A UG G A UUC C G U C G A U C UUG U U G A UCCC C G U A G G U GU GC A G G G AU C G G G G C A U U UAC G U GG E = −17.5kcal/mol E = −16
  • 6. Folding landscape - energy landscape The energy landscape of a biopolymer molecule is a complex surface of the (free) energy versus the conformational degrees of freedom. Number of RNA secondary structures dim Lattice Type µ γ 3 SQ 2.63820 1.34275 cn ∼ 1.86n · n− 2 2 TRI 4.15076 1.343 dynamic programming algorithms available HEX 1.84777 1.345 SC 4.68391 1.161 Number of LP structures 3 BCC 6.53036 1.161 cn ∼ µ n · nγ −1 FCC 10.0364 1.162 problem is NP-hard Formally, three things are needed to construct an energy landscape: A set X of configurations an energy function f : X → R a symmetric neighborhood relation N : X × X
  • 7. Folding landscape - energy landscape The energy landscape of a biopolymer molecule is a complex surface of the (free) energy versus the conformational degrees of freedom. Number of RNA secondary structures dim Lattice Type µ γ 3 SQ 2.63820 1.34275 cn ∼ 1.86n · n− 2 2 TRI 4.15076 1.343 dynamic programming algorithms available HEX 1.84777 1.345 SC 4.68391 1.161 Number of LP structures 3 BCC 6.53036 1.161 cn ∼ µ n · nγ −1 FCC 10.0364 1.162 problem is NP-hard Formally, three things are needed to construct an energy landscape: A set X of configurations an energy function f : X → R a symmetric neighborhood relation N : X × X
  • 8. Folding landscape - energy landscape The energy landscape of a biopolymer molecule is a complex surface of the (free) energy versus the conformational degrees of freedom. Number of RNA secondary structures dim Lattice Type µ γ 3 SQ 2.63820 1.34275 cn ∼ 1.86n · n− 2 2 TRI 4.15076 1.343 dynamic programming algorithms available HEX 1.84777 1.345 SC 4.68391 1.161 Number of LP structures 3 BCC 6.53036 1.161 cn ∼ µ n · nγ −1 FCC 10.0364 1.162 problem is NP-hard Formally, three things are needed to construct an energy landscape: A set X of configurations an energy function f : X → R a symmetric neighborhood relation N : X × X
  • 9. Folding landscape - energy landscape The energy landscape of a biopolymer molecule is a complex surface of the (free) energy versus the conformational degrees of freedom. Number of RNA secondary structures dim Lattice Type µ γ 3 SQ 2.63820 1.34275 cn ∼ 1.86n · n− 2 2 TRI 4.15076 1.343 dynamic programming algorithms available HEX 1.84777 1.345 SC 4.68391 1.161 Number of LP structures 3 BCC 6.53036 1.161 cn ∼ µ n · nγ −1 FCC 10.0364 1.162 problem is NP-hard Formally, three things are needed to construct an energy landscape: A set X of configurations an energy function f : X → R a symmetric neighborhood relation N : X × X
  • 10. The move set o o o o o o o o o o o o o o o o o FRLLRLFRR FRLLFLFRR o o o o o FUDLUDLU FURLUDLU For each move there must be an inverse move Resulting structure must be in X Move set must be ergodic
  • 11. Energy barriers and barrier trees Some topological definitions: A structure is a d local minimum if its energy is lower c b than the energy of all neighbors 5 a local maximum if its energy is higher 4 3 than the energy of all neighbors 2 saddle point if there are at least two E local minima thar can be reached by a 1* downhill walk starting at this point We further define a walk between two conformations x and y as a list of conformations x = x1 . . . xm+1 = y such that ∀1 ≤ i ≤ m : N(xi , xi+1 ) the lower part of the energy landscape (written as X ≤η ) as all conformations x such that E (S, x) ≤ η (with a predefined threshold η ). C. Flamm, I. L. Hofacker, P. F. Stadler, and M. T. Wolfinger. Barrier trees of degenerate landscapes. Z. Phys. Chem., 216:155–173, 2002.
  • 12. Energy barriers and barrier trees Some topological definitions: A structure is a d local minimum if its energy is lower c b than the energy of all neighbors 5 a local maximum if its energy is higher 4 3 than the energy of all neighbors 2 saddle point if there are at least two E local minima thar can be reached by a 1* downhill walk starting at this point We further define a walk between two conformations x and y as a list of conformations x = x1 . . . xm+1 = y such that ∀1 ≤ i ≤ m : N(xi , xi+1 ) the lower part of the energy landscape (written as X ≤η ) as all conformations x such that E (S, x) ≤ η (with a predefined threshold η ). C. Flamm, I. L. Hofacker, P. F. Stadler, and M. T. Wolfinger. Barrier trees of degenerate landscapes. Z. Phys. Chem., 216:155–173, 2002.
  • 13. The lower part of the energy landscape Two conformations x and y are mutually accessible at the level η (written as x η y ) if there is a walk from x to y such that all conformations z in the walk satisfy E (S, z) ≤ η . The saddle height ˆ f (x, y ) of x and y is defined by f (x, y ) = min{η | x ˆ η y} ≤η Given the set of all local minima Xmin below threshold η , the lower energy part X ≤η of the energy landscape can alternatively be written as ≤η ˆ X ≤η = {y | ∃x ∈ Xmin : f (x, y ) ≤ η } Given a restricted set of low-energy conformations, Xinit , and a reasonable value for η , the lower part of the energy landscape can be calculated. M. T. Wolfinger, S. Will, I. L. Hofacker, R. Backofen, and P. F. Stadler. Exploring the lower part of discrete polymer model energy landscapes. Europhys. Lett., 2006.
  • 14. The Flooder approach E 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 Ep 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 D 11111111111111111111 00000000000000000000 D 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 C 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 C 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 B 11111111111111111111 00000000000000000000 B 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 11111111111111111111 00000000000000000000 111111111111111111111 000000000000000000000 111111111111111111111 000000000000000000000 A A
  • 15. The concept of Barriers
  • 16. The algorithm of Barriers Barriers Require: all suboptimal secondary structures within a certain energy range from mfe 1: B ⇐ 0 / 2: for all x ∈ subopt do 3: K ⇐0 / 4: N ⇐ generate neighbors(x) 5: for all y ∈ N do 6: if b ⇐ lookup hash(y ) then 7: K ⇐ K ∪b 8: end if 9: end for 10: if K = 0 then / 11: B ⇐ B ∪ {x} 12: end if 13: if |K | ≥ 2 then 14: merge basins(K ) 15: end if 16: write hash(x) 17: end for
  • 22. Information from the barrier trees Local minima Saddle points Barrier heights Gradient basins Partition functions and free energies of (gradient) basins A gradient basin is the set of all initial points from which a gradient walk (steepest descent) ends in the same local minimum.
  • 23. Folding kinetics Biomolecules may have kinetic traps which prevent them from reaching equilibrium within the lifetime of the molecule. Long molecules are often trapped in such meta-stable states during transcription. Possible solutions are Stochastic folding simulations (predict folding pathways) Predicting structures for growing fragments of the sequence Analysis of the energy landscape based on complete suboptimal folding
  • 24. Biopolymer dynamics The probability distribution P of structures as a function of time is ruled by a set of forward equations, also known as the master equation dPt (x) = ∑ [Pt (y )kxy − Pt (x)kyx ] dt y =x Given an initial population distribution, how does the system evolve in time? (What is the population distribution after n time-steps?) d Pt = UPt =⇒ Pt = e tU P0 dt
  • 25. Kinfold: A stochastic kinetic foling algorithm Simulate folding kinetics by a rejection-less Monte-Carlo type algorithm: Generate all neighbors using the move-set Assign rates to each move, e.g. P1 P2 P8 P3 P7 P4 ∆E P6 P5 Pi = min 1, exp − kT Select a move with probability proportional to its rate Advance clock 1/ ∑i Pi .
  • 26. Treekin: Barrier tree kinetics Local minima Gradient basins Saddle points Partition functions Barrier heights Free energies of (gradient) basins With this information, a reduced dynamics can be formulated as a Markov process by means of macrostates (i.e. basins in the barrier tree) and Arrhenius-like transition rates between them. 1 d Pt = UPt =⇒ Pt = e tU P0 0.8 ✟✟ ✙ ✟ dt ❍❍ population percentage ❍❥ ❍ 0.6 ✟ macro-states form a partition of the ✟✟ ✙ ✟ full configuration space 0.4 transition rates between macro-states 0.2 ✠     ❍❍ ∗ ❍❥ ❍ rβ α = Γβ α exp −(Eβ α − Gα )/kT 0 -2 -1 0 1 2 3 4 5 6 10 10 10 10 10 10 10 10 10 time M. T. Wolfinger, W. A. Svrcek-Seiler, C. Flamm, I. L. Hofacker, and P. F. Stadler. Efficient computation of RNA folding dynamics. J. Phys. A: Math. Gen., 37(17):4731–4741, 2004.
  • 27. Dynamics of a short artificial RNA CUGCGGCUUUGGCUCUAGCC n = 20 6.0 34 33 32 4.0 31 30 29 28 27 25 26 24 23 22 21 20 2.0 18 19 17 16 14 13 15 12 11 10 0.0 9 8 7 6 -2.0 5 4 3 2 -4.0 1 1 mfe 2 0.8 3 4 5 8 population probability 0.6 0.4 0.2 0 -2 0 2 3 -1 1 4 10 10 10 10 10 10 10 time
  • 28. Dynamics of tRNA GCGGAUUUAGCUCAGDDGGGAGAGCGCCAGACUGAAYAUCUGGAGGUCCUGUGTPCGAUCCACAGAAUUCGCACCA 1 1 mfe 5 0.8 80 0.8 56 population probability population probability 0.6 0.6 0.4 0.4 0.2 0.2 0 0 0 3 5 6 7 8 9 1 2 3 4 5 6 7 8 4 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 10 time time
  • 29. Dynamics of lattice proteins: HEX/TET lattice NNHHPPNNPHHHHPXP n = 16 2.0 1 NNHHPPNNPHHHHPXP n = 16 1.0 1 4 0.8 7 0.0 24 25 25 (pinfold) population probability -1.0 0.6 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 -2.0 13 14 15 16 17 18 19 20 21 22 23 0.4 -3.0 0.2 -4.0 9 10 8 11 12 -5.0 0 -3 2 3 4 5 6 7 1 -2 -1 0 1 2 3 4 5 10 10 10 10 10 10 10 10 10 time -2.0 1 -4.0 1 0.8 4 16 -6.0 17 population probability 37 0.6 37 (pinfold) -8.0 -10.0 0.4 -12.0 0.2 40 34 39 42 48 47 46 49 24 25 29 28 30 32 31 33 36 38 45 41 35 37 44 27 50 26 43 -14.0 9 4 5 11 10 20 19 6 18 8 7 12 15 14 17 16 22 13 21 23 3 2 1 0 -2 0 2 6 8 4 10 10 10 10 10 10 time
  • 30. Barrier tree kinetics - problems and pitfalls The method works fine for moderately sized systems. Currently, we consider approx. 100 million structures within a single run of Barriers to calculate the topology of the landscape. However, we are interested in larger systems: biologically relevant RNA switches large 3D lattice proteins The next steps: use high-level diagonalization routines for sparse matrices calculate low-energy structures sample (thermodynamics properties of) individual basins sample low-enery refolding paths
  • 31. Barrier tree kinetics - problems and pitfalls The method works fine for moderately sized systems. Currently, we consider approx. 100 million structures within a single run of Barriers to calculate the topology of the landscape. However, we are interested in larger systems: biologically relevant RNA switches large 3D lattice proteins The next steps: use high-level diagonalization routines for sparse matrices calculate low-energy structures sample (thermodynamics properties of) individual basins sample low-enery refolding paths
  • 32. The PathFinder tool A heuristic approach to efficiently estimate low-energy refolding paths Overall procedure for direct paths: 1 Calculate distance bewteen start and target structure 2 Generate all neighbors of the start structure whose distance to the target is less than the distance of the start structure 3 Sort those neighbor structures by their energies 4 Take the n energetically best structures, take them as new starting points and repeat the procedure until the stop structure is found 5 If a path has been found, try to find another one with lower energy barrier Energy Energy START START STOP STOP 8 7 6 5 4 3 2 1 0 8 7 6 5 4 3 2 1 0 Distance Distance
  • 33. PathFinder example - SV11 SV11 is a RNA switch of length 115 E = −69.2 kcal/mol E = −96.4 kcal/mol metastable stable template for Qβ replicase not a template
  • 34. SV11 refolding path 1/3: Esaddle = −52.2 kcal/mol -50 ✛ ✛ -60 Energy (kcal/mol) -70 ■ ❅ ❅ ❅ ✕ ✁ ✁ ✁ -80 ✁ -90 ✲ -100 0 10 20 30 40 50 60 70 80 steps
  • 35. SV11 refolding path 2/3: Esaddle = −57.7 kcal/mol -50 ✟ ✟✟ ✙ -60 Energy (kcal/mol) -70 ❄ ■ ❅ ❅ ❅ -80  ✒     -90 ✲ -100 0 10 20 30 40 50 60 70 80 steps
  • 36. SV11 refolding path 3/3: Esaddle = −59.2 kcal/mol -50 ✟ ✟✟ ✙ -60 Energy (kcal/mol) -70 ❄ ■ ❅ ❅ ❅ GC GC A GC C U U AU CG CG CG CG CG -80 CG UAA UAA ✒   CG GC GC GC GC   GC GC UA CG AU CC G C U GG C GA CG A G C C U GC U G G C GUU U C C GA G AC C GC C C C U CU A U G A A G A U A A U A U A G GCU GGCC -90 ✲ -100 0 10 20 30 40 50 60 70 80 steps
  • 37. SV11 refolding paths -50 -60 Energy (kcal/mol) -70 -80 -90 -100 0 10 20 30 40 50 60 70 80 steps
  • 38. libPF - a generic path sampling library In practice, this path sampling heuristics is implemented as a C library All structures along a path are stored in a hash and therefore available for the next iterations Heuristics routines are strictly separated from model-dependent routines, i.e. the library is completely generic Currently, RNA secondary structures and lattice proteins are implemented It is easy to extend the functionality to other discrete systems
  • 39. Conclusion Discrete models allow a detailed study of the energy surface Barrier trees represent the landscape topology The lower part of the energy landscape is accessible by a flooding approach Macrostate folding kinetics reduces simulation time drastically A path sampling approach yields low-energy refolding paths and is a valuable tool for further kinetics studies
  • 40. Conclusion Discrete models allow a detailed study of the energy surface Barrier trees represent the landscape topology The lower part of the energy landscape is accessible by a flooding approach Macrostate folding kinetics reduces simulation time drastically A path sampling approach yields low-energy refolding paths and is a valuable tool for further kinetics studies
  • 41. Conclusion Discrete models allow a detailed study of the energy surface Barrier trees represent the landscape topology The lower part of the energy landscape is accessible by a flooding approach Macrostate folding kinetics reduces simulation time drastically A path sampling approach yields low-energy refolding paths and is a valuable tool for further kinetics studies
  • 42. Conclusion Discrete models allow a detailed study of the energy surface Barrier trees represent the landscape topology The lower part of the energy landscape is accessible by a flooding approach Macrostate folding kinetics reduces simulation time drastically A path sampling approach yields low-energy refolding paths and is a valuable tool for further kinetics studies
  • 43. Conclusion Discrete models allow a detailed study of the energy surface Barrier trees represent the landscape topology The lower part of the energy landscape is accessible by a flooding approach Macrostate folding kinetics reduces simulation time drastically A path sampling approach yields low-energy refolding paths and is a valuable tool for further kinetics studies
  • 44. Conclusion Discrete models allow a detailed study of the energy surface Barrier trees represent the landscape topology The lower part of the energy landscape is accessible by a flooding approach Macrostate folding kinetics reduces simulation time drastically A path sampling approach yields low-energy refolding paths and is a valuable tool for further kinetics studies
  • 45. References Christoph Flamm, Ivo Hofacker, Peter Stadler Rolf Backofen, Sebastian Will, Martin Mann M. T. Wolfinger, W. A. Svrcek-Seiler, C. Flamm, I. L. Hofacker, and P. F. Stadler. Efficient computation of RNA folding dynamics. J. Phys. A: Math. Gen., 37(17):4731–4741, 2004. M. T. Wolfinger, S. Will, I. L. Hofacker, R. Backofen, and P. F. Stadler. Exploring the lower part of discrete polymer model energy landscapes. Europhys. Lett., 74(4):725–732, 2006. Ch. Flamm, W. Fontana, I.L. Hofacker, and P. Schuster. RNA folding at elementary step resolution. RNA, 6:325–338, 2000 C. Flamm, I. L. Hofacker, P. F. Stadler, and M. T. Wolfinger. Barrier trees of degenerate landscapes. Z. Phys. Chem., 216:155–173, 2002.