SlideShare a Scribd company logo
1 of 32
Download to read offline
Characteriza*on	
  of	
  Chemical	
  
Libraries	
  Using	
  Scaffolds	
  and	
  
Network	
  Models	
  
Dac-­‐Trung	
  Nguyen,	
  Rajarshi	
  Guha	
  
NIH	
  NCATS	
  
ACS	
  Na:onal	
  Mee:ng,	
  Boston	
  2015	
  
Outline	
  
OR	
  
Mo*va*ons	
  
•  Library	
  comparison	
  usually	
  driven	
  by	
  a	
  need	
  to	
  
construct	
  or	
  expand	
  a	
  library	
  
– OLen	
  with	
  constraints	
  on	
  resources	
  
•  Two	
  classes	
  of	
  features	
  to	
  consider	
  
– Compound-­‐centric	
  (physchem	
  proper:es,	
  
bioac:vity,	
  target	
  preferences)	
  
– Library-­‐centric	
  (diversity,	
  chemical	
  space	
  coverage)	
  
•  Library	
  comparisons	
  generally	
  reduce	
  to	
  
– Distribu:ons	
  of	
  compound	
  features	
  (univariate)	
  
– Overlap	
  in	
  some	
  chemical	
  space	
  (mul:variate)	
  
Comparing	
  Libraries	
  
•  Most	
  comparisons	
  employ	
  a	
  reduced	
  
(numerical)	
  representa:on	
  of	
  the	
  structure	
  
– Fingerprints,	
  BCUTs,	
  physicochemical	
  descriptors	
  
•  Perform	
  comparisons	
  in	
  the	
  new	
  space	
  	
  
– PCA,	
  SOM,	
  MDS,	
  GTM,	
  …	
  
Schamberger	
  et	
  al,	
  DDT,	
  2011,	
  16,	
  636-­‐641;	
  Kireeva	
  et	
  al,	
  Mol.	
  Inf.,	
  2012,	
  31,	
  301-­‐312	
  
Scaffolds	
  &	
  Networks	
  
•  Scaffolds	
  represent	
  a	
  chemically	
  meaningful	
  
reduced	
  representa:on	
  of	
  the	
  structures	
  
•  Can	
  be	
  challenging	
  to	
  define	
  what	
  a	
  (good)	
  
scaffold	
  is	
  
•  A	
  network	
  representa:on	
  of	
  the	
  collec:on	
  of	
  
structures	
  allows	
  for	
  novel	
  ways	
  to	
  perform	
  
library	
  comparisons	
  
– How	
  fine	
  grained	
  can	
  such	
  comparisons	
  be?	
  
Scaffold	
  Network	
  Representa*ons	
  
•  Scaffolds	
  are	
  generated	
  by	
  exhaus:ve	
  
enumera:on	
  of	
  SSSR	
  
•  Scaffolds	
  are	
  nodes,	
  connected	
  by	
  directed	
  edges	
  	
  
•  Nodes	
  are	
  labeled	
  by	
  a	
  hash	
  key	
  of	
  the	
  scaffold	
  
4	
  compounds	
   1912	
  compounds	
  
Scaffold	
  Network	
  Construc*on	
  
•  A	
  scaffold	
  network	
  is	
  a	
  directed	
  graph	
  
•  Edges	
  denote	
  sub/super-­‐structure	
  rela:onships	
  
between	
  scaffolds	
  
•  Each	
  node	
  in	
  the	
  network	
  represents	
  a	
  unique	
  
scaffold	
  
	
  
	
  
	
  
	
  
•  Singletons	
  are	
  acyclic	
  molecules	
  
	
  
Datasets	
  
CL1420,	
  31320	
  compounds	
  
CL886,	
  3552	
  compounds	
  
MIPE,	
  1920	
  compounds	
  
Natural	
  Products,	
  5000	
  
compounds	
  
	
  
	
  Mathews	
  and	
  Guha	
  et	
  al,	
  PNAS,	
  2014,	
  111,	
  11365;	
  Singh	
  et	
  al,	
  JCIM,	
  2009,	
  49,	
  1010	
  
LOPAC,	
  1280	
  compounds	
  
1079	
  nodes,	
  115287	
  edges	
  
69	
  trees	
  
2131	
  nodes,	
  1843	
  edges	
  
129	
  trees	
  
Approved,	
  inves:ga:onal	
  
drugs,	
  constructed	
  for	
  
func:onal	
  diversity	
  
	
  
Diverse	
  library,	
  designed	
  for	
  
enrichment	
  of	
  bioac:vity	
  
15283	
  nodes,	
  13622	
  edges	
  
729	
  trees	
  
5563	
  nodes,	
  4832	
  edges	
  
239	
  trees	
  
23716	
  nodes,	
  21468	
  edges	
  
750	
  trees	
  
•  The	
  overall	
  structure	
  of	
  the	
  complete	
  network	
  
can	
  	
  characterize	
  the	
  library	
  
•  But	
  distribu:ons	
  of	
  vertex-­‐level	
  network	
  
metrics	
  may	
  	
  be	
  	
  informa:ve	
  
•  We	
  can	
  also	
  consider	
  
approaches	
  to	
  iden:fy	
  
“important”	
  scaffolds	
  
Scaffold	
  Network	
  Representa*ons	
  
Metrics	
  for	
  the	
  Complete	
  Network	
  
•  Examined	
  vertex-­‐level	
  measures	
  of	
  centrality	
  
– Closeness,	
  betweenness,	
  …	
  
– High	
  similarity	
  of	
  MIPE	
  &	
  NP	
  and	
  low	
  similarity	
  of	
  
LOPAC	
  &	
  NP	
  is	
  surprising	
  (Ertl	
  et	
  al,	
  JCIM,	
  2008)	
  
0.00
0.25
0.50
0.75
−10 −9 −8 −7 −6 −5
log10(Betweenness)
density
CL1420
CL886
LOPAC
MIPE
NP
0
5000
10000
15000
20000
−8 −7 −6
log Closeness (in−degree)
Num.Scaffold
CL1420
CL886
LOPAC
MIPE
NP
0.00
0.25
0.50
0.75
1.00
0.00
0.25
0.50
0.75
1.00
0.000
0.025
0.050
0.075
CentralizationCPLTransitivity
CL1420 CL886 LOPAC MIPE NP
Library
Value
Metrics	
  for	
  the	
  Complete	
  Network	
  
•  Useful	
  to	
  summarize	
  	
  
distribu:ons	
  by	
  scalar	
  	
  
metrics	
  
•  Path	
  length	
  metrics	
  are	
  
not	
  discriminatory	
  due	
  	
  
to	
  many	
  short	
  paths	
  
•  Extent	
  of	
  clustering	
  differs	
  
but	
  is	
  quite	
  low	
  overall	
  
Comparing	
  Complete	
  Networks	
  
•  Library	
  overlap	
  is	
  characterized	
  by	
  the	
  set	
  of	
  
common	
  scaffolds	
  
•  Scaffolds	
  can	
  be	
  ranked	
  (e.g.,	
  
PageRank)	
  
– 	
  Small	
  fragments	
  have	
  low	
  PR	
  
– Large	
  frameworks	
  have	
  high	
  PR	
  
– Interes:ng	
  scaffolds	
  lie	
  in	
  between?	
  
•  Similar	
  libraries	
  will	
  have	
  	
  
common	
  scaffolds	
  with	
  similar	
  
PageRank	
  values	
  
PageRank
vector
PageRank
vector
Subset
Common
Fragments
Subset
Common
Fragments
Normalized
Dot Product
Comparing	
  Complete	
  Networks	
  
1 0 0 0 0
0 1 0 0 0
0 0 1 0.2 0.3
0 0 0.2 1 0.3
0 0 0.3 0.3 1
CL1420
CL886
LOPAC
MIPE
NP
CL1420 CL886 LOPAC MIPE NP
Scaffold	
  Recogni*on	
  
•  What	
  is	
  a	
  scaffold?	
  
•  Can	
  be	
  addressed	
  through	
  the	
  scaffold	
  
network	
  
– A	
  scaffold	
  is	
  a	
  hub	
  within	
  the	
  scaffold	
  network	
  
•  Provide	
  a	
  prac:cal	
  answer	
  to	
  “What	
  are	
  the	
  
missing	
  scaffolds	
  in	
  my	
  library”	
  
•  Examples	
  of	
  unique	
  scaffolds	
  in	
  MIPE	
  but	
  not	
  
in	
  NP	
  
	
  
Scaffold	
  Comparison	
  
Reduced	
  Network	
  Representa*on	
  
•  The	
  complete	
  network	
  can	
  be	
  reduced	
  
to	
  a	
  forest	
  of	
  trees	
  
•  Order	
  nodes	
  by	
  out-­‐degree	
  
•  From	
  each	
  node,	
  traverse	
  
network	
  un:l	
  a	
  terminal	
  node	
  is	
  
reached	
  
•  Result	
  is	
  a	
  set	
  of	
  spanning	
  trees	
  
Reduced	
  Network	
  Representa*on	
  
MIPE,	
  1912	
  compounds	
  
Network	
  Structure	
  
•  A	
  scaffold	
  forest	
  is	
  characterized	
  by	
  
– Disconnected	
  components	
  	
  
•  structurally	
  related	
  scaffolds,	
  scaffolds	
  diversity	
  
– Singletons	
  	
  
•  scaffolds	
  with	
  no	
  superstructure	
  
– Branching	
  within	
  connected	
  components	
  
•  scaffold	
  complexity	
  
Forest	
  Size	
  vs	
  Library	
  Size	
  
•  A	
  large	
  libraries	
  doesn’t	
  imply	
  a	
  large	
  forest	
  
•  Forest	
  size	
  is	
  a	
  func:on	
  of	
  scaffold	
  diversity	
  
CL1420,	
  31K	
  combinatorial	
  library	
   MIPE,	
  1912	
  (target)	
  diverse	
  library	
  
Summarizing	
  Forests	
  
•  A	
  key	
  feature	
  is	
  the	
  nature	
  of	
  branching	
  in	
  
individual	
  trees	
  
•  Characterized	
  by	
  ID	
  -­‐	
  informa:on	
  theore:c	
  
descriptor	
  of	
  branching	
  derived	
  from	
  the	
  
distance	
  matrix	
  
Bonchev	
  &	
  Trinajis:c,	
  IJQC,	
  1978,	
  14,	
  293-­‐303	
  
ID	
  =	
  978	
   ID	
  =	
  90794	
  ID	
  =	
  3456	
   ID	
  =	
  979252	
  
Summarizing	
  Forests	
  
•  Distribu:on	
  of	
  ID	
  	
  dis:nguishes	
  datasets	
  	
  
primarily	
  in	
  the	
  tails	
  
•  Aggrega:ng	
  by	
  mean	
  ID	
  s:ll	
  discriminates	
  well	
  
– Driven	
  by	
  the	
  tails	
  
0.00
0.25
0.50
0.75
1.00
2 4 6
log10(ID)
Density
CL1420
CL886
LOPAC
MIPE
NP
0
1
2
3
4
CL1420 CL886 LOPAC MIPE NP
Meanlog10(ID)
Exploring	
  the	
  Forest	
  
•  The	
  metric	
  also	
  allows	
  us	
  to	
  drill	
  down	
  
– Select	
  scaffolds	
  of	
  given	
  branching	
  complexity	
  
– Iden:fy	
  scaffolds	
  of	
  given	
  complexity	
  range	
  across	
  
different	
  libraries	
  (equivalent	
  to	
  finding	
  holes	
  in	
  
scaffold	
  coverage)	
  
≈
LOPAC,	
  ID	
  =	
  10214	
   MIPE,	
  ID	
  =	
  10197	
  
Library	
  Comparison	
  via	
  Merging	
  
•  …	
  reduces	
  to	
  comparing	
  networks	
  
•  We	
  compute	
  a	
  graph	
  union	
  and	
  construct	
  new	
  
edges	
  between	
  nodes	
  with	
  the	
  same	
  hash	
  
•  How	
  does	
  the	
  network	
  
structure	
  of	
  the	
  union	
  
differ	
  from	
  the	
  original	
  	
  
networks?	
  
•  Can	
  be	
  extended	
  to	
  merge	
  
more	
  than	
  two	
  networks	
  
Source	
  Forests	
  
•  Structurally	
  
similar	
  networks	
  
•  2659	
  iden:cal	
  
nodes	
  
•  Construct	
  union	
  
by	
  connec:ng	
  
nodes	
  with	
  
iden:cal	
  hash	
  
LOPAC	
   MIPE	
  
Merged	
  Network	
  
•  Green	
  edges	
  
“bridge”	
  the	
  two	
  
networks	
  
•  Trees	
  can	
  now	
  have	
  
two	
  types	
  of	
  nodes	
  
•  How	
  can	
  we	
  
characterize	
  the	
  
– Contrac:on?	
  
– Degree	
  of	
  mixing?	
  
Contrac*on	
  to	
  Measure	
  Overlap	
  
•  Merging	
  very	
  similar	
  libraries	
  should	
  generate	
  	
  
a	
  smaller	
  forest	
  compared	
  to	
  the	
  original	
  forests	
  
	
  
	
  
	
  
	
  
	
  
	
  
•  But	
  this	
  doesn’t	
  really	
  describe	
  how	
  the	
  
individual	
  trees	
  become	
  (more)	
  connected	
  	
  
Cnorm =
F12
F1 + F2
where
Fi = G1i,G2i,!,Gni{ } 0.00
0.25
0.50
0.75
1.00
Cl886/CL1420 MIPE/CL886 MIPE/LOPAC MIPE/NP
Cnorm
0
25
50
75
100
Cl886/CL1420 MIPE/CL886 MIPE/LOPAC MIPE/NP
%oftrees
Assortive Not Assortive
Assorta*vity	
  	
  to	
  Measure	
  Overlap	
  
•  Quan:fies	
  the	
  no:on	
  that	
  “like	
  	
  
connects	
  to	
  like”	
  
•  Undefined	
  for	
  trees	
  that	
  only	
  have	
  one	
  
type	
  of	
  vertex	
  (i.e.,	
  only	
  from	
  a	
  	
  
single	
  library)	
  
•  The	
  number	
  of	
  trees	
  	
  
that	
  are	
  assorta:ve	
  is	
  
a	
  global	
  indicator	
  of	
  	
  
library	
  similarity	
  
Newman,	
  Phys.	
  Rev.	
  E.,	
  2003,	
  026126	
  
0
10
20
30
0.4 0.6 0.8 1.0
Assortativity
density
Cl886/CL1420
MIPE/CL886
MIPE/LOPAC
MIPE/NP
Assorta*vity	
  	
  to	
  Measure	
  Overlap	
  
•  We	
  then	
  examine	
  the	
  distribu:on	
  of	
  
assorta:vity	
  across	
  assorta:ve	
  trees	
  
•  Dissimilar	
  libraries	
  have	
  
few	
  assorta:ve	
  trees	
  
– But	
  they	
  have	
  high	
  values	
  
of	
  assorta:vity	
  
•  However,	
  high	
  assorta:vity	
  
doesn’t	
  imply	
  high	
  overlap	
  
Assorta*vity	
  	
  to	
  Measure	
  Overlap	
  
Assorta:vity	
  =	
  0.85	
  
(MIPE	
  &	
  NP)	
  
Assorta:vity	
  =	
  0.95	
  
(CL886	
  &	
  CL1420)	
  
Overlap	
  via	
  Tree	
  Complexity	
  
•  Similar	
  libraries	
  lead	
  to	
  fewer	
  trees	
  in	
  the	
  
merged	
  network,	
  but	
  also	
  denser	
  trees	
  
	
  
	
  
	
  
	
  
	
  
	
  
•  Change	
  in	
  density	
  (branching)	
  across	
  the	
  
forest	
  can	
  also	
  measure	
  the	
  extent	
  of	
  overlap	
  
MIPE	
   LOPAC	
   Merged	
  
Summarizing	
  via	
  Tree	
  Complexity	
  
•  Distribu:ons	
  of	
  ID	
  before	
  and	
  aLer	
  merging	
  
don’t	
  differ	
  very	
  much,	
  visually	
  
•  However	
  a	
  KS	
  test	
  does	
  discriminate	
  them	
  
0.0
0.2
0.4
0.6
0.8
1 2 3 4
log10(ID)
density
Individual
Merged
CL886	
  /	
  CL1420	
   MIPE	
  /	
  NP	
  
0.0
0.1
0.2
0.3
0.4
2.5 5.0 7.5
log10(ID)
density
Individual
Merged
D	
  =	
  0.0173,	
  p	
  =	
  1	
   D	
  =	
  0.0582,	
  p	
  =	
  .0008	
  
Summary	
  
•  Scaffold	
  networks	
  are	
  a	
  rela:vely	
  objec:ve	
  
way	
  to	
  characterize	
  &	
  compare	
  libraries	
  
– Supports	
  fast	
  comparisons	
  between	
  libraries	
  
•  The	
  approach	
  supports	
  mul:plexing	
  
informa:on	
  in	
  to	
  a	
  single	
  data	
  structure	
  
– Physchem	
  proper:es,	
  bioac:vi:es,	
  …	
  
•  “What	
  is	
  a	
  good	
  comparison?”	
  quickly	
  
becomes	
  a	
  philosophical	
  ques:on	
  

More Related Content

Viewers also liked

Molecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discoveryMolecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discoveryJeremy Yang
 
Cannes lions 2016 -150 Slides Plus
Cannes lions 2016 -150 Slides PlusCannes lions 2016 -150 Slides Plus
Cannes lions 2016 -150 Slides PlusZohar Urian
 
Ad Blocking: A Consumer Right.
Ad Blocking: A Consumer Right. Ad Blocking: A Consumer Right.
Ad Blocking: A Consumer Right. Shine Technologies
 
QConSP 2014 SambaTech Analytics: Arquiteturas e tecnologias por trás da análi...
QConSP 2014 SambaTech Analytics: Arquiteturas e tecnologias por trás da análi...QConSP 2014 SambaTech Analytics: Arquiteturas e tecnologias por trás da análi...
QConSP 2014 SambaTech Analytics: Arquiteturas e tecnologias por trás da análi...Samba Tech
 
Health Insurance CO-OPs: Consumer Operated and Oriented Health Plans
Health Insurance CO-OPs: Consumer Operated and Oriented Health Plans Health Insurance CO-OPs: Consumer Operated and Oriented Health Plans
Health Insurance CO-OPs: Consumer Operated and Oriented Health Plans DBL Law
 
2011 Flash Games Market Survey
2011 Flash Games Market Survey2011 Flash Games Market Survey
2011 Flash Games Market Surveymochimedia
 
Why Can't I Be Happy?
Why Can't I Be Happy?Why Can't I Be Happy?
Why Can't I Be Happy?OH TEIK BIN
 
Understanding intent data raab
Understanding intent data raabUnderstanding intent data raab
Understanding intent data raabdraab
 
Pitching Like a Boss - Silicon Valley Comes to the Baltics 2014
Pitching Like a Boss  - Silicon Valley Comes to the Baltics 2014Pitching Like a Boss  - Silicon Valley Comes to the Baltics 2014
Pitching Like a Boss - Silicon Valley Comes to the Baltics 2014Vitaly Golomb
 
When God is Silent | Elijah on Mt. Horeb
When God is Silent | Elijah on Mt. HorebWhen God is Silent | Elijah on Mt. Horeb
When God is Silent | Elijah on Mt. HorebSteve Thomason
 
Talent Base ja Nitor Creations: Pragmatic Agile
Talent Base ja Nitor Creations: Pragmatic AgileTalent Base ja Nitor Creations: Pragmatic Agile
Talent Base ja Nitor Creations: Pragmatic AgileLoihde Advisory
 
Lead gen, sales & budget model sample
Lead gen, sales & budget model sampleLead gen, sales & budget model sample
Lead gen, sales & budget model sampleHeinz Marketing Inc
 
Starting With Strengths: The Stories We Build #edfling
Starting With Strengths: The Stories We Build #edflingStarting With Strengths: The Stories We Build #edfling
Starting With Strengths: The Stories We Build #edflingChris Wejr
 
Biomeiler nach Jean Pain
Biomeiler nach Jean PainBiomeiler nach Jean Pain
Biomeiler nach Jean PainOlaf Sadzio
 
Relentless Mobile Threats to Avoid
Relentless Mobile Threats to AvoidRelentless Mobile Threats to Avoid
Relentless Mobile Threats to AvoidLookout
 

Viewers also liked (18)

Molecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discoveryMolecular scaffolds are special and useful guides to discovery
Molecular scaffolds are special and useful guides to discovery
 
Cannes lions 2016 -150 Slides Plus
Cannes lions 2016 -150 Slides PlusCannes lions 2016 -150 Slides Plus
Cannes lions 2016 -150 Slides Plus
 
Ad Blocking: A Consumer Right.
Ad Blocking: A Consumer Right. Ad Blocking: A Consumer Right.
Ad Blocking: A Consumer Right.
 
Digital expert class
Digital expert classDigital expert class
Digital expert class
 
QConSP 2014 SambaTech Analytics: Arquiteturas e tecnologias por trás da análi...
QConSP 2014 SambaTech Analytics: Arquiteturas e tecnologias por trás da análi...QConSP 2014 SambaTech Analytics: Arquiteturas e tecnologias por trás da análi...
QConSP 2014 SambaTech Analytics: Arquiteturas e tecnologias por trás da análi...
 
Health Insurance CO-OPs: Consumer Operated and Oriented Health Plans
Health Insurance CO-OPs: Consumer Operated and Oriented Health Plans Health Insurance CO-OPs: Consumer Operated and Oriented Health Plans
Health Insurance CO-OPs: Consumer Operated and Oriented Health Plans
 
2011 Flash Games Market Survey
2011 Flash Games Market Survey2011 Flash Games Market Survey
2011 Flash Games Market Survey
 
Trabajo de investigacion
Trabajo de investigacionTrabajo de investigacion
Trabajo de investigacion
 
Why Can't I Be Happy?
Why Can't I Be Happy?Why Can't I Be Happy?
Why Can't I Be Happy?
 
Understanding intent data raab
Understanding intent data raabUnderstanding intent data raab
Understanding intent data raab
 
Security scam
Security scamSecurity scam
Security scam
 
Pitching Like a Boss - Silicon Valley Comes to the Baltics 2014
Pitching Like a Boss  - Silicon Valley Comes to the Baltics 2014Pitching Like a Boss  - Silicon Valley Comes to the Baltics 2014
Pitching Like a Boss - Silicon Valley Comes to the Baltics 2014
 
When God is Silent | Elijah on Mt. Horeb
When God is Silent | Elijah on Mt. HorebWhen God is Silent | Elijah on Mt. Horeb
When God is Silent | Elijah on Mt. Horeb
 
Talent Base ja Nitor Creations: Pragmatic Agile
Talent Base ja Nitor Creations: Pragmatic AgileTalent Base ja Nitor Creations: Pragmatic Agile
Talent Base ja Nitor Creations: Pragmatic Agile
 
Lead gen, sales & budget model sample
Lead gen, sales & budget model sampleLead gen, sales & budget model sample
Lead gen, sales & budget model sample
 
Starting With Strengths: The Stories We Build #edfling
Starting With Strengths: The Stories We Build #edflingStarting With Strengths: The Stories We Build #edfling
Starting With Strengths: The Stories We Build #edfling
 
Biomeiler nach Jean Pain
Biomeiler nach Jean PainBiomeiler nach Jean Pain
Biomeiler nach Jean Pain
 
Relentless Mobile Threats to Avoid
Relentless Mobile Threats to AvoidRelentless Mobile Threats to Avoid
Relentless Mobile Threats to Avoid
 

Similar to Characterization of Chemical Libraries Using Scaffolds and Network Models

2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 TutorialAlexander Pico
 
Cytoscape Network Visualization and Analysis
Cytoscape Network Visualization and AnalysisCytoscape Network Visualization and Analysis
Cytoscape Network Visualization and Analysisbdemchak
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Spark Summit
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 TutorialAlexander Pico
 
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...Institute of Information Systems (HES-SO)
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learningsun peiyuan
 
Network Visualization and Analysis with Cytoscape
Network Visualization and Analysis with CytoscapeNetwork Visualization and Analysis with Cytoscape
Network Visualization and Analysis with CytoscapeAlexander Pico
 
RELATIONAL MODEL OF DATABASES AND OTHER CONCEPTS OF DATABASES​
RELATIONAL MODEL OF DATABASES AND OTHER CONCEPTS OF DATABASES​RELATIONAL MODEL OF DATABASES AND OTHER CONCEPTS OF DATABASES​
RELATIONAL MODEL OF DATABASES AND OTHER CONCEPTS OF DATABASES​EdwinJacob5
 
A Review of Atypical Hierarchical Routing Protocols for Wireless Sensor Networks
A Review of Atypical Hierarchical Routing Protocols for Wireless Sensor NetworksA Review of Atypical Hierarchical Routing Protocols for Wireless Sensor Networks
A Review of Atypical Hierarchical Routing Protocols for Wireless Sensor Networksiosrjce
 
Chain Based Wireless Sensor Network Routing Using Hybrid Optimization (HBO An...
Chain Based Wireless Sensor Network Routing Using Hybrid Optimization (HBO An...Chain Based Wireless Sensor Network Routing Using Hybrid Optimization (HBO An...
Chain Based Wireless Sensor Network Routing Using Hybrid Optimization (HBO An...IJEEE
 
Database Models, Client-Server Architecture, Distributed Database and Classif...
Database Models, Client-Server Architecture, Distributed Database and Classif...Database Models, Client-Server Architecture, Distributed Database and Classif...
Database Models, Client-Server Architecture, Distributed Database and Classif...Rubal Sagwal
 
Challenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data GenomicsChallenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data GenomicsYasin Memari
 
Physical organization of parallel platforms
Physical organization of parallel platformsPhysical organization of parallel platforms
Physical organization of parallel platformsSyed Zaid Irshad
 
6.1-Cassandra.ppt
6.1-Cassandra.ppt6.1-Cassandra.ppt
6.1-Cassandra.pptDanBarcan2
 
algoritma klastering.pdf
algoritma klastering.pdfalgoritma klastering.pdf
algoritma klastering.pdfbintis1
 

Similar to Characterization of Chemical Libraries Using Scaffolds and Network Models (20)

2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial2016 Cytoscape 3.3 Tutorial
2016 Cytoscape 3.3 Tutorial
 
Cytoscape Network Visualization and Analysis
Cytoscape Network Visualization and AnalysisCytoscape Network Visualization and Analysis
Cytoscape Network Visualization and Analysis
 
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
Deduplication and Author-Disambiguation of Streaming Records via Supervised M...
 
2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial2015 Cytoscape 3.2 Tutorial
2015 Cytoscape 3.2 Tutorial
 
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
MUSYOP: Towards a Query Optimization for Heterogeneous Distributed Database S...
 
NoSql
NoSqlNoSql
NoSql
 
network mining and representation learning
network mining and representation learningnetwork mining and representation learning
network mining and representation learning
 
Network Visualization and Analysis with Cytoscape
Network Visualization and Analysis with CytoscapeNetwork Visualization and Analysis with Cytoscape
Network Visualization and Analysis with Cytoscape
 
RELATIONAL MODEL OF DATABASES AND OTHER CONCEPTS OF DATABASES​
RELATIONAL MODEL OF DATABASES AND OTHER CONCEPTS OF DATABASES​RELATIONAL MODEL OF DATABASES AND OTHER CONCEPTS OF DATABASES​
RELATIONAL MODEL OF DATABASES AND OTHER CONCEPTS OF DATABASES​
 
G010633439
G010633439G010633439
G010633439
 
A Review of Atypical Hierarchical Routing Protocols for Wireless Sensor Networks
A Review of Atypical Hierarchical Routing Protocols for Wireless Sensor NetworksA Review of Atypical Hierarchical Routing Protocols for Wireless Sensor Networks
A Review of Atypical Hierarchical Routing Protocols for Wireless Sensor Networks
 
Chain Based Wireless Sensor Network Routing Using Hybrid Optimization (HBO An...
Chain Based Wireless Sensor Network Routing Using Hybrid Optimization (HBO An...Chain Based Wireless Sensor Network Routing Using Hybrid Optimization (HBO An...
Chain Based Wireless Sensor Network Routing Using Hybrid Optimization (HBO An...
 
Database Models, Client-Server Architecture, Distributed Database and Classif...
Database Models, Client-Server Architecture, Distributed Database and Classif...Database Models, Client-Server Architecture, Distributed Database and Classif...
Database Models, Client-Server Architecture, Distributed Database and Classif...
 
Challenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data GenomicsChallenges and Opportunities of Big Data Genomics
Challenges and Opportunities of Big Data Genomics
 
Physical organization of parallel platforms
Physical organization of parallel platformsPhysical organization of parallel platforms
Physical organization of parallel platforms
 
6.1-Cassandra.ppt
6.1-Cassandra.ppt6.1-Cassandra.ppt
6.1-Cassandra.ppt
 
Cassandra
CassandraCassandra
Cassandra
 
6.1-Cassandra.ppt
6.1-Cassandra.ppt6.1-Cassandra.ppt
6.1-Cassandra.ppt
 
Computer network
Computer networkComputer network
Computer network
 
algoritma klastering.pdf
algoritma klastering.pdfalgoritma klastering.pdf
algoritma klastering.pdf
 

More from Rajarshi Guha

Pharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark GenomePharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark GenomeRajarshi Guha
 
Pharos: Putting targets in context
Pharos: Putting targets in contextPharos: Putting targets in context
Pharos: Putting targets in contextRajarshi Guha
 
Pharos – A Torch to Use in Your Journey In the Dark Genome
Pharos – A Torch to Use in Your Journey In the Dark GenomePharos – A Torch to Use in Your Journey In the Dark Genome
Pharos – A Torch to Use in Your Journey In the Dark GenomeRajarshi Guha
 
Pharos - Face of the KMC
Pharos - Face of the KMCPharos - Face of the KMC
Pharos - Face of the KMCRajarshi Guha
 
Enhancing Prioritization & Discovery of Novel Combinations using an HTS Platform
Enhancing Prioritization & Discovery of Novel Combinations using an HTS PlatformEnhancing Prioritization & Discovery of Novel Combinations using an HTS Platform
Enhancing Prioritization & Discovery of Novel Combinations using an HTS PlatformRajarshi Guha
 
What can your library do for you?
What can your library do for you?What can your library do for you?
What can your library do for you?Rajarshi Guha
 
So I have an SD File … What do I do next?
So I have an SD File … What do I do next?So I have an SD File … What do I do next?
So I have an SD File … What do I do next?Rajarshi Guha
 
From Data to Action : Bridging Chemistry and Biology with Informatics at NCATS
From Data to Action: Bridging Chemistry and Biology with Informatics at NCATSFrom Data to Action: Bridging Chemistry and Biology with Informatics at NCATS
From Data to Action : Bridging Chemistry and Biology with Informatics at NCATSRajarshi Guha
 
Robots, Small Molecules & R
Robots, Small Molecules & RRobots, Small Molecules & R
Robots, Small Molecules & RRajarshi Guha
 
Fingerprinting Chemical Structures
Fingerprinting Chemical StructuresFingerprinting Chemical Structures
Fingerprinting Chemical StructuresRajarshi Guha
 
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...Rajarshi Guha
 
When the whole is better than the parts
When the whole is better than the partsWhen the whole is better than the parts
When the whole is better than the partsRajarshi Guha
 
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...Rajarshi Guha
 
Pushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the PipesPushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the PipesRajarshi Guha
 
Characterization and visualization of compound combination responses in a hig...
Characterization and visualization of compound combination responses in a hig...Characterization and visualization of compound combination responses in a hig...
Characterization and visualization of compound combination responses in a hig...Rajarshi Guha
 
The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research DatabaseRajarshi Guha
 
Cloudy with a Touch of Cheminformatics
Cloudy with a Touch of CheminformaticsCloudy with a Touch of Cheminformatics
Cloudy with a Touch of CheminformaticsRajarshi Guha
 
Chemical Data Mining: Open Source & Reproducible
Chemical Data Mining: Open Source & ReproducibleChemical Data Mining: Open Source & Reproducible
Chemical Data Mining: Open Source & ReproducibleRajarshi Guha
 
Chemogenomics in the cloud: Is the sky the limit?
Chemogenomics in the cloud: Is the sky the limit?Chemogenomics in the cloud: Is the sky the limit?
Chemogenomics in the cloud: Is the sky the limit?Rajarshi Guha
 
Quantifying Text Sentiment in R
Quantifying Text Sentiment in RQuantifying Text Sentiment in R
Quantifying Text Sentiment in RRajarshi Guha
 

More from Rajarshi Guha (20)

Pharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark GenomePharos: A Torch to Use in Your Journey in the Dark Genome
Pharos: A Torch to Use in Your Journey in the Dark Genome
 
Pharos: Putting targets in context
Pharos: Putting targets in contextPharos: Putting targets in context
Pharos: Putting targets in context
 
Pharos – A Torch to Use in Your Journey In the Dark Genome
Pharos – A Torch to Use in Your Journey In the Dark GenomePharos – A Torch to Use in Your Journey In the Dark Genome
Pharos – A Torch to Use in Your Journey In the Dark Genome
 
Pharos - Face of the KMC
Pharos - Face of the KMCPharos - Face of the KMC
Pharos - Face of the KMC
 
Enhancing Prioritization & Discovery of Novel Combinations using an HTS Platform
Enhancing Prioritization & Discovery of Novel Combinations using an HTS PlatformEnhancing Prioritization & Discovery of Novel Combinations using an HTS Platform
Enhancing Prioritization & Discovery of Novel Combinations using an HTS Platform
 
What can your library do for you?
What can your library do for you?What can your library do for you?
What can your library do for you?
 
So I have an SD File … What do I do next?
So I have an SD File … What do I do next?So I have an SD File … What do I do next?
So I have an SD File … What do I do next?
 
From Data to Action : Bridging Chemistry and Biology with Informatics at NCATS
From Data to Action: Bridging Chemistry and Biology with Informatics at NCATSFrom Data to Action: Bridging Chemistry and Biology with Informatics at NCATS
From Data to Action : Bridging Chemistry and Biology with Informatics at NCATS
 
Robots, Small Molecules & R
Robots, Small Molecules & RRobots, Small Molecules & R
Robots, Small Molecules & R
 
Fingerprinting Chemical Structures
Fingerprinting Chemical StructuresFingerprinting Chemical Structures
Fingerprinting Chemical Structures
 
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D...
 
When the whole is better than the parts
When the whole is better than the partsWhen the whole is better than the parts
When the whole is better than the parts
 
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...
Exploring Compound Combinations in High Throughput Settings: Going Beyond 1D ...
 
Pushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the PipesPushing Chemical Biology Through the Pipes
Pushing Chemical Biology Through the Pipes
 
Characterization and visualization of compound combination responses in a hig...
Characterization and visualization of compound combination responses in a hig...Characterization and visualization of compound combination responses in a hig...
Characterization and visualization of compound combination responses in a hig...
 
The BioAssay Research Database
The BioAssay Research DatabaseThe BioAssay Research Database
The BioAssay Research Database
 
Cloudy with a Touch of Cheminformatics
Cloudy with a Touch of CheminformaticsCloudy with a Touch of Cheminformatics
Cloudy with a Touch of Cheminformatics
 
Chemical Data Mining: Open Source & Reproducible
Chemical Data Mining: Open Source & ReproducibleChemical Data Mining: Open Source & Reproducible
Chemical Data Mining: Open Source & Reproducible
 
Chemogenomics in the cloud: Is the sky the limit?
Chemogenomics in the cloud: Is the sky the limit?Chemogenomics in the cloud: Is the sky the limit?
Chemogenomics in the cloud: Is the sky the limit?
 
Quantifying Text Sentiment in R
Quantifying Text Sentiment in RQuantifying Text Sentiment in R
Quantifying Text Sentiment in R
 

Recently uploaded

M.Pharm - Question Bank - Drug Delivery Systems
M.Pharm - Question Bank - Drug Delivery SystemsM.Pharm - Question Bank - Drug Delivery Systems
M.Pharm - Question Bank - Drug Delivery SystemsSumathi Arumugam
 
World Water Day 22 March 2024 - kiyorndlab
World Water Day 22 March 2024 - kiyorndlabWorld Water Day 22 March 2024 - kiyorndlab
World Water Day 22 March 2024 - kiyorndlabkiyorndlab
 
RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024suelcarter1
 
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdfSUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdfsantiagojoderickdoma
 
Principles & Formulation of Hair Care Products
Principles & Formulation of Hair Care  ProductsPrinciples & Formulation of Hair Care  Products
Principles & Formulation of Hair Care Productspurwaborkar@gmail.com
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxAkinrotimiOluwadunsi
 
Physics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and EngineersPhysics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and EngineersAndreaLucarelli
 
Speed to Fly Theory in High-Performance Gliding
Speed to Fly Theory in High-Performance GlidingSpeed to Fly Theory in High-Performance Gliding
Speed to Fly Theory in High-Performance GlidingMichael Mckay
 
KeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data scienceKeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data scienceLayne Sadler
 
Application of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxApplication of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxRahulVishwakarma71547
 
Pests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPRPests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPRPirithiRaju
 
IB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxIB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxUalikhanKalkhojayev1
 
wepik-exploring-the-dynamics-of-kinetic-energy-20240316075156zSuS.pdf
wepik-exploring-the-dynamics-of-kinetic-energy-20240316075156zSuS.pdfwepik-exploring-the-dynamics-of-kinetic-energy-20240316075156zSuS.pdf
wepik-exploring-the-dynamics-of-kinetic-energy-20240316075156zSuS.pdfVishalSuryawanshi31
 
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPirithiRaju
 
Applied Biochemistry feedback_M Ahwad 2023.docx
Applied Biochemistry feedback_M Ahwad 2023.docxApplied Biochemistry feedback_M Ahwad 2023.docx
Applied Biochemistry feedback_M Ahwad 2023.docxmarwaahmad357
 
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonicsIsabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonicsAdvanced-Concepts-Team
 
Human brain.. It's parts and function.
Human brain.. It's parts and function. Human brain.. It's parts and function.
Human brain.. It's parts and function. MUKTA MANJARI SAHOO
 
Bureau of Indian Standards Specification of Shampoo.pptx
Bureau of Indian Standards Specification of Shampoo.pptxBureau of Indian Standards Specification of Shampoo.pptx
Bureau of Indian Standards Specification of Shampoo.pptxkastureyashashree
 
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptxQ3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptxArdeniel
 

Recently uploaded (20)

M.Pharm - Question Bank - Drug Delivery Systems
M.Pharm - Question Bank - Drug Delivery SystemsM.Pharm - Question Bank - Drug Delivery Systems
M.Pharm - Question Bank - Drug Delivery Systems
 
World Water Day 22 March 2024 - kiyorndlab
World Water Day 22 March 2024 - kiyorndlabWorld Water Day 22 March 2024 - kiyorndlab
World Water Day 22 March 2024 - kiyorndlab
 
RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024RCPE terms and cycles scenarios as of March 2024
RCPE terms and cycles scenarios as of March 2024
 
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdfSUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
SUKDANAN DIAGNOSTIC TEST IN PHYSICAL SCIENCE ANSWER KEYY.pdf
 
Principles & Formulation of Hair Care Products
Principles & Formulation of Hair Care  ProductsPrinciples & Formulation of Hair Care  Products
Principles & Formulation of Hair Care Products
 
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptxTHE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
THE HISTOLOGY OF THE CARDIOVASCULAR SYSTEM 2024.pptx
 
Cheminformatics tools supporting dissemination of data associated with US EPA...
Cheminformatics tools supporting dissemination of data associated with US EPA...Cheminformatics tools supporting dissemination of data associated with US EPA...
Cheminformatics tools supporting dissemination of data associated with US EPA...
 
Physics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and EngineersPhysics Serway Jewett 6th edition for Scientists and Engineers
Physics Serway Jewett 6th edition for Scientists and Engineers
 
Speed to Fly Theory in High-Performance Gliding
Speed to Fly Theory in High-Performance GlidingSpeed to Fly Theory in High-Performance Gliding
Speed to Fly Theory in High-Performance Gliding
 
KeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data scienceKeyBio pipeline for bioinformatics and data science
KeyBio pipeline for bioinformatics and data science
 
Application of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptxApplication of Foraminiferal Ecology- Rahul.pptx
Application of Foraminiferal Ecology- Rahul.pptx
 
Pests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPRPests of ragi_Identification, Binomics_Dr.UPR
Pests of ragi_Identification, Binomics_Dr.UPR
 
IB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptxIB Biology New syllabus B3.2 Transport.pptx
IB Biology New syllabus B3.2 Transport.pptx
 
wepik-exploring-the-dynamics-of-kinetic-energy-20240316075156zSuS.pdf
wepik-exploring-the-dynamics-of-kinetic-energy-20240316075156zSuS.pdfwepik-exploring-the-dynamics-of-kinetic-energy-20240316075156zSuS.pdf
wepik-exploring-the-dynamics-of-kinetic-energy-20240316075156zSuS.pdf
 
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdfPests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
Pests of wheat_Identification, Bionomics, Damage symptoms, IPM_Dr.UPR.pdf
 
Applied Biochemistry feedback_M Ahwad 2023.docx
Applied Biochemistry feedback_M Ahwad 2023.docxApplied Biochemistry feedback_M Ahwad 2023.docx
Applied Biochemistry feedback_M Ahwad 2023.docx
 
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonicsIsabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
Isabelle Diacaire - From Ariadnas to Industry R&D in optics and photonics
 
Human brain.. It's parts and function.
Human brain.. It's parts and function. Human brain.. It's parts and function.
Human brain.. It's parts and function.
 
Bureau of Indian Standards Specification of Shampoo.pptx
Bureau of Indian Standards Specification of Shampoo.pptxBureau of Indian Standards Specification of Shampoo.pptx
Bureau of Indian Standards Specification of Shampoo.pptx
 
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptxQ3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
Q3W4part1-SSSSSSSSSSSSSSSSSSSSSSSSCI.pptx
 

Characterization of Chemical Libraries Using Scaffolds and Network Models

  • 1. Characteriza*on  of  Chemical   Libraries  Using  Scaffolds  and   Network  Models   Dac-­‐Trung  Nguyen,  Rajarshi  Guha   NIH  NCATS   ACS  Na:onal  Mee:ng,  Boston  2015  
  • 3. Mo*va*ons   •  Library  comparison  usually  driven  by  a  need  to   construct  or  expand  a  library   – OLen  with  constraints  on  resources   •  Two  classes  of  features  to  consider   – Compound-­‐centric  (physchem  proper:es,   bioac:vity,  target  preferences)   – Library-­‐centric  (diversity,  chemical  space  coverage)   •  Library  comparisons  generally  reduce  to   – Distribu:ons  of  compound  features  (univariate)   – Overlap  in  some  chemical  space  (mul:variate)  
  • 4. Comparing  Libraries   •  Most  comparisons  employ  a  reduced   (numerical)  representa:on  of  the  structure   – Fingerprints,  BCUTs,  physicochemical  descriptors   •  Perform  comparisons  in  the  new  space     – PCA,  SOM,  MDS,  GTM,  …   Schamberger  et  al,  DDT,  2011,  16,  636-­‐641;  Kireeva  et  al,  Mol.  Inf.,  2012,  31,  301-­‐312  
  • 5. Scaffolds  &  Networks   •  Scaffolds  represent  a  chemically  meaningful   reduced  representa:on  of  the  structures   •  Can  be  challenging  to  define  what  a  (good)   scaffold  is   •  A  network  representa:on  of  the  collec:on  of   structures  allows  for  novel  ways  to  perform   library  comparisons   – How  fine  grained  can  such  comparisons  be?  
  • 6. Scaffold  Network  Representa*ons   •  Scaffolds  are  generated  by  exhaus:ve   enumera:on  of  SSSR   •  Scaffolds  are  nodes,  connected  by  directed  edges     •  Nodes  are  labeled  by  a  hash  key  of  the  scaffold   4  compounds   1912  compounds  
  • 7. Scaffold  Network  Construc*on   •  A  scaffold  network  is  a  directed  graph   •  Edges  denote  sub/super-­‐structure  rela:onships   between  scaffolds   •  Each  node  in  the  network  represents  a  unique   scaffold           •  Singletons  are  acyclic  molecules    
  • 8. Datasets   CL1420,  31320  compounds   CL886,  3552  compounds   MIPE,  1920  compounds   Natural  Products,  5000   compounds      Mathews  and  Guha  et  al,  PNAS,  2014,  111,  11365;  Singh  et  al,  JCIM,  2009,  49,  1010   LOPAC,  1280  compounds   1079  nodes,  115287  edges   69  trees   2131  nodes,  1843  edges   129  trees   Approved,  inves:ga:onal   drugs,  constructed  for   func:onal  diversity     Diverse  library,  designed  for   enrichment  of  bioac:vity   15283  nodes,  13622  edges   729  trees   5563  nodes,  4832  edges   239  trees   23716  nodes,  21468  edges   750  trees  
  • 9. •  The  overall  structure  of  the  complete  network   can    characterize  the  library   •  But  distribu:ons  of  vertex-­‐level  network   metrics  may    be    informa:ve   •  We  can  also  consider   approaches  to  iden:fy   “important”  scaffolds   Scaffold  Network  Representa*ons  
  • 10. Metrics  for  the  Complete  Network   •  Examined  vertex-­‐level  measures  of  centrality   – Closeness,  betweenness,  …   – High  similarity  of  MIPE  &  NP  and  low  similarity  of   LOPAC  &  NP  is  surprising  (Ertl  et  al,  JCIM,  2008)   0.00 0.25 0.50 0.75 −10 −9 −8 −7 −6 −5 log10(Betweenness) density CL1420 CL886 LOPAC MIPE NP 0 5000 10000 15000 20000 −8 −7 −6 log Closeness (in−degree) Num.Scaffold CL1420 CL886 LOPAC MIPE NP
  • 11. 0.00 0.25 0.50 0.75 1.00 0.00 0.25 0.50 0.75 1.00 0.000 0.025 0.050 0.075 CentralizationCPLTransitivity CL1420 CL886 LOPAC MIPE NP Library Value Metrics  for  the  Complete  Network   •  Useful  to  summarize     distribu:ons  by  scalar     metrics   •  Path  length  metrics  are   not  discriminatory  due     to  many  short  paths   •  Extent  of  clustering  differs   but  is  quite  low  overall  
  • 12. Comparing  Complete  Networks   •  Library  overlap  is  characterized  by  the  set  of   common  scaffolds   •  Scaffolds  can  be  ranked  (e.g.,   PageRank)   –   Small  fragments  have  low  PR   – Large  frameworks  have  high  PR   – Interes:ng  scaffolds  lie  in  between?   •  Similar  libraries  will  have     common  scaffolds  with  similar   PageRank  values   PageRank vector PageRank vector Subset Common Fragments Subset Common Fragments Normalized Dot Product
  • 13. Comparing  Complete  Networks   1 0 0 0 0 0 1 0 0 0 0 0 1 0.2 0.3 0 0 0.2 1 0.3 0 0 0.3 0.3 1 CL1420 CL886 LOPAC MIPE NP CL1420 CL886 LOPAC MIPE NP
  • 14. Scaffold  Recogni*on   •  What  is  a  scaffold?   •  Can  be  addressed  through  the  scaffold   network   – A  scaffold  is  a  hub  within  the  scaffold  network   •  Provide  a  prac:cal  answer  to  “What  are  the   missing  scaffolds  in  my  library”   •  Examples  of  unique  scaffolds  in  MIPE  but  not   in  NP    
  • 16. Reduced  Network  Representa*on   •  The  complete  network  can  be  reduced   to  a  forest  of  trees   •  Order  nodes  by  out-­‐degree   •  From  each  node,  traverse   network  un:l  a  terminal  node  is   reached   •  Result  is  a  set  of  spanning  trees  
  • 17. Reduced  Network  Representa*on   MIPE,  1912  compounds  
  • 18. Network  Structure   •  A  scaffold  forest  is  characterized  by   – Disconnected  components     •  structurally  related  scaffolds,  scaffolds  diversity   – Singletons     •  scaffolds  with  no  superstructure   – Branching  within  connected  components   •  scaffold  complexity  
  • 19. Forest  Size  vs  Library  Size   •  A  large  libraries  doesn’t  imply  a  large  forest   •  Forest  size  is  a  func:on  of  scaffold  diversity   CL1420,  31K  combinatorial  library   MIPE,  1912  (target)  diverse  library  
  • 20. Summarizing  Forests   •  A  key  feature  is  the  nature  of  branching  in   individual  trees   •  Characterized  by  ID  -­‐  informa:on  theore:c   descriptor  of  branching  derived  from  the   distance  matrix   Bonchev  &  Trinajis:c,  IJQC,  1978,  14,  293-­‐303   ID  =  978   ID  =  90794  ID  =  3456   ID  =  979252  
  • 21. Summarizing  Forests   •  Distribu:on  of  ID    dis:nguishes  datasets     primarily  in  the  tails   •  Aggrega:ng  by  mean  ID  s:ll  discriminates  well   – Driven  by  the  tails   0.00 0.25 0.50 0.75 1.00 2 4 6 log10(ID) Density CL1420 CL886 LOPAC MIPE NP 0 1 2 3 4 CL1420 CL886 LOPAC MIPE NP Meanlog10(ID)
  • 22. Exploring  the  Forest   •  The  metric  also  allows  us  to  drill  down   – Select  scaffolds  of  given  branching  complexity   – Iden:fy  scaffolds  of  given  complexity  range  across   different  libraries  (equivalent  to  finding  holes  in   scaffold  coverage)   ≈ LOPAC,  ID  =  10214   MIPE,  ID  =  10197  
  • 23. Library  Comparison  via  Merging   •  …  reduces  to  comparing  networks   •  We  compute  a  graph  union  and  construct  new   edges  between  nodes  with  the  same  hash   •  How  does  the  network   structure  of  the  union   differ  from  the  original     networks?   •  Can  be  extended  to  merge   more  than  two  networks  
  • 24. Source  Forests   •  Structurally   similar  networks   •  2659  iden:cal   nodes   •  Construct  union   by  connec:ng   nodes  with   iden:cal  hash   LOPAC   MIPE  
  • 25. Merged  Network   •  Green  edges   “bridge”  the  two   networks   •  Trees  can  now  have   two  types  of  nodes   •  How  can  we   characterize  the   – Contrac:on?   – Degree  of  mixing?  
  • 26. Contrac*on  to  Measure  Overlap   •  Merging  very  similar  libraries  should  generate     a  smaller  forest  compared  to  the  original  forests               •  But  this  doesn’t  really  describe  how  the   individual  trees  become  (more)  connected     Cnorm = F12 F1 + F2 where Fi = G1i,G2i,!,Gni{ } 0.00 0.25 0.50 0.75 1.00 Cl886/CL1420 MIPE/CL886 MIPE/LOPAC MIPE/NP Cnorm
  • 27. 0 25 50 75 100 Cl886/CL1420 MIPE/CL886 MIPE/LOPAC MIPE/NP %oftrees Assortive Not Assortive Assorta*vity    to  Measure  Overlap   •  Quan:fies  the  no:on  that  “like     connects  to  like”   •  Undefined  for  trees  that  only  have  one   type  of  vertex  (i.e.,  only  from  a     single  library)   •  The  number  of  trees     that  are  assorta:ve  is   a  global  indicator  of     library  similarity   Newman,  Phys.  Rev.  E.,  2003,  026126  
  • 28. 0 10 20 30 0.4 0.6 0.8 1.0 Assortativity density Cl886/CL1420 MIPE/CL886 MIPE/LOPAC MIPE/NP Assorta*vity    to  Measure  Overlap   •  We  then  examine  the  distribu:on  of   assorta:vity  across  assorta:ve  trees   •  Dissimilar  libraries  have   few  assorta:ve  trees   – But  they  have  high  values   of  assorta:vity   •  However,  high  assorta:vity   doesn’t  imply  high  overlap  
  • 29. Assorta*vity    to  Measure  Overlap   Assorta:vity  =  0.85   (MIPE  &  NP)   Assorta:vity  =  0.95   (CL886  &  CL1420)  
  • 30. Overlap  via  Tree  Complexity   •  Similar  libraries  lead  to  fewer  trees  in  the   merged  network,  but  also  denser  trees               •  Change  in  density  (branching)  across  the   forest  can  also  measure  the  extent  of  overlap   MIPE   LOPAC   Merged  
  • 31. Summarizing  via  Tree  Complexity   •  Distribu:ons  of  ID  before  and  aLer  merging   don’t  differ  very  much,  visually   •  However  a  KS  test  does  discriminate  them   0.0 0.2 0.4 0.6 0.8 1 2 3 4 log10(ID) density Individual Merged CL886  /  CL1420   MIPE  /  NP   0.0 0.1 0.2 0.3 0.4 2.5 5.0 7.5 log10(ID) density Individual Merged D  =  0.0173,  p  =  1   D  =  0.0582,  p  =  .0008  
  • 32. Summary   •  Scaffold  networks  are  a  rela:vely  objec:ve   way  to  characterize  &  compare  libraries   – Supports  fast  comparisons  between  libraries   •  The  approach  supports  mul:plexing   informa:on  in  to  a  single  data  structure   – Physchem  proper:es,  bioac:vi:es,  …   •  “What  is  a  good  comparison?”  quickly   becomes  a  philosophical  ques:on