SlideShare una empresa de Scribd logo
1 de 37
Descargar para leer sin conexión
Application of CurlySMILES to the
encoding of polymer systems
Presented on August 16, 2015, at the ACS 250th
National
Meeting in Boston
Division: Computers in Chemistry
Session: Accelerated Discovery of Chemical Compounds:
Design New Polymers & Inorganic Materials from
Integration of Polymer Science, Materials Science &
Informatics
Axel Drefahl
axeleratio@gmail.com
Axeleratio, Reno, Nevada
www.axeleratio.com/axel/acs/boston2015/CurlySMILESpolymers.pdf
Copyright © 2015 Axel Drefahl
What is CurlySMILES?
CurlySMILES is:
● a chemical language to capture, process and share,
nanostructures, based on molecular constitution,
connectivity and arrangement;
● a line notation system integrating SMILES with atom-
and molecule-anchored annotations, inserted via
curly braces: {…};
● custamizable by annotation: encoding of polymers,
complexes, multi-phase systems, ...;
● available as a suite of Python 3 modules, including a
notation parser and unique notation generator.
Copyright © 2015 Axel Drefahl
Overview
● Current state of polymer informatics
● Brief introduction to CurlySMILES
● Encoding of Structural Repeat Units (SRUs)
● Encoding of single-strand polymers
● Encoding of multi-strand polymers
● Encoding of copolymers and miscellaneous
polymer systems
● CurlySMILES software/task-specific integration
● Perspective: virtual polymer chemistry
Copyright © 2015 Axel Drefahl
Cheminformatics (sub)domains
Established informatics
● “Small molecules”
● Crystalline solids
● Peptides, DNAs, ...
Capturing & processing
● Molecular graph
● Unit cell, space group
● Fragment sequence
Capturing & processing
● Struct. repeat unit (SRU)
● Nano-object (sphere,rod,...)
● Variable groups: R,X,Y,Z,...
● Metalevel components
Evolving informatics
●
Polymer systems
● Nanomaterials
● Material classes
● Composites & design
Copyright © 2015 Axel Drefahl
Polymer informatics: approaches
and tools
● IUPAC nomenclature & seniority rules (head-tail selection)
● S-group (superatom): SRU with crossing bonds and
brackets (common representation, MDL, MarvinSketch),
● ThermoML with polymer block to specify compounds,
● Polymer Markup Language (PML),
● Polymer Informatics Knowledge System (PIKS) - PolyInfo
database | “walled gardens” of polymer information,
● InChI polymer project (awaits implementation),
● CurlySMILES project, actively designs a human-machine
interface for nanoarchitectures, including polymers, and
develops open-source Python code.
Copyright © 2015 Axel Drefahl
From SMILES to CurlySMILES
SMILES: Simplified Molecular Input Line Entry
System
Published by David Weininger in 1988
doi: 10.1021/ci00057a005
CurlySMILES: Curly-braces enhanced Smart
Material Input Line Entry Specification
Published by Axel Drefahl in 2011
doi: 10.1186/1758-2946-3-1
Copyright © 2015 Axel Drefahl
CurlySMILES Motivation
● Chemical nomenclature and encoding languages typically
employ idealized representations, while minor structural
irregularities and impurities are ignored. CurlySMILES
encoding enables their insertion via annotation, if desired.
● A molecular-graph-derived notation is often taken to represent
molecule and substance interchangeably. CurlySMILES
employs molecule multipliers and allows for phase distinction,
for example, by using state and shape annotations such as lq,
tf, am, cr, np, ...
● Variability of detail: stoichiometric formula notation (SFN)
● Encoding of molecular arrangements: hydrogen-bonded
molecules, complexes, macromolecules and other
nanoassemblies.
Copyright © 2015 Axel Drefahl
Format of curly-enclosed
annotations in CurlySMILES
{AMk1=v1;...;kn=vn}
AM is a one-char or two-char annotation marker; a two-char
AM may by followed by an annotation dictionary, a
semicolon-separated list of key/value (ki/vi) pairs. Keys
are predefined, but extensible by customization ($ prefix).
Example: n-octanethiol functionalized gold nanoparticle
dispersed in toluene (●_
SCH2
(CH2
)6
CH3
in toluene)
S{-|c=[Au]{np}}CCCCCCCC{dpc=Cc1ccccc1}
AMs: -| for surface-attached, dp for dispersed, np for
nanoparticle.
Copyright © 2015 Axel Drefahl
Annotated Molecular Graph
Example: (Z)-but-1-ene-1,4-diyl substructure
CurlySMILES: C{-}=C{Z}CC{-}
Atom-anchored annotations:
Structural unit annotation (pendent single bond): {-}
Stereodescriptive annotation: {Z}
Copyright © 2015 Axel Drefahl
Poly[(Z)-but-1-ene-1,4-diyl]
CurlySMILES: C{-}=C{Z}CC{+n}
Atom-anchored annotations:
Structural unit annotation: {-} at head node
Stereodescriptive annotation: {Z}
Operational notation:{+n} at tail node
Copyright © 2015 Axel Drefahl
Does CurlySMILES encode
macromolecules or polymers?
Answer: both (user choice). CurlySMILES comes with a rich
annotation dictionary to encode chain length variation and phases.
A macromolecule is a single molecule. The term “polymer” can
mean “macromolecule” or a “substance” composed thereof,
typically with a “degree of polymerization” (DOP) range.
An oligomer or a macromolecule of a specific length is encoded
based on the chain graph, i.e. the SRU graph, using annotation
dictionary key n: {+nn=10} for ten-time-occurrence of SRU.
A polymer is encoded by leaving out n (generic polymer). The key
dpr may specify a DOP range: {+ndpr=gt250}. AMs such as am
(amorphous) or cr (crystalline) indicate a particular polymer phase.
A polymer system is encoded by additional annotations specifying,
for example, impurities, additives and solvents.
Copyright © 2015 Axel Drefahl
Tail node annotations to formally
construct polymers
{+n} anchored at tail node of divalent SRU to build single-
strand polymer via head-tail single-bond connection.
{+r} anchored at tail node of SRU to build single-strand
macrocycle (last tail node connects first head node).
{+m} anchored at tail node of non-single-bond or multivalent
SRU to build multibond/multi-strand polymer via
specified head-tail connection using key ich to
provide index of corresponding head node.
{+s} anchored at tail node of the last (right-most) SRU in a
copolymer sequence to provide copolymer details;
for example, a copolymer qualifier via key cpq to
specify an alternating, block or random sequence.
Copyright © 2015 Axel Drefahl
CurlySMILES notations of some
common single-strand homopolymers
Structure-Based Name Structural Formula CurlySMILES Notation
Poly(oxymethylene) -[OCH2
]-n O{-}C{+n}
Poly(iminoethylene) -[NHCH2
CH2
]-n N{-}CC{+n}
Poly(1-hydroxyethylene) -[CH(OH)CH2
]-n OC{-}C{+n}
Poly(1-cyanoethylene) -[CH(CN)CH2
]-n N#CC{-}C{+n}
Poly(1,1-difluoroethylene) -[CF2
CH2
]-n FC{-}(F)C{+n}
Poly(1-phenylethylene) -[CH(Ph)CH2
]-n C{-}(c1ccccc1)C{+n}
Poly(oxy-1,4-phenylene) -[O-paraPh]-n O{-}c1ccc{+n}cc1
Poly(methylene) -[CH2
]-n C{-}{+n}
Poly(difluoromethylene) -[CF2
]-n FC{-}{+n}F
Copyright © 2015 Axel Drefahl
Polydispersity characterization
With the exception of the dimensionless pdi, units are kg/mol.
Example: {+npMn=89.2} to encode a single-strand
polymer with a number-average molar mass of 89.2 kg/mol
Key Symbol Meaning ThermoML tag name
pMn Mn Number-average molar mass nNumberAvgMolWt
pMm Mm Mass-average molar mass nWeightAvgMolWt
pMz Mz z-Average molar mass nZAvgMolWt
pMv Mv Viscosity-average molar mass nViscosityAvgMolWt
pMp Mp Peak molar mass nPeakAvgMolWt (?)
pdi Mm
/Mn
Polydispersity index nPolydispersityIndex
Copyright © 2015 Axel Drefahl
Anionic homopolymer with
monoatomic cations
Example: poly(sodium 1-carboxylatoethylene)
CurlySMILES: O=C([O-]{+Cc=[Na+]})C{-}C{+n}
The operational annotation marker +C is used to
include [Na+] as counterion to [O-]. [Na+]is part
of the repeat unit.
Copyright © 2015 Axel Drefahl
Homopolymer with terminating
groups at head and tail
Example: poly(ethylene terephthalate) by
esterification of terephthalic acid with ethylene glycol
[H]O{-}CCOC(=O)c1ccc(cc1)C{+ninc=2-15;ich=2}
(=O)OCCO
Nodes 2 to 15 are parts of SRU. Node 1 makes the head
terminus and nodes 16-20 belong to tail end group.
Copyright © 2015 Axel Drefahl
Cyclic polymers or oligomers
Example: cyclic poly(silaether)
[Si]{-}(C)(C)[Si](C)(C)O{+rn=24}
Shortcut for a long SMILES notation:
[Si]1(C)(C)[Si](C)(C)O...[Si](C)(C)[Si](C)(C)O1
Such cyclic poly(silaether) are obtained, for example, as by-products
while making their linear homologs by ring-opening polymerization of
octamethyl-1,4-dioxatetrasilacyclohexane [10.1021/ma00086a048].
Copyright © 2015 Axel Drefahl
Surface-grafted functional oligomer
Example: polyacrylamid brush grown on
silicon
N{-|c=[Si]}C(=O)c1ccc(cc1)CCC{-} 
C{+ninc=12-16;ich=12}C(=O)N
Group environment annotion -| for bond to substrate
Growth of such polyacrylamide brushes on a silicon wafer is studied
to understand how to reduce or prevent microbial adhesion on
surfaces by chemical surface modification [doi: 10.1021/la063531v].
Copyright © 2015 Axel Drefahl
Regular double-strand polymers:
chain of formally fused cycloalkane rings
Example: poly(butane-1,4:3,2-tetrayl)
CurlySMILES notation:
C{-}C{+mich=1}C{+mich=4}C{-}
two head nodes: C{-}, two tail nodes C{+m}
For IUPAC nomenclature of this polymer see
A Brief Guide to Polymer Nomenclature.
Copyright © 2015 Axel Drefahl
Regular double-strand polymers:
chain of formally fused heterocycles
Example: poly(2,4-dimethyl-1,3,5-trioxa-2,4-disilapentane-
1,5:4,2-tetrayl)
CurlySMILES notation:
O{-}[Si]{+mich=1}(C)O[Si]{+mich=7}(C)O{-}
two head nodes: O{-}, two tail nodes [Si]{+m}
For IUPAC nomenclature of this polymer see page 1573 in
http://old.iupac.org/publications/pac/1993/pdf/6507x1561.pdf.
Copyright © 2015 Axel Drefahl
Double bond between head and tail
Example: poly(piperidine-3,5-diylideneethanediylidene)
CurlySMILES notations:
A: C1{=}CNCC(C1)=CC{+mich=1;b==}
B: C{-}C1CNCC(C1)=C{+n}
Both notations encode correct atom connectivity. In the
IUPAC-compliant notation A, key b specifies = as bond between
tail and head.
For IUPAC nomenclature of this polymer see page 1941 in
Nomenclature of Regular Single-Strand Organic Polymers.
Copyright © 2015 Axel Drefahl
Encoding with copolymer qualifiers
Copolymer Qualifiers Example:
poly(styrene-co-isoprene)
CurlySMILES notation of above example:
C{-}C{+ninc=1-8;ich=1}(c1ccccc1) 
C{-}C=C(C)C{+ninc=9-13;ich=9}{+scpq=c}
cpq Qualifier Meaning
a alt alternating
b block block
c co generic
g graft graft
p per periodic
r ran random
s stat statistical
Copyright © 2015 Axel Drefahl
Encoding of a terpolymer
Example: poly[methyl-N-(3,4-dimethylphenyl)-N-(4-biphenyl)-N-(4-
phenyloxy)siloxane-co-phenylmethylsiloxane-co-
methylhydrosiloxane]
c1ccccc1[Si]{-}(C)O{+ninc=1-9;ich=7}[SiH]{-}
(C)O{+ninc=10-12;ich=10}[Si]{-}(C)
(Oc2ccc(cc2)N(c3cc(C)c(C)cc3)c4ccc(cc4)-
c5ccccc5)O{+ninc=13-43;ich=13}{+scpq=c}
For more about this terpolymer see 10.1021/ma202041u.
Copyright © 2015 Axel Drefahl
Nesting of SRUs
Example: unsaturated polyester with α,ω-alkanediyl
bridges
CurlySMILES notation:
C{-}(=O)OC{-}{+ninc=4;ich=4;n=5-9} 
OC(=O)C(C)=CCC{+ninc=1-13;ich=1}C
Copyright © 2015 Axel Drefahl
Encoding of polymer blends
Example: polystyrene/poly(methyl methacrylate) blend
CurlySMILES notation:
C{-}C{+n}c1ccccc1.C{-}C{+n}(C)C(=O)OC{mx}
Annotation {mx} indicates a compatible or incompatible mixture.
CurlySMILES encoding as a two-phase system (composite):
{/C{-}C{+n}c1ccccc1/C{-}C{+n}(C)C(=O)OC}
Copyright © 2015 Axel Drefahl
Encoding of polymer solutions
Example: poly(1-cyanoethylene) dissolved in
dihydrofuran-2(3H)-one (γ-butyrolactone)
CurlySMILES notation:
C{-}(C#N)C{+n}{dsc=O=C1OCCC1}
Annotation marker ds for dissolved
Key c for CurlySMILES notation with assigned value O=C1OCCC1
Copyright © 2015 Axel Drefahl
Encoding of doped polymers
Example: poly(1,4-phenylene sulfide) doped with
arsenic pentafluoride
CurlySMILES notation:
c1{-}ccc(cc1)S{+n}{IMc=F[As](F)(F)(F)F}
Annotation marker IM for impurity
Key c specifying dopant F[As](F)(F)(F)F
Copyright © 2015 Axel Drefahl
Encoding of polymer sets
Example: poly[(alkylimino)methyleneimino-1,3-
phenylene] with specified alkyl groups
CurlySMILES notation:
N{-}{+Rcc=C{-},CC{-},CCC{-},CC{-}C,CC{-}
(C)C}CNc1cccc{+n}c1
Annotation marker +R for alkyl group insertion
Key cc for list of comma-separated CurlySMILES notations; here,
encoding the specified alkyl groups methyl, ethyl, n-propyl, iso-
propyl and tert-butyl
Copyright © 2015 Axel Drefahl
CurlySMILES in Python 3
Current iteratively tested implementations
● Modules to parse and analyze molecular-graph-based notations
and their annotations
● CANGEN-based methods for input-to-unique conversion of
notations (regular single-strands)
● Substructure and descriptor generation methods
● Programs to maintain and screen Axeleratio's in-house
bibliography of CurlySMILES-tagged literature, including nano-
device and polymer publications.
Copyright © 2015 Axel Drefahl
Transformation of a CurlySMILES
notation based on node ranks
Example: poly[(2-propyl-1,3-dioxane-4,6-
diyl)methylene]
Entered: C1{-}OC(CCC)OC(C1)C{+n}
Unique: C1{-}CC(OC(CCC)O1)C{+n}
The CH2 ring node ranks lower than the left O node; the CH2 tail node
ranks higher than the right O node.
Copyright © 2015 Axel Drefahl
Uniqueness depending on selection
of head/tail (H/T) pair
O{-}CC{+n} C{-}OC{+n} C{-}CO{+n}
poly(oxyethylene)
Nomenclature-conform selection of head and tail
nodes is recommended in polymer encoding.
[see examples of unique notations for regular single-strand polymers]
Copyright © 2015 Axel Drefahl
Task-specific integration of
CurlySMILES modules
● Interfacing polymer structure (input/output)
Form-to-notation editors
Notation-to-sketch and notation-to-query software
● Pipelining polymer data (data administration)
Automatic ranking and comparison of structure/data pairs
Screening of structured lists and repositories
● Generating virtual libraries
Automatically building lists of polymer notations for QSPR
analysis and identification of optimal-design candidates
Copyright © 2015 Axel Drefahl
Application to polymer data mining:
“nurturing the mine sites”
SRU-based CurlySMILES notations in unique form are
identifiers of macromolecules and polymer systems that
can be employed to
• function as search keys in database applications,
• tag factsheets, notes and bibliographic entries,
• populate spreadsheet cells and XML text nodes,
• index and abstract the polymer literature & patents,
• create ontologies that organize polymer information,
..which can be shared via Semantic Web technologies.
Copyright © 2015 Axel Drefahl
Application to polymer data mining:
search and data extraction
The CurlySMILES language has a rich and extensible
dictionary to encode polymers in diverse contexts and at
various levels of detail.
Notations work both ways as precise data annotations
and as query formulations for “needle-in-the-haystick”
requests.
Today's polymer knowledge systems are not marked up by
CurlySMILES. But client-server mediation can be
achieved, behind-the-scenes, via CurlySMILES code to
• compact polymer input provided through entry forms,
• expand notations into query language formats.
Copyright © 2015 Axel Drefahl
Application to polymer modeling
CurlySMILES representations of polymer systems contain
detailed structural information to derive macromolecular
descriptors and substructures (groups) as entry points
for property prediction and model development:
• Structure property relationships (QSPRs, GCMs)
• SRU similarity (kNN and pattern recognition methods)
• MC & MD simulations (flexibility, solution behavior)
• Backbone modeling (polymer stability & degradation)
• Kinetic & ab initio methods (controlled polymerization)
Copyright © 2015 Axel Drefahl
Application to polymer design
Specialty polymers must meet multifaceted
requirements (multi-dimensional property windows).
The virtual design of polymers by permutationally
building (co)polymers (or blends) based on systematically
varied monomer structures often results into large
libraries of structurally related polymers with predictable
properties.
The automatic generation of the polymer structures of
such libraries as compact CurlySMILES notation and the
implementation of predictive methods for the desired
properties will allow virtual high-throughput screening
to initialize the synthesis of potential candidates.
Copyright © 2015 Axel Drefahl
Summary & Outlook
Done
● SRU annotations to encode
polymers
● Polymer description grammar
● Python implementation
To Do
● Stereochemical descriptions
● Unique notations for nested
polymers
● Conquering polymer space
Topics to be addressed for CurlySMILES applications
● Representation and iterative development of models for
structure/property estimation
● Extension to advanced architectures: dendrimers, 3D polymers
and nanostructure designs combining polymers with carbon
nanotubes and fullerene-based bowls and cages

Más contenido relacionado

La actualidad más candente

Tutorial on Coreference Resolution
Tutorial on Coreference Resolution Tutorial on Coreference Resolution
Tutorial on Coreference Resolution Anirudh Jayakumar
 
Daniel Shank, Data Scientist, Talla at MLconf SF 2016
Daniel Shank, Data Scientist, Talla at MLconf SF 2016Daniel Shank, Data Scientist, Talla at MLconf SF 2016
Daniel Shank, Data Scientist, Talla at MLconf SF 2016MLconf
 
2D QSAR DESCRIPTORS
2D QSAR DESCRIPTORS2D QSAR DESCRIPTORS
2D QSAR DESCRIPTORSSmita Jain
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer modelsDing Li
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksNguyen Quang
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers Arvind Devaraj
 

La actualidad más candente (7)

Tutorial on Coreference Resolution
Tutorial on Coreference Resolution Tutorial on Coreference Resolution
Tutorial on Coreference Resolution
 
Daniel Shank, Data Scientist, Talla at MLconf SF 2016
Daniel Shank, Data Scientist, Talla at MLconf SF 2016Daniel Shank, Data Scientist, Talla at MLconf SF 2016
Daniel Shank, Data Scientist, Talla at MLconf SF 2016
 
2D QSAR DESCRIPTORS
2D QSAR DESCRIPTORS2D QSAR DESCRIPTORS
2D QSAR DESCRIPTORS
 
Natural language processing and transformer models
Natural language processing and transformer modelsNatural language processing and transformer models
Natural language processing and transformer models
 
Structure determination
Structure determinationStructure determination
Structure determination
 
Sequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural NetworksSequence to Sequence Learning with Neural Networks
Sequence to Sequence Learning with Neural Networks
 
NLP using transformers
NLP using transformers NLP using transformers
NLP using transformers
 

Destacado

Synthesis of composite polymer for industrial application
Synthesis of composite polymer for industrial applicationSynthesis of composite polymer for industrial application
Synthesis of composite polymer for industrial applicationeSAT Journals
 
Material science technology plastics
Material science technology   plasticsMaterial science technology   plastics
Material science technology plasticsHayati Haji Rithuwan
 
3D printing undergrad symposium 2016
3D printing undergrad symposium 20163D printing undergrad symposium 2016
3D printing undergrad symposium 2016Megan Detwiler
 
Anisoprint - 3D printing of continuous fiber reinforced plastics
Anisoprint - 3D printing of continuous fiber reinforced plasticsAnisoprint - 3D printing of continuous fiber reinforced plastics
Anisoprint - 3D printing of continuous fiber reinforced plasticsFedor Antonov
 
Adopt an element
Adopt an elementAdopt an element
Adopt an elementkaonders
 
Physical Properties of Copper
Physical Properties of CopperPhysical Properties of Copper
Physical Properties of CopperKhairul Bashar
 
POLYMER MODIFICATION WITH CARBON NANOTUBES
POLYMER MODIFICATION WITH CARBON NANOTUBESPOLYMER MODIFICATION WITH CARBON NANOTUBES
POLYMER MODIFICATION WITH CARBON NANOTUBESArjun K Gopi
 
Synthesis and characterization of nanocomposites
Synthesis and characterization of nanocompositesSynthesis and characterization of nanocomposites
Synthesis and characterization of nanocompositessowmya sankaran
 
Additive Manufacturing: 3D Printing--Past, Present, and Future
Additive Manufacturing: 3D Printing--Past, Present, and FutureAdditive Manufacturing: 3D Printing--Past, Present, and Future
Additive Manufacturing: 3D Printing--Past, Present, and Future360mnbsu
 
In-situ polymerization
In-situ polymerizationIn-situ polymerization
In-situ polymerizationArjun K Gopi
 
3D 프린팅 연구/리서치. 3D printing research korean version
3D 프린팅 연구/리서치. 3D printing research korean version3D 프린팅 연구/리서치. 3D printing research korean version
3D 프린팅 연구/리서치. 3D printing research korean versionJiwoo Seo
 
PREPARATION OF NANOCOMPOSITES
PREPARATION OF NANOCOMPOSITESPREPARATION OF NANOCOMPOSITES
PREPARATION OF NANOCOMPOSITESArjun K Gopi
 
Additive Manufacturing (2.008x Lecture Slides)
Additive Manufacturing (2.008x Lecture Slides)Additive Manufacturing (2.008x Lecture Slides)
Additive Manufacturing (2.008x Lecture Slides)A. John Hart
 
Properties, prossesing of natural fiber
Properties, prossesing of natural fiberProperties, prossesing of natural fiber
Properties, prossesing of natural fiberNoornabila Syuhada
 
Conducting polymers By Dheeraj Kumar
Conducting polymers By Dheeraj KumarConducting polymers By Dheeraj Kumar
Conducting polymers By Dheeraj KumarDheeraj Anshul
 

Destacado (20)

PATIL_ABHISHEK
PATIL_ABHISHEKPATIL_ABHISHEK
PATIL_ABHISHEK
 
Synthesis of composite polymer for industrial application
Synthesis of composite polymer for industrial applicationSynthesis of composite polymer for industrial application
Synthesis of composite polymer for industrial application
 
Material science technology plastics
Material science technology   plasticsMaterial science technology   plastics
Material science technology plastics
 
3D printing undergrad symposium 2016
3D printing undergrad symposium 20163D printing undergrad symposium 2016
3D printing undergrad symposium 2016
 
Anisoprint - 3D printing of continuous fiber reinforced plastics
Anisoprint - 3D printing of continuous fiber reinforced plasticsAnisoprint - 3D printing of continuous fiber reinforced plastics
Anisoprint - 3D printing of continuous fiber reinforced plastics
 
Adopt an element
Adopt an elementAdopt an element
Adopt an element
 
App request ru
App request ruApp request ru
App request ru
 
Physical Properties of Copper
Physical Properties of CopperPhysical Properties of Copper
Physical Properties of Copper
 
POLYMER MODIFICATION WITH CARBON NANOTUBES
POLYMER MODIFICATION WITH CARBON NANOTUBESPOLYMER MODIFICATION WITH CARBON NANOTUBES
POLYMER MODIFICATION WITH CARBON NANOTUBES
 
Additive manufacturing
Additive manufacturingAdditive manufacturing
Additive manufacturing
 
Synthesis and characterization of nanocomposites
Synthesis and characterization of nanocompositesSynthesis and characterization of nanocomposites
Synthesis and characterization of nanocomposites
 
Additive Manufacturing: 3D Printing--Past, Present, and Future
Additive Manufacturing: 3D Printing--Past, Present, and FutureAdditive Manufacturing: 3D Printing--Past, Present, and Future
Additive Manufacturing: 3D Printing--Past, Present, and Future
 
In-situ polymerization
In-situ polymerizationIn-situ polymerization
In-situ polymerization
 
3D 프린팅 연구/리서치. 3D printing research korean version
3D 프린팅 연구/리서치. 3D printing research korean version3D 프린팅 연구/리서치. 3D printing research korean version
3D 프린팅 연구/리서치. 3D printing research korean version
 
PREPARATION OF NANOCOMPOSITES
PREPARATION OF NANOCOMPOSITESPREPARATION OF NANOCOMPOSITES
PREPARATION OF NANOCOMPOSITES
 
Additive Manufacturing (2.008x Lecture Slides)
Additive Manufacturing (2.008x Lecture Slides)Additive Manufacturing (2.008x Lecture Slides)
Additive Manufacturing (2.008x Lecture Slides)
 
Properties, prossesing of natural fiber
Properties, prossesing of natural fiberProperties, prossesing of natural fiber
Properties, prossesing of natural fiber
 
Conducting polymers By Dheeraj Kumar
Conducting polymers By Dheeraj KumarConducting polymers By Dheeraj Kumar
Conducting polymers By Dheeraj Kumar
 
Plastics
PlasticsPlastics
Plastics
 
Plastics .ppt
Plastics .pptPlastics .ppt
Plastics .ppt
 

Similar a Application of CurlySMILES to the encoding of polymer systems

organicchemistry-151021071456-lva1-l.ppt
organicchemistry-151021071456-lva1-l.pptorganicchemistry-151021071456-lva1-l.ppt
organicchemistry-151021071456-lva1-l.pptMayur Malgear
 
Deterioration Modelling of Structural Members Subjected to Cyclic Loading Usi...
Deterioration Modelling of Structural Members Subjected to Cyclic Loading Usi...Deterioration Modelling of Structural Members Subjected to Cyclic Loading Usi...
Deterioration Modelling of Structural Members Subjected to Cyclic Loading Usi...openseesdays
 
Les17[1] Writing Executable Statements
Les17[1] Writing Executable StatementsLes17[1] Writing Executable Statements
Les17[1] Writing Executable Statementssiavosh kaviani
 
Md simulations modified
Md simulations modifiedMd simulations modified
Md simulations modifiedshahmeermateen
 
The advantages of thermoelectric power generation
The advantages of thermoelectric power generationThe advantages of thermoelectric power generation
The advantages of thermoelectric power generationAutomotive IQ
 
Automatic Construction of Nanotechnology Ontology
Automatic Construction of Nanotechnology OntologyAutomatic Construction of Nanotechnology Ontology
Automatic Construction of Nanotechnology OntologyAxel Peter MUSTAD
 
Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...
Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...
Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...NextMove Software
 
03 plasma-surface-interaction-snyders-u mons
03 plasma-surface-interaction-snyders-u mons03 plasma-surface-interaction-snyders-u mons
03 plasma-surface-interaction-snyders-u monsSirris
 

Similar a Application of CurlySMILES to the encoding of polymer systems (10)

Oct 2011 ualr
Oct 2011 ualrOct 2011 ualr
Oct 2011 ualr
 
organicchemistry-151021071456-lva1-l.ppt
organicchemistry-151021071456-lva1-l.pptorganicchemistry-151021071456-lva1-l.ppt
organicchemistry-151021071456-lva1-l.ppt
 
Deterioration Modelling of Structural Members Subjected to Cyclic Loading Usi...
Deterioration Modelling of Structural Members Subjected to Cyclic Loading Usi...Deterioration Modelling of Structural Members Subjected to Cyclic Loading Usi...
Deterioration Modelling of Structural Members Subjected to Cyclic Loading Usi...
 
Les17[1] Writing Executable Statements
Les17[1] Writing Executable StatementsLes17[1] Writing Executable Statements
Les17[1] Writing Executable Statements
 
Md simulations modified
Md simulations modifiedMd simulations modified
Md simulations modified
 
The advantages of thermoelectric power generation
The advantages of thermoelectric power generationThe advantages of thermoelectric power generation
The advantages of thermoelectric power generation
 
Automatic Construction of Nanotechnology Ontology
Automatic Construction of Nanotechnology OntologyAutomatic Construction of Nanotechnology Ontology
Automatic Construction of Nanotechnology Ontology
 
Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...
Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...
Reading and Writing Molecular File Formats for Data Exchange of Small Molecul...
 
Polymer Reaction Technique
Polymer Reaction Technique Polymer Reaction Technique
Polymer Reaction Technique
 
03 plasma-surface-interaction-snyders-u mons
03 plasma-surface-interaction-snyders-u mons03 plasma-surface-interaction-snyders-u mons
03 plasma-surface-interaction-snyders-u mons
 

Último

Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...aditisharan08
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - InfographicHr365.us smith
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationkaushalgiri8080
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfjoe51371421
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio, Inc.
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdfWave PLM
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfkalichargn70th171
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideChristina Lin
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...harshavardhanraghave
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...gurkirankumar98700
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Intelisync
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVshikhaohhpro
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Modelsaagamshah0812
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...stazi3110
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyFrank van der Linden
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsAlberto González Trastoy
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...MyIntelliSource, Inc.
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEOrtus Solutions, Corp
 

Último (20)

Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...Unit 1.1 Excite Part 1, class 9, cbse...
Unit 1.1 Excite Part 1, class 9, cbse...
 
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...Call Girls In Mukherjee Nagar 📱  9999965857  🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
Call Girls In Mukherjee Nagar 📱 9999965857 🤩 Delhi 🫦 HOT AND SEXY VVIP 🍎 SE...
 
Asset Management Software - Infographic
Asset Management Software - InfographicAsset Management Software - Infographic
Asset Management Software - Infographic
 
Project Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanationProject Based Learning (A.I).pptx detail explanation
Project Based Learning (A.I).pptx detail explanation
 
why an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdfwhy an Opensea Clone Script might be your perfect match.pdf
why an Opensea Clone Script might be your perfect match.pdf
 
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed DataAlluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
Alluxio Monthly Webinar | Cloud-Native Model Training on Distributed Data
 
5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf5 Signs You Need a Fashion PLM Software.pdf
5 Signs You Need a Fashion PLM Software.pdf
 
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdfThe Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
The Essentials of Digital Experience Monitoring_ A Comprehensive Guide.pdf
 
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop SlideBuilding Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
Building Real-Time Data Pipelines: Stream & Batch Processing workshop Slide
 
Exploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the ProcessExploring iOS App Development: Simplifying the Process
Exploring iOS App Development: Simplifying the Process
 
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
Reassessing the Bedrock of Clinical Function Models: An Examination of Large ...
 
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
(Genuine) Escort Service Lucknow | Starting ₹,5K To @25k with A/C 🧑🏽‍❤️‍🧑🏻 89...
 
Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)Introduction to Decentralized Applications (dApps)
Introduction to Decentralized Applications (dApps)
 
Optimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTVOptimizing AI for immediate response in Smart CCTV
Optimizing AI for immediate response in Smart CCTV
 
Unlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language ModelsUnlocking the Future of AI Agents with Large Language Models
Unlocking the Future of AI Agents with Large Language Models
 
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
Building a General PDE Solving Framework with Symbolic-Numeric Scientific Mac...
 
Engage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The UglyEngage Usergroup 2024 - The Good The Bad_The Ugly
Engage Usergroup 2024 - The Good The Bad_The Ugly
 
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time ApplicationsUnveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
Unveiling the Tech Salsa of LAMs with Janus in Real-Time Applications
 
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
Try MyIntelliAccount Cloud Accounting Software As A Service Solution Risk Fre...
 
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASEBATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
BATTLEFIELD ORM: TIPS, TACTICS AND STRATEGIES FOR CONQUERING YOUR DATABASE
 

Application of CurlySMILES to the encoding of polymer systems

  • 1. Application of CurlySMILES to the encoding of polymer systems Presented on August 16, 2015, at the ACS 250th National Meeting in Boston Division: Computers in Chemistry Session: Accelerated Discovery of Chemical Compounds: Design New Polymers & Inorganic Materials from Integration of Polymer Science, Materials Science & Informatics Axel Drefahl axeleratio@gmail.com Axeleratio, Reno, Nevada www.axeleratio.com/axel/acs/boston2015/CurlySMILESpolymers.pdf
  • 2. Copyright © 2015 Axel Drefahl What is CurlySMILES? CurlySMILES is: ● a chemical language to capture, process and share, nanostructures, based on molecular constitution, connectivity and arrangement; ● a line notation system integrating SMILES with atom- and molecule-anchored annotations, inserted via curly braces: {…}; ● custamizable by annotation: encoding of polymers, complexes, multi-phase systems, ...; ● available as a suite of Python 3 modules, including a notation parser and unique notation generator.
  • 3. Copyright © 2015 Axel Drefahl Overview ● Current state of polymer informatics ● Brief introduction to CurlySMILES ● Encoding of Structural Repeat Units (SRUs) ● Encoding of single-strand polymers ● Encoding of multi-strand polymers ● Encoding of copolymers and miscellaneous polymer systems ● CurlySMILES software/task-specific integration ● Perspective: virtual polymer chemistry
  • 4. Copyright © 2015 Axel Drefahl Cheminformatics (sub)domains Established informatics ● “Small molecules” ● Crystalline solids ● Peptides, DNAs, ... Capturing & processing ● Molecular graph ● Unit cell, space group ● Fragment sequence Capturing & processing ● Struct. repeat unit (SRU) ● Nano-object (sphere,rod,...) ● Variable groups: R,X,Y,Z,... ● Metalevel components Evolving informatics ● Polymer systems ● Nanomaterials ● Material classes ● Composites & design
  • 5. Copyright © 2015 Axel Drefahl Polymer informatics: approaches and tools ● IUPAC nomenclature & seniority rules (head-tail selection) ● S-group (superatom): SRU with crossing bonds and brackets (common representation, MDL, MarvinSketch), ● ThermoML with polymer block to specify compounds, ● Polymer Markup Language (PML), ● Polymer Informatics Knowledge System (PIKS) - PolyInfo database | “walled gardens” of polymer information, ● InChI polymer project (awaits implementation), ● CurlySMILES project, actively designs a human-machine interface for nanoarchitectures, including polymers, and develops open-source Python code.
  • 6. Copyright © 2015 Axel Drefahl From SMILES to CurlySMILES SMILES: Simplified Molecular Input Line Entry System Published by David Weininger in 1988 doi: 10.1021/ci00057a005 CurlySMILES: Curly-braces enhanced Smart Material Input Line Entry Specification Published by Axel Drefahl in 2011 doi: 10.1186/1758-2946-3-1
  • 7. Copyright © 2015 Axel Drefahl CurlySMILES Motivation ● Chemical nomenclature and encoding languages typically employ idealized representations, while minor structural irregularities and impurities are ignored. CurlySMILES encoding enables their insertion via annotation, if desired. ● A molecular-graph-derived notation is often taken to represent molecule and substance interchangeably. CurlySMILES employs molecule multipliers and allows for phase distinction, for example, by using state and shape annotations such as lq, tf, am, cr, np, ... ● Variability of detail: stoichiometric formula notation (SFN) ● Encoding of molecular arrangements: hydrogen-bonded molecules, complexes, macromolecules and other nanoassemblies.
  • 8. Copyright © 2015 Axel Drefahl Format of curly-enclosed annotations in CurlySMILES {AMk1=v1;...;kn=vn} AM is a one-char or two-char annotation marker; a two-char AM may by followed by an annotation dictionary, a semicolon-separated list of key/value (ki/vi) pairs. Keys are predefined, but extensible by customization ($ prefix). Example: n-octanethiol functionalized gold nanoparticle dispersed in toluene (●_ SCH2 (CH2 )6 CH3 in toluene) S{-|c=[Au]{np}}CCCCCCCC{dpc=Cc1ccccc1} AMs: -| for surface-attached, dp for dispersed, np for nanoparticle.
  • 9. Copyright © 2015 Axel Drefahl Annotated Molecular Graph Example: (Z)-but-1-ene-1,4-diyl substructure CurlySMILES: C{-}=C{Z}CC{-} Atom-anchored annotations: Structural unit annotation (pendent single bond): {-} Stereodescriptive annotation: {Z}
  • 10. Copyright © 2015 Axel Drefahl Poly[(Z)-but-1-ene-1,4-diyl] CurlySMILES: C{-}=C{Z}CC{+n} Atom-anchored annotations: Structural unit annotation: {-} at head node Stereodescriptive annotation: {Z} Operational notation:{+n} at tail node
  • 11. Copyright © 2015 Axel Drefahl Does CurlySMILES encode macromolecules or polymers? Answer: both (user choice). CurlySMILES comes with a rich annotation dictionary to encode chain length variation and phases. A macromolecule is a single molecule. The term “polymer” can mean “macromolecule” or a “substance” composed thereof, typically with a “degree of polymerization” (DOP) range. An oligomer or a macromolecule of a specific length is encoded based on the chain graph, i.e. the SRU graph, using annotation dictionary key n: {+nn=10} for ten-time-occurrence of SRU. A polymer is encoded by leaving out n (generic polymer). The key dpr may specify a DOP range: {+ndpr=gt250}. AMs such as am (amorphous) or cr (crystalline) indicate a particular polymer phase. A polymer system is encoded by additional annotations specifying, for example, impurities, additives and solvents.
  • 12. Copyright © 2015 Axel Drefahl Tail node annotations to formally construct polymers {+n} anchored at tail node of divalent SRU to build single- strand polymer via head-tail single-bond connection. {+r} anchored at tail node of SRU to build single-strand macrocycle (last tail node connects first head node). {+m} anchored at tail node of non-single-bond or multivalent SRU to build multibond/multi-strand polymer via specified head-tail connection using key ich to provide index of corresponding head node. {+s} anchored at tail node of the last (right-most) SRU in a copolymer sequence to provide copolymer details; for example, a copolymer qualifier via key cpq to specify an alternating, block or random sequence.
  • 13. Copyright © 2015 Axel Drefahl CurlySMILES notations of some common single-strand homopolymers Structure-Based Name Structural Formula CurlySMILES Notation Poly(oxymethylene) -[OCH2 ]-n O{-}C{+n} Poly(iminoethylene) -[NHCH2 CH2 ]-n N{-}CC{+n} Poly(1-hydroxyethylene) -[CH(OH)CH2 ]-n OC{-}C{+n} Poly(1-cyanoethylene) -[CH(CN)CH2 ]-n N#CC{-}C{+n} Poly(1,1-difluoroethylene) -[CF2 CH2 ]-n FC{-}(F)C{+n} Poly(1-phenylethylene) -[CH(Ph)CH2 ]-n C{-}(c1ccccc1)C{+n} Poly(oxy-1,4-phenylene) -[O-paraPh]-n O{-}c1ccc{+n}cc1 Poly(methylene) -[CH2 ]-n C{-}{+n} Poly(difluoromethylene) -[CF2 ]-n FC{-}{+n}F
  • 14. Copyright © 2015 Axel Drefahl Polydispersity characterization With the exception of the dimensionless pdi, units are kg/mol. Example: {+npMn=89.2} to encode a single-strand polymer with a number-average molar mass of 89.2 kg/mol Key Symbol Meaning ThermoML tag name pMn Mn Number-average molar mass nNumberAvgMolWt pMm Mm Mass-average molar mass nWeightAvgMolWt pMz Mz z-Average molar mass nZAvgMolWt pMv Mv Viscosity-average molar mass nViscosityAvgMolWt pMp Mp Peak molar mass nPeakAvgMolWt (?) pdi Mm /Mn Polydispersity index nPolydispersityIndex
  • 15. Copyright © 2015 Axel Drefahl Anionic homopolymer with monoatomic cations Example: poly(sodium 1-carboxylatoethylene) CurlySMILES: O=C([O-]{+Cc=[Na+]})C{-}C{+n} The operational annotation marker +C is used to include [Na+] as counterion to [O-]. [Na+]is part of the repeat unit.
  • 16. Copyright © 2015 Axel Drefahl Homopolymer with terminating groups at head and tail Example: poly(ethylene terephthalate) by esterification of terephthalic acid with ethylene glycol [H]O{-}CCOC(=O)c1ccc(cc1)C{+ninc=2-15;ich=2} (=O)OCCO Nodes 2 to 15 are parts of SRU. Node 1 makes the head terminus and nodes 16-20 belong to tail end group.
  • 17. Copyright © 2015 Axel Drefahl Cyclic polymers or oligomers Example: cyclic poly(silaether) [Si]{-}(C)(C)[Si](C)(C)O{+rn=24} Shortcut for a long SMILES notation: [Si]1(C)(C)[Si](C)(C)O...[Si](C)(C)[Si](C)(C)O1 Such cyclic poly(silaether) are obtained, for example, as by-products while making their linear homologs by ring-opening polymerization of octamethyl-1,4-dioxatetrasilacyclohexane [10.1021/ma00086a048].
  • 18. Copyright © 2015 Axel Drefahl Surface-grafted functional oligomer Example: polyacrylamid brush grown on silicon N{-|c=[Si]}C(=O)c1ccc(cc1)CCC{-} C{+ninc=12-16;ich=12}C(=O)N Group environment annotion -| for bond to substrate Growth of such polyacrylamide brushes on a silicon wafer is studied to understand how to reduce or prevent microbial adhesion on surfaces by chemical surface modification [doi: 10.1021/la063531v].
  • 19. Copyright © 2015 Axel Drefahl Regular double-strand polymers: chain of formally fused cycloalkane rings Example: poly(butane-1,4:3,2-tetrayl) CurlySMILES notation: C{-}C{+mich=1}C{+mich=4}C{-} two head nodes: C{-}, two tail nodes C{+m} For IUPAC nomenclature of this polymer see A Brief Guide to Polymer Nomenclature.
  • 20. Copyright © 2015 Axel Drefahl Regular double-strand polymers: chain of formally fused heterocycles Example: poly(2,4-dimethyl-1,3,5-trioxa-2,4-disilapentane- 1,5:4,2-tetrayl) CurlySMILES notation: O{-}[Si]{+mich=1}(C)O[Si]{+mich=7}(C)O{-} two head nodes: O{-}, two tail nodes [Si]{+m} For IUPAC nomenclature of this polymer see page 1573 in http://old.iupac.org/publications/pac/1993/pdf/6507x1561.pdf.
  • 21. Copyright © 2015 Axel Drefahl Double bond between head and tail Example: poly(piperidine-3,5-diylideneethanediylidene) CurlySMILES notations: A: C1{=}CNCC(C1)=CC{+mich=1;b==} B: C{-}C1CNCC(C1)=C{+n} Both notations encode correct atom connectivity. In the IUPAC-compliant notation A, key b specifies = as bond between tail and head. For IUPAC nomenclature of this polymer see page 1941 in Nomenclature of Regular Single-Strand Organic Polymers.
  • 22. Copyright © 2015 Axel Drefahl Encoding with copolymer qualifiers Copolymer Qualifiers Example: poly(styrene-co-isoprene) CurlySMILES notation of above example: C{-}C{+ninc=1-8;ich=1}(c1ccccc1) C{-}C=C(C)C{+ninc=9-13;ich=9}{+scpq=c} cpq Qualifier Meaning a alt alternating b block block c co generic g graft graft p per periodic r ran random s stat statistical
  • 23. Copyright © 2015 Axel Drefahl Encoding of a terpolymer Example: poly[methyl-N-(3,4-dimethylphenyl)-N-(4-biphenyl)-N-(4- phenyloxy)siloxane-co-phenylmethylsiloxane-co- methylhydrosiloxane] c1ccccc1[Si]{-}(C)O{+ninc=1-9;ich=7}[SiH]{-} (C)O{+ninc=10-12;ich=10}[Si]{-}(C) (Oc2ccc(cc2)N(c3cc(C)c(C)cc3)c4ccc(cc4)- c5ccccc5)O{+ninc=13-43;ich=13}{+scpq=c} For more about this terpolymer see 10.1021/ma202041u.
  • 24. Copyright © 2015 Axel Drefahl Nesting of SRUs Example: unsaturated polyester with α,ω-alkanediyl bridges CurlySMILES notation: C{-}(=O)OC{-}{+ninc=4;ich=4;n=5-9} OC(=O)C(C)=CCC{+ninc=1-13;ich=1}C
  • 25. Copyright © 2015 Axel Drefahl Encoding of polymer blends Example: polystyrene/poly(methyl methacrylate) blend CurlySMILES notation: C{-}C{+n}c1ccccc1.C{-}C{+n}(C)C(=O)OC{mx} Annotation {mx} indicates a compatible or incompatible mixture. CurlySMILES encoding as a two-phase system (composite): {/C{-}C{+n}c1ccccc1/C{-}C{+n}(C)C(=O)OC}
  • 26. Copyright © 2015 Axel Drefahl Encoding of polymer solutions Example: poly(1-cyanoethylene) dissolved in dihydrofuran-2(3H)-one (γ-butyrolactone) CurlySMILES notation: C{-}(C#N)C{+n}{dsc=O=C1OCCC1} Annotation marker ds for dissolved Key c for CurlySMILES notation with assigned value O=C1OCCC1
  • 27. Copyright © 2015 Axel Drefahl Encoding of doped polymers Example: poly(1,4-phenylene sulfide) doped with arsenic pentafluoride CurlySMILES notation: c1{-}ccc(cc1)S{+n}{IMc=F[As](F)(F)(F)F} Annotation marker IM for impurity Key c specifying dopant F[As](F)(F)(F)F
  • 28. Copyright © 2015 Axel Drefahl Encoding of polymer sets Example: poly[(alkylimino)methyleneimino-1,3- phenylene] with specified alkyl groups CurlySMILES notation: N{-}{+Rcc=C{-},CC{-},CCC{-},CC{-}C,CC{-} (C)C}CNc1cccc{+n}c1 Annotation marker +R for alkyl group insertion Key cc for list of comma-separated CurlySMILES notations; here, encoding the specified alkyl groups methyl, ethyl, n-propyl, iso- propyl and tert-butyl
  • 29. Copyright © 2015 Axel Drefahl CurlySMILES in Python 3 Current iteratively tested implementations ● Modules to parse and analyze molecular-graph-based notations and their annotations ● CANGEN-based methods for input-to-unique conversion of notations (regular single-strands) ● Substructure and descriptor generation methods ● Programs to maintain and screen Axeleratio's in-house bibliography of CurlySMILES-tagged literature, including nano- device and polymer publications.
  • 30. Copyright © 2015 Axel Drefahl Transformation of a CurlySMILES notation based on node ranks Example: poly[(2-propyl-1,3-dioxane-4,6- diyl)methylene] Entered: C1{-}OC(CCC)OC(C1)C{+n} Unique: C1{-}CC(OC(CCC)O1)C{+n} The CH2 ring node ranks lower than the left O node; the CH2 tail node ranks higher than the right O node.
  • 31. Copyright © 2015 Axel Drefahl Uniqueness depending on selection of head/tail (H/T) pair O{-}CC{+n} C{-}OC{+n} C{-}CO{+n} poly(oxyethylene) Nomenclature-conform selection of head and tail nodes is recommended in polymer encoding. [see examples of unique notations for regular single-strand polymers]
  • 32. Copyright © 2015 Axel Drefahl Task-specific integration of CurlySMILES modules ● Interfacing polymer structure (input/output) Form-to-notation editors Notation-to-sketch and notation-to-query software ● Pipelining polymer data (data administration) Automatic ranking and comparison of structure/data pairs Screening of structured lists and repositories ● Generating virtual libraries Automatically building lists of polymer notations for QSPR analysis and identification of optimal-design candidates
  • 33. Copyright © 2015 Axel Drefahl Application to polymer data mining: “nurturing the mine sites” SRU-based CurlySMILES notations in unique form are identifiers of macromolecules and polymer systems that can be employed to • function as search keys in database applications, • tag factsheets, notes and bibliographic entries, • populate spreadsheet cells and XML text nodes, • index and abstract the polymer literature & patents, • create ontologies that organize polymer information, ..which can be shared via Semantic Web technologies.
  • 34. Copyright © 2015 Axel Drefahl Application to polymer data mining: search and data extraction The CurlySMILES language has a rich and extensible dictionary to encode polymers in diverse contexts and at various levels of detail. Notations work both ways as precise data annotations and as query formulations for “needle-in-the-haystick” requests. Today's polymer knowledge systems are not marked up by CurlySMILES. But client-server mediation can be achieved, behind-the-scenes, via CurlySMILES code to • compact polymer input provided through entry forms, • expand notations into query language formats.
  • 35. Copyright © 2015 Axel Drefahl Application to polymer modeling CurlySMILES representations of polymer systems contain detailed structural information to derive macromolecular descriptors and substructures (groups) as entry points for property prediction and model development: • Structure property relationships (QSPRs, GCMs) • SRU similarity (kNN and pattern recognition methods) • MC & MD simulations (flexibility, solution behavior) • Backbone modeling (polymer stability & degradation) • Kinetic & ab initio methods (controlled polymerization)
  • 36. Copyright © 2015 Axel Drefahl Application to polymer design Specialty polymers must meet multifaceted requirements (multi-dimensional property windows). The virtual design of polymers by permutationally building (co)polymers (or blends) based on systematically varied monomer structures often results into large libraries of structurally related polymers with predictable properties. The automatic generation of the polymer structures of such libraries as compact CurlySMILES notation and the implementation of predictive methods for the desired properties will allow virtual high-throughput screening to initialize the synthesis of potential candidates.
  • 37. Copyright © 2015 Axel Drefahl Summary & Outlook Done ● SRU annotations to encode polymers ● Polymer description grammar ● Python implementation To Do ● Stereochemical descriptions ● Unique notations for nested polymers ● Conquering polymer space Topics to be addressed for CurlySMILES applications ● Representation and iterative development of models for structure/property estimation ● Extension to advanced architectures: dendrimers, 3D polymers and nanostructure designs combining polymers with carbon nanotubes and fullerene-based bowls and cages