SlideShare a Scribd company logo
1 of 19
Download to read offline
Towards Lensfield: data
 management, processing and
    semantic publication
  for vernacular e-science

Nick Day, Jim Downing, Lezan Hawizy, Nico Adams and
                 Peter Murray-Rust
 Unilever Centre for Molecular Science Informatics,
               University of Cambridge

               This presentation: CC-By-SA Jim Downing
Linked Data


              CC-By-SA-NC jmelchio



                                CC Images from Flickr
Selling Linked
     Data
  Make it transparent
     Make it easy




                        CC-By mrslogic
Selling Linked Data



    Citations
Selling Linked Data


• Visualizations
• Data management
• Automation
Demo
http://code.google.com/p/lensfield/
Lensfield Principles

• Make it easier to do the right thing
• Vernacular
• KISS and Embrace constraints
Constraints

• Work on the desktop without
  infrastructure installation
• Processing tasks could be anything and
  aren’t predictable
Re-use
Jumbo-Converters
• Library of chemistry file format
  converters, semantifiers and enhancers
• Part of the CML Java libraries
• http://sourceforge.net/projects/cml/
Version Control
•   Mercurial                   •   Track script changes
                                    with data
    •   Excellent support for
        experimentation         •   Automatically ignore
                                    deterministic
    •   Backup to remote            intermediates
        machine

    •   P2P sharing
Build metaphor

• Describing state transitions rather than
  process better for provenance tracking
• Alternative to graphical programming
  languages / workflow packages
  • hard problems are re-use and
    comprehension
Clojure
• Strong on concurrency
 • Functional
 • Software Transactional Memory
• Lisp
 • Snapshots, pause and resume,
    continuations
Future Development

• Templated Parameter Sweeps & sensitivity
  analysis
• Design of Experiments
• Multicore performance testing
• Grid processing
http://fascinator.usq.edu.au/
Users
• CLARION project
 • Embargo management and publication of
    Electronic Lab Notebook data.
• OREChem
 • Distributed chemistry eScience using
    Linked Data.
• Computational Chemical engineering
http://code.google.com/p/lensfield/

                     CC-By-NC ilonameagher




Users
 You?




                 ... to use Lensfield!
Thanks
 Colleagues                       Funds
  Nick Day
  John Aspden
  Lezan Hawizy
  Peter Murray-Rust
Collaboration and Inspiration
  Nico Adams (Dept of Genetics, Cambridge)
  Jerry Winter (Unilever)
  Noel Ruddock (Unilever)
  Markus Kraft, Weerapong Phadungsukanan (Chemical
  Engineering, Cambridge)

More Related Content

What's hot

Why Migrate from MySQL to Cassandra
Why Migrate from MySQL to CassandraWhy Migrate from MySQL to Cassandra
Why Migrate from MySQL to Cassandra
DATAVERSITY
 
AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...
AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...
AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...
Amazon Web Services
 

What's hot (20)

Journey to the Cloud: Database Modernization Best Practices
Journey to the Cloud: Database Modernization Best PracticesJourney to the Cloud: Database Modernization Best Practices
Journey to the Cloud: Database Modernization Best Practices
 
Why Migrate from MySQL to Cassandra
Why Migrate from MySQL to CassandraWhy Migrate from MySQL to Cassandra
Why Migrate from MySQL to Cassandra
 
AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...
AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...
AWS Partner Presentation - PetaByte Scale Computing on Amazon EC2 with BigDat...
 
Cloud computing hybrid architecture
Cloud computing   hybrid architectureCloud computing   hybrid architecture
Cloud computing hybrid architecture
 
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
Architecting Analytic Pipelines on GCP - Chicago Cloud Conference 2020
 
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
Native Spark Executors on Kubernetes: Diving into the Data Lake - Chicago Clo...
 
Choosing Public vs. Private vs. Hybrid Cloud Computing
Choosing Public vs. Private vs. Hybrid Cloud ComputingChoosing Public vs. Private vs. Hybrid Cloud Computing
Choosing Public vs. Private vs. Hybrid Cloud Computing
 
IBM Cognos Business Intelligence using dashDB
IBM Cognos Business Intelligence using dashDBIBM Cognos Business Intelligence using dashDB
IBM Cognos Business Intelligence using dashDB
 
Cloud technologies
Cloud technologiesCloud technologies
Cloud technologies
 
The Private Cloud Isn't Dead
The Private Cloud Isn't DeadThe Private Cloud Isn't Dead
The Private Cloud Isn't Dead
 
Postgres Vision 2018: AI Needs IA
Postgres Vision 2018: AI Needs IAPostgres Vision 2018: AI Needs IA
Postgres Vision 2018: AI Needs IA
 
Beyond Batch: Is ETL still relevant in the API economy?
Beyond Batch: Is ETL still relevant in the API economy?Beyond Batch: Is ETL still relevant in the API economy?
Beyond Batch: Is ETL still relevant in the API economy?
 
Kyligence Cloud 4 - Feature Focus: AI-Augmented Engine
Kyligence Cloud 4 - Feature Focus: AI-Augmented EngineKyligence Cloud 4 - Feature Focus: AI-Augmented Engine
Kyligence Cloud 4 - Feature Focus: AI-Augmented Engine
 
Single View of Well, Production and Assets
Single View of Well, Production and AssetsSingle View of Well, Production and Assets
Single View of Well, Production and Assets
 
IBM + REDHAT "Creating the World's Leading Hybrid Cloud Provider..."
IBM + REDHAT "Creating the World's Leading Hybrid Cloud Provider..."IBM + REDHAT "Creating the World's Leading Hybrid Cloud Provider..."
IBM + REDHAT "Creating the World's Leading Hybrid Cloud Provider..."
 
About Pragmatic Works
About Pragmatic WorksAbout Pragmatic Works
About Pragmatic Works
 
Infrastructure migration to azure cloud
Infrastructure migration to azure cloudInfrastructure migration to azure cloud
Infrastructure migration to azure cloud
 
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons LearnedRightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
RightScale Webinar: Hybrid Cloud Fundamentals and Lessons Learned
 
Webinar: Cut Disaster Recovery Expenses – Improve Recovery Times
Webinar: Cut Disaster Recovery Expenses – Improve Recovery TimesWebinar: Cut Disaster Recovery Expenses – Improve Recovery Times
Webinar: Cut Disaster Recovery Expenses – Improve Recovery Times
 
What is BI on Cloud
What is BI on CloudWhat is BI on Cloud
What is BI on Cloud
 

Similar to Towards Lensfield

Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
Florian Wilhelm
 
Deirdra Dwyer
Deirdra DwyerDeirdra Dwyer
Deirdra Dwyer
Dee Dwyer
 
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr..."Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
Edge AI and Vision Alliance
 
Zejia_CV_final
Zejia_CV_finalZejia_CV_final
Zejia_CV_final
ZJ Zheng
 

Similar to Towards Lensfield (20)

Grid is Dead ? Nimrod on the Cloud
Grid is Dead ? Nimrod on the CloudGrid is Dead ? Nimrod on the Cloud
Grid is Dead ? Nimrod on the Cloud
 
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
AI on Greenplum Using
 Apache MADlib and MADlib Flow - Greenplum Summit 2019
 
Webinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database ReplacementWebinar: How We Evaluated MongoDB as a Relational Database Replacement
Webinar: How We Evaluated MongoDB as a Relational Database Replacement
 
Adoption of Cloud Computing in Scientific Research
Adoption of Cloud Computing in Scientific ResearchAdoption of Cloud Computing in Scientific Research
Adoption of Cloud Computing in Scientific Research
 
Docker?!?! But I'm a SysAdmin
Docker?!?! But I'm a SysAdminDocker?!?! But I'm a SysAdmin
Docker?!?! But I'm a SysAdmin
 
142 wendy shank
142 wendy shank142 wendy shank
142 wendy shank
 
Bridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to ProductionBridging the Gap: from Data Science to Production
Bridging the Gap: from Data Science to Production
 
Avogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and SemanticsAvogadro, Open Chemistry and Semantics
Avogadro, Open Chemistry and Semantics
 
Use Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open WorldUse Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open World
 
Semitracks Capabilities
Semitracks CapabilitiesSemitracks Capabilities
Semitracks Capabilities
 
About Dee Dwyer
About Dee DwyerAbout Dee Dwyer
About Dee Dwyer
 
Deirdra Dwyer
Deirdra DwyerDeirdra Dwyer
Deirdra Dwyer
 
Zero to ten million daily users in four weeks: sustainable speed is king
Zero to ten million daily users in four weeks: sustainable speed is kingZero to ten million daily users in four weeks: sustainable speed is king
Zero to ten million daily users in four weeks: sustainable speed is king
 
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr..."Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
"Enabling Ubiquitous Visual Intelligence Through Deep Learning," a Keynote Pr...
 
Quantifying thefuture
Quantifying thefutureQuantifying thefuture
Quantifying thefuture
 
Quantifying the future
Quantifying the futureQuantifying the future
Quantifying the future
 
Legion - AI Runtime Platform
Legion -  AI Runtime PlatformLegion -  AI Runtime Platform
Legion - AI Runtime Platform
 
Zejia_CV_final
Zejia_CV_finalZejia_CV_final
Zejia_CV_final
 
Considerations and challenges in building an end to-end microbiome workflow
Considerations and challenges in building an end to-end microbiome workflowConsiderations and challenges in building an end to-end microbiome workflow
Considerations and challenges in building an end to-end microbiome workflow
 
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
A Maturing Role of Workflows in the Presence of Heterogenous Computing Archit...
 

More from Jim Downing

Embedding Metadata In Word Processing Documents
Embedding Metadata In Word Processing DocumentsEmbedding Metadata In Word Processing Documents
Embedding Metadata In Word Processing Documents
Jim Downing
 

More from Jim Downing (6)

The Metaverse in Fashion
The Metaverse in FashionThe Metaverse in Fashion
The Metaverse in Fashion
 
Metail and eTryOn for De Montfort Uni Fashion
Metail and eTryOn for De Montfort Uni FashionMetail and eTryOn for De Montfort Uni Fashion
Metail and eTryOn for De Montfort Uni Fashion
 
Creative Cambridge Metail presentation
Creative Cambridge Metail presentationCreative Cambridge Metail presentation
Creative Cambridge Metail presentation
 
XR in fashion & the eTryOn project
XR in fashion  & the eTryOn projectXR in fashion  & the eTryOn project
XR in fashion & the eTryOn project
 
Embedding Metadata In Word Processing Documents
Embedding Metadata In Word Processing DocumentsEmbedding Metadata In Word Processing Documents
Embedding Metadata In Word Processing Documents
 
Web Feeds and Repositories
Web Feeds and RepositoriesWeb Feeds and Repositories
Web Feeds and Repositories
 

Recently uploaded

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Recently uploaded (20)

Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

Towards Lensfield

  • 1. Towards Lensfield: data management, processing and semantic publication for vernacular e-science Nick Day, Jim Downing, Lezan Hawizy, Nico Adams and Peter Murray-Rust Unilever Centre for Molecular Science Informatics, University of Cambridge This presentation: CC-By-SA Jim Downing
  • 2. Linked Data CC-By-SA-NC jmelchio CC Images from Flickr
  • 3. Selling Linked Data Make it transparent Make it easy CC-By mrslogic
  • 5. Selling Linked Data • Visualizations • Data management • Automation
  • 6.
  • 8. Lensfield Principles • Make it easier to do the right thing • Vernacular • KISS and Embrace constraints
  • 9. Constraints • Work on the desktop without infrastructure installation • Processing tasks could be anything and aren’t predictable
  • 11. Jumbo-Converters • Library of chemistry file format converters, semantifiers and enhancers • Part of the CML Java libraries • http://sourceforge.net/projects/cml/
  • 12. Version Control • Mercurial • Track script changes with data • Excellent support for experimentation • Automatically ignore deterministic • Backup to remote intermediates machine • P2P sharing
  • 13. Build metaphor • Describing state transitions rather than process better for provenance tracking • Alternative to graphical programming languages / workflow packages • hard problems are re-use and comprehension
  • 14. Clojure • Strong on concurrency • Functional • Software Transactional Memory • Lisp • Snapshots, pause and resume, continuations
  • 15. Future Development • Templated Parameter Sweeps & sensitivity analysis • Design of Experiments • Multicore performance testing • Grid processing
  • 17. Users • CLARION project • Embargo management and publication of Electronic Lab Notebook data. • OREChem • Distributed chemistry eScience using Linked Data. • Computational Chemical engineering
  • 18. http://code.google.com/p/lensfield/ CC-By-NC ilonameagher Users You? ... to use Lensfield!
  • 19. Thanks Colleagues Funds Nick Day John Aspden Lezan Hawizy Peter Murray-Rust Collaboration and Inspiration Nico Adams (Dept of Genetics, Cambridge) Jerry Winter (Unilever) Noel Ruddock (Unilever) Markus Kraft, Weerapong Phadungsukanan (Chemical Engineering, Cambridge)