SlideShare a Scribd company logo
1 of 29
From Print to the Cloud and Beyond
The Story of a Century Old Company and its Resiliency to Ever-Evolve
Agenda
 CAS Overview
 CAS - In the Beginning… There was Print
 CAS - The Age of Silos
 CAS - IBM Integration. To the Cloud… and Beyond
 Future Considerations
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.2
Agenda
 CAS Overview
 CAS - In the Beginning… There was Print
 CAS - The Age of Silos
 CAS - IBM Integration. To the Cloud… and Beyond
 Future Considerations
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.3
 CAS helps scientists around the world benefit from the published
work of their colleagues by monitoring, abstracting and indexing the
world's chemistry-related literature
CAS has been supporting scientists for more
than 100 years
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.4
 Since 1907, CAS’s objective
has been to find, collect, and
organize all publicly disclosed
chemistry substance
information
CAS helps scientists around the world benefit
from the published work of their colleagues
 CAplusSM
 CAS REGISTRYSM
 CHEMLIST®
 CIN®
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.5
Markush
Indexing
Authority
Processing
Source
Selection
Document
Indexing
Reaction
Indexing
 MARPAT®
 CHEMCATS®
 CAS scientists monitor, abstract and index the world's chemistry-
related literature
 Proprietary, standardized indexing in CAS databases ensures
consistent, comprehensive search results.
 CASREACT®
CAS products and services make it faster and
easier for scientist to find the information they
need for their research
 CAS Registry Numbers® uniquely identify each
chemical substance without the ambiguity of multiple
naming conventions
 STN® combines industry-leading search and retrieval
with unique and comprehensive content
 SciFinder® offers a one-stop shop experience with
flexible search and discover options based on user
input and workflow
 Science IP®, the CAS information search service
provides fast, comprehensive and accurate searches
of the world’s scientific and technical literature
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.6
CAS Registry Number 58-08-2
CAFFEINE!
Agenda
 CAS Overview
 CAS - In the Beginning… There was Print
 CAS - The Age of Silos
 CAS - IBM Integration. To the Cloud… and Beyond
 Future Considerations
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.7
CAS Timeline
108 Years of Progress (and Counting)
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.8
CAS End-To-End Architecture
“In the Beginning… There was Print”
Data
Transformation
Data Validation
Data Curation
Data Integration
Data Presentation
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.9
Data Ingestion
“CAS Knows Jack”
Jack and Friends Beside Printed Chemical Abstracts
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.10
Agenda
 CAS Overview
 CAS - In the Beginning… There was Print
 CAS - The Age of Silos
 CAS - IBM Integration. To the Cloud… and Beyond
 Future Considerations
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.11
 Data Ingestion
 Data Transformation
 Data Validation
 Data Normalization
 Data Persistence
CAS End-To-End Architecture
“The Age of Silos”
 Data Ingestion
 Data Transformation
 Data Validation
 Data Curation
 Data Integration
 Data Persistence
 Data Transformation
 Data Validation
 Data Integration
 Data Presentation
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.12
Silo Challenges
 Multiple Data Ingestion Points
– In some cases, the same data is being ingested twice
 Multiple Views of the Data
– Each silo must perform complex transformations to its specific view
– Editorial manufactures normalized data based on a print model
– Product Development wants de-normalized, complete data
– Content Delivery has a mixed view of the data
 Multiple Vocabulary Conventions
– Differing data definitions causes confusion across silos
 No Unified, Authority Data Store
– Each silo has their own copy of the data in its own specific vocabulary
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.13
Editorial Legacy Systems
 Many disparate databases used to store relational data
– Becomes difficult to maintain and support
 Multiple database technologies used
– No unified platform
 Challenges to support legacy systems
– Some legacy technologies are no longer supported
– Succession planning difficult to support legacy systems
– Special IT used so that legacy code would not need to be touched
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.14
Content Delivery Systems
 Data was transformed into one common data model to bridge
gap between Editorial and Product View
– One common schema model was complex and unwieldy
– Common model contained “unnecessary” complexities
– Common model did not align with Product Development’s specifications
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.15
Product Development Systems
 Product Development must code for “unnecessary” complexities
 Data not completely de-normalized
– Additional development necessary to compile data
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.16
Silo Challenges
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.17
By the Numbers
 Thousands of journals ingested per day
– Approximately 1 TB of data per week
 Over 100 other data feeds ingested per day
 Over 1.2 million messages processed per day
– Synced up with product data daily in less than 10 minutes
 Over 6 TB of compiled data created per day
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.18
What is an Architect to Do?
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.19
Unify…Integrate…Simplify
 Unify Data; Processes; Transformations; Data Ingestion
 Integrate Disparate Systems; Services; Applications; and
Data Consumers
 Simplify the Architecture!!!
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.20
• Run proof-of-concept and/or proof-of-technology and/or pilot project as needed
• Negotiate contract
• Adjust as needed
• Selection team members score vendor solutions
• Aggregate scores
• Select vendor with best aggregate score (judgement required)
• Bake-off if winner is too close to call
• Send RFP document to prospective vendors
• Hold clarification meetings with vendor teams
• Vendors send RFP response documents
• Vendors present their solutions and answer questions
• Create technology selection team
• Identify key requirements (based on architecture and tech stack governance)
• Assign weights
• Create RFP document and scorecard spreadsheet
Request For Proposal
Create RFP
Engage vendors
Score-driven selection
Validate selection
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.
Requirements
 Data Integration
 Durable Message Bus with Guaranteed Delivery
 Any-to-Any Connectivity
 Architectural Flexibility
 Excellent Support
 A Proven Solution
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.22
Agenda
 CAS Overview
 CAS - In the Beginning… There was Print
 CAS - The Age of Silos
 CAS - IBM Integration. To the Cloud… and Beyond
 Future Considerations
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.23
Unify…Integrate…Simplify
 Data Curation
 Data Ingestion
 Data Transformation
 Data Validation
 Data Normalization
 Data Integration
 Data Transformation
 Data Validation
 Data Integration
 Data Presentation
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.24
 Data Persistence
 Data Flow Orchestration
Agenda
 Overview
 CAS - In the Beginning… There was Print
 CAS - The Age of Silos
 CAS - IBM Integration. To the Cloud… and Beyond
 Future Considerations
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.25
To the Cloud… and Beyond!
Off-Prem Processing
 Bursting Capabilities
 Data Center Relief
 Co-Location Capabilities
New Mobile Applications
Service Unification
 Service Management
 Service Integration
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.26
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.27
Questions
CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.28
Connect with CAS:
Joseph Sapp
Lead Enterprise Application Architect
jsapp@cas.org
www.linkedin.com/in/joesapp

More Related Content

What's hot

IBM Spectrum Scale ECM - Winning Combination
IBM Spectrum Scale  ECM - Winning CombinationIBM Spectrum Scale  ECM - Winning Combination
IBM Spectrum Scale ECM - Winning CombinationSasikanth Eda
 
Virtustream presentation
Virtustream presentationVirtustream presentation
Virtustream presentationEimantas
 
Cloud Computing & Business Intelligence
Cloud Computing & Business IntelligenceCloud Computing & Business Intelligence
Cloud Computing & Business IntelligenceSudip Chatterjee
 
Business Intelligence in the Cloud I
Business Intelligence in the Cloud IBusiness Intelligence in the Cloud I
Business Intelligence in the Cloud IRightScale
 
Integration intervention: Get your apps and data up to speed
Integration intervention: Get your apps and data up to speedIntegration intervention: Get your apps and data up to speed
Integration intervention: Get your apps and data up to speedKenneth Peeples
 
The Impact of SOA on Traditional Middleware Technologies
The Impact of SOA on Traditional Middleware TechnologiesThe Impact of SOA on Traditional Middleware Technologies
The Impact of SOA on Traditional Middleware Technologiesdigitallibrary
 
Benefits of Extending PowerCenter with Informatica Cloud
Benefits of Extending PowerCenter with Informatica CloudBenefits of Extending PowerCenter with Informatica Cloud
Benefits of Extending PowerCenter with Informatica CloudAshwin V.
 
A 360 Degree View Of SaaS Integration
A 360 Degree View Of SaaS IntegrationA 360 Degree View Of SaaS Integration
A 360 Degree View Of SaaS IntegrationBoomi
 
Enterprise Modernization: Improving the economics of mainframe and multi-plat...
Enterprise Modernization: Improving the economics of mainframe and multi-plat...Enterprise Modernization: Improving the economics of mainframe and multi-plat...
Enterprise Modernization: Improving the economics of mainframe and multi-plat...dkang
 
Build & Deploy Scalable Cloud Applications in Record Time
Build & Deploy Scalable Cloud Applications in Record TimeBuild & Deploy Scalable Cloud Applications in Record Time
Build & Deploy Scalable Cloud Applications in Record TimeRightScale
 
Multi-Cloud Strategy for Unrestricted Possibilities
Multi-Cloud Strategy for Unrestricted PossibilitiesMulti-Cloud Strategy for Unrestricted Possibilities
Multi-Cloud Strategy for Unrestricted PossibilitiesHarsh V Sehgal
 
Cloud Migration - Cloud Computing Benefits & Issues
Cloud Migration - Cloud Computing Benefits & IssuesCloud Migration - Cloud Computing Benefits & Issues
Cloud Migration - Cloud Computing Benefits & IssuesArtizen, Inc.
 
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...Kai Wähner
 
Democratizing the Cloud with Open Source Cloud Development
Democratizing the Cloud with Open Source Cloud DevelopmentDemocratizing the Cloud with Open Source Cloud Development
Democratizing the Cloud with Open Source Cloud DevelopmentIntel Corporation
 
The State of Software Defined Storage Survey 2015
The State of Software Defined Storage Survey 2015The State of Software Defined Storage Survey 2015
The State of Software Defined Storage Survey 2015DataCore Software
 
The Importance of Integration to Salesforce Success
The Importance of Integration to Salesforce SuccessThe Importance of Integration to Salesforce Success
The Importance of Integration to Salesforce SuccessDarren Cunningham
 
IBM Cloud Services Portfolio
IBM Cloud Services Portfolio IBM Cloud Services Portfolio
IBM Cloud Services Portfolio Bernd Thomsen
 

What's hot (20)

IBM Spectrum Scale ECM - Winning Combination
IBM Spectrum Scale  ECM - Winning CombinationIBM Spectrum Scale  ECM - Winning Combination
IBM Spectrum Scale ECM - Winning Combination
 
Virtustream presentation
Virtustream presentationVirtustream presentation
Virtustream presentation
 
Cloud Computing & Business Intelligence
Cloud Computing & Business IntelligenceCloud Computing & Business Intelligence
Cloud Computing & Business Intelligence
 
Cloud Migration Strategy Framework
Cloud Migration Strategy FrameworkCloud Migration Strategy Framework
Cloud Migration Strategy Framework
 
Business Intelligence in the Cloud I
Business Intelligence in the Cloud IBusiness Intelligence in the Cloud I
Business Intelligence in the Cloud I
 
Integration intervention: Get your apps and data up to speed
Integration intervention: Get your apps and data up to speedIntegration intervention: Get your apps and data up to speed
Integration intervention: Get your apps and data up to speed
 
The Impact of SOA on Traditional Middleware Technologies
The Impact of SOA on Traditional Middleware TechnologiesThe Impact of SOA on Traditional Middleware Technologies
The Impact of SOA on Traditional Middleware Technologies
 
Cloud Transformation
Cloud TransformationCloud Transformation
Cloud Transformation
 
Benefits of Extending PowerCenter with Informatica Cloud
Benefits of Extending PowerCenter with Informatica CloudBenefits of Extending PowerCenter with Informatica Cloud
Benefits of Extending PowerCenter with Informatica Cloud
 
Introduction to integration
Introduction to integrationIntroduction to integration
Introduction to integration
 
A 360 Degree View Of SaaS Integration
A 360 Degree View Of SaaS IntegrationA 360 Degree View Of SaaS Integration
A 360 Degree View Of SaaS Integration
 
Enterprise Modernization: Improving the economics of mainframe and multi-plat...
Enterprise Modernization: Improving the economics of mainframe and multi-plat...Enterprise Modernization: Improving the economics of mainframe and multi-plat...
Enterprise Modernization: Improving the economics of mainframe and multi-plat...
 
Build & Deploy Scalable Cloud Applications in Record Time
Build & Deploy Scalable Cloud Applications in Record TimeBuild & Deploy Scalable Cloud Applications in Record Time
Build & Deploy Scalable Cloud Applications in Record Time
 
Multi-Cloud Strategy for Unrestricted Possibilities
Multi-Cloud Strategy for Unrestricted PossibilitiesMulti-Cloud Strategy for Unrestricted Possibilities
Multi-Cloud Strategy for Unrestricted Possibilities
 
Cloud Migration - Cloud Computing Benefits & Issues
Cloud Migration - Cloud Computing Benefits & IssuesCloud Migration - Cloud Computing Benefits & Issues
Cloud Migration - Cloud Computing Benefits & Issues
 
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
Enterprise Integration Patterns Revisited (again) for the Era of Big Data, In...
 
Democratizing the Cloud with Open Source Cloud Development
Democratizing the Cloud with Open Source Cloud DevelopmentDemocratizing the Cloud with Open Source Cloud Development
Democratizing the Cloud with Open Source Cloud Development
 
The State of Software Defined Storage Survey 2015
The State of Software Defined Storage Survey 2015The State of Software Defined Storage Survey 2015
The State of Software Defined Storage Survey 2015
 
The Importance of Integration to Salesforce Success
The Importance of Integration to Salesforce SuccessThe Importance of Integration to Salesforce Success
The Importance of Integration to Salesforce Success
 
IBM Cloud Services Portfolio
IBM Cloud Services Portfolio IBM Cloud Services Portfolio
IBM Cloud Services Portfolio
 

Similar to From Print to the Cloud and Beyond: The Story of a Century Old Company and its Resiliency to Ever-Evolve

ICIC 2014 Application Programming Interface (API) Technologies to Integrate C...
ICIC 2014 Application Programming Interface (API) Technologies to Integrate C...ICIC 2014 Application Programming Interface (API) Technologies to Integrate C...
ICIC 2014 Application Programming Interface (API) Technologies to Integrate C...Dr. Haxel Consult
 
New Product Introductions - CAS
New Product Introductions - CASNew Product Introductions - CAS
New Product Introductions - CASDr. Haxel Consult
 
Data centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data developmentData centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data developmentKevin Lee
 
d-Wise | SAS Clinical Data Integration
d-Wise | SAS Clinical Data Integration   d-Wise | SAS Clinical Data Integration
d-Wise | SAS Clinical Data Integration d-Wise Technologies
 
Interoperability.pptx
Interoperability.pptxInteroperability.pptx
Interoperability.pptxRahul720416
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsArcadia Data
 
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATIONMAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATIONTigerGraph
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLMatt Lord
 
Lessons from WuXi NextCODE Scales Up To Accelerate Data Sequencing in Their D...
Lessons from WuXi NextCODE Scales Up To Accelerate Data Sequencing in Their D...Lessons from WuXi NextCODE Scales Up To Accelerate Data Sequencing in Their D...
Lessons from WuXi NextCODE Scales Up To Accelerate Data Sequencing in Their D...Amazon Web Services
 
Oracle engineered systems executive presentation
Oracle engineered systems executive presentationOracle engineered systems executive presentation
Oracle engineered systems executive presentationOTN Systems Hub
 
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Cambridge Semantics
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesDataWorks Summit
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesDataWorks Summit
 
Pre-Con Ed: CA Software Asset Management - Key Customer Topics
Pre-Con Ed: CA Software Asset Management - Key Customer TopicsPre-Con Ed: CA Software Asset Management - Key Customer Topics
Pre-Con Ed: CA Software Asset Management - Key Customer TopicsCA Technologies
 

Similar to From Print to the Cloud and Beyond: The Story of a Century Old Company and its Resiliency to Ever-Evolve (20)

Managing a Multi-Tenant Data Lake
Managing a Multi-Tenant Data LakeManaging a Multi-Tenant Data Lake
Managing a Multi-Tenant Data Lake
 
ICIC 2014 Application Programming Interface (API) Technologies to Integrate C...
ICIC 2014 Application Programming Interface (API) Technologies to Integrate C...ICIC 2014 Application Programming Interface (API) Technologies to Integrate C...
ICIC 2014 Application Programming Interface (API) Technologies to Integrate C...
 
New Product Introductions - CAS
New Product Introductions - CASNew Product Introductions - CAS
New Product Introductions - CAS
 
Data centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data developmentData centric SDLC for automated clinical data development
Data centric SDLC for automated clinical data development
 
d-Wise | SAS Clinical Data Integration
d-Wise | SAS Clinical Data Integration   d-Wise | SAS Clinical Data Integration
d-Wise | SAS Clinical Data Integration
 
Interoperability.pptx
Interoperability.pptxInteroperability.pptx
Interoperability.pptx
 
Accelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time AnalyticsAccelerating Data Lakes and Streams with Real-time Analytics
Accelerating Data Lakes and Streams with Real-time Analytics
 
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATIONMAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
MAXIMIZING THE VALUE OF SCIENTIFIC INFORMATION TO ACCELERATE INNOVATION
 
Unlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQLUnlocking Big Data Insights with MySQL
Unlocking Big Data Insights with MySQL
 
Lessons from WuXi NextCODE Scales Up To Accelerate Data Sequencing in Their D...
Lessons from WuXi NextCODE Scales Up To Accelerate Data Sequencing in Their D...Lessons from WuXi NextCODE Scales Up To Accelerate Data Sequencing in Their D...
Lessons from WuXi NextCODE Scales Up To Accelerate Data Sequencing in Their D...
 
Oracle engineered systems executive presentation
Oracle engineered systems executive presentationOracle engineered systems executive presentation
Oracle engineered systems executive presentation
 
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
 
Navigating the Complex Web of Chemistry Using ChemSpider
Navigating the Complex Web of Chemistry Using ChemSpiderNavigating the Complex Web of Chemistry Using ChemSpider
Navigating the Complex Web of Chemistry Using ChemSpider
 
Value of the mediawiki platform for providing content to the chemistry community
Value of the mediawiki platform for providing content to the chemistry communityValue of the mediawiki platform for providing content to the chemistry community
Value of the mediawiki platform for providing content to the chemistry community
 
Presentation of ChemSPider at PubChem Public Meeting
Presentation of ChemSPider at PubChem Public MeetingPresentation of ChemSPider at PubChem Public Meeting
Presentation of ChemSPider at PubChem Public Meeting
 
Insights into Real-world Data Management Challenges
Insights into Real-world Data Management ChallengesInsights into Real-world Data Management Challenges
Insights into Real-world Data Management Challenges
 
ChemSpider Overview SLides August 2007
ChemSpider Overview SLides August 2007ChemSpider Overview SLides August 2007
ChemSpider Overview SLides August 2007
 
Insights into Real World Data Management Challenges
Insights into Real World Data Management ChallengesInsights into Real World Data Management Challenges
Insights into Real World Data Management Challenges
 
Pre-Con Ed: CA Software Asset Management - Key Customer Topics
Pre-Con Ed: CA Software Asset Management - Key Customer TopicsPre-Con Ed: CA Software Asset Management - Key Customer Topics
Pre-Con Ed: CA Software Asset Management - Key Customer Topics
 
Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry
Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry
Crowdsourcing, Collaborations and Text-Mining in a World of Open Chemistry
 

More from Prolifics

Prolifics SAP Data Assessment
Prolifics SAP Data AssessmentProlifics SAP Data Assessment
Prolifics SAP Data AssessmentProlifics
 
Prolifics Level 2 Test Lifecycle Automation Services Star West
Prolifics Level 2 Test Lifecycle Automation Services Star WestProlifics Level 2 Test Lifecycle Automation Services Star West
Prolifics Level 2 Test Lifecycle Automation Services Star WestProlifics
 
PureApplication: System, Service, Software
PureApplication: System, Service, SoftwarePureApplication: System, Service, Software
PureApplication: System, Service, SoftwareProlifics
 
Cloud Options for a Modern Architecture
Cloud Options for a Modern ArchitectureCloud Options for a Modern Architecture
Cloud Options for a Modern ArchitectureProlifics
 
Discover BPM Optimization in the Cloud
Discover BPM Optimization in the CloudDiscover BPM Optimization in the Cloud
Discover BPM Optimization in the CloudProlifics
 
Leveraging Governance in the IBM WebSphere Service Registry and Repository fo...
Leveraging Governance in the IBM WebSphere Service Registry and Repository fo...Leveraging Governance in the IBM WebSphere Service Registry and Repository fo...
Leveraging Governance in the IBM WebSphere Service Registry and Repository fo...Prolifics
 
Applying an IBM SOA Approach to Manual Processes Automation
Applying an IBM SOA Approach to Manual Processes AutomationApplying an IBM SOA Approach to Manual Processes Automation
Applying an IBM SOA Approach to Manual Processes AutomationProlifics
 
How Broadcast Music, Inc. Devised and Enabled Enterprise Architecture from Co...
How Broadcast Music, Inc. Devised and Enabled Enterprise Architecture from Co...How Broadcast Music, Inc. Devised and Enabled Enterprise Architecture from Co...
How Broadcast Music, Inc. Devised and Enabled Enterprise Architecture from Co...Prolifics
 
Using the Power of IBM Tivoli Common Reporting to Make Smart Decisions: The U...
Using the Power of IBM Tivoli Common Reporting to Make Smart Decisions: The U...Using the Power of IBM Tivoli Common Reporting to Make Smart Decisions: The U...
Using the Power of IBM Tivoli Common Reporting to Make Smart Decisions: The U...Prolifics
 
Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...
Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...
Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...Prolifics
 
Best Practices for Monitoring Your Cloud Environment and Applications
Best Practices for Monitoring Your Cloud Environment and ApplicationsBest Practices for Monitoring Your Cloud Environment and Applications
Best Practices for Monitoring Your Cloud Environment and ApplicationsProlifics
 
Smarter Integration Using the IBM SOA Foundation Stack: Best Practices and Le...
Smarter Integration Using the IBM SOA Foundation Stack: Best Practices and Le...Smarter Integration Using the IBM SOA Foundation Stack: Best Practices and Le...
Smarter Integration Using the IBM SOA Foundation Stack: Best Practices and Le...Prolifics
 
Delivering Enterprise Applications: Faster. Cheaper. Better
Delivering Enterprise Applications: Faster. Cheaper. BetterDelivering Enterprise Applications: Faster. Cheaper. Better
Delivering Enterprise Applications: Faster. Cheaper. BetterProlifics
 
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...Prolifics
 
Integrating IBM PureApplication System and IBM UrbanCode Deploy: A GE Capital...
Integrating IBM PureApplication System and IBM UrbanCode Deploy: A GE Capital...Integrating IBM PureApplication System and IBM UrbanCode Deploy: A GE Capital...
Integrating IBM PureApplication System and IBM UrbanCode Deploy: A GE Capital...Prolifics
 
Broadcast Music Inc. Release Rockstars: Program-Wide DevOps Success with Urba...
Broadcast Music Inc. Release Rockstars: Program-Wide DevOps Success with Urba...Broadcast Music Inc. Release Rockstars: Program-Wide DevOps Success with Urba...
Broadcast Music Inc. Release Rockstars: Program-Wide DevOps Success with Urba...Prolifics
 
Integrating Salesforce.com and Oracle ERP Using IBM WebSphere Cast Iron
Integrating Salesforce.com and Oracle ERP Using IBM WebSphere Cast IronIntegrating Salesforce.com and Oracle ERP Using IBM WebSphere Cast Iron
Integrating Salesforce.com and Oracle ERP Using IBM WebSphere Cast IronProlifics
 
Customizing the Mobile Connections App
Customizing the Mobile Connections AppCustomizing the Mobile Connections App
Customizing the Mobile Connections AppProlifics
 
What's New in Smarter Process and C&I
What's New in Smarter Process and C&IWhat's New in Smarter Process and C&I
What's New in Smarter Process and C&IProlifics
 
Compose Your Digital Enterprise
Compose Your Digital EnterpriseCompose Your Digital Enterprise
Compose Your Digital EnterpriseProlifics
 

More from Prolifics (20)

Prolifics SAP Data Assessment
Prolifics SAP Data AssessmentProlifics SAP Data Assessment
Prolifics SAP Data Assessment
 
Prolifics Level 2 Test Lifecycle Automation Services Star West
Prolifics Level 2 Test Lifecycle Automation Services Star WestProlifics Level 2 Test Lifecycle Automation Services Star West
Prolifics Level 2 Test Lifecycle Automation Services Star West
 
PureApplication: System, Service, Software
PureApplication: System, Service, SoftwarePureApplication: System, Service, Software
PureApplication: System, Service, Software
 
Cloud Options for a Modern Architecture
Cloud Options for a Modern ArchitectureCloud Options for a Modern Architecture
Cloud Options for a Modern Architecture
 
Discover BPM Optimization in the Cloud
Discover BPM Optimization in the CloudDiscover BPM Optimization in the Cloud
Discover BPM Optimization in the Cloud
 
Leveraging Governance in the IBM WebSphere Service Registry and Repository fo...
Leveraging Governance in the IBM WebSphere Service Registry and Repository fo...Leveraging Governance in the IBM WebSphere Service Registry and Repository fo...
Leveraging Governance in the IBM WebSphere Service Registry and Repository fo...
 
Applying an IBM SOA Approach to Manual Processes Automation
Applying an IBM SOA Approach to Manual Processes AutomationApplying an IBM SOA Approach to Manual Processes Automation
Applying an IBM SOA Approach to Manual Processes Automation
 
How Broadcast Music, Inc. Devised and Enabled Enterprise Architecture from Co...
How Broadcast Music, Inc. Devised and Enabled Enterprise Architecture from Co...How Broadcast Music, Inc. Devised and Enabled Enterprise Architecture from Co...
How Broadcast Music, Inc. Devised and Enabled Enterprise Architecture from Co...
 
Using the Power of IBM Tivoli Common Reporting to Make Smart Decisions: The U...
Using the Power of IBM Tivoli Common Reporting to Make Smart Decisions: The U...Using the Power of IBM Tivoli Common Reporting to Make Smart Decisions: The U...
Using the Power of IBM Tivoli Common Reporting to Make Smart Decisions: The U...
 
Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...
Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...
Empowering SmartCloud APM - Predictive Insights and Analysis: A Use Case Scen...
 
Best Practices for Monitoring Your Cloud Environment and Applications
Best Practices for Monitoring Your Cloud Environment and ApplicationsBest Practices for Monitoring Your Cloud Environment and Applications
Best Practices for Monitoring Your Cloud Environment and Applications
 
Smarter Integration Using the IBM SOA Foundation Stack: Best Practices and Le...
Smarter Integration Using the IBM SOA Foundation Stack: Best Practices and Le...Smarter Integration Using the IBM SOA Foundation Stack: Best Practices and Le...
Smarter Integration Using the IBM SOA Foundation Stack: Best Practices and Le...
 
Delivering Enterprise Applications: Faster. Cheaper. Better
Delivering Enterprise Applications: Faster. Cheaper. BetterDelivering Enterprise Applications: Faster. Cheaper. Better
Delivering Enterprise Applications: Faster. Cheaper. Better
 
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
Architecting and Tuning IIB/eXtreme Scale for Maximum Performance and Reliabi...
 
Integrating IBM PureApplication System and IBM UrbanCode Deploy: A GE Capital...
Integrating IBM PureApplication System and IBM UrbanCode Deploy: A GE Capital...Integrating IBM PureApplication System and IBM UrbanCode Deploy: A GE Capital...
Integrating IBM PureApplication System and IBM UrbanCode Deploy: A GE Capital...
 
Broadcast Music Inc. Release Rockstars: Program-Wide DevOps Success with Urba...
Broadcast Music Inc. Release Rockstars: Program-Wide DevOps Success with Urba...Broadcast Music Inc. Release Rockstars: Program-Wide DevOps Success with Urba...
Broadcast Music Inc. Release Rockstars: Program-Wide DevOps Success with Urba...
 
Integrating Salesforce.com and Oracle ERP Using IBM WebSphere Cast Iron
Integrating Salesforce.com and Oracle ERP Using IBM WebSphere Cast IronIntegrating Salesforce.com and Oracle ERP Using IBM WebSphere Cast Iron
Integrating Salesforce.com and Oracle ERP Using IBM WebSphere Cast Iron
 
Customizing the Mobile Connections App
Customizing the Mobile Connections AppCustomizing the Mobile Connections App
Customizing the Mobile Connections App
 
What's New in Smarter Process and C&I
What's New in Smarter Process and C&IWhat's New in Smarter Process and C&I
What's New in Smarter Process and C&I
 
Compose Your Digital Enterprise
Compose Your Digital EnterpriseCompose Your Digital Enterprise
Compose Your Digital Enterprise
 

Recently uploaded

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024The Digital Insurer
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticscarlostorres15106
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Wonjun Hwang
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 

Recently uploaded (20)

My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024My INSURER PTE LTD - Insurtech Innovation Award 2024
My INSURER PTE LTD - Insurtech Innovation Award 2024
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmaticsKotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
Kotlin Multiplatform & Compose Multiplatform - Starter kit for pragmatics
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
Bun (KitWorks Team Study 노별마루 발표 2024.4.22)
 
DMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special EditionDMCC Future of Trade Web3 - Special Edition
DMCC Future of Trade Web3 - Special Edition
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 

From Print to the Cloud and Beyond: The Story of a Century Old Company and its Resiliency to Ever-Evolve

  • 1. From Print to the Cloud and Beyond The Story of a Century Old Company and its Resiliency to Ever-Evolve
  • 2. Agenda  CAS Overview  CAS - In the Beginning… There was Print  CAS - The Age of Silos  CAS - IBM Integration. To the Cloud… and Beyond  Future Considerations CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.2
  • 3. Agenda  CAS Overview  CAS - In the Beginning… There was Print  CAS - The Age of Silos  CAS - IBM Integration. To the Cloud… and Beyond  Future Considerations CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.3
  • 4.  CAS helps scientists around the world benefit from the published work of their colleagues by monitoring, abstracting and indexing the world's chemistry-related literature CAS has been supporting scientists for more than 100 years CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.4  Since 1907, CAS’s objective has been to find, collect, and organize all publicly disclosed chemistry substance information
  • 5. CAS helps scientists around the world benefit from the published work of their colleagues  CAplusSM  CAS REGISTRYSM  CHEMLIST®  CIN® CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.5 Markush Indexing Authority Processing Source Selection Document Indexing Reaction Indexing  MARPAT®  CHEMCATS®  CAS scientists monitor, abstract and index the world's chemistry- related literature  Proprietary, standardized indexing in CAS databases ensures consistent, comprehensive search results.  CASREACT®
  • 6. CAS products and services make it faster and easier for scientist to find the information they need for their research  CAS Registry Numbers® uniquely identify each chemical substance without the ambiguity of multiple naming conventions  STN® combines industry-leading search and retrieval with unique and comprehensive content  SciFinder® offers a one-stop shop experience with flexible search and discover options based on user input and workflow  Science IP®, the CAS information search service provides fast, comprehensive and accurate searches of the world’s scientific and technical literature CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.6 CAS Registry Number 58-08-2 CAFFEINE!
  • 7. Agenda  CAS Overview  CAS - In the Beginning… There was Print  CAS - The Age of Silos  CAS - IBM Integration. To the Cloud… and Beyond  Future Considerations CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.7
  • 8. CAS Timeline 108 Years of Progress (and Counting) CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.8
  • 9. CAS End-To-End Architecture “In the Beginning… There was Print” Data Transformation Data Validation Data Curation Data Integration Data Presentation CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.9 Data Ingestion
  • 10. “CAS Knows Jack” Jack and Friends Beside Printed Chemical Abstracts CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.10
  • 11. Agenda  CAS Overview  CAS - In the Beginning… There was Print  CAS - The Age of Silos  CAS - IBM Integration. To the Cloud… and Beyond  Future Considerations CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.11
  • 12.  Data Ingestion  Data Transformation  Data Validation  Data Normalization  Data Persistence CAS End-To-End Architecture “The Age of Silos”  Data Ingestion  Data Transformation  Data Validation  Data Curation  Data Integration  Data Persistence  Data Transformation  Data Validation  Data Integration  Data Presentation CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.12
  • 13. Silo Challenges  Multiple Data Ingestion Points – In some cases, the same data is being ingested twice  Multiple Views of the Data – Each silo must perform complex transformations to its specific view – Editorial manufactures normalized data based on a print model – Product Development wants de-normalized, complete data – Content Delivery has a mixed view of the data  Multiple Vocabulary Conventions – Differing data definitions causes confusion across silos  No Unified, Authority Data Store – Each silo has their own copy of the data in its own specific vocabulary CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.13
  • 14. Editorial Legacy Systems  Many disparate databases used to store relational data – Becomes difficult to maintain and support  Multiple database technologies used – No unified platform  Challenges to support legacy systems – Some legacy technologies are no longer supported – Succession planning difficult to support legacy systems – Special IT used so that legacy code would not need to be touched CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.14
  • 15. Content Delivery Systems  Data was transformed into one common data model to bridge gap between Editorial and Product View – One common schema model was complex and unwieldy – Common model contained “unnecessary” complexities – Common model did not align with Product Development’s specifications CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.15
  • 16. Product Development Systems  Product Development must code for “unnecessary” complexities  Data not completely de-normalized – Additional development necessary to compile data CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.16
  • 17. Silo Challenges CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.17
  • 18. By the Numbers  Thousands of journals ingested per day – Approximately 1 TB of data per week  Over 100 other data feeds ingested per day  Over 1.2 million messages processed per day – Synced up with product data daily in less than 10 minutes  Over 6 TB of compiled data created per day CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.18
  • 19. What is an Architect to Do? CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.19
  • 20. Unify…Integrate…Simplify  Unify Data; Processes; Transformations; Data Ingestion  Integrate Disparate Systems; Services; Applications; and Data Consumers  Simplify the Architecture!!! CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.20
  • 21. • Run proof-of-concept and/or proof-of-technology and/or pilot project as needed • Negotiate contract • Adjust as needed • Selection team members score vendor solutions • Aggregate scores • Select vendor with best aggregate score (judgement required) • Bake-off if winner is too close to call • Send RFP document to prospective vendors • Hold clarification meetings with vendor teams • Vendors send RFP response documents • Vendors present their solutions and answer questions • Create technology selection team • Identify key requirements (based on architecture and tech stack governance) • Assign weights • Create RFP document and scorecard spreadsheet Request For Proposal Create RFP Engage vendors Score-driven selection Validate selection CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.
  • 22. Requirements  Data Integration  Durable Message Bus with Guaranteed Delivery  Any-to-Any Connectivity  Architectural Flexibility  Excellent Support  A Proven Solution CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.22
  • 23. Agenda  CAS Overview  CAS - In the Beginning… There was Print  CAS - The Age of Silos  CAS - IBM Integration. To the Cloud… and Beyond  Future Considerations CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.23
  • 24. Unify…Integrate…Simplify  Data Curation  Data Ingestion  Data Transformation  Data Validation  Data Normalization  Data Integration  Data Transformation  Data Validation  Data Integration  Data Presentation CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.24  Data Persistence  Data Flow Orchestration
  • 25. Agenda  Overview  CAS - In the Beginning… There was Print  CAS - The Age of Silos  CAS - IBM Integration. To the Cloud… and Beyond  Future Considerations CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.25
  • 26. To the Cloud… and Beyond! Off-Prem Processing  Bursting Capabilities  Data Center Relief  Co-Location Capabilities New Mobile Applications Service Unification  Service Management  Service Integration CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.26
  • 27. CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.27
  • 28. Questions CAS is a division of the American Chemical Society. Copyright 2015 American Chemical Society. All rights reserved.28
  • 29. Connect with CAS: Joseph Sapp Lead Enterprise Application Architect jsapp@cas.org www.linkedin.com/in/joesapp

Editor's Notes

  1. How does a century old company, who used to consider data integration placing the binder on a book keep up with younger, nimbler companies? You ever-evolve! You must always be adapting and you MUST change your environment before it changes you! This is the success story of Chemical Abstracts Services . A 108 year old company who is the world’s authority for curating and classifying chemical information.
  2. Like any success story CAS’ story has a beginning…it has chapters where challenges were faced…Necessary evolutions…and a Bright Future!
  3. CAS is a division of the American Chemical Society. It was founded in 1907 with two goals: First, American scientists recognized the need to participate globally in scientific information exchange. With today’s pace of research, this need is felt by all scientists around the world. Second—and this goal sounds like it could have been written today!—scientists simply did not have time to read all of the current literature in their field. So, the Chemical Abstracts Service was formed and pledged to address these two problems. In 1907, they published the first Chemical Abstracts. It contained 502 abstracts from scientific papers and patents. Today, more than 1.5 million abstracts with scientific indexing are added to the electronic database every year. Our extensive coverage ensures that your scientists are getting the most comprehensive and timely information to support their research efforts. The ACS’ mission to “improve people’s lives through the transforming power of chemistry” is still alive today as CAS’ scientists continue to organize the scientific literature so that scientists around the world can find they research they need and use it to make discoveries that improve all of our lives.
  4. The CAS Product and Content operations division has a mission to select a broad spectrum of chemical and related documents and report new substances, methods of synthesis and novel information contained in those documents. Every step in the editorial process enhances what is contained in the original document. Our scientists enhance titles, add abstracts, apply standardized precise indexing, and extract substances for registration or Markush indexing. CAS creates the CAplus, CAS REGISTRY, CASREACT and MARPAT databases by selecting and analyzing documents that report new or novel chemical findings and reporting these findings through enhanced titles and abstracts, controlled subject index entries, and substance indexing. This controlled indexing enhances the retrieval and understanding of the original research publication. CAS staff also create three additional databases to further support the work of researchers. CHEMCATS allows researchers to find commercially available chemicals, pricing and supplier contact information. CHEMLIST allows scientists to find whether a substance is regulated and by what agency. Chemical Industry Notes provides access to current business news.
  5. You can access the CAS databases by using one of our online services: SciFinder or STN; or by asking Science IP to run a search for you. Both STN and Science IP also access many other technical databases. Your CAS Account Consultant can help you decide which product will best meet your needs.
  6. In 1965, CAS introduced the CAS Registry Number which is the industry classification standard for chemical substances, structures and biosequences. In 1966, CAS expands its medium to offer microform and magnetic tape mediums. This is only 14 years after IBM introduced its first magentic tape device, the IBM 7 track. Chemical Abstracts offered an Online product back in 1980! 2 years before the TCP/IP protocol was created and a decade before the first commercial Internet Service Providers came out. In 1998, CAS begins receiving data in electronic format. 2009, CAS registers its 50 millionth substance! CAS registered 10 million more in just a span of 2 years! In 2013, CAS printed its last hard copy volume. But when you are a century-old company, there are times that lulls can occur in your technologies and you are faced with a quandry of moving towards newer technoligies, but also having to maintain and support your older technologies. And this is not only the case for technologies and products, but also the architecture itself. Even though we do not print any more, our architecture is still based on print and must, once again, evolve.
  7. In the Beginning there was Print…There are certain basic functions that CAS performs and has always performed. They are Data Ingestion, Data Transformation, Data Validation and Curation, and Data Integration and Presentation. Back when CAS was print-based, these functions were done quite differently. Data Ingestion used to be dropping journals off at a truck dock. Transformation occurred when the pages were tore out of the journals and handed out for curation. Data validation and curation were done by scientists similar to those who curate today, however, back then it was done by writing on the journal pages manually. Data Integration and presenation were completed when the binding was placed on the book. We have come a very long way as a company and as our industry has evolved, CAS has needed to find more innovative ways of performing these same, basic functions.
  8. In the Beginning there was Print…There are certain basic functions that CAS performs and has always performed. They are Data Ingestion, Data Transformation, Data Validation and Curation, and Data Integration and Presentation. Back when CAS was print-based, these functions were done quite differently. Data Ingestion used to be dropping journals off at a truck dock. Transformation occurred when the pages were tore out of the journals and handed out for curation. Data validation and curation were done by scientists similar to those who curate today, however, back then it was done by writing on the journal pages manually. Data Integration and presenation were completed when the binding was placed on the book. We have come a very long way as a company and as our industry has evolved, CAS has needed to find more innovative ways of performing these same, basic functions.
  9. As CAS began to evolve from a print world to a digital world, there were moments that the architecture found itself in a transition period. This occurred in order to accommodate both the old world of print and the new world of digital, online content. I like to refer to this as “The Age of Silos”.
  10. During the years of print, The Editorial area of Chemical Abstracts was responsible for everything. From data ingestion, to data curation and integration, and finally presentation of the final print product. As CAS evolved to a digital environment and began to consume more and more content, other areas of the architecture were created to spread responsibilities around. A Content Delivery area was created to create search and display files for end consumers. Later a Product Processing and Delivery area was created to better serve ever-demanding search and display capabilities. Because of this step-by-step evolving process, CAS finds itself in “The Age of Silos”. Where each silo of the architecture is not only responsible for its existing responsibilities, but also the responsibilities of its past legacy. This has caused its share of challenges.
  11. Read from Slides
  12. As I had mentioned before, because of its beginnings, each are of the architecture faces its own challenges. During the evolution from print, <read slides>
  13. <read slides>
  14. <read slides>
  15. Accommodating both the legacy and new architectures, caused each area to run in their own silo. With their own processes; their own “unnecessary complexities; and their own architectures. The type of data that CAS produces is highly complex. Human Genome Sequences, that if printed out may go the length of this wall. Chemical names that give relational databases fits. The data is not policy information, it is complicated information by nature. So CAS is familiar with handling necessary complexities. It is the “unnecessary” complexities that we need to avoid.
  16. This is no small problem when you consider the volumes of data that CAS processes.
  17. EVOLVE! This sounds great! But where do you begin?
  18. CAS underwent an RFP or Request For Proposal where it sent an RFP out to four companies. The companies presented their case and then each was evaluated using a quantitative scoring approach. And when it was all said and done, CAS selected IBM to help drive its solution.
  19. So Why was IBM chosen? This brings us to CAS’ latest evolution.
  20. This brings us to our latest evolution.
  21. As I mentioned, this new evolution of the architecture is intended to unify CAS and allow the “tools in the CAS toolbox” if you will, to perform the tasks that they were intented to perform. DataPower can now be used to unify everything from Data Ingestion, to Data Transformation, to Data Integration. Allowing Editorial to produce the high quality of data that makes CAS not only the industry leader, but also the industry standard. And do what they do very well. Curate Data. IIB and MQ give CAS the flexibility it needs to manage and coordinate data flow orchestration and persistence, while product processing and delivery does what it does very well. Process and Deliver Services! All of this results in a more flexible architecture and higher speed to market.
  22. With new found flexibility. Data, Process and Service Integration, CAS can now to prepare to evolve in an ever-changing world. Some of these new capabilities include: Off-prem processing; the extension of mobile applications; and service unification. Once the new architecture is fully in place, CAS began to think of other future considerations.
  23. IBM offers a plethora of solutions that may be beneficial to CAS in the future as we are able to expand our views. Other IBM solutions that have been useful for others may include: IBM BlueMix powered by Cloud Foundry IBM API Manager IBM PureApp IBM Watson