SlideShare una empresa de Scribd logo
1 de 59
<presentation/>
<presenter>Matt Turner</presenter>
<title>Chief Technologist, Media Solutions</title>
Slide 2 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
<MLUGL>
<intro/>
<talk>
<bit>Mission Impossible</bit>
<story>Wiley</story>
<story>Springer</story>
<story>Mitchell1</story>
<bit>Search and Semantics<bit>
<demo>Old Skool</demo>
</talk>
</MLUGL>
Slide 3 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Mission(s) Impossible
Slide 4 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
<story>http://www.marklogic.com/resources/slides-gearing-up-for-the-
content-factory-to-quickly-create-innovate-and-monetize/</story>
Slide 5 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Why is it Mission Impossible?
Start Revenue Earning January 2013
• Publish new content from 1 Jan 2013
• Accepted Articles : 20/day; 100/week; 400/month
• Early View Articles: 20/day; 100/week; 400/month
• Issues : 19/month; 77/quarter; 230/year
Give AGU customers access to all licensed content by 1 January 2013
• 21 journals (160,000 articles)
• 33 personal choice products (aka virtual journals) based on AGU index terms
• 743 special sections
• Migrate customers, users, products, licenses, alerts data
Vendors, systems & business processes in Editorial & Production ready to
publish 2013 Content
• Integration with new editorial system
• Changes to work flow
And… it needs to work like how it works on AGU site with over 60 enhancements
Slide 6 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
KeyChallenges
•Content with no issue number and no pagination
•Journal with 7 parts, of which 3 of those parts have sub-parts!
•Many moving parts within Wiley - 17 systems to check
•Content completeness and quality (and external vendor)
•Unknown unknowns - coping with changing and emerging requirements
throughout development phase
Challenges to overcome
• 4 months left!
Slide 7 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Examples:
“Coastal Ocean
Observatories”
“The 11 March 2011
Tohoku-Oki Earthquake
and Tsunami”
Content-Driven Functionality – Special Section Search
Slide 8 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
How MarkLogic Helped - S/W Development
Search Service
•As a search engine, doesn't need manual/additional re-indexing after loading new
content. Everything is done on fly – saves time and effort
•Enabled reuse and only had to add some enhancements to search service for AGU
Save Searches
•Search service processing request in XML is easy to save whole search and reuse it
for either alerts or loading the saved search
Index Terms
•Reuse vocabulary service to help with hierarchy of index terms. This was more
valuable for faceting for index terms. Can easily fetch any sub-structure of index
terms
Faceting
•MarkLogic supports faceting, so no need to do anything special, just add proper
configuration according AGU specification
Slide 9 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
What Variations/Non Standard
Practices were introduced
• New licensing model (e.g. multi choice product for personal subscribers)
• Create Special Sections as another slice of content view
• New workflow for handling daily society data updates via feeds
• Changing content workflow for legacy vs current content
• Improvements to content (not just conversion)
• Start development before requirements were clear
• Complete testing before we had all the content
• Cannot complete certain types of testing
• Break some rules
Recipe for Disaster?
Slide 10 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Conclusion
•Mission Impossible? Choose not to accept
•Mission Impossible? Deal with it – that’s life but may not succeed
•Mission Impossible? New organizational capability
•Embrace challenge, but put your best people with experience on it
•Be brave to break the rules when required
•People over Process
•Enabling technologies like MarkLogic
Develop as new capability to handle the unexpected and unknowns
Slide 11 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
<story>http://www.marklogic.com/resources/betting-the-company-how-
springer-successfully-insourced-its-flagship-content-platform/</story>
Slide 12 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 18
Growth in electronic sales
0.0%
20.0%
40.0%
60.0%
80.0%
100.0%
2007 2008 2009 2010 2011 2012
Bud
Total Online
Total Print66
33
Slide 13 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 19
So...
Springer decided to
build its own platform
Slide 14 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 21
36 man-years of effort to reproduce
36 man-yearsHow much time independent software auditor
estimated it would take to reproduce
the existing code base
Slide 15 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 22
A risky move?
MetaPress
code base
Slide 16 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 24
Oh, and have it ready
in 11 months
Slide 17 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 26
Where we were in April 2011
• People
• 1 Executive Champion
• 1 Product Owner
• 1 Dir. of Dev
• 1 Tech Lead
• 2 Developers
• 1 BA
• 0 QA
• 0 DevOps
• 0 UX/design/front-end
• 0 architect
• Hardware/Software/Data
• 0 databases
• 0 servers
• 0 documents
7 staff*
*3 managers – who don’t count
Jan-Erik de Boer
Brian Bishop Georg Nold
EVP of IT
Product Owner Dir. of Development
Slide 18 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Slide 19 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 29
Where we are today
• 1 Executive champion
• 1 Product Owner
• 1 Dir. of Dev
• 2 Tech Leads
• 16 Developers
• 2 Dev Ops
• 4 BAs
• 6 QAs
• 2 UX
• 2 Design/Front-end
• 1 Architect
• 16 servers
• 2 live environments
• 1 database
• 12 pairing stations
• 2 Build Agents
• 2 dashboard machines
• 5.7 million documents
• 60 million PNGs
• 11TB of data
31staff
Slide 20 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 31
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
New platform release schedule
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Release
Slide 21 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 34
Slide 22 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 42
MarkLogic
cluster
RESTful APIs realtime.springer.com
citations.springer.com
iPhone apps
Slide 23 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 45
Goals are
prioritized
(top to bottom) and
stories
are prioritized
(left to right)
Velocity is measured
every week, allowing
us to accurately
forecast when a
certain level of work
can be completed
Slide 24 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 55
MarkLogic IS agile
Slide 25 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 56
MarkLogic agility
• Schema-less means we can use our complex XML content as-is
• E.g. Different attributes for books, journals, chapters, articles, protocols, etc.
• You can decide later if you need to add indexes at very little cost
• You don’t have to know everything up front
• Ingestion is relatively pain-free
• You are free to come up with features without worrying about back-end
• Modifying content via Record Loader makes it easy to manipulate data
• Handles various types of native content
• You don’t even have to use Xquery!
Slide 26 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 69
What if you could subscribe to
a search query?
Slide 27 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 70
Content Entitlements
2TB
Storing entitlements as queries means any new content loaded
automatically becomes available to authorized users
Customers
<material_ID=“001”>
Subject : Engineering
<content>
Journal_ID:0001
ContentType: Article
DatePublished: 4/4/2012
Subject:Mathematics
Author: John Smith
Language: English
Keywords: “k theory” <material_ID=“002”>
Journal_ID: 0001-0099
<material_ID=“003”>
Subject: Engineering
SearchTerm: “carbon nanotube”
DatePublished: 2000-2012
<customer=“001”>
material_ID : 001
These are stored as
serialized queries
Slide 28 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 76
How did it go?
Slide 29 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 72
0
2
4
6
8
10
12
Old New
Average Page Load Time (sec)
Slide 30 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Betting the Company | 4/6/2013 | 77
Weekly visits to SpringerLink (millions, Aug 4, 2012 – Mar 2, 2013)
Source: Google Analytics
0
500,000
1,000,000
1,500,000
2,000,000
2,500,000
3,000,000
3,500,000
4,000,000
4,500,000
5,000,000
link.springer.com
SpringerLink.com
Total
Slide 31 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
<story>http://www.marklogic.com/resources/the-journey-from-
print-to-online/</story>
Slide 32 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
2011
Slide 33 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
 65 OEM Auto and Part
Manufacturers
 Data on every modern car sold in
US
 Repair
 Diagnostics
 Maintenance
 Technical Service Bulletins (TSBs)
 Wiring
 Estimator
Mitchell1: Data
Slide 34 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
What’ s in the data store today?
• Articles – 408,892
– 209,987 Narratives
– 103,416 Technical Service Bulletins and Recalls
– 15,179 Maintenance Schedules
• Images – 6,193,647
– 5,924,959 Narrative
– 268,688 Technical Service Bulletins and Recalls
• When it’ s all broken down, it becomes roughly
16,000,000 MarkLogic Documents
Slide 35 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
And how do we describe it?
• Preferred Terms
– Tends to be the ASE term
– Used to describe Components (12,261), Diagnostic Trouble
Codes (65,525), and Information Types (98)
• Non-Preferred Terms
– Tends to be OEM specific terminology
– Alternate terms for Components (22,733) and Information Types
(757)
– Codes do not have Non-Preferred Terms
• Spatial References
– Because “ Replace the window motor” just isn’ t precise enough
Slide 36 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Mitchell1: Data Then, Data Now
Slide 37 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Mitchell1: Data Then, Data Now
Slide 38 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Mitchell1: Data Then, Data Now
Slide 39 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Mitchell1: Market Reaction
https://www.youtube.com/watch?v=IfM8v-8NY_4&list=UUIOYnh6LBFooV_YxlPVPLvA&index=36
Slide 40 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Search . . . and Semantics
Slide 41 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
One Question . . .
Slide 42 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Who’s Smarter?
VS
Slide 43 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Do domestic dogs interpret pointing as a command?
Animal Cognition (2012): 1-12 , November 09, 2012
By Scheider, Linda; Kaminski, Juliane; Call, Josep; Tomasello, Michael
Slide 44 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
What if . . .
Slide 45 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
HOW?
Slide 46 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
The Basic Idea
Get some triples . . . if you haven’t already
• Grabbed DBPedia
• Dumped in Linked Data Consortium
• Loaded Lehigh
• and NYT’s open data
You are behind!
But what if you could add in documents?
Slide 47 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Rich MarkLogic Applications .. Made Richer
Slide 48 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Rich MarkLogic Applications .. Made Richer
Name: John Smith
Affiliation: IBM
Timezone: PST
Committer: Hadoop
Slide 49 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Semantics Architecture
TRIPLE
XQY XSLT SQL SPARQL
GRAPH
SPARQL
Slide 50 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Triple Index
• 3 triple orders
• Cached for performance
• Works seamlessly with other indexes
• Security
• 350 bytes per triple on disk
• 1 billion+ triples per host
TRIPLE
Slide 51 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
SPARQL
• Executed using the triple index
• SPARQL 1.0
• Cost-based optimization
• Join ordering and algorithms
• More in the lightning talks
select * where {
?person :birth-place ?place;
:first-name “John”
}
SPARQL
Slide 52 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Demo
Slide 53 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Slide 54 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Old Skool
- Quickie Framework
- Circa 2006ish
- HTML tables -> 1997 style
- ‘action’ controller
- <query/> state -> from the query string
- No sessions
- No CSS
- No Javascript
- No Adaptive Design
- No Facets?
Slide 55 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Search
Slide 56 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Facets!
Slide 57 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Semantics
Slide 58 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Just Semantics?
Slide 59 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
Thank You!

Más contenido relacionado

La actualidad más candente

Cloud4All Introduction
Cloud4All IntroductionCloud4All Introduction
Cloud4All Introduction
Ross Gardler
 

La actualidad más candente (20)

ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013ODI 11g in the Enterprise - BIWA 2013
ODI 11g in the Enterprise - BIWA 2013
 
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
Big Data: Introducing BigInsights, IBM's Hadoop- and Spark-based analytical p...
 
Manage your ODI Development Cycle – ODTUG Webinar
Manage your ODI Development Cycle – ODTUG WebinarManage your ODI Development Cycle – ODTUG Webinar
Manage your ODI Development Cycle – ODTUG Webinar
 
Adobe Behance Scales to Millions of Users at Lower TCO with Neo4j
Adobe Behance Scales to Millions of Users at Lower TCO with Neo4jAdobe Behance Scales to Millions of Users at Lower TCO with Neo4j
Adobe Behance Scales to Millions of Users at Lower TCO with Neo4j
 
Open Source and Standards Communities Coming Together to Solve Real World Pro...
Open Source and Standards Communities Coming Together to Solve Real World Pro...Open Source and Standards Communities Coming Together to Solve Real World Pro...
Open Source and Standards Communities Coming Together to Solve Real World Pro...
 
Oracle Autonomous Data Warehouse Cloud and Data Visualization
Oracle Autonomous Data Warehouse Cloud and Data VisualizationOracle Autonomous Data Warehouse Cloud and Data Visualization
Oracle Autonomous Data Warehouse Cloud and Data Visualization
 
Webinar: The evolution of SpagoBI suite according to the Agile BI approach
Webinar: The evolution of SpagoBI suite according to the Agile BI approachWebinar: The evolution of SpagoBI suite according to the Agile BI approach
Webinar: The evolution of SpagoBI suite according to the Agile BI approach
 
Choosing the Right Open Source Database
Choosing the Right Open Source DatabaseChoosing the Right Open Source Database
Choosing the Right Open Source Database
 
Oracle OpenWorld - A quick take on all 22 press releases of Day #1 - #3
Oracle OpenWorld - A quick take on all 22 press releases of Day #1 - #3Oracle OpenWorld - A quick take on all 22 press releases of Day #1 - #3
Oracle OpenWorld - A quick take on all 22 press releases of Day #1 - #3
 
OWF13 - Openstack
OWF13 - OpenstackOWF13 - Openstack
OWF13 - Openstack
 
Domain Partitions and Multitenancy in Oracle WebLogic Server 12c - Why It's U...
Domain Partitions and Multitenancy in Oracle WebLogic Server 12c - Why It's U...Domain Partitions and Multitenancy in Oracle WebLogic Server 12c - Why It's U...
Domain Partitions and Multitenancy in Oracle WebLogic Server 12c - Why It's U...
 
Offload, Transform, and Present - the New World of Data Integration
Offload, Transform, and Present - the New World of Data IntegrationOffload, Transform, and Present - the New World of Data Integration
Offload, Transform, and Present - the New World of Data Integration
 
Webinar: Real Time BI is Open and Anywhere with SpagoBI
Webinar: Real Time BI is Open and Anywhere with SpagoBIWebinar: Real Time BI is Open and Anywhere with SpagoBI
Webinar: Real Time BI is Open and Anywhere with SpagoBI
 
Shrinking the container_zurich_july_2018
Shrinking the container_zurich_july_2018Shrinking the container_zurich_july_2018
Shrinking the container_zurich_july_2018
 
Webinar - What's new with SpagoBI 5: presentation and demo
Webinar - What's new with SpagoBI 5: presentation and demoWebinar - What's new with SpagoBI 5: presentation and demo
Webinar - What's new with SpagoBI 5: presentation and demo
 
Drupal 8 and 9, Backwards Compatibility, and Drupal 8.5 update
Drupal 8 and 9, Backwards Compatibility, and Drupal 8.5 updateDrupal 8 and 9, Backwards Compatibility, and Drupal 8.5 update
Drupal 8 and 9, Backwards Compatibility, and Drupal 8.5 update
 
Everything You Need to Know About the Microsoft Azure and Oracle Cloud Interc...
Everything You Need to Know About the Microsoft Azure and Oracle Cloud Interc...Everything You Need to Know About the Microsoft Azure and Oracle Cloud Interc...
Everything You Need to Know About the Microsoft Azure and Oracle Cloud Interc...
 
Javantura v6 - Java SE, Today and Tomorrow - Dalibor Topic
Javantura v6 - Java SE, Today and Tomorrow - Dalibor TopicJavantura v6 - Java SE, Today and Tomorrow - Dalibor Topic
Javantura v6 - Java SE, Today and Tomorrow - Dalibor Topic
 
Cloud4All Introduction
Cloud4All IntroductionCloud4All Introduction
Cloud4All Introduction
 
How to Handle DEV&TEST&PROD for Oracle Data Integrator
How to Handle DEV&TEST&PROD for Oracle Data IntegratorHow to Handle DEV&TEST&PROD for Oracle Data Integrator
How to Handle DEV&TEST&PROD for Oracle Data Integrator
 

Similar a MarkLogic User Group - Best of MLW and Search + Semantics

Con8493 simplified ui 2013 tailoring dubois_evers_teter_o'broin_uob_partner
Con8493 simplified ui 2013 tailoring dubois_evers_teter_o'broin_uob_partnerCon8493 simplified ui 2013 tailoring dubois_evers_teter_o'broin_uob_partner
Con8493 simplified ui 2013 tailoring dubois_evers_teter_o'broin_uob_partner
Berry Clemens
 
A Reference Architecture to Enable Visibility and Traceability across the Ent...
A Reference Architecture to Enable Visibility and Traceability across the Ent...A Reference Architecture to Enable Visibility and Traceability across the Ent...
A Reference Architecture to Enable Visibility and Traceability across the Ent...
CollabNet
 
A6 harnessing the power of big data and business analytics to transform bus...
A6   harnessing the power of big data and business analytics to transform bus...A6   harnessing the power of big data and business analytics to transform bus...
A6 harnessing the power of big data and business analytics to transform bus...
Dr. Wilfred Lin (Ph.D.)
 

Similar a MarkLogic User Group - Best of MLW and Search + Semantics (20)

Loras College 2014 Business Analytics Symposium | Aaron Lanzen: Creating Busi...
Loras College 2014 Business Analytics Symposium | Aaron Lanzen: Creating Busi...Loras College 2014 Business Analytics Symposium | Aaron Lanzen: Creating Busi...
Loras College 2014 Business Analytics Symposium | Aaron Lanzen: Creating Busi...
 
206530 getting started with p6 analytics and reporting
206530 getting started with p6 analytics and reporting206530 getting started with p6 analytics and reporting
206530 getting started with p6 analytics and reporting
 
Scrum discussion (1)
Scrum discussion (1)Scrum discussion (1)
Scrum discussion (1)
 
Building MuleSoft Applications with Google BigQuery Meetup 4
Building MuleSoft Applications with Google BigQuery Meetup 4Building MuleSoft Applications with Google BigQuery Meetup 4
Building MuleSoft Applications with Google BigQuery Meetup 4
 
Con8493 simplified ui 2013 tailoring dubois_evers_teter_o'broin_uob_partner
Con8493 simplified ui 2013 tailoring dubois_evers_teter_o'broin_uob_partnerCon8493 simplified ui 2013 tailoring dubois_evers_teter_o'broin_uob_partner
Con8493 simplified ui 2013 tailoring dubois_evers_teter_o'broin_uob_partner
 
Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccion
 
Advance Your IoT Strategy with Integration of PLM to Oracle Product Hub
Advance Your IoT Strategy with Integration of PLM to Oracle Product HubAdvance Your IoT Strategy with Integration of PLM to Oracle Product Hub
Advance Your IoT Strategy with Integration of PLM to Oracle Product Hub
 
A Reference Architecture to Enable Visibility and Traceability across the Ent...
A Reference Architecture to Enable Visibility and Traceability across the Ent...A Reference Architecture to Enable Visibility and Traceability across the Ent...
A Reference Architecture to Enable Visibility and Traceability across the Ent...
 
Plastic SCM: Entreprise Version Control Platform for Modern Applications and ...
Plastic SCM: Entreprise Version Control Platform for Modern Applications and ...Plastic SCM: Entreprise Version Control Platform for Modern Applications and ...
Plastic SCM: Entreprise Version Control Platform for Modern Applications and ...
 
Learnings from Developing a New B2B SaaS Product (Suryaveer Lodha (Sunny) Pro...
Learnings from Developing a New B2B SaaS Product (Suryaveer Lodha (Sunny) Pro...Learnings from Developing a New B2B SaaS Product (Suryaveer Lodha (Sunny) Pro...
Learnings from Developing a New B2B SaaS Product (Suryaveer Lodha (Sunny) Pro...
 
Big Data: Myths and Realities
Big Data: Myths and RealitiesBig Data: Myths and Realities
Big Data: Myths and Realities
 
Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...Always Be Deploying. How to make R great for machine learning in (not only) E...
Always Be Deploying. How to make R great for machine learning in (not only) E...
 
(Oracle) DBA and Other Skills Needed in 2020
(Oracle) DBA and Other Skills Needed in 2020(Oracle) DBA and Other Skills Needed in 2020
(Oracle) DBA and Other Skills Needed in 2020
 
A6 harnessing the power of big data and business analytics to transform bus...
A6   harnessing the power of big data and business analytics to transform bus...A6   harnessing the power of big data and business analytics to transform bus...
A6 harnessing the power of big data and business analytics to transform bus...
 
Accelerating SDLC for Large Public Sector Enterprise Applications
Accelerating SDLC for Large Public Sector Enterprise ApplicationsAccelerating SDLC for Large Public Sector Enterprise Applications
Accelerating SDLC for Large Public Sector Enterprise Applications
 
Platformpreso siia2013v5
Platformpreso siia2013v5Platformpreso siia2013v5
Platformpreso siia2013v5
 
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to HadoopSuccesses, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
Successes, Challenges, and Pitfalls Migrating a SAAS business to Hadoop
 
Building a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLKBuilding a data driven search application with LucidWorks SiLK
Building a data driven search application with LucidWorks SiLK
 
Integrating Application Security into a Software Development Process
Integrating Application Security into a Software Development ProcessIntegrating Application Security into a Software Development Process
Integrating Application Security into a Software Development Process
 
Geode Meetup Apachecon
Geode Meetup ApacheconGeode Meetup Apachecon
Geode Meetup Apachecon
 

Más de Matt Turner

Securing the Right Metadata and Making it Work for You
Securing the Right Metadata and Making it Work for YouSecuring the Right Metadata and Making it Work for You
Securing the Right Metadata and Making it Work for You
Matt Turner
 
Mark logic Industrialize Your Data IOT Berlin Sept 2019
Mark logic Industrialize Your Data IOT Berlin Sept 2019Mark logic Industrialize Your Data IOT Berlin Sept 2019
Mark logic Industrialize Your Data IOT Berlin Sept 2019
Matt Turner
 

Más de Matt Turner (20)

Data In Action: Business Value of Data
Data In Action: Business Value of DataData In Action: Business Value of Data
Data In Action: Business Value of Data
 
Data2030 Summit MEA: Data Chaos to Data Culture March 2023
Data2030 Summit MEA: Data Chaos to Data Culture March 2023Data2030 Summit MEA: Data Chaos to Data Culture March 2023
Data2030 Summit MEA: Data Chaos to Data Culture March 2023
 
Data2030 Summit Data Megatrends Turner Sept 2022.pptx
Data2030 Summit Data Megatrends Turner Sept 2022.pptxData2030 Summit Data Megatrends Turner Sept 2022.pptx
Data2030 Summit Data Megatrends Turner Sept 2022.pptx
 
From Data Chaos to Data Culture
From Data Chaos to Data CultureFrom Data Chaos to Data Culture
From Data Chaos to Data Culture
 
How Data is Driving AI Innovation
How Data is Driving AI InnovationHow Data is Driving AI Innovation
How Data is Driving AI Innovation
 
Principles of Information Access
Principles of Information AccessPrinciples of Information Access
Principles of Information Access
 
Securing the Right Metadata and Making it Work for You
Securing the Right Metadata and Making it Work for YouSecuring the Right Metadata and Making it Work for You
Securing the Right Metadata and Making it Work for You
 
Operationalize Your Data and Lead Your Business Transformation
Operationalize Your Data and Lead Your Business TransformationOperationalize Your Data and Lead Your Business Transformation
Operationalize Your Data and Lead Your Business Transformation
 
Three Cool Things You Can Do with Standards
Three Cool Things You Can Do with StandardsThree Cool Things You Can Do with Standards
Three Cool Things You Can Do with Standards
 
Mark logic Industrialize Your Data IOT Berlin Sept 2019
Mark logic Industrialize Your Data IOT Berlin Sept 2019Mark logic Industrialize Your Data IOT Berlin Sept 2019
Mark logic Industrialize Your Data IOT Berlin Sept 2019
 
BBC olympics 2012 experience oct18
BBC olympics 2012 experience oct18BBC olympics 2012 experience oct18
BBC olympics 2012 experience oct18
 
Operationalize Your Linked Data
Operationalize Your Linked DataOperationalize Your Linked Data
Operationalize Your Linked Data
 
Smart Content Summit: Unlock the Value with the Right Data Pattern
Smart Content Summit: Unlock the Value with the Right Data PatternSmart Content Summit: Unlock the Value with the Right Data Pattern
Smart Content Summit: Unlock the Value with the Right Data Pattern
 
Data Security and the Hard Outer Shell
Data Security and the Hard Outer ShellData Security and the Hard Outer Shell
Data Security and the Hard Outer Shell
 
Media publishing meetup ocean of data july 2016
Media publishing meetup ocean of data july 2016Media publishing meetup ocean of data july 2016
Media publishing meetup ocean of data july 2016
 
Northeastern DB Class Introduction to Marklogic NoSQL april 2016
Northeastern DB Class Introduction to Marklogic NoSQL april 2016Northeastern DB Class Introduction to Marklogic NoSQL april 2016
Northeastern DB Class Introduction to Marklogic NoSQL april 2016
 
The Impact of Smart Content
The Impact of Smart ContentThe Impact of Smart Content
The Impact of Smart Content
 
Metadata Madness: Semantics Takes Center Stage
Metadata Madness: Semantics Takes Center StageMetadata Madness: Semantics Takes Center Stage
Metadata Madness: Semantics Takes Center Stage
 
New Trends in Data Management in the Information Industries
New Trends in Data Management in the Information Industries New Trends in Data Management in the Information Industries
New Trends in Data Management in the Information Industries
 
Smart Content Summit - Unlocking Content With Semantics and Metadata
Smart Content Summit - Unlocking Content With Semantics and MetadataSmart Content Summit - Unlocking Content With Semantics and Metadata
Smart Content Summit - Unlocking Content With Semantics and Metadata
 

Último

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Victor Rentea
 

Último (20)

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 AmsterdamDEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
DEV meet-up UiPath Document Understanding May 7 2024 Amsterdam
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..Understanding the FAA Part 107 License ..
Understanding the FAA Part 107 License ..
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
Navigating the Deluge_ Dubai Floods and the Resilience of Dubai International...
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024Finding Java's Hidden Performance Traps @ DevoxxUK 2024
Finding Java's Hidden Performance Traps @ DevoxxUK 2024
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 

MarkLogic User Group - Best of MLW and Search + Semantics

  • 2. Slide 2 Copyright © 2013 MarkLogic® Corporation. All rights reserved. <MLUGL> <intro/> <talk> <bit>Mission Impossible</bit> <story>Wiley</story> <story>Springer</story> <story>Mitchell1</story> <bit>Search and Semantics<bit> <demo>Old Skool</demo> </talk> </MLUGL>
  • 3. Slide 3 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Mission(s) Impossible
  • 4. Slide 4 Copyright © 2013 MarkLogic® Corporation. All rights reserved. <story>http://www.marklogic.com/resources/slides-gearing-up-for-the- content-factory-to-quickly-create-innovate-and-monetize/</story>
  • 5. Slide 5 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Why is it Mission Impossible? Start Revenue Earning January 2013 • Publish new content from 1 Jan 2013 • Accepted Articles : 20/day; 100/week; 400/month • Early View Articles: 20/day; 100/week; 400/month • Issues : 19/month; 77/quarter; 230/year Give AGU customers access to all licensed content by 1 January 2013 • 21 journals (160,000 articles) • 33 personal choice products (aka virtual journals) based on AGU index terms • 743 special sections • Migrate customers, users, products, licenses, alerts data Vendors, systems & business processes in Editorial & Production ready to publish 2013 Content • Integration with new editorial system • Changes to work flow And… it needs to work like how it works on AGU site with over 60 enhancements
  • 6. Slide 6 Copyright © 2013 MarkLogic® Corporation. All rights reserved. KeyChallenges •Content with no issue number and no pagination •Journal with 7 parts, of which 3 of those parts have sub-parts! •Many moving parts within Wiley - 17 systems to check •Content completeness and quality (and external vendor) •Unknown unknowns - coping with changing and emerging requirements throughout development phase Challenges to overcome • 4 months left!
  • 7. Slide 7 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Examples: “Coastal Ocean Observatories” “The 11 March 2011 Tohoku-Oki Earthquake and Tsunami” Content-Driven Functionality – Special Section Search
  • 8. Slide 8 Copyright © 2013 MarkLogic® Corporation. All rights reserved. How MarkLogic Helped - S/W Development Search Service •As a search engine, doesn't need manual/additional re-indexing after loading new content. Everything is done on fly – saves time and effort •Enabled reuse and only had to add some enhancements to search service for AGU Save Searches •Search service processing request in XML is easy to save whole search and reuse it for either alerts or loading the saved search Index Terms •Reuse vocabulary service to help with hierarchy of index terms. This was more valuable for faceting for index terms. Can easily fetch any sub-structure of index terms Faceting •MarkLogic supports faceting, so no need to do anything special, just add proper configuration according AGU specification
  • 9. Slide 9 Copyright © 2013 MarkLogic® Corporation. All rights reserved. What Variations/Non Standard Practices were introduced • New licensing model (e.g. multi choice product for personal subscribers) • Create Special Sections as another slice of content view • New workflow for handling daily society data updates via feeds • Changing content workflow for legacy vs current content • Improvements to content (not just conversion) • Start development before requirements were clear • Complete testing before we had all the content • Cannot complete certain types of testing • Break some rules Recipe for Disaster?
  • 10. Slide 10 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Conclusion •Mission Impossible? Choose not to accept •Mission Impossible? Deal with it – that’s life but may not succeed •Mission Impossible? New organizational capability •Embrace challenge, but put your best people with experience on it •Be brave to break the rules when required •People over Process •Enabling technologies like MarkLogic Develop as new capability to handle the unexpected and unknowns
  • 11. Slide 11 Copyright © 2013 MarkLogic® Corporation. All rights reserved. <story>http://www.marklogic.com/resources/betting-the-company-how- springer-successfully-insourced-its-flagship-content-platform/</story>
  • 12. Slide 12 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 18 Growth in electronic sales 0.0% 20.0% 40.0% 60.0% 80.0% 100.0% 2007 2008 2009 2010 2011 2012 Bud Total Online Total Print66 33
  • 13. Slide 13 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 19 So... Springer decided to build its own platform
  • 14. Slide 14 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 21 36 man-years of effort to reproduce 36 man-yearsHow much time independent software auditor estimated it would take to reproduce the existing code base
  • 15. Slide 15 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 22 A risky move? MetaPress code base
  • 16. Slide 16 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 24 Oh, and have it ready in 11 months
  • 17. Slide 17 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 26 Where we were in April 2011 • People • 1 Executive Champion • 1 Product Owner • 1 Dir. of Dev • 1 Tech Lead • 2 Developers • 1 BA • 0 QA • 0 DevOps • 0 UX/design/front-end • 0 architect • Hardware/Software/Data • 0 databases • 0 servers • 0 documents 7 staff* *3 managers – who don’t count Jan-Erik de Boer Brian Bishop Georg Nold EVP of IT Product Owner Dir. of Development
  • 18. Slide 18 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
  • 19. Slide 19 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 29 Where we are today • 1 Executive champion • 1 Product Owner • 1 Dir. of Dev • 2 Tech Leads • 16 Developers • 2 Dev Ops • 4 BAs • 6 QAs • 2 UX • 2 Design/Front-end • 1 Architect • 16 servers • 2 live environments • 1 database • 12 pairing stations • 2 Build Agents • 2 dashboard machines • 5.7 million documents • 60 million PNGs • 11TB of data 31staff
  • 20. Slide 20 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 31 Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec New platform release schedule Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release Release
  • 21. Slide 21 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 34
  • 22. Slide 22 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 42 MarkLogic cluster RESTful APIs realtime.springer.com citations.springer.com iPhone apps
  • 23. Slide 23 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 45 Goals are prioritized (top to bottom) and stories are prioritized (left to right) Velocity is measured every week, allowing us to accurately forecast when a certain level of work can be completed
  • 24. Slide 24 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 55 MarkLogic IS agile
  • 25. Slide 25 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 56 MarkLogic agility • Schema-less means we can use our complex XML content as-is • E.g. Different attributes for books, journals, chapters, articles, protocols, etc. • You can decide later if you need to add indexes at very little cost • You don’t have to know everything up front • Ingestion is relatively pain-free • You are free to come up with features without worrying about back-end • Modifying content via Record Loader makes it easy to manipulate data • Handles various types of native content • You don’t even have to use Xquery!
  • 26. Slide 26 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 69 What if you could subscribe to a search query?
  • 27. Slide 27 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 70 Content Entitlements 2TB Storing entitlements as queries means any new content loaded automatically becomes available to authorized users Customers <material_ID=“001”> Subject : Engineering <content> Journal_ID:0001 ContentType: Article DatePublished: 4/4/2012 Subject:Mathematics Author: John Smith Language: English Keywords: “k theory” <material_ID=“002”> Journal_ID: 0001-0099 <material_ID=“003”> Subject: Engineering SearchTerm: “carbon nanotube” DatePublished: 2000-2012 <customer=“001”> material_ID : 001 These are stored as serialized queries
  • 28. Slide 28 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 76 How did it go?
  • 29. Slide 29 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 72 0 2 4 6 8 10 12 Old New Average Page Load Time (sec)
  • 30. Slide 30 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Betting the Company | 4/6/2013 | 77 Weekly visits to SpringerLink (millions, Aug 4, 2012 – Mar 2, 2013) Source: Google Analytics 0 500,000 1,000,000 1,500,000 2,000,000 2,500,000 3,000,000 3,500,000 4,000,000 4,500,000 5,000,000 link.springer.com SpringerLink.com Total
  • 31. Slide 31 Copyright © 2013 MarkLogic® Corporation. All rights reserved. <story>http://www.marklogic.com/resources/the-journey-from- print-to-online/</story>
  • 32. Slide 32 Copyright © 2013 MarkLogic® Corporation. All rights reserved. 2011
  • 33. Slide 33 Copyright © 2013 MarkLogic® Corporation. All rights reserved.  65 OEM Auto and Part Manufacturers  Data on every modern car sold in US  Repair  Diagnostics  Maintenance  Technical Service Bulletins (TSBs)  Wiring  Estimator Mitchell1: Data
  • 34. Slide 34 Copyright © 2013 MarkLogic® Corporation. All rights reserved. What’ s in the data store today? • Articles – 408,892 – 209,987 Narratives – 103,416 Technical Service Bulletins and Recalls – 15,179 Maintenance Schedules • Images – 6,193,647 – 5,924,959 Narrative – 268,688 Technical Service Bulletins and Recalls • When it’ s all broken down, it becomes roughly 16,000,000 MarkLogic Documents
  • 35. Slide 35 Copyright © 2013 MarkLogic® Corporation. All rights reserved. And how do we describe it? • Preferred Terms – Tends to be the ASE term – Used to describe Components (12,261), Diagnostic Trouble Codes (65,525), and Information Types (98) • Non-Preferred Terms – Tends to be OEM specific terminology – Alternate terms for Components (22,733) and Information Types (757) – Codes do not have Non-Preferred Terms • Spatial References – Because “ Replace the window motor” just isn’ t precise enough
  • 36. Slide 36 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Mitchell1: Data Then, Data Now
  • 37. Slide 37 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Mitchell1: Data Then, Data Now
  • 38. Slide 38 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Mitchell1: Data Then, Data Now
  • 39. Slide 39 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Mitchell1: Market Reaction https://www.youtube.com/watch?v=IfM8v-8NY_4&list=UUIOYnh6LBFooV_YxlPVPLvA&index=36
  • 40. Slide 40 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Search . . . and Semantics
  • 41. Slide 41 Copyright © 2013 MarkLogic® Corporation. All rights reserved. One Question . . .
  • 42. Slide 42 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Who’s Smarter? VS
  • 43. Slide 43 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Do domestic dogs interpret pointing as a command? Animal Cognition (2012): 1-12 , November 09, 2012 By Scheider, Linda; Kaminski, Juliane; Call, Josep; Tomasello, Michael
  • 44. Slide 44 Copyright © 2013 MarkLogic® Corporation. All rights reserved. What if . . .
  • 45. Slide 45 Copyright © 2013 MarkLogic® Corporation. All rights reserved. HOW?
  • 46. Slide 46 Copyright © 2013 MarkLogic® Corporation. All rights reserved. The Basic Idea Get some triples . . . if you haven’t already • Grabbed DBPedia • Dumped in Linked Data Consortium • Loaded Lehigh • and NYT’s open data You are behind! But what if you could add in documents?
  • 47. Slide 47 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Rich MarkLogic Applications .. Made Richer
  • 48. Slide 48 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Rich MarkLogic Applications .. Made Richer Name: John Smith Affiliation: IBM Timezone: PST Committer: Hadoop
  • 49. Slide 49 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Semantics Architecture TRIPLE XQY XSLT SQL SPARQL GRAPH SPARQL
  • 50. Slide 50 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Triple Index • 3 triple orders • Cached for performance • Works seamlessly with other indexes • Security • 350 bytes per triple on disk • 1 billion+ triples per host TRIPLE
  • 51. Slide 51 Copyright © 2013 MarkLogic® Corporation. All rights reserved. SPARQL • Executed using the triple index • SPARQL 1.0 • Cost-based optimization • Join ordering and algorithms • More in the lightning talks select * where { ?person :birth-place ?place; :first-name “John” } SPARQL
  • 52. Slide 52 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Demo
  • 53. Slide 53 Copyright © 2013 MarkLogic® Corporation. All rights reserved.
  • 54. Slide 54 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Old Skool - Quickie Framework - Circa 2006ish - HTML tables -> 1997 style - ‘action’ controller - <query/> state -> from the query string - No sessions - No CSS - No Javascript - No Adaptive Design - No Facets?
  • 55. Slide 55 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Search
  • 56. Slide 56 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Facets!
  • 57. Slide 57 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Semantics
  • 58. Slide 58 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Just Semantics?
  • 59. Slide 59 Copyright © 2013 MarkLogic® Corporation. All rights reserved. Thank You!

Notas del editor

  1. &lt;&lt; JBG: Data Now slide needs to be replaced. A slide at the end of this presentation contains an appropriate image. &gt;&gt;
  2. &lt;&lt; JBG: Data Now slide needs to be replaced. A slide at the end of this presentation contains an appropriate image. &gt;&gt;
  3. &lt;&lt; JBG: Data Now slide needs to be replaced. A slide at the end of this presentation contains an appropriate image. &gt;&gt;
  4. Run it past Michaline and Dave GorbetInclude fulltext index in exposition.
  5. Not all index has to be in memoryRoles and permissionsCheck sizingSee a SPARQL querySpend a bit more time on this slide