Más contenido relacionado La actualidad más candente (20) Similar a Mark Logic Information Analysis Trends Webinar (20) Mark Logic Information Analysis Trends Webinar1. Unlock Content™
Exploring Rising Trends in Information Analysis
Norman Walsh, Principal Technologist
John Kreisa, Director of Industry Solutions
5 August 2009
Rev 3.5.3
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 1
2. Topics
Introduction to Mark Logic
Information warehousing trends
Customer examples
Demonstrations
Q&A
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 2
3. How We Help Our Customers
Mark Logic accelerates the creation of information applications
Integrate information from multiple sources
Repurpose content into multiple products
Enable dynamic custom views
Deliver information through any channel or device
Search-and-discover previously unknown information
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 3
4. Who Uses Mark Logic?
Magazine Publishing Education Government
Software / Services Legal Tax Financial Enterprise
Aggregation STM
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 4
6. Quotable Quote
“Burton Group‟s Methodology for Overcoming Data Silos (called
„MODS‟), used in conjunction with the new XQuery-based development
stack, promises to be an ideal approach for bridging application silos
and for building content-centric applications. ... MODS and the
XQuery development stack offer a unique opportunity for enterprise IT
professionals to simplify IT applications and bridge the walls of
application silos while reinforcing sound data management practices.”
Data Management Strategies Overview, Burton Group
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 6
7. Quotable Quote
“In a collection of recent papers, we predicted the end of „one size fits all‟ as
a commercial relational DBMS paradigm. These papers presented reasons
and experimental evidence that showed that the major RDBMS vendors
can be outperformed by 1-2 orders of magnitude by specialized engines
in essentially any vertical market of significant size …
Assuming that specialized engines dominate these markets over time, the
current relational DBMSs are merely 25 year old legacy code lines that
should be retired in favor of a collection of „from scratch‟ specialized
engines.” Michael Stonebraker, HPTS, 2007
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 7
8. Look inside MarkLogic Server
Industry's leading XML server
Store
High-availability, transactional storage of
information
Search
Full-text, structure and geospatial search
Analyze
Enrich and analyze for insight into your
content
Deliver
Render to any format, push alerts to end
users
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 8
9. MarkLogic Server features
DBMS Features Search Features
XQuery 1.0 Alerting (“profiling”)
Transaction semantics / zero latency Faceted navigation
Geospatial indexing Schema agnosticism
Non-blocking read consistency Full-text extensions to XQuery
Triggers Integrated XML and text search
Point-in-time queries Fielded search
Backup and recovery Programmable relevance
Scriptable administration Entity extraction / enrichment
High-availability Advanced linguistics
Clustered, shared-nothing architecture Thesaurus and taxonomy support
SVM classifier + rule-based classifier
Scaling architecture
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 9
10. Fully Leverage Information
Documentation Book Report
Policies & procedures Assessment Item Support case
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 10
11. A Fundamentally Different Approach
New content infrastructure enables new applications
Data Text Content
Data model Table Flat Text XML
Proprietary
Interface SQL XQuery
API
Delivers Records Links XML
Search
Platform Relational DBMS XML Server
Engine
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 11
12. MarkLogic Selected Differentiators
Schema-agnostic data storage
One schema, multiple schemas, no schema
Fully composable universal index
Full text, XML, geospatial, … all managed through XQuery
Flexible content processing pipelines
Efficiently process unpredictable and unannounced data
Element-level access/isolation for analysis and retrieval
Billions of documents, terabytes of data, sub-second response
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 12
13. Mark Logic Standards Focus
XML, XQuery, XML Schema, HTTP / HTTPS…
Web services APIs (HTTP)
To and from the server with any HTTP web service
SSL
REST (including JSON) and SOAP
W3C committee participation
XML Query Working Group
XML Schema Working Group
XML Core Working Group (co-chair)
XML Processing Model Working Group (chair)
OASIS
DocBook Technical Committee (chair)
RELAX NG Technical Committee
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 13
14. Topics
Introduction to Mark Logic
Information warehousing trends
Customer examples
Demonstrations
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 14
15. Centralized Information Warehouse
Human Enrichment
Internal documents
Applications
Partner content
Internet
Analytics
Alerting
The Web Delivery
Transformation &
Machine Enrichment
RDBMS
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 15
16. Application Integration Architecture
Integration available at multiple
levels
Application level
Infrastructure level
Enables drill through and drill
down with link back
Application layer
MarkLogic provides content
analytics to Data Warehouse
and vice versa
RDBMS
Integration ensures seamless
user experience
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 16
17. Simplified Application Architectures
Separate tier for presentation and Single tier written in XQuery
business logic Integrated presentation and biz logic
Multiple access languages Simplified and efficient app architecture
Causes impedance mismatch Direct to browser communications
Presentation Logic
Business Logic
Application Server
Presentation Logic
Business logic
Information logic
Information logic
MarkLogic
Storage
Traditional New
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 17
18. Enhance Content Through Enrichment
Named entities “Facts”
People Relationship
Places Subject
Names Object
Geocoding Relationship
Company Event
Currency Change
Weapons Time
Chemicals Location
Etc.
Classification
Sentiment
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 18
19. Content Enrichment – Example
Location Dates Organizations Money Names Titles
KABUL, Afghanistan, May 21 (AP) -- Profits from Afghanistan's thriving poppy fields
are increasingly flowing to Taliban fighters, leading U.S. and NATO officials to conclude
that the counterinsurgency mission must now include stepped-up anti-drug efforts.
This year's heroin-producing poppy crop will at least match last year's record haul and
could exceed it by up to 20 percent, officials say, meaning more money to fuel the
Taliban's violent insurgency.
"It's wrong to say that you can do one thing and not the other," Ronald Neumann, who
recently stepped down as U.S. ambassador to Afghanistan, said of the link between
anti-drug and anti-terrorism efforts. "You have to deal with both at the same time."
Afghanistan accounts for more than 90 percent of the world's heroin supply, and a
significant portion of the profits from the $3.1 billion trade is thought to flow to Taliban
fighters, who tax and protect poppy farmers and drug runners.
Drug control has not been part of the official mandate of international forces in
Afghanistan. But there is a growing push for NATO's International Security Assistance
Force, or ISAF, to play a more active role in sharing intelligence and detecting drug
convoys and heroin labs, said Daan Everts, NATO's senior civilian official in
Afghanistan. Drugs
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 19
20. Exploitation of Location Information
Where is the new when
Search using Geospatial markup in the data
Geospatial types e.g. GML, KML,
GeoRSS/Simple
Search based on proximity or area
Push content to me when I approach
Enabled by high-speed geospatial indexing
Real-time query and analytics
Combine geospatial with full-text and
structured search
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 20
21. Information-based Alerting
Let the content find the user
Relevant data finds the user
Standing queries avoid
repetitive searching
More efficient use of time and
SMS Alert: resource
New data
matching
your search Alerts can have many actions
Send email, IM, call a web
service, …
Run additional queries…
Update the database…
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 21
22. Topics
Introduction to Mark Logic
Information warehousing trends
Customer examples
Demonstrations
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 22
23. Morgan Markets Research Portal
Vision for the project – manage and deliver timely research to 80,000
users worldwide
Use of RIXML drives
granular management
and delivery of information
Metadata added using
a variety of tools
Use alerting to deliver
content to both the portal
and email
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 23
24. Advanced Search and Navigation
Advanced search
Faceted navigation
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 24
25. Customer Case Study: DoD
Large information oriented defense organization
Builds and maintains an enterprise catalog
Single central repository of content metadata in DDMS
Provides information services to applications
Enhance and leverage existing enterprise catalog
Vastly underused resource
Challenge
Take full advantage of DDMS
Improve ingestion (ATOM feeds etc)
Deliver UI/Web service duality
Help users leverage schemas and metadata extensions
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 25
26. DoD Solution
Enterprise catalog built on MarkLogic Server
Replace existing RDBMS infrastructure with MarkLogic Server
100x faster ingestion of content
Provide geospatial services for information in KML format
Provide UI/web services duality
Benefits
Now able to reach original vision for enterprise catalog project
Fully leverage DDMS metadata (geospatial, temporal etc.)
Reduce application complexity and increase services
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 26
27. DoD Enterprise Catalog Architecture
Fat client application 3rd party client Direct enterprise catalog client
Query services 3rd party web application
Ingestion services
Query services
Ingestion services
Saturn synchronizer Web services Web UI
Enterprise catalog business logic
Enterprise services
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 27
28. JetBlue: Dynamic Assembly and Reuse
Replace an existing method for document
assembly and updates
Major issues existed
456
System was manual-oriented pages!!
Flight Operations Manual (FOM)
Station Operations Manual (SOM)
Flight Attendant Manual (FAM)
Etc…
Document system was manually-
maintained, published and accessed
They were at risk of losing their operations
license
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 28
29. Information Needed in Many Docs
FAA
Winter Operations
1.0 De-Icing
1.1 Responsibility
1.2 Authority
1.3 Procedures
Section 1.3 Section 2.4 Section 5.2.1
Station Operations Manual (SOM) Flight Operations Manual (FOM) Ground Operations Manual (GOM)
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 29
30. Technical Architecture
Content Creation & Assembly
SharePoint UI SharePoint
Word + Add-in
Author
workstations
Dynamic Content Delivery
Document viewer
web application
Workstations
Laptops & Synchronization
Services
handhelds
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 30
34. Topics
Introduction to Mark Logic
Information warehousing trends
Customer examples
Demonstrations
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 34
36. ----- Appendix -----
Supplemental material follows
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 36
37. MarkLogic Server Positioning
Ad hoc
Search
MarkLogic
Engines
Structure
Server
Predefined
IMS Relational
IDMS Databases
Predefined Ad hoc
Queries
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 37
38. Mark Logic Generic Solutions
We accelerate the creation of information products
Content repurposing
Using the same content in multiple products
Content integration
Building products with content from different sources
Content delivery
Delivering content to multiple output formats and devices
Custom publishing
User-driven creation of unique information products
Search and discovery
Finding previously unknown information
Content analytics
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 38
39. Search Engine Differentiation
XML Server Search Engine
Searches text Searches text
Indexes XML “Filters” files
Read/write (think: Web 2.0) Read-only
Transaction consistency Index latency
Standard query language Proprietary APIs
Pushes queries* to data Moves data to queries
Top-to-bottom XML Impedance mismatches
Designed as DBMS platform Designed to help find files
Participant in the DBMS Participant in the eaten alive by
specialization trend Google Appliance trend
* In the sense of complex queries
Copyright © 2009 Mark Logic Corporation. All rights reserved. Slide 39