SlideShare una empresa de Scribd logo
1 de 44
Open Source Search for the
Enterprise
Charlie Hull
Managing Director, Flax
3rd
November 2010
OVUM Briefing, Search Across the Enterprise
charlie@flax.co.uk
www.flax.co.uk/blog
+44 (0) 8700 118334
Twitter: @FlaxSearch
Search engine specialists with decades of experience
Developers, innovators and strategists
Based in Cambridge, UK
Technology agnostic – but open source exponents
Recently selected as UK Authorized Partner by Lucid
Imagination
Customers include Mydeco, NLA, Durrants Ltd, Financial
Times, MediaMiser, MySkreen, Accenture, University of
Cambridge
Recently asked to present at British Computer Society
and Lucene Revolution conferences
Who are Flax?
“Open-source software (OSS) is computer
software that is available in source code form
for which the source code and certain other
rights normally reserved for copyright holders
are provided under a software license that
permits users to study, change, and improve
the software. […] Some open source software is
available within the public domain” (Wikipedia)
What is open source?
“Open-source software (OSS) is computer
software that is available in source code form
for which the source code and certain other
rights normally reserved for copyright holders
are provided under a software license that
permits users to study, change, and improve
the software. […] Some open source software is
available within the public domain” (Wikipedia)
What is open source?
It's the work of amateur developers
Myths about open source
It's the work of amateur developers
If I use open source, I have to open up my
software/servers/network to all and sundry
Myths about open source
It's the work of amateur developers
If I use open source, I have to open up my
software/servers/network to all and sundry
Open source software isn't reliable or
scalable
Myths about open source
It's the work of amateur developers
If I use open source, I have to open up my
software/servers/network to all and sundry
Open source software isn't reliable or
scalable
It's free
Myths about open source
It's the work of amateur developers
If I use open source, I have to open up my
software/servers/network to all and sundry
Open source software isn't reliable or
scalable
It's free
It's unsupported
Myths about open source
Open source search software
Apache Lucene and Solr are trademarks of The Apache Software Foundation
- Flexible licensing
- Vector space model
- Java and other languages
- Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
Open source search software
Apache Lucene and Solr are trademarks of The Apache Software Foundation
- The successor to Muscat
- Bayesian probabilistic ranking
- C/C++ with language bindings
- Highly accurate & scalable
- Flexible licensing
- Vector space model
- Java and other languages
- Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
Open source search software
Apache Lucene and Solr are trademarks of The Apache Software Foundation
- The successor to Muscat
- Bayesian probabilistic ranking
- C/C++ with language bindings
- Highly accurate & scalable
- Flexible licensing
- Vector space model
- Java and other languages
- Well known and supported
And more....
Apache Lucene and Solr are trademarks of The Apache Software Foundation
Some examples
http://www.nla-clipshare.com
Newspaper Licensing Agency – NLA Clipshare
20 million newspaper stories
6500 users
Content from every major newspaper (and
most regionals)
Used by journalists, clippings agencies,
media monitors
Replacing internal systems at major
newspapers
Some examples
http://www.nla-clipshare.com
Newspaper Licensing Agency – NLA Clipshare
20 million newspaper stories
6500 users
Content from every major newspaper (and
most regionals)
Used by journalists, clippings agencies,
media monitors
Replacing internal systems at major
newspapers
One of very few ways to search content
from all the papers within hours of
publication
Some examples
Financial Times – press cuttings
Web Service for easy integration
XML source data
Faceted search
Area filters (whole article, body, headline,
byline or any combination)
Synonyms, spelling suggestions
http://presscuttings.ft.com
Some examples
Financial Times – press cuttings
Web Service for easy integration
XML source data
Faceted search
Area filters (whole article, body, headline,
byline or any combination)
Synonyms, spelling suggestions
Built from scratch in a fortnight
Designed as a prototype, scaled to
production use without significant change
http://presscuttings.ft.com
Some examples
Durrants Ltd. Media monitoring platform
Thousands of client search profiles
Hundreds of thousands of articles per day
Complex publication heirarchy
Established pipeline
Solution
Flexible query language allows OCR
errors, punctuation, fuzzy matching,
weighting
Supports features of previous engine
Scalable master-slave architecture
Some examples
Durrants Ltd. Media monitoring platform
Thousands of client search profiles
Hundreds of thousands of articles per day
Complex publication heirarchy
Established pipeline
Solution
Flexible query language allows OCR
errors, punctuation, fuzzy matching,
weighting
Supports features of previous engine
Scalable master-slave architecture
Accuracy improved in some cases from 95%
rejected to 95% accepted
Hardware budget 15% of previous system
Some examples
(Unnamed multinational radio suppliers)
Intranet search
12 million documents
Multiple formats – Office, PDF, HTML...
User and group-based security (LDAP)
Faceted search
Users can 'tag' interesting documents – for
example to identify a 'reference' version
Some examples
(Unnamed multinational radio suppliers)
Intranet search
12 million documents
Multiple formats – Office, PDF, HTML...
User and group-based security (LDAP)
Faceted search
Users can 'tag' interesting documents – for
example to identify a 'reference' version
Open source chosen because of significant
cost advantage – commercial solutions
uneconomic at this scale
A look at Lucene & Solr
Among the top 15 open source projects
Installations at over 4,000 companies
Downloads have grown nearly 10x over the past three
years
Over 7,000 downloads a day.
A look at Lucene & Solr
Among the top 15 open source projects
Installations at over 4,000 companies
Downloads have grown nearly 10x over the past three
years
Over 7,000 downloads a day.
USA based
Employs 9 out of 15 top Lucene committers
Offers training, consulting and up to 24x7
support
Developing value-add software
A look at Lucene & Solr
Among the top 15 open source projects
Installations at over 4,000 companies
Downloads have grown nearly 10x over the past three
years
Over 7,000 downloads a day.
USA based
Employs 9 out of 15 top Lucene committers
Offers training, consulting and up to 24x7
support
Developing value-add software
Flax are UK partners & resellers
Lucid Works Enterprise
Who are Lucid working with?
Some Lucene & Solr numbers
LinkedIn – 30 million users
Internet Archive – a billion indexed pages
Salesforce.com – 8 terabytes of searchable data
Twitter – a billion queries a day
Why open source search?
Flexible, extendable
Why open source search?
Flexible, extendable
Powerful & scalable
Why open source search?
Flexible, extendable
Powerful & scalable
Lower cost, especially when planning for growth
Why open source search?
Flexible, extendable
Powerful & scalable
Lower cost, especially when planning for growth
Commercial support available as necessary
Why open source search?
Flexible, extendable
Powerful & scalable
Lower cost, especially when planning for growth
Commercial support available as necessary
- Freedom to innovate
Looking to the future
Looking to the future
More and more content including social media
Looking to the future
More and more content including social media
Multiple delivery platforms
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Cloud computing
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Cloud computing
More use of entity extraction & sentiment analysis
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Cloud computing
More use of entity extraction & sentiment analysis
Search no longer a bolt-on, but a
platform for innovation
Looking to the future
More and more content including social media
Multiple delivery platforms
Search-powered applications
Cloud computing
More use of entity extraction & sentiment analysis
Search no longer a bolt-on, but a
platform for innovation
Open source no longer an outsider,
but the obvious choice
Thankyou!
Any questions?
charlie@flax.co.uk
www.flax.co.uk/blog
+44 (0) 8700 118334
Twitter: @FlaxSearch

Más contenido relacionado

La actualidad más candente

GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesMarin Dimitrov
 
A whirlwind tour of graph databases
A whirlwind tour of graph databasesA whirlwind tour of graph databases
A whirlwind tour of graph databasesjexp
 
Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)DevDays
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudMarin Dimitrov
 
Schema.org: Where did that come from!
Schema.org: Where did that come from!Schema.org: Where did that come from!
Schema.org: Where did that come from!Richard Wallis
 
GraphQL - The new "Lingua Franca" for API-Development
GraphQL - The new "Lingua Franca" for API-DevelopmentGraphQL - The new "Lingua Franca" for API-Development
GraphQL - The new "Lingua Franca" for API-Developmentjexp
 
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j
 
Intro to Cypher
Intro to CypherIntro to Cypher
Intro to CypherNeo4j
 
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiTimothy Spann
 
Contextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesContextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesRichard Wallis
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceMarin Dimitrov
 
Infrastructure crossroads... and the way we walked them in DKPro
Infrastructure crossroads... and the way we walked them in DKProInfrastructure crossroads... and the way we walked them in DKPro
Infrastructure crossroads... and the way we walked them in DKProopenminted_eu
 
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...Kevin Dias
 
Visual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactoryVisual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactoryPeter Haase
 
CodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National PoliceCodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National PoliceBert Jan Schrijver
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineLeigh Dodds
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationPeter Haase
 
The Kasabi Information Marketplace
The Kasabi Information MarketplaceThe Kasabi Information Marketplace
The Kasabi Information MarketplaceKnud Möller
 
Full Stack Graph in the Cloud
Full Stack Graph in the CloudFull Stack Graph in the Cloud
Full Stack Graph in the CloudNeo4j
 

La actualidad más candente (20)

Linked Data and OCLC
Linked Data and OCLCLinked Data and OCLC
Linked Data and OCLC
 
GraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL QueriesGraphDB Connectors – Powering Complex SPARQL Queries
GraphDB Connectors – Powering Complex SPARQL Queries
 
A whirlwind tour of graph databases
A whirlwind tour of graph databasesA whirlwind tour of graph databases
A whirlwind tour of graph databases
 
Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)Fire kit ios (r-baldwin)
Fire kit ios (r-baldwin)
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
Schema.org: Where did that come from!
Schema.org: Where did that come from!Schema.org: Where did that come from!
Schema.org: Where did that come from!
 
GraphQL - The new "Lingua Franca" for API-Development
GraphQL - The new "Lingua Franca" for API-DevelopmentGraphQL - The new "Lingua Franca" for API-Development
GraphQL - The new "Lingua Franca" for API-Development
 
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4jNeo4j-Databridge: Enterprise-scale ETL for Neo4j
Neo4j-Databridge: Enterprise-scale ETL for Neo4j
 
Intro to Cypher
Intro to CypherIntro to Cypher
Intro to Cypher
 
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFiReal-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
Real-time Twitter Sentiment Analysis and Image Recognition with Apache NiFi
 
Contextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of EntitiesContextual Computing - Knowledge Graphs & Web of Entities
Contextual Computing - Knowledge Graphs & Web of Entities
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-Service
 
Infrastructure crossroads... and the way we walked them in DKPro
Infrastructure crossroads... and the way we walked them in DKProInfrastructure crossroads... and the way we walked them in DKPro
Infrastructure crossroads... and the way we walked them in DKPro
 
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
Getting the Most out of Your Translation Memories (TM-Town ProZ Webinar April...
 
Visual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactoryVisual Ontology Modeling for Domain Experts and Business Users with metaphactory
Visual Ontology Modeling for Domain Experts and Business Users with metaphactory
 
CodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National PoliceCodeOne 2018 - Microservices in action at the Dutch National Police
CodeOne 2018 - Microservices in action at the Dutch National Police
 
Talis Platform: A Linked Data Engine
Talis Platform: A Linked Data EngineTalis Platform: A Linked Data Engine
Talis Platform: A Linked Data Engine
 
Ephedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federationEphedra: efficiently combining RDF data and services using SPARQL federation
Ephedra: efficiently combining RDF data and services using SPARQL federation
 
The Kasabi Information Marketplace
The Kasabi Information MarketplaceThe Kasabi Information Marketplace
The Kasabi Information Marketplace
 
Full Stack Graph in the Cloud
Full Stack Graph in the CloudFull Stack Graph in the Cloud
Full Stack Graph in the Cloud
 

Similar a Flax ovum search-across_the_enterprise

Republica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wildRepublica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wildAcquia
 
National Archives of Norway - AtoM and Archivematica intro workshop
National Archives of Norway - AtoM and Archivematica intro workshopNational Archives of Norway - AtoM and Archivematica intro workshop
National Archives of Norway - AtoM and Archivematica intro workshopArtefactual Systems - AtoM
 
Open Source Movement
Open Source MovementOpen Source Movement
Open Source MovementMesut Yılmaz
 
Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Neeraj Agarwal
 
Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Neeraj Agarwal
 
20080602 Microsoft and Open Source
20080602 Microsoft and Open Source20080602 Microsoft and Open Source
20080602 Microsoft and Open SourceDavid Chou
 
Prospero: A Web-based Document Delivery System
Prospero: A Web-based Document Delivery SystemProspero: A Web-based Document Delivery System
Prospero: A Web-based Document Delivery SystemEric Schnell
 
Koha Presentation at Uttara University
Koha Presentation at Uttara UniversityKoha Presentation at Uttara University
Koha Presentation at Uttara UniversityNur Ahammad
 
Opensource development and apache software foundation
Opensource development and apache software foundationOpensource development and apache software foundation
Opensource development and apache software foundationEran Chinthaka Withana
 
Open source 101
Open source 101Open source 101
Open source 101Tom Rieger
 
Cilip Seminar 6th October - Integrating With Open Source
Cilip Seminar 6th October - Integrating With Open SourceCilip Seminar 6th October - Integrating With Open Source
Cilip Seminar 6th October - Integrating With Open SourceJonathan Field
 
Day 2-presentation
Day 2-presentationDay 2-presentation
Day 2-presentationDeb Forsten
 
Open Source Software R
Open Source Software ROpen Source Software R
Open Source Software Rmsimanau7824
 

Similar a Flax ovum search-across_the_enterprise (20)

Republica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wildRepublica 2014 open-source_in_the_wild
Republica 2014 open-source_in_the_wild
 
Workshop slides - Introduction to AtoM and Archivematica
Workshop slides - Introduction to AtoM and ArchivematicaWorkshop slides - Introduction to AtoM and Archivematica
Workshop slides - Introduction to AtoM and Archivematica
 
National Archives of Norway - AtoM and Archivematica intro workshop
National Archives of Norway - AtoM and Archivematica intro workshopNational Archives of Norway - AtoM and Archivematica intro workshop
National Archives of Norway - AtoM and Archivematica intro workshop
 
Opensource
OpensourceOpensource
Opensource
 
Open Source Movement
Open Source MovementOpen Source Movement
Open Source Movement
 
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access SeminarWhy we need oa infrastructure - STM Association Beyond Open Access Seminar
Why we need oa infrastructure - STM Association Beyond Open Access Seminar
 
Artefactual and Open Source Development
Artefactual and Open Source DevelopmentArtefactual and Open Source Development
Artefactual and Open Source Development
 
Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?
 
Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?Why Open Source, I have Microsoft ?
Why Open Source, I have Microsoft ?
 
20080602 Microsoft and Open Source
20080602 Microsoft and Open Source20080602 Microsoft and Open Source
20080602 Microsoft and Open Source
 
Open Source & Open Development
Open Source & Open Development Open Source & Open Development
Open Source & Open Development
 
Open Source Software: A Study
Open Source Software: A StudyOpen Source Software: A Study
Open Source Software: A Study
 
Prospero: A Web-based Document Delivery System
Prospero: A Web-based Document Delivery SystemProspero: A Web-based Document Delivery System
Prospero: A Web-based Document Delivery System
 
Koha Presentation at Uttara University
Koha Presentation at Uttara UniversityKoha Presentation at Uttara University
Koha Presentation at Uttara University
 
Opensource development and apache software foundation
Opensource development and apache software foundationOpensource development and apache software foundation
Opensource development and apache software foundation
 
Open source 101
Open source 101Open source 101
Open source 101
 
Cilip Seminar 6th October - Integrating With Open Source
Cilip Seminar 6th October - Integrating With Open SourceCilip Seminar 6th October - Integrating With Open Source
Cilip Seminar 6th October - Integrating With Open Source
 
Day 2-presentation
Day 2-presentationDay 2-presentation
Day 2-presentation
 
Open source: Making connections by Sunny Pai
Open source: Making connections by Sunny PaiOpen source: Making connections by Sunny Pai
Open source: Making connections by Sunny Pai
 
Open Source Software R
Open Source Software ROpen Source Software R
Open Source Software R
 

Más de Charlie Hull

Lucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesLucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesCharlie Hull
 
Making sense of big data
Making sense of big dataMaking sense of big data
Making sense of big dataCharlie Hull
 
Search Solutions 2015: Towards a new model of search relevance testing
Search Solutions 2015:  Towards a new model of search relevance testingSearch Solutions 2015:  Towards a new model of search relevance testing
Search Solutions 2015: Towards a new model of search relevance testingCharlie Hull
 
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015Charlie Hull
 
Bio solr building a better search for bioinformatics
Bio solr   building a better search for bioinformaticsBio solr   building a better search for bioinformatics
Bio solr building a better search for bioinformaticsCharlie Hull
 
Solr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studySolr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studyCharlie Hull
 

Más de Charlie Hull (6)

Lucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challengesLucene, Solr and java 9 - opportunities and challenges
Lucene, Solr and java 9 - opportunities and challenges
 
Making sense of big data
Making sense of big dataMaking sense of big data
Making sense of big data
 
Search Solutions 2015: Towards a new model of search relevance testing
Search Solutions 2015:  Towards a new model of search relevance testingSearch Solutions 2015:  Towards a new model of search relevance testing
Search Solutions 2015: Towards a new model of search relevance testing
 
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
BioSolr - Searching the stuff of life - Lucene/Solr Revolution 2015
 
Bio solr building a better search for bioinformatics
Bio solr   building a better search for bioinformaticsBio solr   building a better search for bioinformatics
Bio solr building a better search for bioinformatics
 
Solr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance studySolr and Elasticsearch, a performance study
Solr and Elasticsearch, a performance study
 

Último

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityWSO2
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Angeliki Cooney
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsNanddeep Nachan
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 

Último (20)

Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Platformless Horizons for Digital Adaptability
Platformless Horizons for Digital AdaptabilityPlatformless Horizons for Digital Adaptability
Platformless Horizons for Digital Adaptability
 
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
Biography Of Angeliki Cooney | Senior Vice President Life Sciences | Albany, ...
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
MS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectorsMS Copilot expands with MS Graph connectors
MS Copilot expands with MS Graph connectors
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 

Flax ovum search-across_the_enterprise

  • 1. Open Source Search for the Enterprise Charlie Hull Managing Director, Flax 3rd November 2010 OVUM Briefing, Search Across the Enterprise charlie@flax.co.uk www.flax.co.uk/blog +44 (0) 8700 118334 Twitter: @FlaxSearch
  • 2. Search engine specialists with decades of experience Developers, innovators and strategists Based in Cambridge, UK Technology agnostic – but open source exponents Recently selected as UK Authorized Partner by Lucid Imagination Customers include Mydeco, NLA, Durrants Ltd, Financial Times, MediaMiser, MySkreen, Accenture, University of Cambridge Recently asked to present at British Computer Society and Lucene Revolution conferences Who are Flax?
  • 3. “Open-source software (OSS) is computer software that is available in source code form for which the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, and improve the software. […] Some open source software is available within the public domain” (Wikipedia) What is open source?
  • 4. “Open-source software (OSS) is computer software that is available in source code form for which the source code and certain other rights normally reserved for copyright holders are provided under a software license that permits users to study, change, and improve the software. […] Some open source software is available within the public domain” (Wikipedia) What is open source?
  • 5. It's the work of amateur developers Myths about open source
  • 6. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Myths about open source
  • 7. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Open source software isn't reliable or scalable Myths about open source
  • 8. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Open source software isn't reliable or scalable It's free Myths about open source
  • 9. It's the work of amateur developers If I use open source, I have to open up my software/servers/network to all and sundry Open source software isn't reliable or scalable It's free It's unsupported Myths about open source
  • 10. Open source search software Apache Lucene and Solr are trademarks of The Apache Software Foundation - Flexible licensing - Vector space model - Java and other languages - Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
  • 11. Open source search software Apache Lucene and Solr are trademarks of The Apache Software Foundation - The successor to Muscat - Bayesian probabilistic ranking - C/C++ with language bindings - Highly accurate & scalable - Flexible licensing - Vector space model - Java and other languages - Well known and supportedApache Lucene and Solr are trademarks of The Apache Software Foundation
  • 12. Open source search software Apache Lucene and Solr are trademarks of The Apache Software Foundation - The successor to Muscat - Bayesian probabilistic ranking - C/C++ with language bindings - Highly accurate & scalable - Flexible licensing - Vector space model - Java and other languages - Well known and supported And more.... Apache Lucene and Solr are trademarks of The Apache Software Foundation
  • 13. Some examples http://www.nla-clipshare.com Newspaper Licensing Agency – NLA Clipshare 20 million newspaper stories 6500 users Content from every major newspaper (and most regionals) Used by journalists, clippings agencies, media monitors Replacing internal systems at major newspapers
  • 14. Some examples http://www.nla-clipshare.com Newspaper Licensing Agency – NLA Clipshare 20 million newspaper stories 6500 users Content from every major newspaper (and most regionals) Used by journalists, clippings agencies, media monitors Replacing internal systems at major newspapers One of very few ways to search content from all the papers within hours of publication
  • 15.
  • 16.
  • 17.
  • 18. Some examples Financial Times – press cuttings Web Service for easy integration XML source data Faceted search Area filters (whole article, body, headline, byline or any combination) Synonyms, spelling suggestions http://presscuttings.ft.com
  • 19. Some examples Financial Times – press cuttings Web Service for easy integration XML source data Faceted search Area filters (whole article, body, headline, byline or any combination) Synonyms, spelling suggestions Built from scratch in a fortnight Designed as a prototype, scaled to production use without significant change http://presscuttings.ft.com
  • 20.
  • 21. Some examples Durrants Ltd. Media monitoring platform Thousands of client search profiles Hundreds of thousands of articles per day Complex publication heirarchy Established pipeline Solution Flexible query language allows OCR errors, punctuation, fuzzy matching, weighting Supports features of previous engine Scalable master-slave architecture
  • 22. Some examples Durrants Ltd. Media monitoring platform Thousands of client search profiles Hundreds of thousands of articles per day Complex publication heirarchy Established pipeline Solution Flexible query language allows OCR errors, punctuation, fuzzy matching, weighting Supports features of previous engine Scalable master-slave architecture Accuracy improved in some cases from 95% rejected to 95% accepted Hardware budget 15% of previous system
  • 23. Some examples (Unnamed multinational radio suppliers) Intranet search 12 million documents Multiple formats – Office, PDF, HTML... User and group-based security (LDAP) Faceted search Users can 'tag' interesting documents – for example to identify a 'reference' version
  • 24. Some examples (Unnamed multinational radio suppliers) Intranet search 12 million documents Multiple formats – Office, PDF, HTML... User and group-based security (LDAP) Faceted search Users can 'tag' interesting documents – for example to identify a 'reference' version Open source chosen because of significant cost advantage – commercial solutions uneconomic at this scale
  • 25. A look at Lucene & Solr Among the top 15 open source projects Installations at over 4,000 companies Downloads have grown nearly 10x over the past three years Over 7,000 downloads a day.
  • 26. A look at Lucene & Solr Among the top 15 open source projects Installations at over 4,000 companies Downloads have grown nearly 10x over the past three years Over 7,000 downloads a day. USA based Employs 9 out of 15 top Lucene committers Offers training, consulting and up to 24x7 support Developing value-add software
  • 27. A look at Lucene & Solr Among the top 15 open source projects Installations at over 4,000 companies Downloads have grown nearly 10x over the past three years Over 7,000 downloads a day. USA based Employs 9 out of 15 top Lucene committers Offers training, consulting and up to 24x7 support Developing value-add software Flax are UK partners & resellers
  • 29. Who are Lucid working with?
  • 30. Some Lucene & Solr numbers LinkedIn – 30 million users Internet Archive – a billion indexed pages Salesforce.com – 8 terabytes of searchable data Twitter – a billion queries a day
  • 31. Why open source search? Flexible, extendable
  • 32. Why open source search? Flexible, extendable Powerful & scalable
  • 33. Why open source search? Flexible, extendable Powerful & scalable Lower cost, especially when planning for growth
  • 34. Why open source search? Flexible, extendable Powerful & scalable Lower cost, especially when planning for growth Commercial support available as necessary
  • 35. Why open source search? Flexible, extendable Powerful & scalable Lower cost, especially when planning for growth Commercial support available as necessary - Freedom to innovate
  • 36. Looking to the future
  • 37. Looking to the future More and more content including social media
  • 38. Looking to the future More and more content including social media Multiple delivery platforms
  • 39. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications
  • 40. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing
  • 41. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing More use of entity extraction & sentiment analysis
  • 42. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing More use of entity extraction & sentiment analysis Search no longer a bolt-on, but a platform for innovation
  • 43. Looking to the future More and more content including social media Multiple delivery platforms Search-powered applications Cloud computing More use of entity extraction & sentiment analysis Search no longer a bolt-on, but a platform for innovation Open source no longer an outsider, but the obvious choice