SlideShare una empresa de Scribd logo
1 de 27
Descargar para leer sin conexión
Business Intelligence
Semantic Search in SQL Server 2012
• Semantic search seeks to improve search
accuracy by understanding searcher intent
and the contextual meaning of terms as they
appear in the searchable dataspace.
What is Semantic Search
• Built on top of Full-Text Search
• Requires predefined external Database
• That database should be attached to SQL
Server Instance
• Semantic Search should be configured to use
that Database
Semantic Search in SQL Server 2012
• Exists in all Commercial editions of SQL Server
2012
• Also in SQL Server 2012 Express Advanced
Services Edition
Supported in SQL Server Editions
Semantic Search Installation 1/3
Semantic Search Installation 2/3
Semantic Search Installation 3/3
-- do not use sp_attach_db stored procedure
-- it is obsolete
CREATE DATABASE SemanticsDB
ON (FILENAME = N'C:Program FilesMicrosoft
Semantic Language DatabasesemanticsDB.mdf')
LOG ON (FILENAME = 'C:Program FilesMicrosoft
Semantic Language Databasesemanticsdb_log.ldf')
FOR ATTACH;
GO
Attach Semantics DB
-- Register Semantics Languages Database
-- required once
EXEC
sp_fulltext_semantic_register_language_statisti
cs_db @dbname = N'SemanticsDB';
GO
Register Semantics DB
-- Verify the registration is succeeded
SELECT * FROM
sys.fulltext_semantic_language_statistics_database;
GO
Verify Registration
-- Check available languages for statistical semantic extraction
SELECT * FROM sys.fulltext_semantic_languages;
GO
Supported Languages
Demo
How to Enable On Table
-- Reload filters (iFilter) and restart fulltext
-- host process if needed
EXEC sp_fulltext_service 'load_os_resources', 1;
EXEC sp_fulltext_service 'restart_all_fdhosts';
GO
Restart Processes
Full-Text Search
• Supports character-based columns:
1. char
2. varchar
3. nchar
4. nvarchar
5. text
6. ntext
7. image
8. xml
9. varbinary (max)
10. FileStream
Text
Full-Text Queries Specifics
• Full-text queries are not case-sensitive searching for
"Aluminum" or "aluminum" returns the same results
• Transact-SQL predicates:
– CONTAINS
– FREETEXT
• Transact-SQL functions:
– CONTAINSTABLE
– FREETEXTTABLE
Text
SELECT * FROM sys.fulltext_document_types;
File types supported by iFilters
Three Tabular Functions:
• SemanticKeyPhraseTable - returns the statistically
significant phrases in each document
• SemanticSimilarityTable – returns documents or
rows that are similar or related, based on the key
phrases in each document
• SemanticSimilarityDetailsTable – returns the key
phrases that explain why two documents were
identified as similar
Semantic Search Functions
-- select Full-Text Catalog items count
SELECT FulltextCatalogProperty
('FullTextCatalog', 'itemcount');
GO
Full-Text Catalog Items Count
-- check Population progress
SELECT fulltextcatalogproperty('FullTextCatalog', 'populatestatus');
GO
• 0 = Idle
• 1 = Full population in progress
• 2 = Paused
• 3 = Throttled
• 4 = Recovering
• 5 = Shutdown
• 6 = Incremental population in progress
• 7 = Building index
• 8 = Disk is full. Paused.
• 9 = Change tracking
Full-Text Catalog Population Status
-- Get all key phrases in the entire corpus
SELECT
K.score, K.keyphrase, COUNT(D.stream_id) AS Occurrences
FROM SemanticKeyPhraseTable
(dbo.Documents, (name, file_stream)) AS K
INNER JOIN dbo.Documents AS D
ON D.path_locator = K.document_key
GROUP BY K.score, K.keyphrase
ORDER BY K.score DESC, K.keyphrase ASC;
GO
Get all Key Phrases
-- Find documents by keyphrase – ‘sql’ in the case below
SELECT
K.score, K.keyphrase,
D.stream_id, D.name, D.file_type, D.cached_file_size,
D.creation_time, D.last_write_time, D.last_access_time
FROM dbo.Documents D
INNER JOIN semantickeyphrasetable (
dbo.Documents,
(name, file_stream)
) AS K
ON D.path_locator = K.document_key
WHERE K.keyphrase = N'sql'
ORDER BY K.score DESC;
Find Documents by Key phrase
-- find similar documents
DECLARE @Title NVARCHAR(1000) = (SELECT'Gurevich Vladimir.docx');
DECLARE @DocID HIERARCHYID =
(SELECT path_locator FROM dbo.Documents WHERE name = @Title);
SELECT
@Title AS source_title, D.name AS matched_title,
D.stream_id, K.score
FROM SemanticSimilarityTable(dbo.Documents, *, @DocID) AS K
INNER JOIN dbo.Documents AS D
ON D.path_locator = K.matched_document_key
ORDER BY K.score DESC;
GO
Find Similar Documents
-- find out Key Phrases that make two documents match
DECLARE @SourceTitle NVARCHAR(1000) = (SELECT ‘source.docx');
DECLARE @MatchedTitle NVARCHAR(1000) = (SELECT ‘target.docx');
DECLARE @SourceDocID HIERARCHYID =
(SELECT path_locator FROM dbo.Documents WHERE name = @SourceTitle);
DECLARE @MatchedDocID HIERARCHYID =
(SELECT path_locator FROM dbo.Documents WHERE name = @MatchedTitle);
SELECT
K.keyphrase, K.score, @SourceTitle AS source_title, @MatchedTitle AS matched_title
FROM SemanticSimilarityDetailsTable(dbo.Documents, file_stream, @SourceDocID,
file_stream, @MatchedDocID) AS K
ORDER BY K.score DESC;
GO
Why 2 Documents Are Similar
• The generic NEAR operator is deprecated in SQLServer2012
• It is a new operator and not an extension of the existing NEAR
operator
• Lets to query with 2 optional requirements that you could not
previously specify
1. The maximum gap between the search terms
2. The order of the search terms - for example, “John” must appear
before “Smith”
• Stopwords or noise words are included in the gap count.
CONTAINSTABLE(Documents, Content, ‘NEAR((John, Smith), 4, TRUE)’);
Full-Text Search NEAR Operator 1/2
• -- get documents that contain keywords "sql"
and "server" nearby
• SELECT D.name,
file_stream.GetFileNamespacePath() AS
relative_path
• FROM dbo.Documents D
• WHERE CONTAINS(file_stream, 'NEAR(("sql",
"server"), 1, FALSE)');
• GO
Full-Text Search NEAR Operator 2/2
-- get documents that contain keywords "sql" and
"server" nearby
SELECT D.name,
file_stream.GetFileNamespacePath() AS
relative_path
FROM dbo.Documents D
WHERE CONTAINS
(file_stream, 'NEAR(("sql", "server"), 1, FALSE)');
GO
Full-Text Search in Documents
• Full Text Catalog depend on language selected
Problems

Más contenido relacionado

La actualidad más candente

Database programming in vb net
Database programming in vb netDatabase programming in vb net
Database programming in vb net
Zishan yousaf
 
Database Architecture and Basic Concepts
Database Architecture and Basic ConceptsDatabase Architecture and Basic Concepts
Database Architecture and Basic Concepts
Tony Wong
 
Web based database application design using vb.net and sql server
Web based database application design using vb.net and sql serverWeb based database application design using vb.net and sql server
Web based database application design using vb.net and sql server
Ammara Arooj
 
Александр Третьяков: "Spring Data JPA and MongoDB"
Александр Третьяков: "Spring Data JPA and MongoDB" Александр Третьяков: "Spring Data JPA and MongoDB"
Александр Третьяков: "Spring Data JPA and MongoDB"
Anna Shymchenko
 
SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013
SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013
SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013
Owen Allen
 

La actualidad más candente (20)

Oracle by Muhammad Iqbal
Oracle by Muhammad IqbalOracle by Muhammad Iqbal
Oracle by Muhammad Iqbal
 
Database programming in vb net
Database programming in vb netDatabase programming in vb net
Database programming in vb net
 
SQL Server 2008 Overview
SQL Server 2008 OverviewSQL Server 2008 Overview
SQL Server 2008 Overview
 
3. ADO.NET
3. ADO.NET3. ADO.NET
3. ADO.NET
 
Chapter 6(introduction to documnet databse) no sql for mere mortals
Chapter 6(introduction to documnet databse) no sql for mere mortalsChapter 6(introduction to documnet databse) no sql for mere mortals
Chapter 6(introduction to documnet databse) no sql for mere mortals
 
Csci12 report aug18
Csci12 report aug18Csci12 report aug18
Csci12 report aug18
 
Sql Server Basics
Sql Server BasicsSql Server Basics
Sql Server Basics
 
ADO CONTROLS - Database usage
ADO CONTROLS - Database usageADO CONTROLS - Database usage
ADO CONTROLS - Database usage
 
Sql server basics
Sql server basicsSql server basics
Sql server basics
 
Introduction à DocumentDB
Introduction à DocumentDBIntroduction à DocumentDB
Introduction à DocumentDB
 
NoSQL and MapReduce
NoSQL and MapReduceNoSQL and MapReduce
NoSQL and MapReduce
 
Ado.net
Ado.netAdo.net
Ado.net
 
Database Architecture and Basic Concepts
Database Architecture and Basic ConceptsDatabase Architecture and Basic Concepts
Database Architecture and Basic Concepts
 
NOSQL and MongoDB Database
NOSQL and MongoDB DatabaseNOSQL and MongoDB Database
NOSQL and MongoDB Database
 
Web based database application design using vb.net and sql server
Web based database application design using vb.net and sql serverWeb based database application design using vb.net and sql server
Web based database application design using vb.net and sql server
 
Александр Третьяков: "Spring Data JPA and MongoDB"
Александр Третьяков: "Spring Data JPA and MongoDB" Александр Третьяков: "Spring Data JPA and MongoDB"
Александр Третьяков: "Spring Data JPA and MongoDB"
 
MS SQL Server
MS SQL ServerMS SQL Server
MS SQL Server
 
SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013
SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013
SharePoint Metadata and Search Refiners SPSUTAH2013 Jan 2013
 
MDF and LDF in SQL Server
MDF and LDF in SQL ServerMDF and LDF in SQL Server
MDF and LDF in SQL Server
 
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
Structured Query Language (SQL) - Lecture 5 - Introduction to Databases (1007...
 

Similar a SQL Server 2012 - Semantic Search

ASP.Net Presentation Part2
ASP.Net Presentation Part2ASP.Net Presentation Part2
ASP.Net Presentation Part2
Neeraj Mathur
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development Tutorial
Erik Hatcher
 
Data Access Options in SharePoint 2010
Data Access Options in SharePoint 2010Data Access Options in SharePoint 2010
Data Access Options in SharePoint 2010
Rob Windsor
 
Learn PHP Lacture2
Learn PHP Lacture2Learn PHP Lacture2
Learn PHP Lacture2
ADARSH BHATT
 

Similar a SQL Server 2012 - Semantic Search (20)

PHP and MySQL.pptx
PHP and MySQL.pptxPHP and MySQL.pptx
PHP and MySQL.pptx
 
Dynamic Publishing with Arbortext Data Merge
Dynamic Publishing with Arbortext Data MergeDynamic Publishing with Arbortext Data Merge
Dynamic Publishing with Arbortext Data Merge
 
Hands On Spring Data
Hands On Spring DataHands On Spring Data
Hands On Spring Data
 
ASP.Net Presentation Part2
ASP.Net Presentation Part2ASP.Net Presentation Part2
ASP.Net Presentation Part2
 
Graph db as metastore
Graph db as metastoreGraph db as metastore
Graph db as metastore
 
Full Text Search In PostgreSQL
Full Text Search In PostgreSQLFull Text Search In PostgreSQL
Full Text Search In PostgreSQL
 
Solr Application Development Tutorial
Solr Application Development TutorialSolr Application Development Tutorial
Solr Application Development Tutorial
 
Data Access Options in SharePoint 2010
Data Access Options in SharePoint 2010Data Access Options in SharePoint 2010
Data Access Options in SharePoint 2010
 
Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)Introducing U-SQL (SQLPASS 2016)
Introducing U-SQL (SQLPASS 2016)
 
Learn PHP Lacture2
Learn PHP Lacture2Learn PHP Lacture2
Learn PHP Lacture2
 
Local storage in Web apps
Local storage in Web appsLocal storage in Web apps
Local storage in Web apps
 
07 darwino rest services
07   darwino rest services07   darwino rest services
07 darwino rest services
 
R data interfaces
R data interfacesR data interfaces
R data interfaces
 
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
Interactive Analytics on AWS - AWS Summit Tel Aviv 2017
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
Houston tech fest dev intro to sharepoint search
Houston tech fest   dev intro to sharepoint searchHouston tech fest   dev intro to sharepoint search
Houston tech fest dev intro to sharepoint search
 
Introduction to Amazon Athena
Introduction to Amazon AthenaIntroduction to Amazon Athena
Introduction to Amazon Athena
 
Spark sql
Spark sqlSpark sql
Spark sql
 
La sql
La sqlLa sql
La sql
 
Entity Framework Database and Code First
Entity Framework Database and Code FirstEntity Framework Database and Code First
Entity Framework Database and Code First
 

Más de Sperasoft

Más de Sperasoft (20)

особенности работы с Locomotion в Unreal Engine 4
особенности работы с Locomotion в Unreal Engine 4особенности работы с Locomotion в Unreal Engine 4
особенности работы с Locomotion в Unreal Engine 4
 
концепт и архитектура геймплея в Creach: The Depleted World
концепт и архитектура геймплея в Creach: The Depleted Worldконцепт и архитектура геймплея в Creach: The Depleted World
концепт и архитектура геймплея в Creach: The Depleted World
 
Опыт разработки VR игры для UE4
Опыт разработки VR игры для UE4Опыт разработки VR игры для UE4
Опыт разработки VR игры для UE4
 
Организация работы с UE4 в команде до 20 человек
Организация работы с UE4 в команде до 20 человек Организация работы с UE4 в команде до 20 человек
Организация работы с UE4 в команде до 20 человек
 
Gameplay Tags
Gameplay TagsGameplay Tags
Gameplay Tags
 
Data Driven Gameplay in UE4
Data Driven Gameplay in UE4Data Driven Gameplay in UE4
Data Driven Gameplay in UE4
 
Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks Code and Memory Optimisation Tricks
Code and Memory Optimisation Tricks
 
The theory of relational databases
The theory of relational databasesThe theory of relational databases
The theory of relational databases
 
Automated layout testing using Galen Framework
Automated layout testing using Galen FrameworkAutomated layout testing using Galen Framework
Automated layout testing using Galen Framework
 
Sperasoft talks: Android Security Threats
Sperasoft talks: Android Security ThreatsSperasoft talks: Android Security Threats
Sperasoft talks: Android Security Threats
 
Sperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft Talks: RxJava Functional Reactive Programming on AndroidSperasoft Talks: RxJava Functional Reactive Programming on Android
Sperasoft Talks: RxJava Functional Reactive Programming on Android
 
Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015Sperasoft‬ talks j point 2015
Sperasoft‬ talks j point 2015
 
Effective Мeetings
Effective МeetingsEffective Мeetings
Effective Мeetings
 
Unreal Engine 4 Introduction
Unreal Engine 4 IntroductionUnreal Engine 4 Introduction
Unreal Engine 4 Introduction
 
JIRA Development
JIRA DevelopmentJIRA Development
JIRA Development
 
Introduction to Elasticsearch
Introduction to ElasticsearchIntroduction to Elasticsearch
Introduction to Elasticsearch
 
MOBILE DEVELOPMENT with HTML, CSS and JS
MOBILE DEVELOPMENT with HTML, CSS and JSMOBILE DEVELOPMENT with HTML, CSS and JS
MOBILE DEVELOPMENT with HTML, CSS and JS
 
Quick Intro Into Kanban
Quick Intro Into KanbanQuick Intro Into Kanban
Quick Intro Into Kanban
 
ECMAScript 6 Review
ECMAScript 6 ReviewECMAScript 6 Review
ECMAScript 6 Review
 
Console Development in 15 minutes
Console Development in 15 minutesConsole Development in 15 minutes
Console Development in 15 minutes
 

Último

Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
UXDXConf
 

Último (20)

SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
SOQL 201 for Admins & Developers: Slice & Dice Your Org’s Data With Aggregate...
 
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdfSimplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
Simplified FDO Manufacturing Flow with TPMs _ Liam at Infineon.pdf
 
IESVE for Early Stage Design and Planning
IESVE for Early Stage Design and PlanningIESVE for Early Stage Design and Planning
IESVE for Early Stage Design and Planning
 
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi IbrahimzadeFree and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
Free and Effective: Making Flows Publicly Accessible, Yumi Ibrahimzade
 
Optimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through ObservabilityOptimizing NoSQL Performance Through Observability
Optimizing NoSQL Performance Through Observability
 
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdfThe Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
The Value of Certifying Products for FDO _ Paul at FIDO Alliance.pdf
 
AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101AI presentation and introduction - Retrieval Augmented Generation RAG 101
AI presentation and introduction - Retrieval Augmented Generation RAG 101
 
The Metaverse: Are We There Yet?
The  Metaverse:    Are   We  There  Yet?The  Metaverse:    Are   We  There  Yet?
The Metaverse: Are We There Yet?
 
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
Integrating Telephony Systems with Salesforce: Insights and Considerations, B...
 
What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024What's New in Teams Calling, Meetings and Devices April 2024
What's New in Teams Calling, Meetings and Devices April 2024
 
Designing for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at ComcastDesigning for Hardware Accessibility at Comcast
Designing for Hardware Accessibility at Comcast
 
Structuring Teams and Portfolios for Success
Structuring Teams and Portfolios for SuccessStructuring Teams and Portfolios for Success
Structuring Teams and Portfolios for Success
 
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdfHow Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
How Red Hat Uses FDO in Device Lifecycle _ Costin and Vitaliy at Red Hat.pdf
 
THE BEST IPTV in GERMANY for 2024: IPTVreel
THE BEST IPTV in  GERMANY for 2024: IPTVreelTHE BEST IPTV in  GERMANY for 2024: IPTVreel
THE BEST IPTV in GERMANY for 2024: IPTVreel
 
Connecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAKConnecting the Dots in Product Design at KAYAK
Connecting the Dots in Product Design at KAYAK
 
Google I/O Extended 2024 Warsaw
Google I/O Extended 2024 WarsawGoogle I/O Extended 2024 Warsaw
Google I/O Extended 2024 Warsaw
 
Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024Extensible Python: Robustness through Addition - PyCon 2024
Extensible Python: Robustness through Addition - PyCon 2024
 
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya HalderCustom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
Custom Approval Process: A New Perspective, Pavel Hrbacek & Anindya Halder
 
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
FDO for Camera, Sensor and Networking Device – Commercial Solutions from VinC...
 
WebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM PerformanceWebAssembly is Key to Better LLM Performance
WebAssembly is Key to Better LLM Performance
 

SQL Server 2012 - Semantic Search

  • 2. • Semantic search seeks to improve search accuracy by understanding searcher intent and the contextual meaning of terms as they appear in the searchable dataspace. What is Semantic Search
  • 3. • Built on top of Full-Text Search • Requires predefined external Database • That database should be attached to SQL Server Instance • Semantic Search should be configured to use that Database Semantic Search in SQL Server 2012
  • 4. • Exists in all Commercial editions of SQL Server 2012 • Also in SQL Server 2012 Express Advanced Services Edition Supported in SQL Server Editions
  • 8. -- do not use sp_attach_db stored procedure -- it is obsolete CREATE DATABASE SemanticsDB ON (FILENAME = N'C:Program FilesMicrosoft Semantic Language DatabasesemanticsDB.mdf') LOG ON (FILENAME = 'C:Program FilesMicrosoft Semantic Language Databasesemanticsdb_log.ldf') FOR ATTACH; GO Attach Semantics DB
  • 9. -- Register Semantics Languages Database -- required once EXEC sp_fulltext_semantic_register_language_statisti cs_db @dbname = N'SemanticsDB'; GO Register Semantics DB
  • 10. -- Verify the registration is succeeded SELECT * FROM sys.fulltext_semantic_language_statistics_database; GO Verify Registration
  • 11. -- Check available languages for statistical semantic extraction SELECT * FROM sys.fulltext_semantic_languages; GO Supported Languages
  • 12. Demo How to Enable On Table
  • 13. -- Reload filters (iFilter) and restart fulltext -- host process if needed EXEC sp_fulltext_service 'load_os_resources', 1; EXEC sp_fulltext_service 'restart_all_fdhosts'; GO Restart Processes
  • 14. Full-Text Search • Supports character-based columns: 1. char 2. varchar 3. nchar 4. nvarchar 5. text 6. ntext 7. image 8. xml 9. varbinary (max) 10. FileStream Text
  • 15. Full-Text Queries Specifics • Full-text queries are not case-sensitive searching for "Aluminum" or "aluminum" returns the same results • Transact-SQL predicates: – CONTAINS – FREETEXT • Transact-SQL functions: – CONTAINSTABLE – FREETEXTTABLE Text
  • 16. SELECT * FROM sys.fulltext_document_types; File types supported by iFilters
  • 17. Three Tabular Functions: • SemanticKeyPhraseTable - returns the statistically significant phrases in each document • SemanticSimilarityTable – returns documents or rows that are similar or related, based on the key phrases in each document • SemanticSimilarityDetailsTable – returns the key phrases that explain why two documents were identified as similar Semantic Search Functions
  • 18. -- select Full-Text Catalog items count SELECT FulltextCatalogProperty ('FullTextCatalog', 'itemcount'); GO Full-Text Catalog Items Count
  • 19. -- check Population progress SELECT fulltextcatalogproperty('FullTextCatalog', 'populatestatus'); GO • 0 = Idle • 1 = Full population in progress • 2 = Paused • 3 = Throttled • 4 = Recovering • 5 = Shutdown • 6 = Incremental population in progress • 7 = Building index • 8 = Disk is full. Paused. • 9 = Change tracking Full-Text Catalog Population Status
  • 20. -- Get all key phrases in the entire corpus SELECT K.score, K.keyphrase, COUNT(D.stream_id) AS Occurrences FROM SemanticKeyPhraseTable (dbo.Documents, (name, file_stream)) AS K INNER JOIN dbo.Documents AS D ON D.path_locator = K.document_key GROUP BY K.score, K.keyphrase ORDER BY K.score DESC, K.keyphrase ASC; GO Get all Key Phrases
  • 21. -- Find documents by keyphrase – ‘sql’ in the case below SELECT K.score, K.keyphrase, D.stream_id, D.name, D.file_type, D.cached_file_size, D.creation_time, D.last_write_time, D.last_access_time FROM dbo.Documents D INNER JOIN semantickeyphrasetable ( dbo.Documents, (name, file_stream) ) AS K ON D.path_locator = K.document_key WHERE K.keyphrase = N'sql' ORDER BY K.score DESC; Find Documents by Key phrase
  • 22. -- find similar documents DECLARE @Title NVARCHAR(1000) = (SELECT'Gurevich Vladimir.docx'); DECLARE @DocID HIERARCHYID = (SELECT path_locator FROM dbo.Documents WHERE name = @Title); SELECT @Title AS source_title, D.name AS matched_title, D.stream_id, K.score FROM SemanticSimilarityTable(dbo.Documents, *, @DocID) AS K INNER JOIN dbo.Documents AS D ON D.path_locator = K.matched_document_key ORDER BY K.score DESC; GO Find Similar Documents
  • 23. -- find out Key Phrases that make two documents match DECLARE @SourceTitle NVARCHAR(1000) = (SELECT ‘source.docx'); DECLARE @MatchedTitle NVARCHAR(1000) = (SELECT ‘target.docx'); DECLARE @SourceDocID HIERARCHYID = (SELECT path_locator FROM dbo.Documents WHERE name = @SourceTitle); DECLARE @MatchedDocID HIERARCHYID = (SELECT path_locator FROM dbo.Documents WHERE name = @MatchedTitle); SELECT K.keyphrase, K.score, @SourceTitle AS source_title, @MatchedTitle AS matched_title FROM SemanticSimilarityDetailsTable(dbo.Documents, file_stream, @SourceDocID, file_stream, @MatchedDocID) AS K ORDER BY K.score DESC; GO Why 2 Documents Are Similar
  • 24. • The generic NEAR operator is deprecated in SQLServer2012 • It is a new operator and not an extension of the existing NEAR operator • Lets to query with 2 optional requirements that you could not previously specify 1. The maximum gap between the search terms 2. The order of the search terms - for example, “John” must appear before “Smith” • Stopwords or noise words are included in the gap count. CONTAINSTABLE(Documents, Content, ‘NEAR((John, Smith), 4, TRUE)’); Full-Text Search NEAR Operator 1/2
  • 25. • -- get documents that contain keywords "sql" and "server" nearby • SELECT D.name, file_stream.GetFileNamespacePath() AS relative_path • FROM dbo.Documents D • WHERE CONTAINS(file_stream, 'NEAR(("sql", "server"), 1, FALSE)'); • GO Full-Text Search NEAR Operator 2/2
  • 26. -- get documents that contain keywords "sql" and "server" nearby SELECT D.name, file_stream.GetFileNamespacePath() AS relative_path FROM dbo.Documents D WHERE CONTAINS (file_stream, 'NEAR(("sql", "server"), 1, FALSE)'); GO Full-Text Search in Documents
  • 27. • Full Text Catalog depend on language selected Problems