Más contenido relacionado La actualidad más candente (20) Similar a Using Metadata-Driven Taxonomies to Solve Business Problems (20) Más de Concept Searching, Inc (15) Using Metadata-Driven Taxonomies to Solve Business Problems1. © Concept Searching 2018
Using Metadata-Driven Taxonomies
to Solve Business Problems
www.conceptsearching.com
marketing@conceptsearching.com
Twitter @conceptsearch
Michael Paye
Chief Technology Officer
Concept Searching
mikep@conceptsearching.com
2. © Concept Searching 2018
Michael Paye – Chief Technology Officer at Concept Searching
has been the driving force behind many of the company's recent
innovations, including the SharePoint Add-in and hybrid search
products. He has a wealth of experience across the Microsoft
platform and related technologies, and oversees all product
development.
3. © Concept Searching 2018
Agenda
• Who we are
• What problems do we solve?
• Demo – Auto-classification
• Demo – Taxonomy creation
4. © Concept Searching 2018
• Company founded in 2002
• Product launched in 2003
• Focus on management of structured and unstructured information
• Profitable, debt free
• Technology Platform
• Delivered as a web service
• Automatic concept identification, content tagging, auto-classification,
taxonomy management
• Only statistical vendor that can extract conceptual metadata
• 9 years KMWorld ‘100 Companies that Matter in Knowledge Management’
9 years KMWorld ‘Trend Setting Product’
• Authority to Operate enterprise wide US Air Force, NETCON US Army,
and Canadian SLSA
• Client base: Fortune 500/1000 organizations in Healthcare,
Financial Services, Manufacturing, Energy, Professional Services,
Pharmaceutical, Public Sector and DoD
• Microsoft Gold Certification in Application Development
• Member of SharePoint PAC and TAP programs
• Suitable for all versions of SharePoint on-premises and SharePoint Online,
including the latest vNext dedicated platform and the government cloud
The Global Leader in
Managed Metadata Solutions
5. © Concept Searching 2018
“Over 80% of business decisions are made using unstructured data.”
IDC
6. © Concept Searching 2018
• 91% use manual metadata tagging
• Free-for-all mode
• Drop down lists
• 15% maintain a homegrown manual taxonomy
• 77% have no rhyme or reason for managing content
Organizational Information Chaos
• Unstructured data is growing at the rate of 62% per year IDG
• By 2022, 93% of all data in the digital universe will be unstructured IDG
• Data volume is set to grow 800% over the next five years and 80% of it
will reside as unstructured data Gartner
What’s the Problem?
7. © Concept Searching 2018
“74% of organizations continue to depend on individuals to manually
comply with legal, regulatory, and record management requirements.
Given the projected growth and the inability of employees to
manually manage information, organizations need to start
automating the tasks associated with classifying, managing, and
disposing of information assets.”
Council for Information Auto-Classification (CIAC)
“It is simply not realistic to expect broad sets of employees to
navigate extensive classification options while referring to a records
schedule that may weigh in at more than 100 pages.”
Forrester Research/ARMA International Survey
What’s the Real Problem?
8. © Concept Searching 2018
What are the typical challenges?
Electronic information is growing at a rate of 30% to 60% per year –
electronic records typically constitute 90% of an organization’s records
• Users
• Trained – rarely done, and a minimalist approach is taken
• Policies enforcement
• Reality – biggest stumbling block is often the end user
• Impact
• Metadata can be subjective, erroneous, or non-existent
• Impacts productivity
• Increases organizational risk
• End user classification – 20% accurate versus automated classification,
which is 80%-90% accurate, if tuned and managed, sometimes higher
Typical Enforcement Challenges for Business Users
9. © Concept Searching 2018
The Core Technology and Why We Are Different
It’s all about metadata
• Unique IP compound term processing
• Identifies multi-word terms that form
a complex entity
• Ambiguity inherent in single words
is eliminated
• Works in any language, regardless of
grammar or linguistic style
• Generates non-subjective metadata
based on an understanding of
conceptual meaning
10. © Concept Searching 2018
“The metadata infrastructure provides the critical glue that binds the
information infrastructure to the underlying IT infrastructure.
Sound information governance practices would take advantage of the
metadata infrastructure to ensure that content and data are managed
consistently and adhere to written policies, across on-premise and
cloud based environments.”
IDC Digital Universe Study
The Advantages
• Ability to develop a single repository of organizationally relevant metadata,
to be made available to any application that requires the use of metadata
• Elimination of costs and errors associated with end user tagging
• Normalization of content across functional and geographic boundaries,
to remove ambiguity in vocabulary
• Metadata managed and changed in one place
• Ability to rapidly implement workflows, to apply policy consistently across
diverse repositories and applications
• Provide flexibility to quickly make changes to the repository for regulatory
compliance, where changes are immediately available for use by applications
Metadata
11. © Concept Searching 2018
Auto-classification Systems – What Do They Do?
Document
Preparation
• Split into language
blocks (paragraphs,
headings) formatting,
layout
Parsing
• Entity extraction
• NLP – parts of
speech, phrases
• Terms, variants
Weighting
• Frequency
• Location in text,
phrase
• Proximity
• Combination
• Format of text
Classification
• If threshold reached
• Can influence search
results
This is where rules
versus statistics
come into play…
Not all classification solutions are created equal
12. © Concept Searching 2018
Statistical
• Avoids the inherent problems in linguistic
solutions
• By tokenizing text and then working on a
mathematical representation, most of the
difference between languages is eliminated
• Any unique vocabulary used by different
vertical sectors can be accommodated
automatically
• Even differences in grammatical style do not
greatly affect accuracy when using statistical
techniques – writing a news article rather than
a white paper
• Able to balance precision and recall
• Performance can be easily modified through a
taxonomy
Statistical Auto-classification
14. © Concept Searching 2018
“There is a debilitating disconnect between the proliferation of electronic
information and the constant need to quickly and accurately find all of the
information and expertise that is essential for work every day. From top to
bottom, enterprises have failed to take seriously the high cost of being
grossly inadequate at finding information, data, documents, experts.
Instead they have settled for low performance, low-return techniques to…
sort of handle Search.”
Julie Hunt, Search Consultant
15. © Concept Searching 2018
Taxonomies
• Hierarchical representation of entities of
interest in an organization
• Primary tool to provide structure to
unstructured content
• Front end and/or back end functionality
• Actualized through metadata
• Business taxonomies
• Tend to be less rigid and constrained
• Usability – minimize clicks
• Content driven
• Allows flexibility and redundancy
• Provides a single methodology for
classification (categorization)
• Allows for entity extraction
16. © Concept Searching 2018
One of a Kind
Unique to conceptTaxonomyManager
• Compound term processing technology that
identifies ‘concepts in context’
• Automatic intelligent metadata generation
as content is created or ingested
• Automatic taxonomy node clue suggestion
• Dynamic screen updating to immediately
see the impact of changes in the taxonomy
• Document movement feedback to see
cause and effect of changes without reindexing
17. © Concept Searching 2018
Why Concept Searching?
• Delivers ‘concept’ based searching – no training required
• Automatic, real-time identification and protection of privacy or
organizationally-defined confidential information
• Content optimization – cleanup of file shares or any content repository,
eliminate ROT, identify undeclared records and potential security breaches
• Intelligent migration – performed after content optimization, migrates
content to a defined structure (taxonomy)
• Automatic identification and routing of documents of records
• Mergers and acquisitions
• Secure collaboration at the content level
• Text analytics
• Knowledge management
• Research
• Metadata available to any application that uses it to process information
18. © Concept Searching 2018
Setting up a taxonomy node, suggesting clues, real-time feedback, weighting
19. © Concept Searching 2018
Next Webinar in Metadata-Driven World Series
Going Meta in SharePoint – Tricks of the Trade
Wednesday, February 14, 2018
Register
Join Robert Piddocke, our Vice President of Channel and Business
Development, and author of SharePoint Search books, to learn how
‘going meta’ helps transcend typical metadata use, and how to realize
the potential of intelligent content in context.
Robert discusses SharePoint functionality and what needs to be put in
place to deploy a metadata-driven enterprise and build a framework for
the future, and how metadata can be used to automate and drive
business processes, and proactively manage content.
Read more and register in the Upcoming Webinars area of our website.
20. © Concept Searching 2018
Thank You
www.conceptsearching.com
marketing@conceptsearching.com
Twitter @conceptsearch
Michael Paye
Chief Technology Officer
Concept Searching
mikep@conceptsearching.com
22. © Concept Searching 2018
Solution Components *Basic* What business problem are you trying to solve?
• Semantic metadata generation, automated classification, taxonomy management
• Stay away from lengthy implementations, required use of vendor consultants,
stand-alone applications, use of new languages
• If it’s based on linguistics, can you easily integrate your corpus of content?
• Look for a flexible solution that addresses more than ‘just’ search and is extendable
to other applications, such as records management, privacy/security, migration,
secure collaboration, eDiscovery, text analytics
• Don’t be swayed by bells and whistles – this is a framework, not an application
• Accomplished through the development of an enterprise metadata repository
• Workflow rules to enforce policies across the enterprise
• Evaluate solution and richness of function as well as ease of use
• Who will maintain? IT or business users?
• How easy for subject-matter experts to contribute?
• Identify how the taxonomy tool integrates with other applications and how it
reduces time and effort, yet delivers high quality
• Calculate the ROI for each application
Evaluating Solutions
24. © Concept Searching 2018
A Few Questions to Ask to Get You Started
• How often should a repository be indexed for new
content?
• Does the system need to perform in real time?
• Should old content be reclassified to determine if it should
be classified according to a different category?
• How are classification errors solved?
• Should the user have the ability to override the
classification assignment?
• Who should manage the system – IT or business?
Or both?
• How long should deployment and ongoing management
take?
• Can end user involvement be eliminated?
• How does the system handle vocabulary and/or language
ambiguities?
26. © Concept Searching 2018
Clarification of the Following Slides
• Some of the following slides are based on a SharePoint environment, but would
also be applicable in a non-SharePoint environment. The information is a few
years old, and the calculations derived by the authors should be calculated in
today’s currency. The slides still have informational value and are to be used to
point you in the right direction and are not all-inclusive.
• For example, assuming statistics are accurate and 80% of breaches are caused
internally, what would be the cost to rectify a breach after it occurred? Include
different scenarios, such as whether the breach was only internal, did it impact
customers, did it impact your market brand, will it impact stock value (yes!), how
would you fix it if you had a tool, how would you fix it without a tool, was it
introduced through a partner (think Target breach), then assume that you had a
tool that would identify potential breaches in real time, before they occur.
• Typically, the ROI will be significant. Not to worry, it’s probably right on the
money.
27. © Concept Searching 2018
The primary research collected in
this referenced white paper
illustrated:
• There is a broad range of benefits
spanning the categories of IT,
process, and business impact, all
of which have moderate to high
levels of business impact
• The leading ROI value drivers are
related to processes and effective
decision making
• The value of process and IT
drivers are manifested via their
business impact
Calculating ROI
• IT benefits
• Process improvements
• Business benefits
• End user benefits
8.02
8
7.97
7.93
7.29
7.28
6.95
6.71
6.67
6.67
6.45
6.17
6.12
6.1
1 2 3 4 5 6 7 8 9
Shorter process cycle time
Simplified data entry/access
Automation of manual activities
Better decision making
Reduction in errors/rework
Faster decision making
Lower cost from self-service
Reduction in phone/email/mail
Simplified reporting
Productivity from improved search
Lower cost for extending LOB systems
Centralized access control
Reduced coding to deploy solutions
Less training for LOB systems
ROI impact of key value drivers
Average of 58 responses; 10 = highest impact
Pique Solutions – Connected Value: The ROI Benefits of Business Critical SharePoint
ROI Objectives
28. © Concept Searching 2018
• Three areas
• Business impact
• IT impact
• Process impact
• Most enterprises will see
highest ROI from process
impact
• Addition – End user costs
Real-World Business Drivers
30. © Concept Searching 2018
ROI – Real-World Savings
Pique Solutions
The Business Solutions
• Search
• Records Management
• Content Optimization
• Intelligent Migration
• Data Security/Confidentiality
• eDiscovery/Litigation Support,
FOIA
• Information Governance
• Text Analytics
• GDPR
• Business Social Networking
• Secure Collaboration
• Content Lifecycle Management
• Metadata Management
• Research
• Knowledge Management
• Mergers and Acquisitions
• Your Challenge…
31. © Concept Searching 2018
Need Help?
If you need help calculating ROI or providing additional statistics,
please contact us on marketing@conceptsearching.com