Intelligent Document Processing in Healthcare. Choosing the Right Solutions.

Choosing the Right Document
Processing Solution for
Healthcare Organizations
Presented by:
Iskandar Sitdikov, ML Solutions Architect @ Provectus
Stepan Pushkarev, CTO @ Provectus

Webinar Objectives
1. Provide an overview of the market for document processing solutions
2. Outline critical factors for choosing the right document processing solution
for your healthcare use case
1. Strategize on whether you should look for a ready-made solution to purchase,
or to build a custom solution of your own
1. Get qualified for the Provectus IDP Solution Discovery Program

Agenda
1. Introduction
2. Healthcare use cases
3. Document processing in 60 seconds
4. Solutions map, advantages, and problems
5. Evaluation

Introductions
Iskandar Sitdikov
ML Solutions Architect
Provectus
Stepan Pushkarev
Chief Technology Officer
Provectus

AI-first Consultancy & Solutions Provider
500 employees and
growing
Established in 2010
HQ in Palo Alto
Offices in North America,
LATAM, and Europe
Machine Learning DevOps
Big Data Analytics
We are obsessed about leveraging cloud, data, and AI to reimagine the way
businesses operate, compete, and deliver customer value

Our Clients
Innovative Tech Vendors
Seeking for niche expertise to
differentiate and win the market
Midsize to Large Enterprises
Seeking to accelerate innovation,
achieve operational excellence

Healthcare Use Cases
Document processing 101

Use cases:
Clinical notes, medical records,
insurance medical claims, clinical
studies, medical imaging reports, lab
reports, and transfers. Administrative
overhead to process data from these
types of documents is huge.
Main benefits:
Operational speed and cost reduction. In
our practice, we see 2-8x сost reduction
compared to a fully manual process and
30%+ savings in comparison to legacy
OCR solutions.

Use cases:
Clinical notes, medical records, insurance medical
claims, clinical studies, medical imaging reports,
lab reports, and transfers. Administrative
overhead to process data from these types of
documents is huge.
Main benefits:
Operational speed and cost reduction. In our
practice, we see 2-8x сost reduction compared to
a fully manual process and 30%+ savings in
comparison to legacy OCR solutions.
Clinician notes
Claims
Transfer summaries Medical imaging reports Lab reports
Medical record
Clinical studies

General goal is to spot main entities in the
document (paragraphs, forms, tables, etc.)
and then successfully identify written text
in them (segmentation and OCR).
Both problems can be resolved separately or
using end-to-end networks.
IDP / CV

Context search on data from OCR + segmentation
Forms and tables greatly impact overall performance. Data extraction from forms is resolved (due to a
straightforward key-value structure). Tables are still a pain point for all data extractors. For unstructured texts,
deep networks are a solution at this point. Ex: BERT — good for finding key-value (question / answer) pairs
in context.
IDP / Data Extraction

Evaluation of the document
processing model is a task in
progress.
Results with a low-confidence
score and missing information
are forwarded to human experts.
Samples of successfully extracted
information are also forwarded to
human experts for evaluation.
IDP / Evaluation and Monitoring

Data lake + Ontology specifications
Fast Healthcare Interoperability Resources (FHIR)
is a standard describing data formats and
elements and an application programming
interface for exchanging electronic health records.
The standard was created by the Health Level
Seven International healthcare standards
organization.
IDP / Storage

Data lake + Ontology specifications
Fast Healthcare Interoperability Resources (FHIR)
is a standard describing data formats and
elements and an application programming
interface for exchanging electronic health records.
The standard was created by the Health Level
Seven International healthcare standards
organization.
IDP / Storage
Storage
Hospitals
Providers
Pharmaceutical
companies
Patients
Labs
Health plans

Automation encapsulates all processes mentioned above
and unites them into one single product, featuring:
● Document capture
● Model lifecycle
○ Labeling
○ (Re)Training
○ Evaluation
○ Monitoring
● Human-in-the-loop
● Integrations
● System monitoring
IDP / Automation

IDP is more than just OCR. To resolve the problem in-house, you need
to take care of data capture, data ingestion, preprocessing, OCR, data
extraction, evaluation, and further integrations to destination systems.
Bottleneck: Tables and unstructured text
IDP / Takeaways

Solutions Landscape
Market Overview

Documents are everywhere... and solutions for document processing are everywhere, too!
Competitive Landscape

Major technology platforms offer general-
purpose technology components for
document processing, such as:
● Amazon Textract + Comprehend
● Google Document AI
● Microsoft Azure Form Recognizer
Solutions: Cloud Vendors
Pros:
● Cloud infrastructure and integration
● Long lifespan and support
● Constant development
Cons:
● General purpose a.k.a require
additional work to extract necessary
information and integrate with current
workflows

These are emerging use case-focused vendors
that offer solutions using AI-native platforms to
tackle the most demanding automation
challenges. They can handle more complex
documents with a greater variability. As a result,
they often deliver a better business impact than
obsolete technologies. Since they are free from
legacy technical debt, it is easier for them to
build next-gen, future-oriented solutions.
Solutions: Startups
Pros:
● Modern tech
● Constant development
● More focused applications
● Support — For a new independent player, support is
one of the highest priorities to gain customer loyalty
Cons:
● Only few startups in this market can survive
competition with big vendors
● Challenging to customize
● May not align with your cloud strategy
● Support — On the other hand, new startups might
struggle with support

Legacy vendors typically build IDP
solutions on top of legacy platforms.
Niche vendors that are focused on limited
types of documents and use cases. You
might find hidden gems here!
Vendors that restructure your documents
workflow by introducing standard types of
documents, which are really easy to
process.
Solutions: Other Vendors
Pros:
● Wide variety of integrations
● Niche use cases
● Large portfolio of clients
Cons:
● In some cases, they rely on outdated,
less performant technologies
● Document flow restructure

System Integrators may offer IDP
as part of their portfolio of
solutions. Their IDP offering may
be a solution from another IDP
vendor or developed in-house.
Solutions: System Integrators

Vendors Evaluation Methodology

What to Choose?
Now, you have all the information about
possible go-to solutions in your market
segment. What’s next?
You need to fairly compare each and every
solution to choose one that fits and aligns
with your use case the most.
Deep evaluation is key to making the right
decision.

Data
● EDA (exploratory data analysis) — Knowing your
data is the key to success
● Sample data based on EDA
● Use this data as the evaluation dataset for
measuring performance of solutions on the
market / in the segment
Composite Index
● F1, Accuracy, Recall, etc.
● Robustness
● Key, value extraction
● Table data
● Language, character recognition, spelling,
handwritten text
Provectus Evaluation Methodology

Evaluation / Composite Index
Name Score
Provider 1 0.64
Provider 2 0.81
Provider 3 0.78
Composite Index
Dimensions

Evaluation / Text Index
Text index

Evaluation / Robustness Index
Spacing index
Noise index

TCO and Case Study: Under NDA Client
General TCO structure:
● Infrastructure (data pipelines, storage, control panel)
● CV, NLP, Human-in-the-loop
● R&D costs (if building in house)
● Support
TCO targets for end-to-end solution:
~20-30 cents per document for simple use cases and 50+
cents for more complex documents
Result:
The cost of processing one document was reduced from 24
to 11 cents, since the right OCR/CV vendor was selected (it
saved almost 10 cents per document). Also, serverless
architecture was leveraged to reduce infrastructure costs.
OCR/CV solutions performance vs. cost:
For a given use case, the most expensive
solution delivered the worst result. A second
to best result was demonstrated by the vendor
with the second to cheapest solution.
Performance vs. price

Buy vs. Customize vs. Build
Cloud OCR + extraction APIs
vs. Custom model
In cases with high volume of documents, it’s
worth investing in an in-house built custom
model to reduce costs of extra services (ex.
form and table API) in the long run.
~8th month is a break-even point on average
for the IDP custom extraction model vs. APIs

Takeaways
1. Ecosystem matters: Data integration with built-in industry specific connectors, data
pipelines, OCR, NLP, security, storage, and a human-in-the-loop workflow — All these
elements should be integrated with each other for optimal performance.
1. Use unbiased benchmarking framework for evaluating real performance of different
providers, based on your use case and datasets.
1. Work with Provectus to reduce your Document Processing costs
a. By 2-8x comparing to manual workflows
b. By 30%+ comparing to legacy OCR solutions
c. By 10%+ comparing to modern cloud solutions.

Getting Started:
Unbiased Evaluation for IDP
by Provectus

Commitments & Deliverables
Helping businesses choose the right document processing solution for their healthcare
use cases. A fully funded engagement for qualified customers.
IDP Solution Discovery Program. Unbiased!
Schedule a 30 min. pre-assessment session here:
IDP Solution Discovery Program
You provide:
1. Business use cases overview
2. Access to datasets
3. Commitment to support
the engagement
We deliver:
1. Solutions evaluation report
based on your unique data
2. Solution architecture
3. TCO estimate

125 University Avenue
Suite 295, Palo Alto
California, 94301
provectus.com
Questions, details?
We would be happy to answer!

Intelligent Document Processing in Healthcare. Choosing the Right Solutions.

Intelligent Document Processing in Healthcare. Choosing the Right Solutions.

Recomendados

Recomendados

Más contenido relacionado

La actualidad más candente

La actualidad más candente (20)

Similar a Intelligent Document Processing in Healthcare. Choosing the Right Solutions.

Similar a Intelligent Document Processing in Healthcare. Choosing the Right Solutions. (20)

Más de Provectus

Más de Provectus (20)

Último

Último (20)

Intelligent Document Processing in Healthcare. Choosing the Right Solutions.

Notas del editor