SlideShare una empresa de Scribd logo
1 de 39
Choosing the Right Document
Processing Solution for
Healthcare Organizations
Presented by:
Iskandar Sitdikov, ML Solutions Architect @ Provectus
Stepan Pushkarev, CTO @ Provectus
Andy Schuetz, PhD, Sr. Solutions Architect @ AWS
Webinar Objectives
1. Provide an overview of the market for document processing solutions
2. Outline critical factors for choosing the right document processing solution
for your healthcare use case
1. Strategize on whether you should look for a ready-made solution to purchase,
or to build a custom solution of your own
1. Get qualified for the Provectus IDP Solution Discovery Program
1. Introduction
2. Healthcare use cases
3. Document processing in 60 seconds
4. Solutions map, advantages and problems
5. Evaluation
Agenda
Introductions
Iskandar Sitdikov
ML Solutions Architect,
Provectus
Andy Schuetz, PhD
Sr. Solutions Architect,
Healthcare and Life Science,
AWS
Stepan Pushkarev
Chief Technology
Officer, Provectus
AI-first Consultancy & Solutions Provider
500 employees and
growing
Established in 2010
HQ in Palo Alto
Offices in North
America, LATAM, and
Europe
Machine Learning DevOps
Big Data Analytics
We are obsessed about leveraging cloud, data, and AI to reimagine the way
businesses operate, compete, and deliver customer value
Our Clients
Innovative Tech Vendors (ISV & DNB)
Seeking for niche expertise to differentiate
and win the market
Midsize to Large Enterprises
Seeking to accelerate innovation, achieve
operational excellence
Healthcare Use Cases
Document processing 101
Use cases: Clinical notes, medical
records, insurance medical claims,
clinical studies, medical imaging
reports, lab reports, and transfers.
Administrative overhead to process
data from these types of documents is
huge.
Main benefits: Operational speed and
cost reduction. In our practice, we see
2-8x+ reduction in costs compared to a
fully manual process and 30%+ savings
in comparison to legacy OCR solutions.
Healthcare Use Cases
General goal is to spot main entities in
the document (paragraphs, forms, tables,
etc.) and then successfully identify
written text in them (segmentation and
OCR).
Both problems can be resolved
separately or using end-to-end networks.
IDP / CV
Context search on data from OCR + segmentation
Forms and tables greatly impact overall performance. Data extraction from forms is resolved (due to a
straightforward key-value structure). Tables are still a pain point for all data extractors. For unstructured texts,
deep networks are a solution at this point. Ex: BERT — good for finding key-value (question / answer) pairs
in context.
IDP / Data Extraction
Evaluation of the document
processing model is a task in
progress.
Results with a low-confidence
score and missing information
are forwarded to human experts.
Samples of successfully extracted
information are also forwarded to
human experts for evaluation.
IDP / Evaluation and Monitoring
Data lake + Ontology specifications
Fast Healthcare Interoperability Resources (FHIR)
is a standard describing data formats and
elements and an application programming
interface for exchanging electronic health
records. The standard was created by the Health
Level Seven International healthcare standards
organization.
IDP / Storage
Automation encapsulates all processes mentioned above
and unites them into one single product, featuring:
● Document capture
● Model lifecycle
○ Labeling
○ (Re)Training
○ Evaluation
○ Monitoring
● Human-in-the-loop
● Integrations
● System monitoring
IDP / Automation
IDP is more than just OCR. To resolve the problem in-house, you need
to take care of data capture, data ingestion, preprocessing, OCR, data
extraction, evaluation, and further integrations to destination
systems.
Bottleneck: Tables and unstructured text
IDP / Takeaways
Solutions Landscape
Market Overview
Documents are everywhere... and solutions for document processing are everywhere, too!
Competitive Landscape
Major technology platforms offer general-
purpose technology components for
document processing, such as:
● Amazon Textract + Comprehend
● Google Document AI
● Microsoft Azure Form Recognizer
Solutions: Cloud Vendors
Pros:
● Cloud infrastructure and integration
● Long lifespan and support
● Constant development
Cons:
● General purpose a.k.a require
additional work to extract necessary
information and integrate with
current workflows
This is a “younger” group of up-and-coming
vendors who have built solutions using AI-
native platforms to tackle the most demanding
automation challenges. Generally, they can
handle documents that are more complex or
have greater variation. As a result, they often
can deliver a greater business impact than
older technologies. Since they are free from
legacy technical debt, it is easier for them to
build next-gen, future-oriented solutions.
Solutions: Startups
Pros:
● Modern tech
● Constant development
● More focused applications
● Support — For a new independent player, support is
one of the highest priorities to gain customer
loyalty
Cons:
● Only few startups in this market can survive
competition with big vendors
● Challenging to customize
● May not align with your cloud strategy
● Support — On the other hand, new startups might
struggle with support
Legacy vendors typically build IDP
solutions on top of legacy platforms.
Niche vendors that are focused on limited
types of documents and use cases. You
might find hidden gems here!
Vendors that restructure your documents
workflow by introducing standard types of
documents, which are really easy to
process.
Solutions: Other Vendors
Pros:
● Wide variety of integrations
● Niche use cases
● Large portfolio of clients
Cons:
● In some cases, they rely outdated, less
performant technologies
● Document flow restructure
System Integrators may offer IDP
as part of their portfolio of
solutions. Their IDP offering may
be a solution from another IDP
vendor or developed in-house.
Solutions: System Integrators
© 2021, Amazon Web Services, Inc. or its Affiliates.
The AWS ML stack
Broadest and most complete set of machine learning capabilities
ML FRAMEWORKS
& INFRASTRUCTURE
TensorFlow, PyTorch,
Apache MXNet
Deep learning
AMIs & containers
GPUs Inferentia Elastic inference FPGA
AI SERVICES
Vision
Rekognition
Speech
Polly
Transcribe
Chatbots
Lex
Contact centers
Contact Lens
Connect Voice ID
Code + DevOps
CodeGuru
DevOps Guru
Text
Comprehend
Translate
Textract
Business tools
Personalize
Forecast
Fraud Detector
Lookout for Metrics
Search
Kendra
Industrial
Panorama Appliance and
SDK, Monitron, Lookout for
Equipment, Lookout for Vision
Healthcare
HealthLake
Comprehend Medical
Transcribe Medical
Label
data
Data
collection prep
Store
features
Detect bias
and explain
predictions
Visualize in
notebooks
Pick
algorithm
Manage
& monitor
Train
models faster
Deploy in
production
Tune
parameters
Manage edge
devices
SAGEMAKER STUDIO IDE
CI/CD
AMAZON
SAGEMAKER
22
© 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Purpose-built, HIPAA-eligible services
Amazon
Comprehend
Medical
Amazon
Transcribe Medical
Amazon
HealthLake
(now in preview)
NEW
23
© 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Digital transformation in healthcare
Uses <30 of
variables
200K- 300K data points
Today’s clinical
models
Future ML models
Use <30
data points
Use 200K–300K
data points
Clinician notes
Claims
Lab reports
Medical record
Clinical
studies
Transfer summaries Medical imaging
reports
24
© 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Quickly and easily import medical records including clinical notes, lab reports,
and more
Amazon
HealthLake
A HIPAA-eligible service that
enables healthcare providers,
health insurance companies,
and pharmaceutical companies
to store, transform, query, and
analyze health data at petabyte
scale
IMPORT
Understand relationships in the data with integrated analytics and ML capabilities
ANALYZE
Powerful query and search capabilities to ask questions of the data
QUERY & SEARCH
Tag and index unstructured data using specialized ML models
TRANSFORM
Stored in the AWS Cloud in a secure, compliant, and auditable way
STORE
25
© 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Making sense
of health data
See how you can benefit
from Amazon HealthLake
Automatically understand and extract meaningful medical information from raw,
disparate data. Revolutionize a process that was traditionally manual, error-
prone, and costly
Transform data seamlessly
Presents a a chronological order of medical events so that you can look at
trends over time. Unlock novel insights with ML models to find patterns and
identify anomalies
Identify trends and make predictions
Create a complete view of each patient’s medical history and structure it in the
Fast Healthcare Interoperability Resources (FHIR) standard format to facilitate
the exchange of information across multiple applications
Support interoperability
26
© 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Abbreviations, and time stamps
patient is a 62-year-old female with a history of Type 2 diabetes mellitus with insulin use (1/14/2019). Admitted ER 1/11/2020 for an
elevated BP with no previous history of HTN. 3/28/2020 she was admitted for hypothyroidism and prescribed metformin (GLUCOPHAGE)
1000 mg take daily by mouth in evening. Follow up clinic visit (9/20/2020) with A1C results of...
Medical conditions
Transform Data Seamlessly
27
© 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved |
ICD-10
RxNorm
“METFORMIN (GLUCOPHAGE) 1,000 mg tablet E 1 AND 1/2 TABLETS
BY MOUTH IN THE MORNING... “
“Type 2 diabetes mellitus with hyperglycemia,
without long-term current use of insulin (HCC)…”
Assign medical codes to text
28
© 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Access all the information
on a patient
29
© 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved |
Support Interoperability
Interoperability
Pharmaceutical
companies
Patients
Hospitals
Health plans
Providers
Labs
30
© 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved |
How it Works
Vendors Evaluation Methodology
What to Choose?
Now, you have all the information about
possible go-to solutions in your market
segment. What’s next?
You need to fairly compare each and every
solution to choose one that fits and aligns
with your use case the most.
Deep evaluation is key to making the right
decision.
Data
● EDA (exploratory data analysis) — Knowing your
data is the key to success
● Sample data based on EDA
● Use this data as the evaluation dataset for
measuring performance of solutions on the
market / in the segment
Metrics
● F1, Accuracy, Recall, etc.
● Key, value extraction
● Table data
● Language, character recognition, spelling,
handwritten text
Provectus Evaluation Methodology
Evaluation / Composite Index
Name Score
Provider 1 0.64
Provider 2 0.81
Provider 3 0.78
Composite Index
Evaluation / Text Index
Evaluation / Robustness Index
TCO and Case Study: Under NDA Client
General TCO structure:
● Infrastructure (data pipelines, storage, control panel)
● CV, NLP, Human-in-the-loop
● R&D costs (if building in house)
● Support
TCO targets for end-to-end solution: ~20-30 cents per
document for simple use cases and 50+ cents for specific
“complex” documents
Result: The cost of processing one document was reduced
from 24 to 11 cents, since the right OCR/CV vendor was
selected (it saved almost 10 cents per document). Also,
serverless architecture was leveraged to reduce
infrastructure costs.
OCR/CV solutions performance vs. cost: For a
given use case, the most expensive solution
delivered the worst result. A second to best
result was demonstrated by the vendor with
the second to cheapest solution.
Takeaways
1. Ecosystem matters: Data integration with built-in industry specific connectors, data pipelines,
OCR, NLP, security, storage, and a human-in-the-loop workflow — All these elements should be
integrated with each other for optimal performance.
1. Use unbiased benchmarking framework for evaluating real performance
of different providers, based on your use case and datasets.
1. Work with Provectus to reduce your Document Processing costs
a. By 2-8x compared to manual workflows
b. By 30%+ compared to legacy OCR solutions
c. By 10%+ compared to modern cloud solutions
125 University Avenue
Suite 295, Palo Alto
California, 94301
provectus.com
Questions, details?
We would be happy to answer!

Más contenido relacionado

La actualidad más candente

Building safety-critical medical device platforms and Meaningful Use EHR gate...
Building safety-critical medical device platforms and Meaningful Use EHR gate...Building safety-critical medical device platforms and Meaningful Use EHR gate...
Building safety-critical medical device platforms and Meaningful Use EHR gate...
Shahid Shah
 

La actualidad más candente (20)

INTIENT Pharmacovigilance
INTIENT PharmacovigilanceINTIENT Pharmacovigilance
INTIENT Pharmacovigilance
 
AI today and its power to transform healthcare
AI today and its power to transform healthcareAI today and its power to transform healthcare
AI today and its power to transform healthcare
 
Hadoop Enabled Healthcare
Hadoop Enabled HealthcareHadoop Enabled Healthcare
Hadoop Enabled Healthcare
 
Sensors For The Lab & For Manufacturing: Early Adventures in IoT
Sensors For The Lab & For Manufacturing: Early Adventures in IoTSensors For The Lab & For Manufacturing: Early Adventures in IoT
Sensors For The Lab & For Manufacturing: Early Adventures in IoT
 
Tear Down Data Silos - CROWN 2019 conference
Tear Down Data Silos - CROWN 2019 conferenceTear Down Data Silos - CROWN 2019 conference
Tear Down Data Silos - CROWN 2019 conference
 
Architecting, designing and building medical devices in an outcomes focused B...
Architecting, designing and building medical devices in an outcomes focused B...Architecting, designing and building medical devices in an outcomes focused B...
Architecting, designing and building medical devices in an outcomes focused B...
 
Ai design sprint - Finance - Wealth management
Ai design sprint  - Finance - Wealth managementAi design sprint  - Finance - Wealth management
Ai design sprint - Finance - Wealth management
 
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
Challenges in Clinical Research: Aridhia Disrupts Technology Approach to Rese...
 
Meetup7 integration microservices_machine_learning
Meetup7 integration microservices_machine_learningMeetup7 integration microservices_machine_learning
Meetup7 integration microservices_machine_learning
 
Building safety-critical medical device platforms and Meaningful Use EHR gate...
Building safety-critical medical device platforms and Meaningful Use EHR gate...Building safety-critical medical device platforms and Meaningful Use EHR gate...
Building safety-critical medical device platforms and Meaningful Use EHR gate...
 
OSEHRA Summit 2012 Lunch Keynote: Current health IT systems integrate poorly ...
OSEHRA Summit 2012 Lunch Keynote: Current health IT systems integrate poorly ...OSEHRA Summit 2012 Lunch Keynote: Current health IT systems integrate poorly ...
OSEHRA Summit 2012 Lunch Keynote: Current health IT systems integrate poorly ...
 
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
Anzo Smart Data Lake 4.0 - a Data Lake Platform for the Enterprise Informatio...
 
Embracing Cloud Deployment for Big Data and Dev Ops
Embracing Cloud Deployment for Big Data and Dev OpsEmbracing Cloud Deployment for Big Data and Dev Ops
Embracing Cloud Deployment for Big Data and Dev Ops
 
How to Use Open Source Technologies in Safety-critical Digital Health Applica...
How to Use Open Source Technologies in Safety-critical Digital Health Applica...How to Use Open Source Technologies in Safety-critical Digital Health Applica...
How to Use Open Source Technologies in Safety-critical Digital Health Applica...
 
Destroy Data Siloes at Digital Innovations to Advance Clinical Trials
Destroy Data Siloes at Digital Innovations to Advance Clinical TrialsDestroy Data Siloes at Digital Innovations to Advance Clinical Trials
Destroy Data Siloes at Digital Innovations to Advance Clinical Trials
 
Accelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data FabricAccelerate Digital Transformation with an Enterprise Big Data Fabric
Accelerate Digital Transformation with an Enterprise Big Data Fabric
 
Real-Time Clinical Analytics
Real-Time Clinical AnalyticsReal-Time Clinical Analytics
Real-Time Clinical Analytics
 
Philips john huffman
Philips john huffmanPhilips john huffman
Philips john huffman
 
Hadoop in Healthcare Systems
Hadoop in Healthcare SystemsHadoop in Healthcare Systems
Hadoop in Healthcare Systems
 
A Modern Data Strategy for Precision Medicine
A Modern Data Strategy for Precision MedicineA Modern Data Strategy for Precision Medicine
A Modern Data Strategy for Precision Medicine
 

Similar a Choosing the Right Document Processing Solution for Healthcare Organizations

Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
Julian Tong
 
Real time insights for better products, customer experience and resilient pla...
Real time insights for better products, customer experience and resilient pla...Real time insights for better products, customer experience and resilient pla...
Real time insights for better products, customer experience and resilient pla...
Balvinder Hira
 

Similar a Choosing the Right Document Processing Solution for Healthcare Organizations (20)

Intelligent Document Processing in Healthcare. Choosing the Right Solutions.
Intelligent Document Processing in Healthcare. Choosing the Right Solutions.Intelligent Document Processing in Healthcare. Choosing the Right Solutions.
Intelligent Document Processing in Healthcare. Choosing the Right Solutions.
 
Choosing the right IDP Solution
Choosing the right IDP SolutionChoosing the right IDP Solution
Choosing the right IDP Solution
 
eBook-DataSciencePlatform
eBook-DataSciencePlatformeBook-DataSciencePlatform
eBook-DataSciencePlatform
 
Agile Mumbai 2022 - Balvinder Kaur & Sushant Joshi | Real-Time Insights and A...
Agile Mumbai 2022 - Balvinder Kaur & Sushant Joshi | Real-Time Insights and A...Agile Mumbai 2022 - Balvinder Kaur & Sushant Joshi | Real-Time Insights and A...
Agile Mumbai 2022 - Balvinder Kaur & Sushant Joshi | Real-Time Insights and A...
 
2016 Strata Conference New York - Vendor Briefings
2016 Strata Conference New York - Vendor Briefings2016 Strata Conference New York - Vendor Briefings
2016 Strata Conference New York - Vendor Briefings
 
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
Sergio Juarez, Elemica – “From Big Data to Value: The Power of Master Data Ma...
 
Athira mp cv_latest - copy
Athira mp cv_latest - copyAthira mp cv_latest - copy
Athira mp cv_latest - copy
 
Data Analytics in your IoT Solution Fukiat Julnual, Technical Evangelist, Mic...
Data Analytics in your IoT SolutionFukiat Julnual, Technical Evangelist, Mic...Data Analytics in your IoT SolutionFukiat Julnual, Technical Evangelist, Mic...
Data Analytics in your IoT Solution Fukiat Julnual, Technical Evangelist, Mic...
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Keyrus US Information
Keyrus US InformationKeyrus US Information
Keyrus US Information
 
Solution Architecture US healthcare
Solution Architecture US healthcare Solution Architecture US healthcare
Solution Architecture US healthcare
 
Latest trends in Business Analytics
Latest trends in Business AnalyticsLatest trends in Business Analytics
Latest trends in Business Analytics
 
Big Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential ToolsBig Data Tools: A Deep Dive into Essential Tools
Big Data Tools: A Deep Dive into Essential Tools
 
How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)How Data Virtualization Puts Machine Learning into Production (APAC)
How Data Virtualization Puts Machine Learning into Production (APAC)
 
Real time insights for better products, customer experience and resilient pla...
Real time insights for better products, customer experience and resilient pla...Real time insights for better products, customer experience and resilient pla...
Real time insights for better products, customer experience and resilient pla...
 
Augmented Data Management
Augmented Data ManagementAugmented Data Management
Augmented Data Management
 
Whitepaper on Master Data Management
Whitepaper on Master Data Management Whitepaper on Master Data Management
Whitepaper on Master Data Management
 
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder AtwalDataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
DataOps - Big Data and AI World London - March 2020 - Harvinder Atwal
 
Dashboards Beyond the Boardroom
Dashboards Beyond the BoardroomDashboards Beyond the Boardroom
Dashboards Beyond the Boardroom
 
Applying Architecture Design for Information Delivery - HC
Applying Architecture Design for Information Delivery - HCApplying Architecture Design for Information Delivery - HC
Applying Architecture Design for Information Delivery - HC
 

Más de Provectus

AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and Beyond
Provectus
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
Provectus
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
Provectus
 

Más de Provectus (20)

MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
AI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and BeyondAI Stack on AWS: Amazon SageMaker and Beyond
AI Stack on AWS: Amazon SageMaker and Beyond
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMakerMLOps and Reproducible ML on AWS with Kubeflow and SageMaker
MLOps and Reproducible ML on AWS with Kubeflow and SageMaker
 
Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMR
Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMRCost Optimization for Apache Hadoop/Spark Workloads with Amazon EMR
Cost Optimization for Apache Hadoop/Spark Workloads with Amazon EMR
 
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
ODSC webinar "Kubeflow, MLFlow and Beyond — augmenting ML delivery" Stepan Pu...
 
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K..."Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
"Building a Modern Data platform in the Cloud", Alex Casalboni, AWS Dev Day K...
 
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ..."How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
"How to build a global serverless service", Alex Casalboni, AWS Dev Day Kyiv ...
 
"Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky...
"Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky..."Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky...
"Automating AWS Infrastructure with PowerShell", Martin Beeby, AWS Dev Day Ky...
 
"Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2...
"Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2..."Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2...
"Analyzing your web and application logs", Javier Ramirez, AWS Dev Day Kyiv 2...
 
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma..."Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
"Resiliency and Availability Design Patterns for the Cloud", Sebastien Storma...
 
"Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ...
"Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ..."Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ...
"Architecting SaaS solutions on AWS", Oleksandr Mykhalchuk, AWS Dev Day Kyiv ...
 
"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019
"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019
"Developing with .NET Core on AWS", Martin Beeby, AWS Dev Day Kyiv 2019
 
"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019
"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019
"How to build real-time backends", Martin Beeby, AWS Dev Day Kyiv 2019
 
"Integrate your front end apps with serverless backend in the cloud", Sebasti...
"Integrate your front end apps with serverless backend in the cloud", Sebasti..."Integrate your front end apps with serverless backend in the cloud", Sebasti...
"Integrate your front end apps with serverless backend in the cloud", Sebasti...
 
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
"Scaling ML from 0 to millions of users", Julien Simon, AWS Dev Day Kyiv 2019
 
How to implement authorization in your backend with AWS IAM
How to implement authorization in your backend with AWS IAMHow to implement authorization in your backend with AWS IAM
How to implement authorization in your backend with AWS IAM
 
Yurii Gavrilin | ML Interpretability: From A to Z | Kazan ODSC Meetup
Yurii Gavrilin | ML Interpretability: From A to Z | Kazan ODSC MeetupYurii Gavrilin | ML Interpretability: From A to Z | Kazan ODSC Meetup
Yurii Gavrilin | ML Interpretability: From A to Z | Kazan ODSC Meetup
 
Andrei Grigoriev | Version Control in Data Science | Kazan ODSC Meetup
Andrei Grigoriev | Version Control in Data Science | Kazan ODSC MeetupAndrei Grigoriev | Version Control in Data Science | Kazan ODSC Meetup
Andrei Grigoriev | Version Control in Data Science | Kazan ODSC Meetup
 
Modern word embeddings | Andrei Kulagin | Kazan ODSC Meetup
Modern word embeddings | Andrei Kulagin | Kazan ODSC MeetupModern word embeddings | Andrei Kulagin | Kazan ODSC Meetup
Modern word embeddings | Andrei Kulagin | Kazan ODSC Meetup
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
Enterprise Knowledge
 

Último (20)

04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdfGenAI Risks & Security Meetup 01052024.pdf
GenAI Risks & Security Meetup 01052024.pdf
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Choosing the Right Document Processing Solution for Healthcare Organizations

  • 1. Choosing the Right Document Processing Solution for Healthcare Organizations Presented by: Iskandar Sitdikov, ML Solutions Architect @ Provectus Stepan Pushkarev, CTO @ Provectus Andy Schuetz, PhD, Sr. Solutions Architect @ AWS
  • 2. Webinar Objectives 1. Provide an overview of the market for document processing solutions 2. Outline critical factors for choosing the right document processing solution for your healthcare use case 1. Strategize on whether you should look for a ready-made solution to purchase, or to build a custom solution of your own 1. Get qualified for the Provectus IDP Solution Discovery Program
  • 3. 1. Introduction 2. Healthcare use cases 3. Document processing in 60 seconds 4. Solutions map, advantages and problems 5. Evaluation Agenda
  • 4. Introductions Iskandar Sitdikov ML Solutions Architect, Provectus Andy Schuetz, PhD Sr. Solutions Architect, Healthcare and Life Science, AWS Stepan Pushkarev Chief Technology Officer, Provectus
  • 5. AI-first Consultancy & Solutions Provider 500 employees and growing Established in 2010 HQ in Palo Alto Offices in North America, LATAM, and Europe Machine Learning DevOps Big Data Analytics We are obsessed about leveraging cloud, data, and AI to reimagine the way businesses operate, compete, and deliver customer value
  • 6. Our Clients Innovative Tech Vendors (ISV & DNB) Seeking for niche expertise to differentiate and win the market Midsize to Large Enterprises Seeking to accelerate innovation, achieve operational excellence
  • 8. Use cases: Clinical notes, medical records, insurance medical claims, clinical studies, medical imaging reports, lab reports, and transfers. Administrative overhead to process data from these types of documents is huge. Main benefits: Operational speed and cost reduction. In our practice, we see 2-8x+ reduction in costs compared to a fully manual process and 30%+ savings in comparison to legacy OCR solutions. Healthcare Use Cases
  • 9. General goal is to spot main entities in the document (paragraphs, forms, tables, etc.) and then successfully identify written text in them (segmentation and OCR). Both problems can be resolved separately or using end-to-end networks. IDP / CV
  • 10. Context search on data from OCR + segmentation Forms and tables greatly impact overall performance. Data extraction from forms is resolved (due to a straightforward key-value structure). Tables are still a pain point for all data extractors. For unstructured texts, deep networks are a solution at this point. Ex: BERT — good for finding key-value (question / answer) pairs in context. IDP / Data Extraction
  • 11. Evaluation of the document processing model is a task in progress. Results with a low-confidence score and missing information are forwarded to human experts. Samples of successfully extracted information are also forwarded to human experts for evaluation. IDP / Evaluation and Monitoring
  • 12. Data lake + Ontology specifications Fast Healthcare Interoperability Resources (FHIR) is a standard describing data formats and elements and an application programming interface for exchanging electronic health records. The standard was created by the Health Level Seven International healthcare standards organization. IDP / Storage
  • 13. Automation encapsulates all processes mentioned above and unites them into one single product, featuring: ● Document capture ● Model lifecycle ○ Labeling ○ (Re)Training ○ Evaluation ○ Monitoring ● Human-in-the-loop ● Integrations ● System monitoring IDP / Automation
  • 14. IDP is more than just OCR. To resolve the problem in-house, you need to take care of data capture, data ingestion, preprocessing, OCR, data extraction, evaluation, and further integrations to destination systems. Bottleneck: Tables and unstructured text IDP / Takeaways
  • 16. Documents are everywhere... and solutions for document processing are everywhere, too! Competitive Landscape
  • 17. Major technology platforms offer general- purpose technology components for document processing, such as: ● Amazon Textract + Comprehend ● Google Document AI ● Microsoft Azure Form Recognizer Solutions: Cloud Vendors Pros: ● Cloud infrastructure and integration ● Long lifespan and support ● Constant development Cons: ● General purpose a.k.a require additional work to extract necessary information and integrate with current workflows
  • 18. This is a “younger” group of up-and-coming vendors who have built solutions using AI- native platforms to tackle the most demanding automation challenges. Generally, they can handle documents that are more complex or have greater variation. As a result, they often can deliver a greater business impact than older technologies. Since they are free from legacy technical debt, it is easier for them to build next-gen, future-oriented solutions. Solutions: Startups Pros: ● Modern tech ● Constant development ● More focused applications ● Support — For a new independent player, support is one of the highest priorities to gain customer loyalty Cons: ● Only few startups in this market can survive competition with big vendors ● Challenging to customize ● May not align with your cloud strategy ● Support — On the other hand, new startups might struggle with support
  • 19. Legacy vendors typically build IDP solutions on top of legacy platforms. Niche vendors that are focused on limited types of documents and use cases. You might find hidden gems here! Vendors that restructure your documents workflow by introducing standard types of documents, which are really easy to process. Solutions: Other Vendors Pros: ● Wide variety of integrations ● Niche use cases ● Large portfolio of clients Cons: ● In some cases, they rely outdated, less performant technologies ● Document flow restructure
  • 20. System Integrators may offer IDP as part of their portfolio of solutions. Their IDP offering may be a solution from another IDP vendor or developed in-house. Solutions: System Integrators
  • 21. © 2021, Amazon Web Services, Inc. or its Affiliates. The AWS ML stack Broadest and most complete set of machine learning capabilities ML FRAMEWORKS & INFRASTRUCTURE TensorFlow, PyTorch, Apache MXNet Deep learning AMIs & containers GPUs Inferentia Elastic inference FPGA AI SERVICES Vision Rekognition Speech Polly Transcribe Chatbots Lex Contact centers Contact Lens Connect Voice ID Code + DevOps CodeGuru DevOps Guru Text Comprehend Translate Textract Business tools Personalize Forecast Fraud Detector Lookout for Metrics Search Kendra Industrial Panorama Appliance and SDK, Monitron, Lookout for Equipment, Lookout for Vision Healthcare HealthLake Comprehend Medical Transcribe Medical Label data Data collection prep Store features Detect bias and explain predictions Visualize in notebooks Pick algorithm Manage & monitor Train models faster Deploy in production Tune parameters Manage edge devices SAGEMAKER STUDIO IDE CI/CD AMAZON SAGEMAKER
  • 22. 22 © 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved | Purpose-built, HIPAA-eligible services Amazon Comprehend Medical Amazon Transcribe Medical Amazon HealthLake (now in preview) NEW
  • 23. 23 © 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved | Digital transformation in healthcare Uses <30 of variables 200K- 300K data points Today’s clinical models Future ML models Use <30 data points Use 200K–300K data points Clinician notes Claims Lab reports Medical record Clinical studies Transfer summaries Medical imaging reports
  • 24. 24 © 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved | Quickly and easily import medical records including clinical notes, lab reports, and more Amazon HealthLake A HIPAA-eligible service that enables healthcare providers, health insurance companies, and pharmaceutical companies to store, transform, query, and analyze health data at petabyte scale IMPORT Understand relationships in the data with integrated analytics and ML capabilities ANALYZE Powerful query and search capabilities to ask questions of the data QUERY & SEARCH Tag and index unstructured data using specialized ML models TRANSFORM Stored in the AWS Cloud in a secure, compliant, and auditable way STORE
  • 25. 25 © 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved | Making sense of health data See how you can benefit from Amazon HealthLake Automatically understand and extract meaningful medical information from raw, disparate data. Revolutionize a process that was traditionally manual, error- prone, and costly Transform data seamlessly Presents a a chronological order of medical events so that you can look at trends over time. Unlock novel insights with ML models to find patterns and identify anomalies Identify trends and make predictions Create a complete view of each patient’s medical history and structure it in the Fast Healthcare Interoperability Resources (FHIR) standard format to facilitate the exchange of information across multiple applications Support interoperability
  • 26. 26 © 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved | Abbreviations, and time stamps patient is a 62-year-old female with a history of Type 2 diabetes mellitus with insulin use (1/14/2019). Admitted ER 1/11/2020 for an elevated BP with no previous history of HTN. 3/28/2020 she was admitted for hypothyroidism and prescribed metformin (GLUCOPHAGE) 1000 mg take daily by mouth in evening. Follow up clinic visit (9/20/2020) with A1C results of... Medical conditions Transform Data Seamlessly
  • 27. 27 © 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved | ICD-10 RxNorm “METFORMIN (GLUCOPHAGE) 1,000 mg tablet E 1 AND 1/2 TABLETS BY MOUTH IN THE MORNING... “ “Type 2 diabetes mellitus with hyperglycemia, without long-term current use of insulin (HCC)…” Assign medical codes to text
  • 28. 28 © 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved | Access all the information on a patient
  • 29. 29 © 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved | Support Interoperability Interoperability Pharmaceutical companies Patients Hospitals Health plans Providers Labs
  • 30. 30 © 2021 Amazon Web Services, Inc. or its affiliates. All rights reserved | How it Works
  • 32. What to Choose? Now, you have all the information about possible go-to solutions in your market segment. What’s next? You need to fairly compare each and every solution to choose one that fits and aligns with your use case the most. Deep evaluation is key to making the right decision.
  • 33. Data ● EDA (exploratory data analysis) — Knowing your data is the key to success ● Sample data based on EDA ● Use this data as the evaluation dataset for measuring performance of solutions on the market / in the segment Metrics ● F1, Accuracy, Recall, etc. ● Key, value extraction ● Table data ● Language, character recognition, spelling, handwritten text Provectus Evaluation Methodology
  • 34. Evaluation / Composite Index Name Score Provider 1 0.64 Provider 2 0.81 Provider 3 0.78 Composite Index
  • 37. TCO and Case Study: Under NDA Client General TCO structure: ● Infrastructure (data pipelines, storage, control panel) ● CV, NLP, Human-in-the-loop ● R&D costs (if building in house) ● Support TCO targets for end-to-end solution: ~20-30 cents per document for simple use cases and 50+ cents for specific “complex” documents Result: The cost of processing one document was reduced from 24 to 11 cents, since the right OCR/CV vendor was selected (it saved almost 10 cents per document). Also, serverless architecture was leveraged to reduce infrastructure costs. OCR/CV solutions performance vs. cost: For a given use case, the most expensive solution delivered the worst result. A second to best result was demonstrated by the vendor with the second to cheapest solution.
  • 38. Takeaways 1. Ecosystem matters: Data integration with built-in industry specific connectors, data pipelines, OCR, NLP, security, storage, and a human-in-the-loop workflow — All these elements should be integrated with each other for optimal performance. 1. Use unbiased benchmarking framework for evaluating real performance of different providers, based on your use case and datasets. 1. Work with Provectus to reduce your Document Processing costs a. By 2-8x compared to manual workflows b. By 30%+ compared to legacy OCR solutions c. By 10%+ compared to modern cloud solutions
  • 39. 125 University Avenue Suite 295, Palo Alto California, 94301 provectus.com Questions, details? We would be happy to answer!