iData Sciences
Data Capture, Analytics & Optimization
Enabling customers to harvest & analyze data for actionable intelligence
Redefining Decision Intelligence
iData Sciences is an Information Technology Company
Providing Software Solutions to Customers
Accelerated through the use of Innovative Technology
in the area of Data Capture & Image Recognition
Head Quartered in NJ, USA
Offshore Delivery Center in Chennai, India
www.idatasciences.com 2
Company Overview
To help Customers Improve Process Efficiency through the use of
innovative Software Solutions
To help Customers Improve Productivity in the area of Data
Capture by
Reducing Average Handling Time
Reducing Average Cost of Handling and
Improving Accuracy of Capture
To be regarded amongst the most Innovative Providers of
Imaging & Recognition Technology to global enterprises
www.idatasciences.com 3
Vision
Data is Everywhere !!!
Invoices, Bank & Financial Statements, Legal Documents,
Regulatory Filings, Bank Checks, Application Forms
Prescriptions, Medical Bills, Insurance Claims
Retail Coupons, Product Catalogues, Travel Vouchers
Structured; Unstructured; Textual; Non-Textual
Capturing & Analyzing this data
provide Organizations with Valuable Insights
leading to Actionable Intelligence
www.idatasciences.com 4
Context
Data Capture Processes
Limited automation specific only to textual data capture
Limited scalability due to high dependency on manual human labour
Time consuming with lower accuracy of data
Inability to extract data from large files
Increased total cost of ownership (TCO)
Data Recognition
Non-textual data capture is still nascent and is predominantly manual
Challenges to detect and recognize data from videos
Large and complex volume of data
Tagging of imagesvideo is highly manual and labor intensive
Current Challenges
www.idatasciences.com 6
Typical Data Flow
Sourcing
FTP
Email
Web Crawlers
Data Feeds
Extraction
Text
Images
Processing
Normalization
Validation
Tagging
Consumption
Business
Applications
Analytics
Unlike OCR which capture only text from pre-defined zones, iData
widens the definition of data to capture both text and images,
independent of zones
Human like visual cognition in extraction of data
Self learning engine (expert system)
Can work on both structured & unstructured data, printed & hand
written
Making it a powerful alternative to existing OCR technology
Can work on static images and video
Analytics based on disruptive pattern based techniques (Content
Based Object Retrieval)
Highly scalable & performant
How Are We Different?
www.idatasciences.com 7
www.idatasciences.com 9
Data Capture
• Exceptions Listing
• Formatting Rules
• Dependency Rule
• Look Up from DB
• Duplication Check
• Reconciliation
• Build Audit Trail
Export:
XML, Excel, CSV,
Text, Database
Porting:
ERP: SAP, MS
Dynamics
Database: Oracle,
MS SQL, My SQL
Self Learning: Builds Knowledge Base
PatternRecognition
BusinessRules
KnowledgeBase
Images: Forms, Invoices,
Documents, Other Images
Image Repository
Porting to
Business
Process
Extract Validate
Extract: Data, Objects & Features
Specific Business Rules Defined at the Time
of Extraction
Extraction from Structured, and Un-
Structured documents/forms
Typewritten, Printed, (*Multi-Lingual data)
Image Objects
Capital Handwritten in boxes (non-cursive)
data from Structured forms.
Editable Clean
Data
Digital Image Repository
Physical
Document
Scanned
Actionable Intelligence to Enterprises
Predictive Analytics
Trend Analysis and Inflection Points
Risk Analysis
Real time & highly performant Analytics
Advance Technology in Image Analytics using disruptive pattern
based techniques
www.idatasciences.com 11
Data Analytics
Facial Recognition
Object Recognition
Logo / Brand Recognition
Static & Automated Tagging
Voice based identification and elastic search
www.idatasciences.com 12
Recognition Expert System
One Framework, Multiple use-cases
Logo detection
Image classification based on objects in Image
Recognize and analyze items in photos and video
Recognize a face from an image or a video and bring all correlated
information from the reference database.
Identify part numbers from visual characteristics
Identify and count inventory on display
Recognize patterns and distinguish objects
Recognize objects and quantify their characteristics
Find Objects of interest in a scene
Recognize the content in video and photos to generate metadata
and taxonomy automatically
www.idatasciences.com 13
Recognition Expert System
Achieves compression ratios of 8:1 times without compromising on visual
quality
Accepts multiple formats (DCM, J2K, JPEG ,PNG, PDF, TIFF, AVI, WMV,
MPEG2 MPEG4, H.64 etc..)
Input attributes metadata are retained in output - format , scale ,
dimension , resolution, quality etc…
Automated Image edit features – resize, crop, smoothen, content filtering,
meta data edit
Options to choose Lossy / Lossless optimization
ROLLBACK option to regenerate original un-optimized file
www.idatasciences.com 14
Image & Video Optimization
iData Engine
Optimization, Format
Standardization,
Quality Checks, Data
Extraction
Image Video with LARGE file
size
DCM, J2K, JPEG, PNG, PDF, TIFF
AVI, WMV, MPEG2 MPEG4, H.64
Image Video with SMALL file size
DCM, J2K, JPEG, PNG, PDF, TIFF, AVI,
WMV, MPEG2 MPEG4, H.64
• No visual quality loss
• No metadata loss
• Data capture for analysis
•Credit Card Application / Loan Origination / Know Your Customer (KYC) Modules – Retrieves hand-written or
printed data from Application Forms including Name, Address, Gender, Employment Details, etc. and validates
based on business rules.
•Legal Documents Module – Retrieves data fields from a Legal Contract including Title, Legal Entities, Execution
Date etc.
•Bank & Financial Statements Module – Retrieves Customer Name, Account Number, Financial Institution Name,
Address, Ledger Line Item Details etc.
Banking
Finance
Insurance
•Drawing Identification Module – Automatically identify the type of plan / drawing and recognize the discipline namely
Electrical, Architectural, Plumbing etc.
•Page Title Finder Module – Auto Detect and recognize the title and page reference
•Factor & Shape Finder – Identify and recognize details of Architects, Engineers; Recognize the shape and verify
compliance
•Spec Finder – Capture all specifications automatically
Engineering
Data
Analytics
•Image/Video Tagging – Automates the tagging of images & video with all relevant data from a video or image
thereby enriching the search capabilities.
•Product Identification – Create a non-intrusive ‘shoppable’ experience embedded into video assets. iData
captures the pixel coordinates for every object and brand identified to create a ‘clickable’ mode
•Archiving / Search – Analyze video, make correlations among videos, translate all screen on text, search for
specific things or people, compare, reference, and develop knowledgebase
Media
www.idatasciences.com 15
iData Solution Accelerators
•Digital Content Creation Module – To automate the creation of device independent digital content (Epub) from
image Pdf’s. Built-in mapping for automated transformation of XML/DHTML/HTML to various e-book formats.
•Image Tagging Module – For automatic tagging of images.
•Digital Rights Management Module – Build finger prints (fractal prints) in the image/document for protecting
original content
•Elastic Search Module - Powerful elastic text search module to search content in non-linear mode
Publishing
•Medical Image Optimization Module - Reduces the file size of medical images without any loss of meta data or
visual quality.
•Prescription Processing Module - Retrieves relevant information from a prescription including Doctor's
information, Code, Patient Information, Drug, Dosage etc.
•Explanation of Benefits Module - Retrieves all relevant benefits information.
Healthcare
•Invoice & Order Processing Module – Retrieves data fields such as Invoice Number, Date, PO Number, Vendor
Name, Invoice Amount, Tax etc. and validates based on business rules.
•Forms Processing Module – Retrieves relevant data fields from any printed or handwritten form with is either
structure or unstructured and validates based on business rules.
•Automatic Feature Extraction Module – Retrieve data from Advertisement Flyers and Coupons. Fact Coding, UPC
Coding & Product coding.
Business
Process
Outsourcing
www.idatasciences.com 16
iData Solution Accelerators
Domain Profile
World's leading US based Retail Marketing Research and Consumer Behaviour Intelligence Company
Voluminous processing of unstructured retail image coupons for data, image feature extraction and data
analytics
Challenges
Transmit approx. 18,000 images everyday to India from the US (one Frame = 30 images at 5.94 MB per
frame) on 40 MBPS bandwidth; processing and rendering large image files at high volumes.
Process involved manual cropping of images and data entry (Fact coding and UPC coding) into a
specified file format with data fields in a database
Lower per person productivity: Approx. 240 Images/8 Hours.
Solution & Results
Through MapOptis, productivity levels propelled to 40,000 Images/Hour using MapOptis optimisation
Extraction productivity went to 5000 images per two hours
Process went from 100 Image analysts producing 18000 images/day to 15 QA resources producing
100,000 images in 8 Hours.
Team size reduced to only 15 resources, performing Quality Analysis & Checks
Storage requirements reduced by 60%
Accuracy levels consistently over 95% on auto mode.
Operational and man power cost savings at 60%
www.idatasciences.com 20
Case Study 1
Coupons & Flyers Extraction
Domain Profile
A leading US based invoice processing company that works with world’s foremost Retail Chains
Voluminous processing of structured and unstructured invoices, with data manually being
captured
Challenges
Handling 4500 scanned invoices pages with multiple line items.
Double Data Entry for accuracy.
Productivity – 90 images/8 hours
Team size of 120
Solution & Results
MapOptis achieved productivity speeds of 4500 images per two hours. Provided Unlimited
Volume Scaling and immediate increase in volume of business.
Team size reduced to only 5 resources, performing Quality Analysis & Checks
Accuracy levels consistently at 99.95%
60% savings in operational and man power costs
www.idatasciences.com 22
Case Study 2
Invoice Data Extraction
Domain Profile
A leading Mobile Telecommunications Service Provider
Voluminous processing of hand written structured application forms.
Data was manually captured
Challenges
To automatically extract different types of data.
Accurately, without loss of image quality or data accuracy and with less manual
processing and dependencies
Solution & Results
MapOptis Extracted, optimized and stored all types of form data such as text,
signature, photo & attachments with 95% extraction and validation accuracy that
progressively improved to 98%
Churned an unprecedented 80% turn around time
Saving overall operational costs by 65%
www.idatasciences.com 24
Case Study 3
Handwritten Forms
www.idatasciences.com 25
Case Study 3
Handwritten Forms
MapOptis takes 10 seconds
Extracted Data with photo ,
signature can be ported to
any data base
Handwritten Form Manual Data Entry takes
Approximately 4 minutes per form
www.idatasciences.com 27
Case Study 5
Medical Image Optimization
• Dimension, Format, Resolution and quality
are retained in the output file
• Above sample image being viewed using
3rdParty Image Viewer (Santa Fe)
Original Image
Image Type: DCM
Image Size 7.19 Mb
Optimized Image
Image Type: DCM
Image Size 1.83 Mb