Everyone knows what a load file is, but do you know what a load file is made of? Together, we'll discuss the anatomy of different load file formats commonly used with Ipro software.
Then, with the surgical precision of a toddler playing their first game of Operation!, we'll practice editing some load files.
2. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
DATs, LFPs, and OPTs – Oh My!
Everyone knows what a load file is, but do you know what a load file is made of?
Together, we'll discuss the anatomy of different load file formats commonly used
with Ipro software.
Then, with the surgical precision of a toddler playing their first game of
Operation!, we'll practice editing some load files.
What to Expect –
• 5 hands-on exercises using a Text Editor, CMD, and various techniques to
manipulate data
• Class discussion and Q&A between each scenario
3. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Do you have what it takes?
Familiarity with Text Editors
Case Administrator
4.
5. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
The DAT
Document Level
Native Path, Text, and Metadata
Document Breaks Indicated
Encoding: ASCII / Unicode
Quote: þ
Delimiters Field:
Newline: ®
6. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
DAT File Preview
7. Private and Confidential – Copyright 2019
The OPT
Page Level with Single Page Images
Image-based
Document Breaks Indicated
Encoding: ASCII
Commonly requested Image Load File
Comma Delimited
8. Private and Confidential – Copyright 2019
ECL_00000001,ECL0001,ECL0001Image0001ECL_00000001.tif,Y,,,6
Imagekey,Volume,ImagePath,DocMarker,BoxMarker,FolderMarker,PageCount
(Not used in Ipro Desktop or Ipro Cloud)
Components of an OPT
10. Private and Confidential – Copyright 2019
The LFP
Page Level Lines; Single or Multi Page
Can load Images, Tags, Annotations and
Redactions, Natives, and Text
Document Breaks Indicated
Encoding: ASCII
Original Ipro Image Load File
11. Private and Confidential – Copyright 2019
Command Description
AN Redactions, Embedded Text,
Highlights, Markups
FT Full-Text search highlights
IM Image/page reference
IS Page-level tags
OF Native file
OI OCR Text in load file
OT Path to text file on disk
VN Volume location
LFP Basic Commands
12. Private and Confidential – Copyright 2019
IM,ECL_00000001,D,1,@;Eclipse Smoke Test DataECL0001Image0001;ECL_00000001.tif;2
IM,ImageKey,Boundary,Page,@VOL;RootFolderSubFolder;Filename.tif;FileType
Components of an LFP
14. Private and Confidential – Copyright 2019
The DLF
Document and Page Level; Single and Multi Page
Can load Metadata, Natives, Text, Images, Tags, and
Annotations
Document Breaks Indicated
Encoding: ASCII / UTF8 / UTF16
Primarily used to move data between Ipro products
An “All-In-One” load file
Delimiters ASCII<24>
ASCII<25>
15. Private and Confidential – Copyright 2019
Command Description
ANOT Embedded Text, Highlights, Markups
DBFD Database field definition
DBRC Database record (new doc, metadata)
PAGE Loads a page
TNBD Document level tag
RDAC Redaction
DLF Basic Commands
16. Private and Confidential – Copyright 2019
DBFD<24>BegDoc<24>EndDoc
DBRC<24>ECL0000001<24>ECL0000006
Database Field Definition<24>Field1Header<24>Field2Header
Database Record<24>FieldValue<24>FieldValue
Components of a DLF
17. Private and Confidential – Copyright 2019
PAGE<24>ECL0000001<24>Eclipse Smoke Test DataECL0000001.tif<24>1<24>0<24>
PAGE<24>ImageKey<24>ImagePath<24>PageNumber<24>Rotation<24>StartOfDocument*
* = Start Of Document optional.
Components of a DLF
18. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Custom Delimited Files
Document Level
Metadata, Text, Native Paths
Document Breaks Indicated
Encoding: ASCII / UTF8 / UTF16
Common Extensions: TXT, LST, CSV, ASCII
Commonly used to overlay coded data
Delimiters: Can be almost anything
19.
20. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Metadata Load Files
• Always include FIELD HEADERS
• BEGDOC should be the first column
• Use UNC PATHS when loading natives
• Use DELIMITERS not commonly found in your data
• Leverage the GRID VIEW PREVIEW during import
• Make sure your DATES are formatted as expected
21. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Image Load Files
• Avoid Volumes (LFP)
• Use Relative Paths when producing to
another party
• Validate your Image Options when importing
22. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Load Files Overall
• Use Advanced text editors (Notepad++, TextPad,
UltraEdit)
• Split files into manageable sizes
• Keep production requirements in mind
23.
24. Private and Confidential – Copyright 2019
Benefits and Drawbacks
Format Benefit Drawback
CSV Simple, easy to create and
modify
Common delimiters; can’t
use commas/quotes in data
OPT Easy to understand
document bounds
Must be page-level (even if
images are multipage)
LFP Versatile; load images,
annotations, natives, and
tags
Volumes can be tricky if
used
DLF Versatile, unique delimiters Difficult to create manually
due to unique delimiters
DAT Most common format;
widely accepted
Nothing, everyone loves a
nice DAT
25.
26. Private and Confidential – Copyright 2019
Rules
• 3 Meme templates
• Write your captions on scratch paper
• Write your name down!
• We will read the captions anonymously
• You get to vote for your favorites!
29. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Scenario: Production QC
You receive a production volume with a DAT, OPT and LFP.
It also contains images, text, and native files. Your job is to
validate the production volume and make sure it is “Good
to Go”.
What is your process?
The production is located here: Desktop WorkshopDay 2 - Wednesday -
May 1stDATs, LFPs and OPTs - Oh MyILUVLDFLZ001
30. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Scenario: Production QC
Applications Needed:
Windows File Explorer + Right Click > Properties
Notepad ++ or another file editor
___ = OPT Line Count
___ = OPT Doc Count
___ = LFP Line Count
___ = LFP Doc Count
___ = DAT Doc Count
___ = DAT Native Path Count
___ = DAT Text Path Count
___ = Files in TEXT folder
___ = Files in NATIVES
___ = Files in IMAGES
31. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Scenario: Production QC
118 = OPT Line Count
24 = OPT Doc Count
118 = LFP Line Count
24 = LFP Doc Count
25 = DAT Doc Count
0 = DAT Native Path Count
25 = DAT Text Path Count
25 = Files in TEXT folder
25 = Files in NATIVES
122 = Files in IMAGES
What did you find wrong with the production?
32. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Scenario: Production QC
What did you find wrong with the production?
• OPT line count ≠ # of files in IMAGES folder
• OPT doc count ≠ DAT doc count or # of text or natives
• No Native Path in DAT
• No ending delimiter in the DAT
• Anything else wrong?
Let’s fix this production!!!
33. Private and Confidential – Copyright 2019
Scenario: Incorrect File Paths
The OPT File Paths are incorrect (i.e. the path is relative and incomplete).
How do you replace the path prior to import
into review?
If this issue was in our DAT file paths, what
precautions should you take?
34. Private and Confidential – Copyright 2019
Scenario: No Load File? No Problem
It appears you received
a bunch of Image files
for one volume (002),
but no load files.
You need to load
the images, and
decide to create an
LFP file.
Hint: the first line will look something like this:
IM,ImageKey,D,1,FilePath;Filename.tif;2
35. Private and Confidential – Copyright 2019
Scenario: Regular Expressions
If you didn’t initially notice, the DAT is missing something. Every text
field should begin and end with a specific delimiter. This DAT file does
not contain ending delimiters for the last field or beginning delimiters for
the first field.
How can you fix this via find and replace?
^: Start of Line
$: End of Line
36. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Scenario: No NativePath, No Problem!
Our production contains native files but the DAT does not contain the
native path in order to load the natives. The native files are named after
their BEGDOC.
You need to create a load file with the BegDoc and Native Path fields.
How do you create the load file?
37. Presentation TitlePrivate and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Private and Confidential – Copyright 2019
Scenario: No NativePath, No Problem!
Applications Needed:
• Command Prompt
• Notepad ++ or another file editor
Step 1: Open Command Prompt and navigate to location of native files
Step 2: Directory Listing is SO BAD!
Step 3: Text Editors and Regular Expressions are
Load Files
What do they do? What do they look like?
Why do I need them? When do I need them?
What if I don’t have one?
Probably the most commonly used/requested load file
OPT tells the application the path to find the images.
The purpose of this session is to provide a deeper understanding of some of the most common industry load files. We will focus on breaking down the syntax and use cases of different types of load files. We will engage attendees specific scenarios to demonstrate best practices when manipulating or QCing load files with their deliverables
Include field headers makes mapping fields much easier – you don’t have to guess
Begdoc first helps you understand what document you’re looking at
Uncommon delimiters = lower chance of messing up load
Grid view lets you see how your data will line up
Volume translation can get messy and only one volume can be configured at a time during import when using find/replace
Partial paths will avoid exposing your file structure to opposing
Don’t accidentally delete all your annotations!
Advanced text editors allow you to use regular expressions
Manageable load files (<200mb or even smaller) make it easier to load when edits are needed
With all this in mind, don’t forget your requirements. Don’t provide a CSV when they’re asking for a DAT
The purpose of this session is to provide a deeper understanding of some of the most common industry load files. We will focus on breaking down the syntax and use cases of different types of load files. We will engage attendees specific scenarios to demonstrate best practices when manipulating or QCing load files with their deliverables
Include field headers makes mapping fields much easier – you don’t have to guess which fields you’re mapping
Begdoc first – Easy to identify records if you need to read the load file for any reason (troubleshooting, etc)
UNC Paths – who likes mapped drives anyway? Saves from having to reload or modify paths after the fact
requirements. Don’t provide a CSV when they’re asking for a DAT
You receive a production volume. The volume contains a DAT and LFP load files. You are asked to make sure “all of the data is there”. How do you go about this?
They will be using counts and imagekey numbers to compare against files on disk for Natives, Images, and Text.
You receive a production volume. The volume contains a DAT and LFP load files. You are asked to make sure “all of the data is there”. How do you go about this?
They will be using counts and imagekey numbers to compare against files on disk for Natives, Images, and Text.
You receive a production volume. The volume contains a DAT and LFP load files. You are asked to make sure “all of the data is there”. How do you go about this?
They will be using counts and imagekey numbers to compare against files on disk for Natives, Images, and Text.
You receive a production volume. The volume contains a DAT and LFP load files. You are asked to make sure “all of the data is there”. How do you go about this?
They will be using counts and imagekey numbers to compare against files on disk for Natives, Images, and Text.
You are given an OPT load file, but the file paths are NOT usable (i.e. the path is relative and incomplete). How do you replace the path prior to import into review?
Can be useful for certain scenarios where find and replace in review software cannot accomplish this.
We had a load file mapped to E: as the drive letter. Someone went in and did a mass replace for ONLY E: and wound up replacing every single metadata instance where E: was present. IE, almost every email reply subject line (RE: ) not had our drive path in it.
Delimiters at end of lines – Regular Expressions to fix:
You are given a DAT load file to load into review. However, the load file doesn’t read properly in your review software and you receive errors when trying to import. Upon reviewing the DAT, you find the delimiters are missing at the end of the line. How can you fix this with a single find & replace?
Search & Replace using basic regular expression.
You are given a deliverable with load files and natives and text, but the load file doesn’t contain the nativepath.
The native files are named after their BEGDOC. You have loaded the available metadata and text but your native files are still not loaded.
You need to create a CSV load file with BegDoc, NativePath fields. How do you create a load file for the files on disk?
Use Command Prompt’s DIR (/s/o/b/a-d >file.txt) to file option to get a CSV of file paths, then edit to add BegDoc.
Uses regular expressions and column select to accomplish
You are given a deliverable with load files and natives and text, but the load file doesn’t contain the nativepath.
The native files are named after their BEGDOC. You have loaded the available metadata and text but your native files are still not loaded.
You need to create a CSV load file with BegDoc, NativePath fields. How do you create a load file for the files on disk?
Use Command Prompt’s DIR (/s/o/b/a-d >file.txt) to file option to get a CSV of file paths, then edit to add BegDoc.
Uses regular expressions and column select to accomplish