An introduction to some concepts of best practices with digital images to assist aquatic biologist in their analyses. Given at BIO in February 2012 to accompany the DFO Technical Report 2962. http://waves-vagues.dfo-mpo.gc.ca/waves-vagues/search-recherche/display-afficher/344780
Managing image data for aquatic sciences - the best practices presentation
1. Claude Nozères
Science Branch, Québec Region
Fisheries and Oceans Canada
Maurice Lamontagne Institute
claudenozeres@gmail.com
2. Overview
1. introduction: the guide (Tech. Rep. 2962)
2. image data: what is it about?
3. captures: preparations
4. metadata: why all the bother?
5. workflows: recipes for work
6. exports: archives & publishing
7. trends: comments on new tech.
8. questions: findings on the tour so far
afternoon: software demos & discussions
3. 1. Introduction: background
Personal experiences – taking digital photos
of aquatic life since 2001
needed to document prey samples for marine
mammals, and film wasn‘t doing a good job
became aware of mixed information among users
○ frustrations were common when using either
consumer or industrial tools
○ by sharing experiences, our work may become
easier, and better quality image data is produced
4. Introduction: guide & tour
DFO‘s National Image Data Management
(NIDM) Working Group
fall 2010: began a ‗best practices guide‘ to assist
employees with their imaging work
mid-Dec. 2011, published the first full version:
Nozères. Tech. Rep. 2962, now online (WAVES)
Jan-Feb. 2012: tour of regions to introduce guide
○ the hope is that each site will then do a follow-up,
with advanced workshops, for their needs
5. Objectives: this talk
1.
2.
to introduce a sample of common, but perhaps
misunderstood concepts in image data
to learn about your experiences with gear and
software so we can share this with others in DFO
note: will also try
to include latest
information, not
in the guide
Headline: happy marine biologist
Keywords: scene, joke, smurf
Location: Belle-Isle
Category: personal
6. 2. Image data – basic types
still image data (photo)
huge availability in consumer devices
well-established for industry & science but finicky?
moving image data (video)*
often consumer-oriented (family videos)
industrial applications: pricey and finicky?
information (metadata)
2008-08-0712:02:09...
data about the image data
*note: video is not discussed in this brief introduction – see guide for information
7. Why talk of files as ‗data‘?
just another pretty picture
or an
aquatic species observation?
information
clearly visible
subjects
Keywords: harbour seal, rock
Location: Sainte-Luce
Date: Sept. 8, 2009
8. Image data: perceptions
‗If really so useful, we should all be doing it!‟
may end up generating stacks of fuzzy, dateless,
unknown files = frustration
„I have enough science data to deal with‟
images as data may not be taken seriously
„I don‟t have time for more requirements!‟
learning about image data may be viewed as a
time-waster instead of a work-saver
9. 3. Capturing image data
camera settings
format (file type)
quality (lossy compression)
size (dimensions....)
special topic: geotagging (GPS data)
10. Camera settings: file
formats
JPG (8-bit) is default, or only option for many
good, but ‗baked‘ (limited for image editing)
RAW (10, 12, 14-bit) for advanced cameras
require post-processing with RAW software
sometimes capture both ‗RAW+JPG‘
○ view JPG right away, store RAW for later edits
TIF is occasionally available (8 or 16-bit)
microscope & tethered cameras, scanners
good choice for image analysis (16-bit)
‗baked‘ like JPG: harder to correct for whiteness
11. Camera settings:
‗quality’ (for JPG)
Lossy compression:
how much detail is to be
discarded in JPG?
Select quality:
„basic, good, fine, v.fine‟
= low to high quality
Lossless compression:
no data loss, no need to
set ‗quality‘ (RAW, TIF)
12. Settings: channels & bits
Channels
Grayscale has 1 channel (black)
RGB (for screen) has 3: Red, Green, Blue
CMYK (for print) has 4: Cyan, Magenta, Yellow, blacK
Bits: levels, or gradation ‗steps‘ (in each channel)
1-bit = 21 = 2 values, on/off, black or white (like a fax)
8-bit = 28 = 256 values for tones (gray or colour images)
10, 12, 14, 16-bit = many thousands of tone levels
note: most monitors only display in 8-bit
even if you can‘t see it, the data is there for analysis
14. Why more bits matter
high-bit RAW & TIFF files have more tones
important in image analysis for feature (subject)
discrimination, like plankton in a water sample
○ 16-bit grayscale may be preferred over 8-bit colour
the extra information enables powerful software
editing (recover detail in light and dark areas)
○ JPG 8-bit can be also edited, but less dramatic
○ TIF at 8-bit has same limits (16-bit allows more)
○ RAW is >8-bit (e.g.,10-14)
Note: colour scanners may refer to 24-bit or 48-bit (3x8 or 3x16)
15. Settings: white balance
auto-white balance may be accurate, but
sometimes better when set to conditions:
sunny, cloudy, shade, incandescent, fluorescent
JPG & TIF are ‗processed‘ files with their
‗whiteness‘ (white balance) set at capture
like a ‗Polaroid‘ instant photo: limited edits
RAW has metadata suggesting the setting,
but is not fixed: can redo after capture
similar to a film negative: ‗reprocess it‘
16. Camera settings: white
balancedefault capture
RAW file:
under fluorescent lights
corrected file for white
Background should be white – clicked on it with a correction
tool and white balance was adjusted
18. Camera settings – size
2 MP
1600x1200
Image resolution
web
5 MP
2600x1900
web or small: good for onscreen viewing
large (2 to 5 MP): good for regular prints
full-size (usually about 8 to 16 MP): archives
note: RAW is usually a full-size capture
Why choose for less than ‗full-size‘?
digital zoom (like cropping) sometimes handy
situations when a large image is a burden
○ documenting labels, geotagging, emailing
○ caution: set back to full-size afterwards
19. Size: pixels vs. files
Settings for size (or resolution) are about image
dimensions—how many megapixels (MP), not
the computer file size in Kilobytes, Megabytes
(KB, MB)
blank test
image 2 MP
1600 (across) x 1200 (high) pixels = 2 MP
but file size will vary by format & compression
JPG with high compression = small file (68 KB)
TIFF with no compression = large file (5800 KB)
○ TIF with lossless compression of this image = (70 KB)
20. Size: dimensions vs. density
on the computer: resizing is increasing or
decreasing the number of pixels (dimensions)
1600x1200
3600x2400
800x600
smaller
(less pixels)
original)
upsized
(more pixels)
but sometimes we say we ‗resize‘ for print
really just setting pixel density (dots per inch: dpi)
image size (number of pixels) has not changed
smaller dots: 300 dpi
Print viewing
larger dots: 72 dpi
Screen viewing
21. Of sensor sizes &
megapixels
Sensor size: physical dimensions (mm)
SLR cameras have large sensors
compact cameras have tiny image sensors
Photosites : density of sites on the sensor
two cameras may have the same resolution, but
the 12 megapixels of the SLR are over a much
wider area (the larger sensor) than the 12
megapixels on a small-sensor compact
22. 20-80 MP
Sensor sizes
Medium format &
full-frame 35mm
12-24 MP
niche markets ($$)
slow development
smaller sensors
most
are versatile
extremely
competitive
intense
development
12-24 MP
new Canon G1X
common
new Nikon 1
5-16 MP
23. Sensor sizes
35 mm & Medium format ( & larger) are useful
in aerial surveys (e.g., marine mammals)
extreme level of fine, clean detail & tones
great for distances; macro work is trickier, bulky
most biology work is done with compacts or
smaller (APS) SLRs: simpler, easier to use)
compacts for macro work: many can do 0-10 cm
getting ‗pretty good‘ results: use software processing
to beat physical limits, reduce noise
○ not ‗fakery‘ but sometimes undesired (see example later)
26. Capture: Geolocation
some cameras have internal GPS to embed
coordinates & correct time zone date
mostly in still cameras, but also some video (rare)
○ note: smartphones geotag both photos & videos
other cameras can have their images tagged with
external data using, for example:
1) geotagged image at same location (e.g. smartphone)
2) GPS track and timestamp of image
○ note: image file must have correct clock time
○ tip: take a photo of the time on a GPS screen, then
examine that photo‘s capture time info. to determine
correction/adjustment for camera clock
27. Geolocation – image tagging
Smartphone map (shows AIS)
Smartphone photo (tagged with GPS)
Camera with telephoto lens
(but no GPS)
28. Geolocation – image
tagging
Smartphone photo
(tagged with GPS)
Keywords: ship, transport
Location: Sainte-Flavie
Category: personal
load into the geotagging software the
tagged photo with untagged photos
taken from the same location
SLR zoom photo
(geotagged w/phone image)
29. Geolocation – GPS track
sync
record a GPS track log on an external device
log while taking camera images
later, download images and the GPS track
into geotagging software
the capture time of the photo will be used to
determine its position at that time on the
GPS track (‗sync‘)
embeds the coordinates into image file
NOTE: this is an example of image data information
(metadata), and not about image quality
30. 4. Image (file) metadata
tags
why the fuss over metadata?
we may do ‗tagging‘ in order to be able to locate,
use, and credit the image files using the tags
where is the image metadata?
camera files have well-known, standard places to
store this special text information
other image data, or non-standard information, may
be entered in catalog files in a database system
do I need to do manually add all these tags?
some are automatically included by the camera, such
as date, time, camera model (and GPS, if available)
31. Metadata tags: suggestions
Common fields for tagging images:
Filename: unique name (e.g, date-####.JPG)
Title: name for photo (but often for ID #)
Headline: short phrase about content
Description: more info. about content
Keywords: species name, subject
Location: place or station name
Creator: photographer‘s name
32. Tag example
Filename:
20111014_IMG_1387.JPG
useful, but not often done
Title (catalog no.):
9682
Headline (quick describe):
Arctic isopods
Description/Caption (text on
paper label): Hand-collected
Mesidotea sabini from
Causeway at low tide, held in
an aquarium for one day
Keywords: Saduria sabini
Location: Frobisher Bay site 9
Creator: Claude Nozères
33. Good: added metadata
tags can be as you like
Bad: added metadata
tags can be as you like
Try to follow examples
of others, e.g. IPTC,
MWG, the DAM book
(some rules exist, but
most are open-ended)
Example:
Creator: unknown
Posted on blogs since 2010
Was able to find it using the
visible text in a Google search
How would you tag this image?
Title? Caption? Keyword?
Make sure your metadata makes sense to users
36. Metadata: retaining &
reading
Older or simpler software may be unaware
strip away camera metadata (capture date, etc)
Not all image browsing software play fair
Apple, Microsoft, and Google are all competing to
make easy-to-use, popular tools
sometimes do hidden & proprietary processing ‗for
your benefit‘ (automatically), which may be to the
detriment of ‗industry-standard‘ metadata tags
recent examples: face recognition (all), geotagging
(Windows Live), stripping of current tags (IPTC)
with retired fields (Apple Aperture, iPhoto)
37. Metadata: summary for use
basic fields are easily read by most
advanced fields may be handy in projects
custom fields are available, but make sure
your users are aware of their existence
key lessons:
1) adopt a style and be consistent
2) let your users know what to expect
3) be vigilant for software behavior
38. 5. Image data workflows
can we do editing and tagging without
worrying about how it works?
people want ‗recipes‘, or workflows
see guide no. 2962 for some examples
image data protocol examples
○ case studies for different work scenarios in
aquatic sciences
image data software examples
○ practical examples using software tools
39. Guide workflows: for discussion
the guide is not a
fixed set of rules
rather it is a list of
suggestions from
recent work
which ways of
working may be
easier (& better)
than others?
source: XKCD
42. Quote overheard yesterday*
“How do you love Photoshop? Like someone
loves their wife,...or their cousin...or?”
“I love Photoshop like people love their kids –
no way to get rid of it, so I have to love it”
*Macworld Podcast – Less than Perfect: App Design
43. Image data work: a tale of 2
tools
Adobe Photoshop (PS)....20+ years
classic tool for editing and....everything!
○ most folks only use it for a few tasks
Adobe Lightroom (LR)....5+ years
revolutionary workflow tool, now matured
○ ‗95%‘ of my photo work is now done inside LR
○ extra functions available with shareware plugins
Newsflash! Jan. 2012 – LR Public Beta 4:
video editing, geotagging, photobooks
44. Image data work: managing
tools
Browsers: ‗find‘ your images on a workstation
Windows Explorer (default – very limited)
Google Picasa (easy, basic, free)
Photoshop Bridge (full browser & metadata editor)
Photoshop Elements Organizer (new: object searching)
Cataloger & image editor
Adobe Lightroom (workstation, not network use)
Catalogers
Phase One Media Pro (workstation; free catalog reader)
Damnion, Canto Cumulus (network/server)
demonstrations this afternoon (bring your laptop)
45. 6. Exporting: final work stages
After capturing, tagging, editing images,
we want to:
store the originals & edits (archiving)
distribute copies (publishing)
46. Exporting: archives
Ideally, this is about final edits in best
quality with metadata tags that are stored
securely in multiple locations and media
This is an area that NIDM is working on: how to consolidate and preserve.
Large projects are likely good, but smaller ones may need advice
3-2-1 approach is recommended (Krogh)
have 3 copies (original & 2 backups in rotation)
store on 2 kinds of media (hard drive, DVD)
keep 1 off-site (not all stored same place)
47. Exporting: galleries & print
may send re-sized versions:
800 pixel 72 dpi JPG is fine for web
galleries, and especially for email
more-pixels, but at 150-300 dpi is for print
(the density is important for clear prints)
for public viewing on web, review the
file metadata & edit if desired
location, names, comments may be seen
edit in DAM (Bridge, MediaPro, LR)
48. Publishing – web
CaRMS (Canadian Register
of Marine Species)
- online taxonomic resource with editors
- also has a user-added image gallery
- see Kennedy et al. Tech. Report
- note: camera metadata is visible
added on
website
camera
metadata
49. Publishing – web
DFO has several image
gallery projects
Coast Guard, SLGO,
CaRMS, CMB, others?
Groups may join a
large, existing gallery
Flickr is very popular and
does some metadata
used by EOL, BHL, GBIF
50. 7. Trends: new camera
types
before, chose either a digicam or a SLR
small device & average images, or big rig & great
was a demand for quality and compact at same time
‗mirrorless interchangeable lens‘: MILC
Panasonic, Olympus, Sony, Nikon, Pentax
2011: new disruptive trends in compacts
‗retro-style‘: Olympus Pen, Fujifilm X100, X10...
‗ultra-modern‘ camera phones: iPhone 4S
2012: light field (Lytros) – ‗refocus anytime‘
52. New tech: changing the
game
editing software
new camera types
high-sensivity sensors (lowlight)
solid-state memory (‗flash‘)
cheap hard drives
network storage (‗cloud‘, e.g., Dropbox)
tablets & tactile displays (iPad, Cintiq)
not just fashion: new types may lead to better image
data and much improved workflow (easier & faster)
53. New tech: science benefits
lowlight sensors: reduce need to carry lights
fewer noisy, blurry (slow shutter) shots
compacts: easier to carry & use
capture events more often in the field
SSD: insensitive to ship vibration, magnets
use on underwater towsleds, aerial surveys
large drives: save all, do backups
don‘t bother to delete or waste $$ time reviewing
cloud services: share files with colleagues
don‘t burden email with huge attachments
tablets: field guides, rapid data entry & review
54. Newer is not
always better
Late 2011 DPReview test: indoors w/flash photo
Pentax had long line of WP cameras,
but recent models not good indoors
Sony & Panasonic are new entrants,
but are giving much better files
clean detail
mushy when indoors
clean detail
55. Teleost Aug. 2011
new Pentax Optio when used indoors:
mushy photo—hard to identify
Canon Powershot: clean detail,
easier to identify small organisms
56. Resources – websites
The Luminous Landscape – practical opinions
The DAM Book forum – ―real DAM answers‖
dpBestflow.org – best practices & workflows
JISC Digital Media – advice & examples
Digital Photography Review (dpreview.com)
WHOI HabCam – underwater photo
SERPENT projet – underwater video
CARMS Photogallery – species images
57. Resources – books
The DAM Book, 2nd edition, Krogh
Photoshop CS5 and Lightroom 3: A
Photographer‟s Handbook, Laskevitch
Adobe Photoshop Lightroom 3: the
missing FAQ, Brampton
Photographic
Multishot Techniques,
Steinhoff & Steinhoff
The VueScan Bible, Steinhoff
On Digital Photography, Johnson
58. Resources – documents
(PDF)
GBIF Community Site: Best Practices Manuals
Federal Agencies Digitization Initiative (FADGI),
Still Image Working Group
Metadata Working Group (MWG)
IPTC Image Metadata Handbook
Establishing best practices for marine biological
data, Seeley et al. 2008, COWRIE
CaRMS photogallery user guide, Kennedy et al.
2011. DFO Tech. Rep. 2933
60. Obj. 2: learning – Sault-SteMarie
Otolith microscopy w/Image Pro (5 MP)
good file naming, 3-2-1 storage; might try tagging
Scanning historical slides of activities (size?..)
all notes are entered in filename – need to rethink this
Underwater video for lamprey control (volume?)
proprietary DVR: take video feed over RCA & capture
Underwater dam inspection using a 2 m pole
want live view & record; suggest using 2 dif. cameras
Photo folder on local server (8 GB)
do temp. catalog to browse, then do perm. catalog
61. Obj. 2: learning – Nanaimo
Otoliths: want to overlay 2 images, & dots
need 3rd party tools (Photoshop, ImageJ)
Import prior analyses (keywords into LR)
LR plugin (Syncomatic: based on filenames)
Reading catalog without full software
LR: not usually. ExMedia/Media Pro: Yes
Can we use alternative ingestion (import) tools?
Yes, Photo Mechanic, Ingestamatic may be useful for high
volume, batch file entry (e.g., marine mammal surveys)
Easy way to get started and using tools like LR?
Various resources – our guide is an example, but we still
need a forum or other place to post experiences and tips
62. Obj. 2: learning – Burlington
Q‘s
does DFO have a site licence for this software?
how to distribute a catalog on the network?
can I use custom annotation fields in a catalog?
what kind of scanner to archive histology slides?
Flowcam produces a composite of plankton shots in
sample: how to manage?
Nikon imaging microscope produces custom files –
how to manage?
How to transfer hierarchical folder names into
annotation fields?
...and more!
63. Obj. 2: learning – St.
Andrews
geomatics & video lab: of screens & mice
had a quality monitor, but not great for viewing
charts or when using a mouse and keyboard to
trace habitats at same time as viewing images
solution: use the right display for different work:
1) HDTV for video (1920x1080 pixels)
2) 27in NEC for photos (2560 x1600 pixels)
3) 24in tactile display (Wacom Cintiq) for tracing
habitat classifications—more efficient
64. Obj. 2: learning – St.
Andrews
reusing legacy & custom equipment
big HDV camcorder, with $30K UW housing
○ don‘t want to buy a new camera & $$$ housing
solution: HDMI video out to flash memory cards
result: instant digital video (no tape playback to
import), and higher quality (original video
capture, not compressed to fit HDV tape)
65. Obj. 2: learning – St. John‘s
need a place to obtain and learn more
want workshops, website forums....CMB?...
exchange files with remote fisher. observers
receive and send feedback on species ID images
cloud computing seen as a solution (Dropbox)
have to enable software updates
older software versions (>3 yrs.) are not aware of
current metadata and image file standards
want access to image files for regional guides
other regions may do ID books, want to do it here
Notas del editor
compare to octopus and isopod images: so as long you know what and where to look for info, search will succeed