TraitCapture: NextGen Monitoring and Visualization from seed to ecosystem
1. TraitCapture: NextGen Software and
Hardware for Scaling from Seeds to Traits
to Ecosystems
Tim Brown, Research Fellow, Borevitz Lab
ARC Centre for Plant Energy Biology, Australian National University
Chuong Nguyen, Joel Granados, Kevin D. Murray, Riyan
Cheng, Cristopher Brack, Justin Borevitz
2. Terraforming
“To alter the environment of a planet to make it capable of
supporting terrestrial life forms.”
We are currently unterraforming the earth at an exceptionally fast rate
To meet the challenges of the coming century we need to restore and
re-engineer the environment to support >7 billion people for the next
100 years in the face of climate change while maintaining biodiversity
and ecosystem services
These ecological challenges are too hard to be solved
with existing data and methods
3. Genotype x Environment = Phenotype
The degree to which we can measure all three components
is the degree to which we can understand plant and
ecosystem function
FIELDLAB
4. Outline: Phenomics challenges
• Lab:
• Measure phenotypes with high precision across large natural
populations in varied growth environments
• Identify the genetic basis of traits of interest
• Identify novel, cryptic traits
• Field:
• Monitor phenotype and environment at high precision across scales
from plant to ecosystem to identify natural variation on the landscape
Conservation: Ecosystem stability / plasticity (how should we spend
limited conservation $$)
Restoration: Using existing plasticity and population genetic variation
to select seeds for building “climate ready” populations and assited
migration (reforestation, etc.)
5. Outline: General challenges
(1) Processing and managing big data
• We used to be primarily limited by data collection (hardware)
• Now we are increasingly limited by data processing and curation (software)
• We need “excel” for big data
6. And how do you do science if you can’t even download your data?
7. Outline: General challenges
(2) Optimizing the knowledge discovery network
• Data sharing, open access and open source are of major importance
for solving research problems:
• Research dollars are poorly spent when they produce closed
data and firewalled journal articles, yet we all aspire to publish
our best work in journals that refuse access to the public.
• We have serious problems to solve in this decade: This is a network
optimization problem
• Open source matters! – The rate of knowledge discovery is
determined by how efficiently we can share data, tools and new
knowledge.
8. Lab vs field phenotyping
Lab: High precision measurement and control but low realism
youtu.be/d3vUwCbpDk0
9. Lab vs field phenotyping
Field: Realistic environment but low precision measurements
In the field we have real environments but the complexity (and bad lighting!) reduces
our ability to measure things with precision
youtu.be/gFnXXT1d_7s
11. Lab phenotyping
Normal lab growth conditions aren’t very “natural”
Kulheim, Agren, and Jansson 2002
Real World
Growth Chamber
12. Growth cabinets with dynamic “semi-realistic” environmental &
lighting conditions
• Grow plants in simulated regional/seasonal conditions & simulate climate
• Control chamber light intensity, spectra (8/10-bands), Temp/Humidity @ 5min
intervals
• Expose “cryptic” phenotypes
• Repeat environmental conditions
• Between studies and collaborators
• Simulate live field site climate
Lab Solution: SpectralPhenoClimatron (SPC)
Spectral response of Heliospectra LEDs. (L4A s20: 10-band)
13. TraitCapture: Open-source phenotyping pipeline
• Phenotype 2,000 plants (7 Conviron chambers) in real-time
• 14 DSLR’s (2/chamber) - Controlled by raspberry Pi computers
• 4-12 JPG + RAW images/hr every during daylight
• Automated analysis pipeline: phenotype data from 150,000 pot
images a day
• Automated Phenotypes
• Area
• Diurnal movement
• Color (RGB, Gcc, etc)
• Perimeter, Roundness
• Compactness, Eccentricity
• Upcoming:
• Leaf Count
• Leaf tracking
• Leaf length/width/petiole
• Machine learning
Brown, Tim B., et al. (2014). Current opinion in plant biology 18 (2014): 73-79.
Corrected
Segmented
Original
GWAS
Area
15. The current resolution of field ecology is very limited
• Low spatial & time resolution data
• Limited sensors; don’t capture local spatial variation
• Sampling is often manual and subjective
• Observations not-interoperable or proprietary; little or no data sharing
• Sample resolution is “Forest” or “field” not Tree or Plant
• Very little data from the 20th century ecology is available for reuse
The lab is not the real world
16. The challenge – “Measure everything all the time”
How do we go from doing the science at
the scale of one point per forest to
multilayer data cubes for every tree or
leaf?
16/
20
17. Tech revolutions are driving data revolutions
• Computation
• Small fast and cheap (Raspberry Pi) and Huge fast and cheap (cloud)
• Unlimited storage
• Unlimited processing
• All comes down to pipelines and data management
• Many of the actual computational problems are “solved” or could be with reasonable effort.
• Network
• Ubiquitous internet is huge
• Lab is now in the field (i.e. cloud computational resources available remotely)
• Field is in the lab via AR/VR and 3D
• Mobile computing – your phone is a supercomputer
• 1.5-2x the network bandwidth of MODIS
• The computing power of a supercomputer from 20 years ago
• 4000x the RAM of the Space Shuttle
• 3D
• 3D reconstruction from static and moving cameras
• LiDAR and LightField
• Robotics – automated monitoring and field sampling; Drones/UAVs
• Machine learning / Deep learning / AI – processing huge datasets
18. Huge data crunching isn’t impossible
• Google didn’t exit 17 yrs ago and now it indexes 30
trillion web pages (and 500hrs of new video per minute)
• 1.8 billion (mostly geolocated) images are uploaded to
social media every day (2014; was 500m in 2013)1
• Consider: 75% of cars may be self-driving by 20402 –
continuously imaging, laser scanning and 3D modelling
their immediate environment: 6.2 billion miles3 of
roadside environments in US, imaged in 3D daily!
• Google street view already has imaged 5 million miles of
roads in 3D
We need this level of resolution (and google-like tools) for
ecological knowledge
1. Meeker, 2013, 2014
19. Cloud computing and automation can do amazing things…
2015 Paper: Time-lapse mining from internet photos
• Mined 86 million public geolocated online photos (Flickr, Picassa)
• Clustered 120K different landmarks
• Computed 755K 3D reconstructions.
• 10,728 time-lapses from 2942 landmarks, that contain more than
300 images
• Including a 3-D time-lapse reconstruction of the retreat of the
Briksdalsbreen Glacier in Norway from 9,400 images over a 10-year
time-span.
Martin-Brualla R, Gallup D, and Seitz SM. 2015.
Time-lapse mining from internet photos. ACM
Trans Graph 34: 62.
26. Low cost sequencing let’s us genotype every individual tree and identify genetic loci that correlate
with observed phenotypic differences between trees.
We can do this for all trees at the arboretum within view of the camera.
Fall Color change shows differing rates of fall senescence in trees
Late fall
Brown, TB et al, 2012. High-resolution, time-lapse imaging for ecosystem-scale phenotyping
in the field. in: High Throughput Phenotyping in Plants. Methods in molecular biology.
27. Gigavision hardware evolution
• 2009
• Custom-built system with robotics servos, DSLR’s,
hand wired with mini pc, 0.5 deg accuracy
• 40 minutes / panorama (1.5 gigapixels)
• Jan 2016
• Off the shelf Axis PTZ camera (Q6128) with $40 Raspberry Pi
computer running python code
• 4K resolution PTZ with 700 degree/sec rotation
and 0.2 deg accuracy sensor
• 2 gigapixel panorama in < 5 min
• SMS/Slack alerts if system offline
28. Visualization and analysis (future)
• Current challenge is in visualizing and processing the data
• NGINX image server – stream unlimited resolution images to
any device
• Cloud-backed processing and stitching (university super
computer resources or Amazon cloud)
• Machine learning to detect individuals and phenotypes
• Visualization tools (same as for pot images) – output growth
curves for thousands of trees
• Gigapan viewer demo
• http://bit.ly/gv-tif1 (downsizing a 500MP tif on the fly)
• Player demo
• Old version: Gigavision.org
• New (beta): http://bit.ly/gigavisionV1
29. Gigavision pros and cons
• Pros
• Turn-key always-on automated monitoring
• Monitor huge areas (if you have a tower or a good hill)
• High resolution time-series of everything in your field site
(including ephemerals)
• Cons
• Data transfer issues if you don’t have good internet
• May be overkill if a DSLR image or phenocam provides sufficient
resolution (e.g. tree-level phenology)
• Data extraction pipeline still in beta
• Best hardware solution requires lots of power (30-50watts)
30. UAV’s (drones) for monitoring
• $2-4K airframe (DJI, Aeronavics, senseFly) + 10-20MP digital
camera (~500g – 5KG payload)
• Processing software ($700 - 2,000 USD: Pix4D; Agisoft)
• 3D models of field site (cm resolutions)
• Orthorectified image and map layers
• LAS / point cloud data
• Automated pipeline:
• Tree Height; Volume, foliage density (?)
• RGB color
• GPS location
• DEM of site
• Typically RGB
• Other layers:
• NDVI (MicaSense)
• Hyperspectral
• Thermal
30/
20
View 3D model online:
http://traitcapture.org/
pointclouds
31. Software outputs: DEM and point cloud data
• Processing script for tree data (python):
• GPS, Height, 3D volume, top-down area, RGB phenology data
• Straight to google maps online
32. 3D Point clouds online: http://Phenocam.org.au
Up next, re-sort 3D tree data by provenance, size, etc
33. Drones – Pro’s and cons
• Pros
• Nadir view
• Wide coverage (km’s)
• Larger airframes can carry big payload (~5kg for larger
airframes) for advanced imaging (thermal, hyperspec, etc)
• Time-series point clouds and 3D models of field site
• Outputs can match conventional satellite data for comparison
• Cons
• Requires operator and site visit (can’t fly itself yet)
• Limited time-series and weather dependent
• Regulations and cost (for site visits)
• Processing pipelines not fully turn-key
(and not that cheap)
34. Ultra-high resolution ground-based laser
• DWEL (CSIRO); Zebedee(handheld; $25K LiDAR)
• Multiband Lidar with full point returns
• DWEL: ~30 million points in a 50m2 area (vs 5-10 pts/m for typical airborne)
Data: Michael.Schaefer@csiro.au
36. VR and AR
• Virtual Reality (VR) and Augmented Reality (AR) will
radically change how we interact with our data
• VR (Oculus, VIVE, Morpheus) allows you to immerse
people in imaginary space
• AR (MS Hololens, Magic Leap) allow you to add virtual
content to the real world
Search: “Magic Leap Wired”
37. Augmented / Mixed Reality
• Add holograms to the existing world that can be seen by
anyone
• Microsfot Hololens
• Magic Leap
• Estimated value 2015: $500 million USD; 2016: $2.4 billion
Minecraft on the hololens Magic Leap promo image
40. This is just the beginning
Atari 2600 “Adventure” circa 1980
“Skyrim” circa 2011
We are at the “ATARI” stage in VR
In 10 years, VR/AR will be
indistinguishable from reality.
What will you do with this tool?
41. Important things to consider for monitoring
• Pick the right tool for the job
• What do you really need to measure?
• What is the lowest time and visual resolution you can get
away with?
• How often does it happen? (minute or monthly resolution?)
• How many pixels do you need to detect it?
• For new tech – how do you ground-truth?
• Phenotyping hardware is just sampling some stuff from the
world – the trick is understanding how what the sensor sees
relates to a signal of biological importance (and when it
doesn’t)
• This seems obvious but important to think about when playing
with shiny new toys
Brown, Tim B et al, (2016) Using phenocams to monitor our changing Earth:
towards a global phenocam network. Frontiers in Ecology and the Environment. Vol
14, Issue 2 (March 2016).
42. Example “NextGen” Field site:
National Arboretum Phenomic & Environmental Sensor Array
National Arboretum, Canberra, Australia
ANU Major Equipment Grant, 2014; ANU MEC 2016
Collaboration with:
• Cris Brack and Albert Van Dijk (ANU Fenner school); Borevitz Lab
43. National Arboretum Phenomic & Environmental Sensor Array
• Ideal location
• 5km from ANU (64 Mbps wifi) and near many research institutions
• Forest is only ~4 yrs old
• Chance to monitor it from birth into the future!
• Great site for testing experimental monitoring systems prior to
more remote deployments
43/
20
44. National Arboretum Sensor Array
• 20-node Wireless mesh sensor network (10min sample interval)
• Temp, Humidity
• Sunlight (PAR)
• Soil Temp and moisture @ 20cm depth
• uM resolution denrometers on 20 trees
• Campbell weather stations (baseline data for verification)
• Two Gigapixel timelapse cameras:
• Leaf/growth phenology for > 1,000 trees
• LIDAR: DWEL / Zebedee
• UAV overflights (bi-weekly/monthly)
• Georectified image layers
• High resolution DEM
• 3D point cloud of site in time-series
• Sequence tree genomes
Environment
Phenotype
Genetics
46. “The missing heritability is on your hard drive”
• The challenge is no longer to gather the data, the challenge is how we do science with the data
once we have it
• A sample is no longer a data point
• Gigavision – Hourly time-series of every tree is just pixels not “data” until you quantify something
• Example: Soil Moisture
• 5min intervals @ 20 locations, 6 months of data
• The spatial variation is what is interesting... Artifact or signal?
Soil Moisture @ 20 sensor locations
47. EcoVR: Virtual 3D Ecosystems Project
GIS for 3D “time-series data”• Goal:
• Use modern gaming software to explore new methods for
visualizing time-series environmental data
• Historic and real-time data layers integrated into persistent 3D
model of the national arboretum in the Unreal gaming engine
• Collaboration with
• ANU Computer Science Dept. TechLauncher students
• Stuart Ramsden, ANU VISlab
49. Thanks and Contacts
Justin Borevitz – Lab Leader Lab web page: http://borevitzlab.anu.edu.au
• Funding:
• Arboretum ANU Major Equipment Grant
• ARC Center of Excellence in Planet Energy Biology | ARC Linkage 2014
• Arboretum
• http://bit.ly/PESA2014
• Cris Brack, Albert VanDijk, Justin Borevitz (PESA Project PI’s)
• UAV data: Darrell Burkey, ProUAV
• 3D site modelling:
• Pix4D.com / Zac Hatfield Dodds / ANUVR team
• Dendrometers & site infrastructure
• Darius Culvenor: Environmental Sensing Systems
• Mesh sensors: EnviroStatus, Alberta, CA
• ANUVR Team
• Zena Wolba; Alex Alex Jansons; Isobel Stobo; David Wai [2015/16 Team]
• Yuhao Lui, Zhuoqi Qui, Abishek Kookana, Andrew Kock, Thomas Urwin [2016/7 Team]
• TraitCapture:
• Chuong Nguyen; Joel Granados; Kevin Murray; Gareth Dunstone; Jiri Fajkus
• Pip Wilson; Keng Rugrat; Borevitz Lab
• Gareth Dunstone; Jack Adamson Jordan Braiuka
• Contact me:
• tim.brown@anu.edu.au
• http://bit.ly/Tim_ANU
Code: http://github.com/borevitzlab
50. Links to open
• Gigapan demo
• https://traitcapture.org/test-gigapan?ARB-GV-HILL-1/ARB-GV-HILL-
1.tif
• https://traitcapture.org/test-gigapan?ARB-GV-HILL-1/ARB-GV-HILL-1-
april10.tif
• Black mountain: http://gigapan.com/gigapans/154507
• Player demo
• https://traitcapture.org/timestreams/by-
id/577c7868f7f5660be205ffd0
• Map
• https://www.google.com/maps/d/u/0/edit?mid=1CYARFsRGTvszPKqC
aiBW-tib3nQ
• Plant timestream
• https://traitcapture.org/timestreams/by-
id/57722b4cf7f566640959c908