This document summarizes a project called TULIPP that received EU funding to develop ubiquitous low-power image processing platforms. The project involves 8 partners over 3 years working on hardware, operating systems, tools and use cases for medical imaging, automotive and drone applications. The project aims to define a reference platform using Xilinx SoCs, develop a real-time low-power image processing OS, and toolchain called STHEM to support the hardware and OS. Medical and automotive use cases will integrate and validate the platform for reducing radiation in surgery and adding pedestrian detection to cars. An advisory board and ecosystem are being developed to guide the project and promote adoption.
TULIPP H2020 Project presentation @ FPGA Network: Implementing Machine Vision with FPGA and SoC Platforms
1. This project has received funding from
the European Union’s Horizon 20 20
research and innovation programme
under grant agreement No 688403
www.tulipp.eu
TULIPP
Place :
Date :
By: Flemming CHRISTENSEN, Sundance, UK
Towards Ubiquitous Low-Power Image Processing Platforms
2. 8x Partners – 20+ Engineers – 3 years
• Thales :
Coordinator & Medical use case
• Sundance : Hardware
• Hipperos : Operating system
• Synective Labs : ADAS use case
• Efficient Innovation :
Management
• Fraunhofer IOSB : UAV use case
• Ruhr Universität Bochum :
FPGA tools
• NTNU : Performance tools
3. How will we do it?
WP7: Management, Coordination
LABEL : Marketing, Ecosystem and Pre-normalisation
WP6: IP protection, Dissemination, Communication, Advisory Board
and Exploitation preparation
WP1: Reference platform definition
(Interfaces & implementation Rules)
Instantiations
WP2:
Hardware
WP4:
Programming
Toolchain
WP3:
Runtime, API,
Libraries & OS
feedback WP5 : Usecases description
and Integration and platform
validation
8. Low-Power Image Processing RTOS
Needs OS
- high reliability,
- low power,
- hard real-time
- high performance
This kind of RTOS is not (yet) available today, as current GPOS (e.g.
Linux) or RTOS lack one or more of the required features and
performance.
Specific Image processing Needs
- supporting the hardware accelerators
- the libraries needed for image processing.
9. STHEM: The TULIPP Tool-chain
Status:
• Xilinx SDSoC has been extended to
support the current platform
• Support for HIPPEROS OS is underway
Insights:
• Significant effort has been invested
into the development of vendor tools
• STHEM fills the productivity gaps
between existing tools
Support uTilities for Heterogeneous
Embedded image processing (STHEM) •Supports development for all platform
components
•Map source files of the application to the
appropriate tool chain
•Retrieve OS configuration from the developer
Development and Mapping
•Boot OS with selected configuration (if
needed due to changed configuration)
•Update files (binaries, bitfiles, etc.)
•Initialise the reconfigurable logic (if needed)
•Start the application with the requested
instrumentation
Runner
•Analyses performance results and presents
findings to the developer
Analyser
10. Medical imaging use case
TDLP
RAW IMAGE
THALES Processing
Unit
CI / ICS
UI
GigE-Vision + Msg
THALES Flat panel detector
Customer system
UI
GigE-Vision + Msg
CI / ICS
Nano Processing Unit
Inside the detector
Based on SoC’ based Small-Form-Factor board
Customer system
THALES Flat panel detector
Before TULIPP
After TULIPP
11. Medical imaging use case
• Real-Time X-Ray imaging for surgery
• Reduce radiation dose by 75%
• Add noise removal processing with critical
real-time constraints
13. Unmanned Aerial Vehicle (UAV) use case
• Performs real-time stereo depth estimation to do obstacle /
collision avoidance (for an UAV), i.e. to detect obstacles in
direction of flight
• Based on dual cameras
14. Advisory Board and EcoSystem
Advisory
Board
(WP6)
Reference
Platform
(WP1)
TULIPP Guide, implementation and demos
Ask for review / advise
Roles in the project:
Provide information about standards
Give feedback on the approach
Early adopters
SystemEco
19. Guidelines
Advice: Exploit both vectorization and multithreading for high performance on multicore
processors with vector units such as the ARM Cortex A9. On these architectures, utilizing all
hardware execution resources is key to achieve high performance [2] [4, 5].
Recommended implementation method: Use OpenMP. OpenMP is a widely supported parallel
programming API that enables programmers to express vectorization and multithreading
operations concisely using compiler directives. Programmers need not worry about specifying
scheduling and synchronization operations in code. These are handled transparently by the
OpenMP runtime system. See the official OpenMP examples[6] to understand in more detail
about exploiting vectorization and multithreading simultaneously.