Slides used in the webinar TileDB hosted with participation from Spire Maritime, describing the use and accessibility of massive time series maritime data on TileDB Cloud.
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
AIS data management and time series analytics on TileDB Cloud (Webinar, Feb 3, 2022)
1. TileDB webinars
February 3, 2022
AIS data management
& time-series analytics on
TileDB Cloud
Founder & CEO of TileDB, Inc.
Dr. Stavros Papadopoulos
2. Deep roots at the intersection of HPC, databases and data science
Traction with telecoms, pharmas, hospitals and other scientific organizations
45+ members with expertise across all applications and domains
Who we are
TileDB was spun out from MIT and Intel Labs in 2017
WHERE IT ALL STARTED
Raised over $20M, we are very well capitalized
INVESTORS
3. Data Economics
Consumption
How tools can compute
on the data, where
does the computation
happen
Distribution
Who has access to the
data, what is the means
of access, and
monetization
Production
What format does the
data get produced in
and where does it get
stored
4. The Problem | Data Economics is Flawed
Distribution (secure sharing) is an afterthought
Data produced in inefficient formats
All data management
solutions focus here
Consumption
How tools can
compute on the data,
where does the
computation happen
5. Data in some
custom format
.las
.cog
.csv
The Problem
very high TCO
Storage in some cloud
bucket or marketplace Org #N:
Download + Wrangle +
Built analytics infra
Org #1:
Download + Wrangle +
Built analytics infra
burden at data vendor
for extra services
6. Enter TileDB
Secure governance & collaboration
Scalable, serverless compute
Data & code sharing & monetization
Pay-as-you-go, consumer pays
Extreme interoperability
No infra hassles
Universal data
management platform
Data in a universal,
analysis-ready format
User / group #1:
any tool, any scale
User / group #N:
any tool, any scale
no wrangling
7. The Secret Sauce | The Data Model
Dense array
Store everything as dense or sparse multi-dimensional arrays
Sparse array
9. The Secret Sauce | The Data Model
What can be modeled as an array
LiDAR (3D sparse)
SAR (2D or 3D dense)
Population genomics (3D sparse)
Single-cell genomics (2D dense or sparse)
Biomedical imaging (2D or 3D dense) Even flat files!!! (1D dense)
Time series (ND dense or sparse)
Weather (2D or 3D dense)
Graphs (2D sparse)
Video (3D dense)
Key-values (1D or ND sparse)
Tables (1D dense or ND sparse)
10. TileDB Cloud
❏ Access control and logging
❏ Serverless SQL, UDFs, task graphs
❏ Jupyter notebooks and dashboards
Unified data management
and easy serverless compute
at global scale
How we built a Universal Database
Efficient APIs & Tool Integrations via Zero-Copy Techniques
TileDB Embedded
Open-source interoperable
storage with a universal
open-spec array format
❏ Parallel IO, rapid reads & writes
❏ Columnar, cloud-optimized
❏ Data versioning & time traveling
11. Superior
performance
Built in C++
Fully-parallelized
Columnar format
Multiple compressors
R-trees for sparse arrays
TileDB Embedded
https://github.com/TileDB-Inc/TileDB
Open source:
Rapid updates
& data versioning
Immutable writes
Lock-free
Parallel reader / writer model
Time traveling
14. TileDB Cloud
Works as SaaS: https://cloud.tiledb.com
Works on premises
Currently on AWS, soon on any cloud
Built to work anywhere
Slicing, SQL, UDFs, task graphs
It is completely serverless
On-demand JupyterHub instances
Can launch Jupyter notebooks
Compute sent to the data
It is geo-aware
Authentication, compliance, etc.
It is secure
15. TileDB Cloud
Full marketplace (via Stripe)
Everything is monetizable
Access control inside and outside your
organization
Make any data and code public
Discover any public data and code
(central catalog)
Everything is shareable at global scale
Jupyter notebooks
UDFs and task graphs
ML models
Everything is an array!
Dashboards (e.g., R shiny apps)
All types of data (even flat files)
Full auditability (data, code, any action)
Everything is logged
16. AIS capabilities on TileDB Cloud
Data is analysis-ready,
no more CSV downloads
A built-in marketplace,
no infrastructure costs
Time-series analysis,
at extreme scale
Fusion of AIS data with
other sources (e.g., SAR)
Numerous APIs and tool
integrations
Visualization with popular
tools and dashboards
20. The Evolution of Spire Maritime’s Data Services
The Early Years (<2013)
• AIS Messages delivered via proxy/SFTP in raw NMEA
or CSV formats
• Customer 100% responsible for data storage,
position and static message synthesization,
indexing, manipulation, etc.
2013
• Geospatial Web Services (GWS) Introduced
• Easy to query vessel-based information
• Removes complications associated with real-time
synthesization of position and static messages
• Key fields indexed to provide rapid query responses
• Data delivered in industry standard schema for
easier storage and manipulation
2021
• Hosted Data Platform Introduced (TileDB)
• Maintains all the benefits of historical GWS content but removes
the complexity and lowers the expense that customers will
experience to store and compute against the data
• Enables immediate access to interrogate Spire Maritime’s historical
data using complex queries that would typically require a fully
configured database to run
• Spire Maritime’s AIS data updated daily into TileDB platform
21. 2
1
Hosted Data Platform Use Cases
`
Customers who
believe they are
spending too much
money on storage and
compute time based
on their Spire
Maritime data
subscription
Customers who only
want to ask
questions of the
data
• Don’t need or want
to store archive
data locally
• Focus on answering
real world
questions starting
from the moment
access to the
platform is
granted
Customers who lack
the skill set to
create the databases
needed to
interrogate the data
in a fast and
efficient way