SlideShare una empresa de Scribd logo
1 de 57
Descargar para leer sin conexión
What happens when you put 1 billion
points into Postgis Topology?
Foss4g 2015, Como 16/07/2015
Norwegian Institute of Bioeconomy Research
WWW.NIBIO.NO
(from 1. July 2015 Skogoglandskap was merged into
NIBIO together 2 other institutes. )
Lars Aksel Opsahl (Lars.Opsahl@nibio.no) , developer.
Is this possible ?
7/18/15 31 billion points in Postgis Topology
Move 1
billion points
Into
postgis/topology
The answer is YES!
How long time to add 15 billion ? 15-16 hours.
Is it possible to edit this topo layer ? Yes.
Does edit take long time ? 1 sec and more.
The rest of the slides will go into details about how
we solve this and why Topology is good alternative
for our case.
7/18/15 41 billion points in Postgis Topology
This presentation we will focus on
WHAT type of data we test on.
WHY use Postgis Topology for this layer.
HOW we use Postgis Topology.
HOW we f ill this Postgis Topology layer with data.
HOW we plan to update this Topology layer.
AR5 is a high resolution land resource map
that covers all of Norway.
●
The map describes land resources based on land
type, site index, tree species and ground conditions.
●
When simple feature it is 8 million polygons with a
total of 1 billion points.
AR5 used in gardskart.nibio.no
7/18/15 71 billion points in Postgis Topology
AR5 used in kilden.nibio.no
7/18/15 81 billion points in Postgis Topology
Why use Postgis Topology for AR5.
View map changes
7/18/15 101 billion points in Postgis Topology
What you see Whats the history of the map
Added by aeb
10/01/2011
Added by lop
16/06/2015
Rollback a user map update
7/18/15 111 billion points in Postgis Topology
User adds a new line
and surface attribute
Moderator deletes
the new line
The new
map
Initial map
No overlap or gaps when map edit
7/18/15 121 billion points in Postgis Topology
User adds a new line
and surface attribute The new map
Initial map
This new line will
not cause any
overlap or gap
with the
exiting surface
Old lines will keep
their history and
original points
(2 new points)
7/18/15 131 billion points in Postgis Topology
CREATE UNLOGGED TABLE topo_ar5.ar5_topo_linje(
id serial PRIMARY KEY not null );
SELECT topology.AddTopoGeometryColumn('topo_ar5_sysdata', 'topo_ar5',
'ar5_topo_linje', 'geo', 'LINESTRING') As new_layer_id;
-- create a new table for linestring attrubuttes
CREATE UNLOGGED TABLE topo_ar5.ar5_topo_linje_attr(
id serial PRIMARY KEY not null,
-- could be a feoreign key to topo_ar5_sysdata.edge_data, but since
this update outside our range we can not us foreig key her
edge_id int not null,
objtype_kode smallint not null CONSTRAINT objtype_kode_1_2_m1
CHECK (objtype_kode in (1,2,-1)),
aravgrtype smallint not null,
-- contains felles egenskaper from ar5
felles_egenskaper topo_ar5.sosi_felles_egenskaper,
-- used temp data will be deleted after data is addded
sl_sdeid int
);
HOW TO ILUSTRATE
A good picture may say more that any text, but for some people
a SQL fragment may say more that any text or picture.
When you see SQL fragments, I will explain the meaning. You
can actually think of this as a picture.
HOW we use Postgis Topology.
Database structure for border (lines/edges)
7/18/15 151 billion points in Postgis Topology
CREATE UNLOGGED TABLE topo_ar5.ar5_topo_linje(
id serial PRIMARY KEY not null );
SELECT topology.AddTopoGeometryColumn('topo_ar5_sysdata', 'topo_ar5', 'ar5_topo_linje', 'geo',
'LINESTRING') As new_layer_id;
-- create a new table for linestring attrubuttes
CREATE UNLOGGED TABLE topo_ar5.ar5_topo_linje_attr(
id serial PRIMARY KEY not null,
-- could be a feoreign key to topo_ar5_sysdata.edge_data, but since this update outside
our range we can not us foreig key her
edge_id int not null,
objtype_kode smallint not null CONSTRAINT objtype_kode_1_2_m1
CHECK (objtype_kode in (1,2,-1)),
aravgrtype smallint not null,
-- contains felles egenskaper from ar5
felles_egenskaper topo_ar5.sosi_felles_egenskaper,
-- used temp data will be deleted after data is addded
sl_sdeid int
);
table that holds
Topo object for lines
Holds attribute
For egdes
Why store attributes in separate table for lines ?
7/18/15 161 billion points in Postgis Topology
●
We want to be sure that any edge can have only
one attribute value.
●
After a discussion with Sandro Santilli we will look at
other ways to do this : My update code becomes
complicated and many of the same tests are already
done in Topology package by Sandro Santilli. The
way I have solved this now needs to be redesigned.
Database structure surface
7/18/15 171 billion points in Postgis Topology
CREATE UNLOGGED TABLE topo_ar5.ar5_topo_flate(
id serial PRIMARY KEY not null,
artype int4 CONSTRAINT artype_between_0_100 CHECK (artype > 0 and artype < 100),
arskogbon int4 CONSTRAINT arskogbon_between_0_100 CHECK (arskogbon > 0 and arskogbon < 100),
artreslag int4 CONSTRAINT artreslag_between_0_100 CHECK (artreslag > 0 and artreslag < 100),
argrunnf int4 CONSTRAINT argrunnf_between_0_100 CHECK (argrunnf > 0 and argrunnf < 100),
-- contains felles egenskaper form ar5
felles_egenskaper topo_ar5.sosi_felles_egenskaper,
simple_geo geometry(MultiPolygon,4258) NULL
);
--add a topogeometry column to the a ref to polygpn surface
SELECT topology.AddTopoGeometryColumn('topo_ar5_sysdata', 'topo_ar5', 'ar5_topo_flate', 'geo',
'POLYGON') As new_layer_id;
Used for performance.
Adding the topo geometry
HOW we f ill this Postgis Topology layer with data.
●
Content balanced grid.
●
Parallelize with GNU parallel and the grid cells.
●
All code is wrapped in PL/pgSQL functions.
●
We use simple feature lines and surface
representation points when we create Postgis
Topology
-- Core create grid code we use the && Operators to increase index use
sql := 'SELECT count(*) FROM ' || table_name || ' WHERE ' || geo_column_name || ' && ' ||
'ST_MakeEnvelope(' || x_min || ',' || y_min || ',' || x_max || ',' || y_max || ',' || source_srid || ')';
EXECUTE sql INTO num_rows_table_tmp ;
IF num_rows_table < max_rows
THEN
sectors[0] := grid_geom;
ELSE
x_delta := (x_max – x_min)/2; y_delta := (y_max – y_min)/2;
x_center := x_min + x_delta; y_center := y_min + y_delta;
sectors[0] := func_grid.SL_make_contert_based_grid(table_name_column_name_array,
ST_MakeEnvelope(x_min,y_min,x_center,y_center, ST_SRID(grid_geom)), min_distance, max_rows);
sectors[1] := func_grid.SL_make_contert_based_grid(table_name_column_name_array,
ST_MakeEnvelope(x_center,y_min,x_max,y_center, ST_SRID(grid_geom)), min_distance, max_rows);
sectors[2] := func_grid.SL_make_contert_based_grid(table_name_column_name_array,
ST_MakeEnvelope(x_min,y_center,x_center,y_max, ST_SRID(grid_geom)), min_distance, max_rows);
sectors[3] := func_grid.SL_make_contert_based_grid(table_name_column_name_array,
ST_MakeEnvelope(x_center,y_center,x_max,y_max, ST_SRID(grid_geom)), min_distance, max_rows);
Create content balanced grid for AR5 in Norway
7/18/15 191 billion points in Postgis Topology
-- Create a grid with around max 4000 lines in each cell
SL_make_content_based_balanced_grid01(ARRAY['org_ar5.ar5_linje geo'],4000))
To big, split in 4
Below limit ok to use
Content balanced grid for AR5 in Norway
7/18/15 201 billion points in Postgis Topology
Content balanced grid for AR5 in Norway
7/18/15 211 billion points in Postgis Topology
Linestring and surface distribution for the grid used.
●
Covered by a single cell (does not touch any cell border lines)
●
Single cell edges : 18988984
●
Single cell surfaces : 7093814
●
Crosses/touches cell border lines
●
Multi cell edges : 635048
●
Multi cell surfaces : 534455
221 billion points in Postgis Topology
4 different operation type
7/18/15 231 billion points in Postgis Topology
●
A:Process lines covered by single cells.
●
B:Merge cells to include lines that cross cell borders
(then do the same as in A for lines founs)
●
C:Process surfaces covered by single cells.
●
D:Merge cells to include surfaces that cross cell
borders. (then do the same as in C for surfaces
found)
A: Only process data covered by each cell
7/18/15 241 billion points in Postgis Topology
WAIT TO PROCESS:
LINE NOT COVERD BY SINGLE CELL
START TO PROCESS :
LINE COVERD BY SINGLE CELL
B: Merge cells to include lines that cross cell borders.
7/18/15 251 billion points in Postgis Topology
OK TO PROCESS NOW:
LINE COVERD BY SET OF MERGED CELLS
DON'T PROCESS :
DON'T TOUCH ANY ORIGNAL BORDERS
Process lines covered by single cells : 1. create topo.
7/18/15 261 billion points in Postgis Topology
SELECT
topology.toTopoGeom(geo, 'topo_ar5_sysdata', 1, 0.0000000001) as geo,
sl_sdeid
FROM (
select arl.sl_sdeid, arl.geo from org_ar5.ar5_linje arl
where
cell_geo_in && arl.geo and
ST_Contains(cell_geo_in, arl.geo) and
arl.objType not in ('KantUtsnitt') and
NOT EXISTS ( select sl_sdeid from topo_ar5.added_edges f where
arl.sl_sdeid=f.sl_sdeid)
) AS a
Create the topo object. Extreme
performance. Snap to value
Use to find attributes
Merge cells and collect cell borders
7/18/15 271 billion points in Postgis Topology
-- merge cel
( SELECT
ST_union(cell.geo) as cell_union
FROM topo_ar5.cell_ad as cell
WHERE cell.id >= cell_min_in and
cell.id < (stop_cell_id)
) AS r2
-- get cell borders
FROM (
SELECT
(ST_Dump(grid_lines)).geom AS grid_line
FROM (
SELECT
ST_Collect(ST_ExteriorRing(cell.geo)) as grid_lines
FROM topo_ar5.cell_ad as cell
WHERE cell.id >= cell_min_in and
cell.id < (stop_cell_id)
) AS r
) AS r,
Use merged cells and cell borders to f ind new lines
7/18/15 281 billion points in Postgis Topology
....
WHERE ST_intersects(r.grid_line, arl.geo) AND
NOT EXISTS ( select edge_id from topo_ar5_sysdata.edge_data where
ST_Intersects(geom, arl.geo) and ST_Intersects(geom, r.grid_line) ) AND
arl.objType not in ('KantUtsnitt') AND
NOT EXISTS ( select sl_sdeid from topo_ar5.added_edges f where arl.sl_sdeid=f.sl_sdeid)
...
WHERE ST_Contains(r2.cell_union, arl.geo) AND
NOT EXISTS ( select sl_sdeid from topo_ar5.added_edges f where arl.sl_sdeid=f.sl_sdeid)
Covered by
merged cell
Process lines covered by single cells : 2. add attributes
7/18/15 291 billion points in Postgis Topology
SELECT
distinct ON (edge_id) edge_id,
topo_ar5.ar5_omkod_objtype_2_kode(b.objtype) as objtype_kode,
aravgrtype,b.datafangstdato,
ARRAY[b.informasjon] as informasjon,
(b.maalemetode,b.noyaktighet,b.synbarhet)::topo_ar5.sosi_kvalitet as
kvalitet ,b.opphav,b.verifiseringsdato,
(b.registreringsversjon,4.5)::topo_ar5.sosi_registreringsversjon as
registreringsversjon,
b.sl_sdeid
FROM (
select r.element_id as edge_id , arl.*
FROM relation_ids_added ra, topo_ar5_sysdata.relation r ,
org_ar5.ar5_linje arl
WHERE
ra.topogeo_id = r.topogeo_id and ra.layer_id = r.layer_id and
arl.sl_sdeid = ra.sl_sdeid
) AS b Map by id.
Add attributes using
user defined types.
Process surfaces covered by single cells: 1 add topo
7/18/15 301 billion points in Postgis Topology
INSERT INTO topo_ar5.ar5_topo_flate (geo)
SELECT topology.CreateTopoGeom('topo_ar5_sysdata',3,2,topoelementarray ) as geo
from
( select distinct ST_GetFaceGeometry('topo_ar5_sysdata',l.face_id) as geo,
topology.TopoElementArray_Agg(ARRAY[l.face_id,3]) as topoelementarray,
ST_union(l.mbr) as union_face
From topo_ar5_sysdata.face as l, topo_ar5.cell_ad cell
where cell.id = cell_nr_in and
ST_Contains(cell.geo,l.mbr) and
NOT EXISTS (select re.element_id from topo_ar5_sysdata.relation re
where re.layer_id = 2 and re.element_id = l.face_id )
group by l.face_id
) as r1,
topo_ar5.cell_ad cell
where cell.id = cell_nr_in and
ST_Contains(cell.geo, ST_Boundary(r1.union_face));
Build surface created
Find surfaces inside
Current cell
Create surface
Topo geo
Process surfaces covered by single cells: 2 update
simple geo
7/18/15 311 billion points in Postgis Topology
update topo_ar5.ar5_topo_flate AS f
set
simple_geo = geo::geometry
from arf_id as ft
where f.id = ft.id_temp; Just cast from topo geomtry
Process surfaces covered by single cells : 2. update
attributes
7/18/15 321 billion points in Postgis Topology
-- update the rest of the attributtes
update topo_ar5.ar5_topo_flate as f SET (artype, arskogbon,
artreslag,argrunnf,felles_egenskaper) =
(c.artype,c.arskogbon,c.artreslag,c.argrunnf,
(datafangstdato,informasjon,null, kvalitet,null,opphav,null,
registreringsversjon,verifiseringsdato)::topo_ar5.sosi_felles_egenskaper )
FROM ( SELECT
b.artype ,b.arskogbon,b.artreslag,b.argrunnf,
b.id_temp,b.datafangstdato, ARRAY[b.informasjon] as informasjon,
(b.maalemetode,b.noyaktighet,b.synbarhet)::topo_ar5.sosi_kvalitet as kvalitet ,
b.opphav, b.verifiseringsdato,
(b.registreringsversjon,'4.5')::topo_ar5.sosi_registreringsversjon as registreringsversjon
FROM
( select p.*, ft.id_temp from org_ar5.ar5_punkt as p,arf_id as ft,
topo_ar5.ar5_topo_flate as f2
where f2.id = ft.id_temp and ST_Covers(f2.simple_geo,p.geo)
) as b
) AS c where f.id = c.id_temp;
Find data by using
Representation point
Test performance for the migrations process
(16 dual core CPU's and ssd disks)
1 parallel thread
function_create_topo_ar5.sh vroom2 1 13000 200
15 parallel thread
function_create_topo_ar5.sh vroom2 15 13000 200
20 parallel thread
function_create_topo_ar5.sh vroom2 20 13000 200
331 billion points in Postgis Topology
Decreasing processing time
when increasing number of parallel threads
Number of threads Total runtime in hours
1 108
15 16
20 18
7/18/15 341 billion points in Postgis Topology
Average operations per second the 4 the different
operation types with different number of threads.
Number
of threads
A: Single cell
linestrings
B: Multi cell
linestrings
C: Single cell
surfaces
D: Multi cell
surfaces
1 91 9 305 5
15 1043 48 972 21
20 814 48 934 27
7/18/15 351 billion points in Postgis Topology
Average operations per second at every hour when
running single threaded.
7/18/15 361 billion points in Postgis Topology
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
A
B
B
B
B
C
C
C
C
C
C
D
0
50
100
150
200
250
300
350
400
450
500
Hours and opr. type
Opr.
pr. sec .
Average operations per second at every hour when
running 15 parallel threads.
7/18/15 371 billion points in Postgis Topology
A A A A A A B B B B C C C D D D D D D D D
0
200
400
600
800
1000
1200
1400
1600
1800
Hours and opr. type
Opr.
pr. sec .
Summery convert AR5 to Postgis Topology
7/18/15 381 billion points in Postgis Topology
●
Content balanced grid and parallel threads.
●
Two parallel threads can not work in the same area
●
Function based index topo_ar5.get_relation_id( geo
TopoGeometry) and indexes on relation table.
●
Heavy use of && operator.
●
Ok with 16 hours processing time since this is a one
time operation.
●
ValidateTopology('topo_ar5_sysdata') show no error.
HOW to update the Postgis Topology layer.
●
Draw a line and set attribute values
●
Use stored procedures
●
Use one single transaction
●
Rollback if any errors
●
Java backend with JSON API
●
Simple test client using this API
Two comments about update
7/18/15 401 billion points in Postgis Topology
1) Jostein head of AR5 “Don't delete old lines, it's nice to
know the history behind changes”.
2) Ingvild my boss “Why do I have to move old lines around
with many hundreds points, why can´t I just give you a new
simple line that just shows the difference ?”
Edit Topology data with surface data
7/18/15 411 billion points in Postgis Topology
Draw a polygon
Split a polygon
Update surface attributes
Extend a polygon
Edit Topology : Split a polygon
- Original map
7/18/15 421 billion points in Postgis Topology
Edit Topology : Split a polygon
- Input : point, line, attribute values
7/18/15 431 billion points in Postgis Topology
Edit Topology : What happens when you have a
split surface operation.
1 billion points in Postgis Topology
Java backend calls : apply_line_on_topo_flate( geo_in geometry,
p_in geometry, artype_in int, arskogbon_in int,
artreslag_in int, argrunnf_in int)
And the following happens
- Adjust input input line to current data and take in account that equal surface be equal
- Compute the area to be update
- Take a copy of the non changed data
- Take a copy of data may change
- Clear data from the line attribute table
- Clear data from the topo surface layer and delete rows to be changed
- Add the adjusted line by topology.toTopoGeom
- Update the line attribute table
- Create new surfaces with new attribute value
- Create old surfaces with old value
- Check that non changed area is still the same
Edit Topology : Timing issues when you have a
split surface operation.
1 billion points in Postgis Topology
Java backend calls this function
topo_ar5.apply_line_on_topo_flate( geo_in geometry, p_in geometry,
artype_in int, arskogbon_in int, artreslag_in int, argrunnf_in int)
Small operations that include few changes takes a 1000 ms, but bigger oprations may minutts
http://trac.osgeo.org/postgis/ticket/2083
Edit Topology : Split a polygon
- New map
7/18/15 461 billion points in Postgis Topology
Edit Topology : Extend a polygon
7/18/15 471 billion points in Postgis Topology
Edit Topology : Extend a polygon
7/18/15 481 billion points in Postgis Topology
Edit Topology : Extend a polygon.
1 billion points in Postgis Topology
Java backend call this function:
apply_line_on_topo_flate( geo_in geometry, p_in geometry,
artype_in int, arskogbon_in int, artreslag_in int,
argrunnf_in int)
Where p_in (0.0) means not set.
Edit Topology : Extend a polygon
7/18/15 501 billion points in Postgis Topology
Edit Topology : Draw a new polygon
7/18/15 511 billion points in Postgis Topology
Edit Topology : Draw a new polygon
7/18/15 521 billion points in Postgis Topology
Edit Topology : Draw a new polygon.
1 billion points in Postgis Topology
Java backend call this function: apply_polygon_on_topo_flate(
geo_in geometry, artype_in int, arskogbon_in int,
artreslag_in int, argrunnf_in int
)
Edit Topology : Draw a new polygon
7/18/15 541 billion points in Postgis Topology
Further plans this year
●
Add many new layer to Postgis Topology this fall and
adjust the Topology model to new requirements.
●
Create a client that uses JSON API for update of
topology layers.
●
Extend update API with more functionality.
●
We have to work more on performance and topology
usage and update client for AR5 .
Postgis Topology is a great tool and you can
add one billion points and it's possible to
update it afterwords.
Thanks to everybody that has contributed to
Postgis Topology and other open source tools.
Questions ?
Thanks for your attention.

Más contenido relacionado

La actualidad más candente

Stacks & Queues
Stacks & QueuesStacks & Queues
Stacks & Queuestech4us
 
Stack and queue
Stack and queueStack and queue
Stack and queueLavanyaJ28
 
هياكلبيانات
هياكلبياناتهياكلبيانات
هياكلبياناتRafal Edward
 
Data structure using c bcse 3102 pcs 1002
Data structure using c bcse 3102 pcs 1002Data structure using c bcse 3102 pcs 1002
Data structure using c bcse 3102 pcs 1002SANTOSH RATH
 
Data structure lecture7
Data structure lecture7Data structure lecture7
Data structure lecture7Kumar
 
Stack Data Structure
Stack Data StructureStack Data Structure
Stack Data StructureRabin BK
 
Photon Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedPhoton Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedDatabricks
 
Data Structure Using C
Data Structure Using CData Structure Using C
Data Structure Using Ccpjcollege
 
358 33 powerpoint-slides_9-stacks-queues_chapter-9
358 33 powerpoint-slides_9-stacks-queues_chapter-9358 33 powerpoint-slides_9-stacks-queues_chapter-9
358 33 powerpoint-slides_9-stacks-queues_chapter-9sumitbardhan
 
Stacks & Queues By Ms. Niti Arora
Stacks & Queues By Ms. Niti AroraStacks & Queues By Ms. Niti Arora
Stacks & Queues By Ms. Niti Arorakulachihansraj
 
stacks and queues for public
stacks and queues for publicstacks and queues for public
stacks and queues for publiciqbalphy1
 

La actualidad más candente (18)

Stacks & Queues
Stacks & QueuesStacks & Queues
Stacks & Queues
 
Stack and queue
Stack and queueStack and queue
Stack and queue
 
Data structure Stack
Data structure StackData structure Stack
Data structure Stack
 
Doc 20180130-wa0004
Doc 20180130-wa0004Doc 20180130-wa0004
Doc 20180130-wa0004
 
Stacks and Queue - Data Structures
Stacks and Queue - Data StructuresStacks and Queue - Data Structures
Stacks and Queue - Data Structures
 
stacks and queues
stacks and queuesstacks and queues
stacks and queues
 
Stack project
Stack projectStack project
Stack project
 
هياكلبيانات
هياكلبياناتهياكلبيانات
هياكلبيانات
 
Data structure using c bcse 3102 pcs 1002
Data structure using c bcse 3102 pcs 1002Data structure using c bcse 3102 pcs 1002
Data structure using c bcse 3102 pcs 1002
 
Data structure lecture7
Data structure lecture7Data structure lecture7
Data structure lecture7
 
Stack Data Structure
Stack Data StructureStack Data Structure
Stack Data Structure
 
Photon Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think VectorizedPhoton Technical Deep Dive: How to Think Vectorized
Photon Technical Deep Dive: How to Think Vectorized
 
Stacks in c++
Stacks in c++Stacks in c++
Stacks in c++
 
Data Structure Using C
Data Structure Using CData Structure Using C
Data Structure Using C
 
358 33 powerpoint-slides_9-stacks-queues_chapter-9
358 33 powerpoint-slides_9-stacks-queues_chapter-9358 33 powerpoint-slides_9-stacks-queues_chapter-9
358 33 powerpoint-slides_9-stacks-queues_chapter-9
 
Stacks & Queues By Ms. Niti Arora
Stacks & Queues By Ms. Niti AroraStacks & Queues By Ms. Niti Arora
Stacks & Queues By Ms. Niti Arora
 
Data structure stack&queue basics
Data structure stack&queue   basicsData structure stack&queue   basics
Data structure stack&queue basics
 
stacks and queues for public
stacks and queues for publicstacks and queues for public
stacks and queues for public
 

Similar a What happens when you add 1 billion points to PostGIS Topology

lect- 3&4.ppt
lect- 3&4.pptlect- 3&4.ppt
lect- 3&4.pptmrizwan38
 
Aileen heal postgis osmm cou
Aileen heal postgis osmm couAileen heal postgis osmm cou
Aileen heal postgis osmm couMatt Travis
 
Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007paulguerin
 
computer notes - Stack
computer notes - Stackcomputer notes - Stack
computer notes - Stackecomputernotes
 
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionStar Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionFranck Pachot
 
RSX™ Best Practices
RSX™ Best PracticesRSX™ Best Practices
RSX™ Best PracticesSlide_N
 
Oracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive PlansOracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive PlansFranck Pachot
 
SQL Plan Directives explained
SQL Plan Directives explainedSQL Plan Directives explained
SQL Plan Directives explainedMauro Pagano
 
Polymorphic Table Functions in 18c
Polymorphic Table Functions in 18cPolymorphic Table Functions in 18c
Polymorphic Table Functions in 18cAndrej Pashchenko
 
Stockage, manipulation et analyse de données matricielles avec PostGIS Raster
Stockage, manipulation et analyse de données matricielles avec PostGIS RasterStockage, manipulation et analyse de données matricielles avec PostGIS Raster
Stockage, manipulation et analyse de données matricielles avec PostGIS RasterACSG Section Montréal
 
Scaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQLScaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQLJim Mlodgenski
 
UNIT 3a.pptx
UNIT 3a.pptxUNIT 3a.pptx
UNIT 3a.pptxjack881
 

Similar a What happens when you add 1 billion points to PostGIS Topology (20)

lect- 3&4.ppt
lect- 3&4.pptlect- 3&4.ppt
lect- 3&4.ppt
 
Aileen heal postgis osmm cou
Aileen heal postgis osmm couAileen heal postgis osmm cou
Aileen heal postgis osmm cou
 
Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007Myth busters - performance tuning 101 2007
Myth busters - performance tuning 101 2007
 
Abap Questions
Abap QuestionsAbap Questions
Abap Questions
 
Abap
AbapAbap
Abap
 
computer notes - Stack
computer notes - Stackcomputer notes - Stack
computer notes - Stack
 
PLSQL Practices
PLSQL PracticesPLSQL Practices
PLSQL Practices
 
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory optionStar Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
Star Transformation, 12c Adaptive Bitmap Pruning and In-Memory option
 
Bcsl 033 solve assignment
Bcsl 033 solve assignmentBcsl 033 solve assignment
Bcsl 033 solve assignment
 
Stack in C.pptx
Stack in C.pptxStack in C.pptx
Stack in C.pptx
 
RSX™ Best Practices
RSX™ Best PracticesRSX™ Best Practices
RSX™ Best Practices
 
Oracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive PlansOracle Join Methods and 12c Adaptive Plans
Oracle Join Methods and 12c Adaptive Plans
 
SQL Plan Directives explained
SQL Plan Directives explainedSQL Plan Directives explained
SQL Plan Directives explained
 
Polymorphic Table Functions in 18c
Polymorphic Table Functions in 18cPolymorphic Table Functions in 18c
Polymorphic Table Functions in 18c
 
Stockage, manipulation et analyse de données matricielles avec PostGIS Raster
Stockage, manipulation et analyse de données matricielles avec PostGIS RasterStockage, manipulation et analyse de données matricielles avec PostGIS Raster
Stockage, manipulation et analyse de données matricielles avec PostGIS Raster
 
Linked List
Linked ListLinked List
Linked List
 
Interview Preparation
Interview PreparationInterview Preparation
Interview Preparation
 
Introduction to oracle optimizer
Introduction to oracle optimizerIntroduction to oracle optimizer
Introduction to oracle optimizer
 
Scaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQLScaling PostgreSQL With GridSQL
Scaling PostgreSQL With GridSQL
 
UNIT 3a.pptx
UNIT 3a.pptxUNIT 3a.pptx
UNIT 3a.pptx
 

Último

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningLars Bell
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsSergiu Bodiu
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsRizwan Syed
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clashcharlottematthew16
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii SoldatenkoFwdays
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubKalema Edgar
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebUiPathCommunity
 

Último (20)

DSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine TuningDSPy a system for AI to Write Prompts and Do Fine Tuning
DSPy a system for AI to Write Prompts and Do Fine Tuning
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
DevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platformsDevEX - reference for building teams, processes, and platforms
DevEX - reference for building teams, processes, and platforms
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Scanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL CertsScanning the Internet for External Cloud Exposures via SSL Certs
Scanning the Internet for External Cloud Exposures via SSL Certs
 
Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Powerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time ClashPowerpoint exploring the locations used in television show Time Clash
Powerpoint exploring the locations used in television show Time Clash
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko"Debugging python applications inside k8s environment", Andrii Soldatenko
"Debugging python applications inside k8s environment", Andrii Soldatenko
 
Unleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding ClubUnleash Your Potential - Namagunga Girls Coding Club
Unleash Your Potential - Namagunga Girls Coding Club
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Dev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio WebDev Dives: Streamline document processing with UiPath Studio Web
Dev Dives: Streamline document processing with UiPath Studio Web
 

What happens when you add 1 billion points to PostGIS Topology

  • 1. What happens when you put 1 billion points into Postgis Topology? Foss4g 2015, Como 16/07/2015
  • 2. Norwegian Institute of Bioeconomy Research WWW.NIBIO.NO (from 1. July 2015 Skogoglandskap was merged into NIBIO together 2 other institutes. ) Lars Aksel Opsahl (Lars.Opsahl@nibio.no) , developer.
  • 3. Is this possible ? 7/18/15 31 billion points in Postgis Topology Move 1 billion points Into postgis/topology The answer is YES!
  • 4. How long time to add 15 billion ? 15-16 hours. Is it possible to edit this topo layer ? Yes. Does edit take long time ? 1 sec and more. The rest of the slides will go into details about how we solve this and why Topology is good alternative for our case. 7/18/15 41 billion points in Postgis Topology
  • 5. This presentation we will focus on WHAT type of data we test on. WHY use Postgis Topology for this layer. HOW we use Postgis Topology. HOW we f ill this Postgis Topology layer with data. HOW we plan to update this Topology layer.
  • 6. AR5 is a high resolution land resource map that covers all of Norway. ● The map describes land resources based on land type, site index, tree species and ground conditions. ● When simple feature it is 8 million polygons with a total of 1 billion points.
  • 7. AR5 used in gardskart.nibio.no 7/18/15 71 billion points in Postgis Topology
  • 8. AR5 used in kilden.nibio.no 7/18/15 81 billion points in Postgis Topology
  • 9. Why use Postgis Topology for AR5.
  • 10. View map changes 7/18/15 101 billion points in Postgis Topology What you see Whats the history of the map Added by aeb 10/01/2011 Added by lop 16/06/2015
  • 11. Rollback a user map update 7/18/15 111 billion points in Postgis Topology User adds a new line and surface attribute Moderator deletes the new line The new map Initial map
  • 12. No overlap or gaps when map edit 7/18/15 121 billion points in Postgis Topology User adds a new line and surface attribute The new map Initial map This new line will not cause any overlap or gap with the exiting surface Old lines will keep their history and original points (2 new points)
  • 13. 7/18/15 131 billion points in Postgis Topology CREATE UNLOGGED TABLE topo_ar5.ar5_topo_linje( id serial PRIMARY KEY not null ); SELECT topology.AddTopoGeometryColumn('topo_ar5_sysdata', 'topo_ar5', 'ar5_topo_linje', 'geo', 'LINESTRING') As new_layer_id; -- create a new table for linestring attrubuttes CREATE UNLOGGED TABLE topo_ar5.ar5_topo_linje_attr( id serial PRIMARY KEY not null, -- could be a feoreign key to topo_ar5_sysdata.edge_data, but since this update outside our range we can not us foreig key her edge_id int not null, objtype_kode smallint not null CONSTRAINT objtype_kode_1_2_m1 CHECK (objtype_kode in (1,2,-1)), aravgrtype smallint not null, -- contains felles egenskaper from ar5 felles_egenskaper topo_ar5.sosi_felles_egenskaper, -- used temp data will be deleted after data is addded sl_sdeid int ); HOW TO ILUSTRATE A good picture may say more that any text, but for some people a SQL fragment may say more that any text or picture. When you see SQL fragments, I will explain the meaning. You can actually think of this as a picture.
  • 14. HOW we use Postgis Topology.
  • 15. Database structure for border (lines/edges) 7/18/15 151 billion points in Postgis Topology CREATE UNLOGGED TABLE topo_ar5.ar5_topo_linje( id serial PRIMARY KEY not null ); SELECT topology.AddTopoGeometryColumn('topo_ar5_sysdata', 'topo_ar5', 'ar5_topo_linje', 'geo', 'LINESTRING') As new_layer_id; -- create a new table for linestring attrubuttes CREATE UNLOGGED TABLE topo_ar5.ar5_topo_linje_attr( id serial PRIMARY KEY not null, -- could be a feoreign key to topo_ar5_sysdata.edge_data, but since this update outside our range we can not us foreig key her edge_id int not null, objtype_kode smallint not null CONSTRAINT objtype_kode_1_2_m1 CHECK (objtype_kode in (1,2,-1)), aravgrtype smallint not null, -- contains felles egenskaper from ar5 felles_egenskaper topo_ar5.sosi_felles_egenskaper, -- used temp data will be deleted after data is addded sl_sdeid int ); table that holds Topo object for lines Holds attribute For egdes
  • 16. Why store attributes in separate table for lines ? 7/18/15 161 billion points in Postgis Topology ● We want to be sure that any edge can have only one attribute value. ● After a discussion with Sandro Santilli we will look at other ways to do this : My update code becomes complicated and many of the same tests are already done in Topology package by Sandro Santilli. The way I have solved this now needs to be redesigned.
  • 17. Database structure surface 7/18/15 171 billion points in Postgis Topology CREATE UNLOGGED TABLE topo_ar5.ar5_topo_flate( id serial PRIMARY KEY not null, artype int4 CONSTRAINT artype_between_0_100 CHECK (artype > 0 and artype < 100), arskogbon int4 CONSTRAINT arskogbon_between_0_100 CHECK (arskogbon > 0 and arskogbon < 100), artreslag int4 CONSTRAINT artreslag_between_0_100 CHECK (artreslag > 0 and artreslag < 100), argrunnf int4 CONSTRAINT argrunnf_between_0_100 CHECK (argrunnf > 0 and argrunnf < 100), -- contains felles egenskaper form ar5 felles_egenskaper topo_ar5.sosi_felles_egenskaper, simple_geo geometry(MultiPolygon,4258) NULL ); --add a topogeometry column to the a ref to polygpn surface SELECT topology.AddTopoGeometryColumn('topo_ar5_sysdata', 'topo_ar5', 'ar5_topo_flate', 'geo', 'POLYGON') As new_layer_id; Used for performance. Adding the topo geometry
  • 18. HOW we f ill this Postgis Topology layer with data. ● Content balanced grid. ● Parallelize with GNU parallel and the grid cells. ● All code is wrapped in PL/pgSQL functions. ● We use simple feature lines and surface representation points when we create Postgis Topology
  • 19. -- Core create grid code we use the && Operators to increase index use sql := 'SELECT count(*) FROM ' || table_name || ' WHERE ' || geo_column_name || ' && ' || 'ST_MakeEnvelope(' || x_min || ',' || y_min || ',' || x_max || ',' || y_max || ',' || source_srid || ')'; EXECUTE sql INTO num_rows_table_tmp ; IF num_rows_table < max_rows THEN sectors[0] := grid_geom; ELSE x_delta := (x_max – x_min)/2; y_delta := (y_max – y_min)/2; x_center := x_min + x_delta; y_center := y_min + y_delta; sectors[0] := func_grid.SL_make_contert_based_grid(table_name_column_name_array, ST_MakeEnvelope(x_min,y_min,x_center,y_center, ST_SRID(grid_geom)), min_distance, max_rows); sectors[1] := func_grid.SL_make_contert_based_grid(table_name_column_name_array, ST_MakeEnvelope(x_center,y_min,x_max,y_center, ST_SRID(grid_geom)), min_distance, max_rows); sectors[2] := func_grid.SL_make_contert_based_grid(table_name_column_name_array, ST_MakeEnvelope(x_min,y_center,x_center,y_max, ST_SRID(grid_geom)), min_distance, max_rows); sectors[3] := func_grid.SL_make_contert_based_grid(table_name_column_name_array, ST_MakeEnvelope(x_center,y_center,x_max,y_max, ST_SRID(grid_geom)), min_distance, max_rows); Create content balanced grid for AR5 in Norway 7/18/15 191 billion points in Postgis Topology -- Create a grid with around max 4000 lines in each cell SL_make_content_based_balanced_grid01(ARRAY['org_ar5.ar5_linje geo'],4000)) To big, split in 4 Below limit ok to use
  • 20. Content balanced grid for AR5 in Norway 7/18/15 201 billion points in Postgis Topology
  • 21. Content balanced grid for AR5 in Norway 7/18/15 211 billion points in Postgis Topology
  • 22. Linestring and surface distribution for the grid used. ● Covered by a single cell (does not touch any cell border lines) ● Single cell edges : 18988984 ● Single cell surfaces : 7093814 ● Crosses/touches cell border lines ● Multi cell edges : 635048 ● Multi cell surfaces : 534455 221 billion points in Postgis Topology
  • 23. 4 different operation type 7/18/15 231 billion points in Postgis Topology ● A:Process lines covered by single cells. ● B:Merge cells to include lines that cross cell borders (then do the same as in A for lines founs) ● C:Process surfaces covered by single cells. ● D:Merge cells to include surfaces that cross cell borders. (then do the same as in C for surfaces found)
  • 24. A: Only process data covered by each cell 7/18/15 241 billion points in Postgis Topology WAIT TO PROCESS: LINE NOT COVERD BY SINGLE CELL START TO PROCESS : LINE COVERD BY SINGLE CELL
  • 25. B: Merge cells to include lines that cross cell borders. 7/18/15 251 billion points in Postgis Topology OK TO PROCESS NOW: LINE COVERD BY SET OF MERGED CELLS DON'T PROCESS : DON'T TOUCH ANY ORIGNAL BORDERS
  • 26. Process lines covered by single cells : 1. create topo. 7/18/15 261 billion points in Postgis Topology SELECT topology.toTopoGeom(geo, 'topo_ar5_sysdata', 1, 0.0000000001) as geo, sl_sdeid FROM ( select arl.sl_sdeid, arl.geo from org_ar5.ar5_linje arl where cell_geo_in && arl.geo and ST_Contains(cell_geo_in, arl.geo) and arl.objType not in ('KantUtsnitt') and NOT EXISTS ( select sl_sdeid from topo_ar5.added_edges f where arl.sl_sdeid=f.sl_sdeid) ) AS a Create the topo object. Extreme performance. Snap to value Use to find attributes
  • 27. Merge cells and collect cell borders 7/18/15 271 billion points in Postgis Topology -- merge cel ( SELECT ST_union(cell.geo) as cell_union FROM topo_ar5.cell_ad as cell WHERE cell.id >= cell_min_in and cell.id < (stop_cell_id) ) AS r2 -- get cell borders FROM ( SELECT (ST_Dump(grid_lines)).geom AS grid_line FROM ( SELECT ST_Collect(ST_ExteriorRing(cell.geo)) as grid_lines FROM topo_ar5.cell_ad as cell WHERE cell.id >= cell_min_in and cell.id < (stop_cell_id) ) AS r ) AS r,
  • 28. Use merged cells and cell borders to f ind new lines 7/18/15 281 billion points in Postgis Topology .... WHERE ST_intersects(r.grid_line, arl.geo) AND NOT EXISTS ( select edge_id from topo_ar5_sysdata.edge_data where ST_Intersects(geom, arl.geo) and ST_Intersects(geom, r.grid_line) ) AND arl.objType not in ('KantUtsnitt') AND NOT EXISTS ( select sl_sdeid from topo_ar5.added_edges f where arl.sl_sdeid=f.sl_sdeid) ... WHERE ST_Contains(r2.cell_union, arl.geo) AND NOT EXISTS ( select sl_sdeid from topo_ar5.added_edges f where arl.sl_sdeid=f.sl_sdeid) Covered by merged cell
  • 29. Process lines covered by single cells : 2. add attributes 7/18/15 291 billion points in Postgis Topology SELECT distinct ON (edge_id) edge_id, topo_ar5.ar5_omkod_objtype_2_kode(b.objtype) as objtype_kode, aravgrtype,b.datafangstdato, ARRAY[b.informasjon] as informasjon, (b.maalemetode,b.noyaktighet,b.synbarhet)::topo_ar5.sosi_kvalitet as kvalitet ,b.opphav,b.verifiseringsdato, (b.registreringsversjon,4.5)::topo_ar5.sosi_registreringsversjon as registreringsversjon, b.sl_sdeid FROM ( select r.element_id as edge_id , arl.* FROM relation_ids_added ra, topo_ar5_sysdata.relation r , org_ar5.ar5_linje arl WHERE ra.topogeo_id = r.topogeo_id and ra.layer_id = r.layer_id and arl.sl_sdeid = ra.sl_sdeid ) AS b Map by id. Add attributes using user defined types.
  • 30. Process surfaces covered by single cells: 1 add topo 7/18/15 301 billion points in Postgis Topology INSERT INTO topo_ar5.ar5_topo_flate (geo) SELECT topology.CreateTopoGeom('topo_ar5_sysdata',3,2,topoelementarray ) as geo from ( select distinct ST_GetFaceGeometry('topo_ar5_sysdata',l.face_id) as geo, topology.TopoElementArray_Agg(ARRAY[l.face_id,3]) as topoelementarray, ST_union(l.mbr) as union_face From topo_ar5_sysdata.face as l, topo_ar5.cell_ad cell where cell.id = cell_nr_in and ST_Contains(cell.geo,l.mbr) and NOT EXISTS (select re.element_id from topo_ar5_sysdata.relation re where re.layer_id = 2 and re.element_id = l.face_id ) group by l.face_id ) as r1, topo_ar5.cell_ad cell where cell.id = cell_nr_in and ST_Contains(cell.geo, ST_Boundary(r1.union_face)); Build surface created Find surfaces inside Current cell Create surface Topo geo
  • 31. Process surfaces covered by single cells: 2 update simple geo 7/18/15 311 billion points in Postgis Topology update topo_ar5.ar5_topo_flate AS f set simple_geo = geo::geometry from arf_id as ft where f.id = ft.id_temp; Just cast from topo geomtry
  • 32. Process surfaces covered by single cells : 2. update attributes 7/18/15 321 billion points in Postgis Topology -- update the rest of the attributtes update topo_ar5.ar5_topo_flate as f SET (artype, arskogbon, artreslag,argrunnf,felles_egenskaper) = (c.artype,c.arskogbon,c.artreslag,c.argrunnf, (datafangstdato,informasjon,null, kvalitet,null,opphav,null, registreringsversjon,verifiseringsdato)::topo_ar5.sosi_felles_egenskaper ) FROM ( SELECT b.artype ,b.arskogbon,b.artreslag,b.argrunnf, b.id_temp,b.datafangstdato, ARRAY[b.informasjon] as informasjon, (b.maalemetode,b.noyaktighet,b.synbarhet)::topo_ar5.sosi_kvalitet as kvalitet , b.opphav, b.verifiseringsdato, (b.registreringsversjon,'4.5')::topo_ar5.sosi_registreringsversjon as registreringsversjon FROM ( select p.*, ft.id_temp from org_ar5.ar5_punkt as p,arf_id as ft, topo_ar5.ar5_topo_flate as f2 where f2.id = ft.id_temp and ST_Covers(f2.simple_geo,p.geo) ) as b ) AS c where f.id = c.id_temp; Find data by using Representation point
  • 33. Test performance for the migrations process (16 dual core CPU's and ssd disks) 1 parallel thread function_create_topo_ar5.sh vroom2 1 13000 200 15 parallel thread function_create_topo_ar5.sh vroom2 15 13000 200 20 parallel thread function_create_topo_ar5.sh vroom2 20 13000 200 331 billion points in Postgis Topology
  • 34. Decreasing processing time when increasing number of parallel threads Number of threads Total runtime in hours 1 108 15 16 20 18 7/18/15 341 billion points in Postgis Topology
  • 35. Average operations per second the 4 the different operation types with different number of threads. Number of threads A: Single cell linestrings B: Multi cell linestrings C: Single cell surfaces D: Multi cell surfaces 1 91 9 305 5 15 1043 48 972 21 20 814 48 934 27 7/18/15 351 billion points in Postgis Topology
  • 36. Average operations per second at every hour when running single threaded. 7/18/15 361 billion points in Postgis Topology A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A A B B B B C C C C C C D 0 50 100 150 200 250 300 350 400 450 500 Hours and opr. type Opr. pr. sec .
  • 37. Average operations per second at every hour when running 15 parallel threads. 7/18/15 371 billion points in Postgis Topology A A A A A A B B B B C C C D D D D D D D D 0 200 400 600 800 1000 1200 1400 1600 1800 Hours and opr. type Opr. pr. sec .
  • 38. Summery convert AR5 to Postgis Topology 7/18/15 381 billion points in Postgis Topology ● Content balanced grid and parallel threads. ● Two parallel threads can not work in the same area ● Function based index topo_ar5.get_relation_id( geo TopoGeometry) and indexes on relation table. ● Heavy use of && operator. ● Ok with 16 hours processing time since this is a one time operation. ● ValidateTopology('topo_ar5_sysdata') show no error.
  • 39. HOW to update the Postgis Topology layer. ● Draw a line and set attribute values ● Use stored procedures ● Use one single transaction ● Rollback if any errors ● Java backend with JSON API ● Simple test client using this API
  • 40. Two comments about update 7/18/15 401 billion points in Postgis Topology 1) Jostein head of AR5 “Don't delete old lines, it's nice to know the history behind changes”. 2) Ingvild my boss “Why do I have to move old lines around with many hundreds points, why can´t I just give you a new simple line that just shows the difference ?”
  • 41. Edit Topology data with surface data 7/18/15 411 billion points in Postgis Topology Draw a polygon Split a polygon Update surface attributes Extend a polygon
  • 42. Edit Topology : Split a polygon - Original map 7/18/15 421 billion points in Postgis Topology
  • 43. Edit Topology : Split a polygon - Input : point, line, attribute values 7/18/15 431 billion points in Postgis Topology
  • 44. Edit Topology : What happens when you have a split surface operation. 1 billion points in Postgis Topology Java backend calls : apply_line_on_topo_flate( geo_in geometry, p_in geometry, artype_in int, arskogbon_in int, artreslag_in int, argrunnf_in int) And the following happens - Adjust input input line to current data and take in account that equal surface be equal - Compute the area to be update - Take a copy of the non changed data - Take a copy of data may change - Clear data from the line attribute table - Clear data from the topo surface layer and delete rows to be changed - Add the adjusted line by topology.toTopoGeom - Update the line attribute table - Create new surfaces with new attribute value - Create old surfaces with old value - Check that non changed area is still the same
  • 45. Edit Topology : Timing issues when you have a split surface operation. 1 billion points in Postgis Topology Java backend calls this function topo_ar5.apply_line_on_topo_flate( geo_in geometry, p_in geometry, artype_in int, arskogbon_in int, artreslag_in int, argrunnf_in int) Small operations that include few changes takes a 1000 ms, but bigger oprations may minutts http://trac.osgeo.org/postgis/ticket/2083
  • 46. Edit Topology : Split a polygon - New map 7/18/15 461 billion points in Postgis Topology
  • 47. Edit Topology : Extend a polygon 7/18/15 471 billion points in Postgis Topology
  • 48. Edit Topology : Extend a polygon 7/18/15 481 billion points in Postgis Topology
  • 49. Edit Topology : Extend a polygon. 1 billion points in Postgis Topology Java backend call this function: apply_line_on_topo_flate( geo_in geometry, p_in geometry, artype_in int, arskogbon_in int, artreslag_in int, argrunnf_in int) Where p_in (0.0) means not set.
  • 50. Edit Topology : Extend a polygon 7/18/15 501 billion points in Postgis Topology
  • 51. Edit Topology : Draw a new polygon 7/18/15 511 billion points in Postgis Topology
  • 52. Edit Topology : Draw a new polygon 7/18/15 521 billion points in Postgis Topology
  • 53. Edit Topology : Draw a new polygon. 1 billion points in Postgis Topology Java backend call this function: apply_polygon_on_topo_flate( geo_in geometry, artype_in int, arskogbon_in int, artreslag_in int, argrunnf_in int )
  • 54. Edit Topology : Draw a new polygon 7/18/15 541 billion points in Postgis Topology
  • 55. Further plans this year ● Add many new layer to Postgis Topology this fall and adjust the Topology model to new requirements. ● Create a client that uses JSON API for update of topology layers. ● Extend update API with more functionality. ● We have to work more on performance and topology usage and update client for AR5 .
  • 56. Postgis Topology is a great tool and you can add one billion points and it's possible to update it afterwords. Thanks to everybody that has contributed to Postgis Topology and other open source tools. Questions ?
  • 57. Thanks for your attention.