SlideShare una empresa de Scribd logo
1 de 14
OpenSpending.org
                           Current State




Stefan Urbanek
stefan.urbanek@gmail.com                   June 2011
@Stiivi
Data
Data Store
Collections
         data             metadata



                              dataset
     entry


entity      classifier    dimension
Dimension types
 classifier   collection
 entity      collection
 string      scalar field
 float        scalar field
 date        hash/dictionary

             no implicit hierarchy
Time

■   time is a range from 
to
■   compound object
■   dataset specifies date for analytics
■   in cubes it is split into year + manth + day
■   date hierarchy is hardwired in app logic
amount
the only one measure
Extraction,
Transformation, Loading
OpenSpending.org ETL – overview

                              mapping                    wdmmg
                                                         openspending pylons application


                                                                 paster                    pylons managed data
                                        CSV file                csvimport                     store with ORM
              ckan-datapkg


                    datapkg                                    paster load

 "raw" data                             locally cached
  resource                                 resource



                                                                           wdmmg-ext
                                                                           extraction modules

                                                                                       loading
                                                                                          loading
                                                                                        scripts
                                                                                             loading
                                                                                           scripts
                                                                                              scripts

                                                                             cofog still runing from here
classification

                 –
                         entity

                     –

          date
          from
            to            +            +
          amnt
CSV file




                                     entry
Aggregations
≈
dataset*          cube

           *still as dimension
roll-up is aggregated
     “on the fly”

Más contenido relacionado

La actualidad más candente

Kotlin functional programming basic@Kotlin TW study group
Kotlin  functional programming basic@Kotlin TW study groupKotlin  functional programming basic@Kotlin TW study group
Kotlin functional programming basic@Kotlin TW study group
Julian Yu-Lang Chu
 
流行るLisp用Webフレームワーク(Gauche on Railsから学んだ事)
流行るLisp用Webフレームワーク(Gauche on Railsから学んだ事)流行るLisp用Webフレームワーク(Gauche on Railsから学んだ事)
流行るLisp用Webフレームワーク(Gauche on Railsから学んだ事)
Yuumi Yoshida
 
Flink Forward San Francisco 2019: Build a Table-centric Apache Flink Ecosyste...
Flink Forward San Francisco 2019: Build a Table-centric Apache Flink Ecosyste...Flink Forward San Francisco 2019: Build a Table-centric Apache Flink Ecosyste...
Flink Forward San Francisco 2019: Build a Table-centric Apache Flink Ecosyste...
Flink Forward
 
6. 2x2 matrixaddmulttouchpad
6. 2x2 matrixaddmulttouchpad6. 2x2 matrixaddmulttouchpad
6. 2x2 matrixaddmulttouchpad
Media4math
 

La actualidad más candente (19)

Building a transactional key-value store that scales to 100+ nodes (percona l...
Building a transactional key-value store that scales to 100+ nodes (percona l...Building a transactional key-value store that scales to 100+ nodes (percona l...
Building a transactional key-value store that scales to 100+ nodes (percona l...
 
Kotlin functional programming basic@Kotlin TW study group
Kotlin  functional programming basic@Kotlin TW study groupKotlin  functional programming basic@Kotlin TW study group
Kotlin functional programming basic@Kotlin TW study group
 
Paper_An Efficient Garbage Collection in Java Virtual Machine via Swap I/O O...
Paper_An Efficient Garbage Collection in Java Virtual  Machine via Swap I/O O...Paper_An Efficient Garbage Collection in Java Virtual  Machine via Swap I/O O...
Paper_An Efficient Garbage Collection in Java Virtual Machine via Swap I/O O...
 
流行るLisp用Webフレームワーク(Gauche on Railsから学んだ事)
流行るLisp用Webフレームワーク(Gauche on Railsから学んだ事)流行るLisp用Webフレームワーク(Gauche on Railsから学んだ事)
流行るLisp用Webフレームワーク(Gauche on Railsから学んだ事)
 
An i18n Journey
An i18n JourneyAn i18n Journey
An i18n Journey
 
Linked Data Notifications for RDF Streams
Linked Data Notifications for RDF StreamsLinked Data Notifications for RDF Streams
Linked Data Notifications for RDF Streams
 
Partitioning SKA Dataflows for Optimal Graph Execution
Partitioning SKA Dataflows for Optimal Graph ExecutionPartitioning SKA Dataflows for Optimal Graph Execution
Partitioning SKA Dataflows for Optimal Graph Execution
 
Flink Forward San Francisco 2019: Build a Table-centric Apache Flink Ecosyste...
Flink Forward San Francisco 2019: Build a Table-centric Apache Flink Ecosyste...Flink Forward San Francisco 2019: Build a Table-centric Apache Flink Ecosyste...
Flink Forward San Francisco 2019: Build a Table-centric Apache Flink Ecosyste...
 
6. 2x2 matrixaddmulttouchpad
6. 2x2 matrixaddmulttouchpad6. 2x2 matrixaddmulttouchpad
6. 2x2 matrixaddmulttouchpad
 
The Dark Side Of Go -- Go runtime related problems in TiDB in production
The Dark Side Of Go -- Go runtime related problems in TiDB  in productionThe Dark Side Of Go -- Go runtime related problems in TiDB  in production
The Dark Side Of Go -- Go runtime related problems in TiDB in production
 
15 shades of fvertica
15 shades of fvertica15 shades of fvertica
15 shades of fvertica
 
Principles of programming languages(Functional programming Languages using LISP)
Principles of programming languages(Functional programming Languages using LISP)Principles of programming languages(Functional programming Languages using LISP)
Principles of programming languages(Functional programming Languages using LISP)
 
Open Source Routing Machine - FOSS4G 2016 Bonn
Open Source Routing Machine - FOSS4G 2016 BonnOpen Source Routing Machine - FOSS4G 2016 Bonn
Open Source Routing Machine - FOSS4G 2016 Bonn
 
I/O-Efficient Techniques for Computing Pagerank
I/O-Efficient Techniques for Computing PagerankI/O-Efficient Techniques for Computing Pagerank
I/O-Efficient Techniques for Computing Pagerank
 
Log Event Stream Processing In Flink Way
Log Event Stream Processing In Flink WayLog Event Stream Processing In Flink Way
Log Event Stream Processing In Flink Way
 
Cardinality Estimation through Histogram in Apache Spark 2.3 with Ron Hu and ...
Cardinality Estimation through Histogram in Apache Spark 2.3 with Ron Hu and ...Cardinality Estimation through Histogram in Apache Spark 2.3 with Ron Hu and ...
Cardinality Estimation through Histogram in Apache Spark 2.3 with Ron Hu and ...
 
Flink Forward Berlin 2017: Roberto Bentivoglio, Saverio Veltri - NSDB (Natura...
Flink Forward Berlin 2017: Roberto Bentivoglio, Saverio Veltri - NSDB (Natura...Flink Forward Berlin 2017: Roberto Bentivoglio, Saverio Veltri - NSDB (Natura...
Flink Forward Berlin 2017: Roberto Bentivoglio, Saverio Veltri - NSDB (Natura...
 
スマホでDeepLearning実践入門(α版)
スマホでDeepLearning実践入門(α版)スマホでDeepLearning実践入門(α版)
スマホでDeepLearning実践入門(α版)
 
ConvNetJS & CaffeJS
ConvNetJS & CaffeJSConvNetJS & CaffeJS
ConvNetJS & CaffeJS
 

Destacado

Knowledge Management Introduction
Knowledge Management IntroductionKnowledge Management Introduction
Knowledge Management Introduction
Stefan Urbanek
 
Cubes – ways of deployment
Cubes – ways of deploymentCubes – ways of deployment
Cubes – ways of deployment
Stefan Urbanek
 
Knowledge Management Lecture 2: Individuals, communities and organizations
Knowledge Management Lecture 2: Individuals, communities and organizationsKnowledge Management Lecture 2: Individuals, communities and organizations
Knowledge Management Lecture 2: Individuals, communities and organizations
Stefan Urbanek
 
Knowledge Management Lecture 1: definition, history and presence
Knowledge Management Lecture 1: definition, history and presenceKnowledge Management Lecture 1: definition, history and presence
Knowledge Management Lecture 1: definition, history and presence
Stefan Urbanek
 

Destacado (20)

Dallas Data Brewery Meetup #2: Data Quality Perception
Dallas Data Brewery Meetup #2: Data Quality PerceptionDallas Data Brewery Meetup #2: Data Quality Perception
Dallas Data Brewery Meetup #2: Data Quality Perception
 
Олег Лавров. Личные, командные и организационные стратегии.
Олег Лавров. Личные, командные и организационные стратегии. Олег Лавров. Личные, командные и организационные стратегии.
Олег Лавров. Личные, командные и организационные стратегии.
 
Dallas Data Brewery - introduction
Dallas Data Brewery - introductionDallas Data Brewery - introduction
Dallas Data Brewery - introduction
 
Knowledge Management Introduction
Knowledge Management IntroductionKnowledge Management Introduction
Knowledge Management Introduction
 
Cubes – ways of deployment
Cubes – ways of deploymentCubes – ways of deployment
Cubes – ways of deployment
 
Cubes 1.0 Overview
Cubes 1.0 OverviewCubes 1.0 Overview
Cubes 1.0 Overview
 
Open Data Decentralisation
Open Data DecentralisationOpen Data Decentralisation
Open Data Decentralisation
 
Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)Python business intelligence (PyData 2012 talk)
Python business intelligence (PyData 2012 talk)
 
Cubes - Lightweight OLAP Framework
Cubes - Lightweight OLAP FrameworkCubes - Lightweight OLAP Framework
Cubes - Lightweight OLAP Framework
 
Cubes – pluggable model explained
Cubes – pluggable model explainedCubes – pluggable model explained
Cubes – pluggable model explained
 
New york data brewery meetup #1 – introduction
New york data brewery meetup #1 – introductionNew york data brewery meetup #1 – introduction
New york data brewery meetup #1 – introduction
 
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
Cubes - Lightweight Python OLAP (EuroPython 2012 talk)
 
Data Cleansing introduction (for BigClean Prague 2011)
Data Cleansing introduction (for BigClean Prague 2011)Data Cleansing introduction (for BigClean Prague 2011)
Data Cleansing introduction (for BigClean Prague 2011)
 
Knowledge Management Lecture 2: Individuals, communities and organizations
Knowledge Management Lecture 2: Individuals, communities and organizationsKnowledge Management Lecture 2: Individuals, communities and organizations
Knowledge Management Lecture 2: Individuals, communities and organizations
 
Bubbles – Virtual Data Objects
Bubbles – Virtual Data ObjectsBubbles – Virtual Data Objects
Bubbles – Virtual Data Objects
 
Knowledge Management Lecture 3: Cycle
Knowledge Management Lecture 3: CycleKnowledge Management Lecture 3: Cycle
Knowledge Management Lecture 3: Cycle
 
Knowledge Management Lecture 1: definition, history and presence
Knowledge Management Lecture 1: definition, history and presenceKnowledge Management Lecture 1: definition, history and presence
Knowledge Management Lecture 1: definition, history and presence
 
Creativity and innovation
Creativity and innovationCreativity and innovation
Creativity and innovation
 
Knowledge Management Lecture 4: Models
Knowledge Management Lecture 4: ModelsKnowledge Management Lecture 4: Models
Knowledge Management Lecture 4: Models
 
Creativity and innovation ppt mba
Creativity and innovation ppt  mbaCreativity and innovation ppt  mba
Creativity and innovation ppt mba
 

Similar a Open spending as-is 2011-06

Introduction to Apache Camel
Introduction to Apache CamelIntroduction to Apache Camel
Introduction to Apache Camel
FuseSource.com
 
Perf onjs final
Perf onjs finalPerf onjs final
Perf onjs final
qi yang
 

Similar a Open spending as-is 2011-06 (20)

Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
Python Streaming Pipelines on Flink - Beam Meetup at Lyft 2019
 
Apache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native EraApache Flink in the Cloud-Native Era
Apache Flink in the Cloud-Native Era
 
Scaling Big Data Mining Infrastructure Twitter Experience
Scaling Big Data Mining Infrastructure Twitter ExperienceScaling Big Data Mining Infrastructure Twitter Experience
Scaling Big Data Mining Infrastructure Twitter Experience
 
Whole Site Delivery with Amazon CloudFront
Whole Site Delivery with Amazon CloudFrontWhole Site Delivery with Amazon CloudFront
Whole Site Delivery with Amazon CloudFront
 
apidays LIVE New York - Automation API Testing: with Postman collection are ...
apidays LIVE New York -  Automation API Testing: with Postman collection are ...apidays LIVE New York -  Automation API Testing: with Postman collection are ...
apidays LIVE New York - Automation API Testing: with Postman collection are ...
 
H2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt DowleH2O Design and Infrastructure with Matt Dowle
H2O Design and Infrastructure with Matt Dowle
 
Partitioning CCGrid 2012
Partitioning CCGrid 2012Partitioning CCGrid 2012
Partitioning CCGrid 2012
 
Camel Riders in the Cloud
Camel Riders in the CloudCamel Riders in the Cloud
Camel Riders in the Cloud
 
Déploiement dynamique d'applications OSGi sur le Cloud
Déploiement dynamique d'applications OSGi sur le CloudDéploiement dynamique d'applications OSGi sur le Cloud
Déploiement dynamique d'applications OSGi sur le Cloud
 
Data file handling in python binary & csv files
Data file handling in python binary & csv filesData file handling in python binary & csv files
Data file handling in python binary & csv files
 
Data file handling in python binary & csv files
Data file handling in python binary & csv filesData file handling in python binary & csv files
Data file handling in python binary & csv files
 
20160908 hivemall meetup
20160908 hivemall meetup20160908 hivemall meetup
20160908 hivemall meetup
 
Ogce Workflow Suite Tg09
Ogce Workflow Suite Tg09Ogce Workflow Suite Tg09
Ogce Workflow Suite Tg09
 
Spark in the Maritime Domain
Spark in the Maritime DomainSpark in the Maritime Domain
Spark in the Maritime Domain
 
Camel riders in the cloud
Camel riders in the cloudCamel riders in the cloud
Camel riders in the cloud
 
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time CompilationThe Performance Engineer's Guide To HotSpot Just-in-Time Compilation
The Performance Engineer's Guide To HotSpot Just-in-Time Compilation
 
2008 Sccc Inheritance
2008 Sccc Inheritance2008 Sccc Inheritance
2008 Sccc Inheritance
 
GOTO LONDON 2016: Concursus Event sourcing Evolved (Updated)
GOTO LONDON 2016: Concursus Event sourcing Evolved (Updated)GOTO LONDON 2016: Concursus Event sourcing Evolved (Updated)
GOTO LONDON 2016: Concursus Event sourcing Evolved (Updated)
 
Introduction to Apache Camel
Introduction to Apache CamelIntroduction to Apache Camel
Introduction to Apache Camel
 
Perf onjs final
Perf onjs finalPerf onjs final
Perf onjs final
 

Último

CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
giselly40
 

Último (20)

Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 

Open spending as-is 2011-06

Notas del editor

  1. \n
  2. \n
  3. \n
  4. \n
  5. \n
  6. \n
  7. \n
  8. \n
  9. \n
  10. \n
  11. \n
  12. \n
  13. \n
  14. \n