SlideShare una empresa de Scribd logo
1 de 22
Descargar para leer sin conexión
In memory OLAP engine
Samuel Pelletier	

Kaviju inc.	

samuel@kaviju.com
OLAP ?
• An acronym for OnLine Analytical Processing.	

• In simple words, a system to query a multidimensional data set
and get answer fast for interactive reports. 	

• A well known implementation is an Excel Pivot Table.
Why build something new
• I wanted something fast, memory efficient for simple queries with
millions of facts. 	

• Sql queries dost not works for millions of facts with multiple
dimensions, especially with large number of rows.	

• There are specialized tools for OLAP from Microsoft, Oracle and
others but they are large and expensive, too much for my needs.	

• Generic cheap toolkits are not memory efficient, this is the cost for
their simplicity.	

• I wanted a simple solution to deploy with minimal dependency.
Memory usage and time to
retrieve 1 000 000 invoice lines
• Fetching EOs uses 1.2 GB of ram in 13-19 s	

• Fetching raw rows uses 750 MB of ram in 5-8 s.	

• Fetching as POJOs with jdbc uses 130 MB in 4.0 s.	

• Reading from file as POJOs uses 130 MB in 1.4 s.	

• For 7 M rows, EOs would require 8.4 GB for gazillions of small
objects (bad for the GC).
Time to compute sum of sales for
1 000 000 invoice lines
• 2.1 s for "select sum(sales)..." in FrontBase with table in RAM.	

• 0.5 s for @sum.sales on EOs.	

• 0.12 s for @sum.sales on raw rows.	

• 0.5 s for @sum.sales on POJOs.	

• 0.009 s for a loop with direct attribute access on POJOs.
Some concepts
• Facts are the elements being analyzed.An exemple is invoice
lines.	

• Facts contains measures like quantities, prices or amounts.	

• Facts are linked to dimensions used to filter and aggregate
them. For invoice lines, we have product, invoice, date, etc.	

• Dimensions are often part of a hierarchy, for example, products
are in a product category, dates are in a month and in a week.
Sample Invoice dimension hierarchy
Invoice
Line
Invoice
Date
Month
Ship to Client type
Sold to
Product
Salesman
SalesManager
Week
Client type
Measures:
Shipped Qty
Sales
Profits
Steps to implement an engine
• Create the Engine class.	

• Create required classes to model the dimension hierarchy. 	

• Create theValue class for your facts.	

• Create the Group class that will compute summarized results.	

• Create the dimensions definition classes.
Engine class
• Engine class extends OlapEngine with Group andValue types.

	

	

 public class SalesEngine extends OlapEngine<GroupEntry,Value>	

• Create the objects required for the data model and lookup table
used to load the facts.	

• Load the fact intoValue objects.	

• Create and register the dimensions.
Create required model objects
public class Product {	
	 public final int code;	
	 public final String name;	
	 public final ProductCategory category;	
	 	
	 public Product(int code, String name, ProductCategory category) {	
	 	 super();	
	 	 this.code = code;	
	 	 this.name = name;	
	 	 this.category = category;	
	 }	
}	
!
	 private void loadProducts() {	
	 	 productsByCode = new HashMap<Integer, Product>();	
!
	 	 WOResourceManager resourceManager = ERXApplication.application().resourceManager();	
	 	 String fileName = "olapData/products.txt";	
	 	 try ( InputStream fileData = resourceManager.inputStreamForResourceNamed(fileName, null, null);) {	
	 	 	 InputStreamReader fileReader = new InputStreamReader(fileData, "utf-8");	
	 	 	 BufferedReader reader = new BufferedReader(fileReader);	
	 	 	 String line;	
	 	 	 while ( (line = reader.readLine()) != null) {	
	 	 	 	 String[] cols = line.split("t", -1);	
	 	 	 	 Product product = new Product(Integer.parseInt(cols[0]), cols[0], categoryWithID(cols[1]));	
	 	 	 	 	
	 	 	 	 productsByCode.put(product.code, product);	
	 	 	 }	
	 	 }	
	 	 ...	
	 }
Load the facts and create dimensions
	 private void loadInvoiceLines() {	
	 	 ...	
	 	 loadProductCategories();	
	 	 loadProducts();	
!
	 	 InvoiceDimension invoiceDim = new InvoiceDimension(this);	
	 	 SalesmanDimension salesmanDim = new SalesmanDimension(this);	
	 	 	 while ( (line = reader.readLine()) != null) {	
	 	 	 	 String[] cols = line.split("t", -1);	
!
	 	 	 	 InvoiceLine invoiceLine = new InvoiceLine(valueIndex++, Short.parseShort(cols[1]));	
	 	 	 	 invoiceLine.shippedQty = Integer.parseInt(cols[6]);	
	 	 	 	 invoiceLine.sales = Float.parseFloat(cols[7]);	
	 	 	 	 invoiceLine.profits = Float.parseFloat(cols[8]);	
	 	 	 	 lines.add(invoiceLine);	
	 	 	 	 	
	 	 	 	 invoiceDim.addLine(invoiceLine, cols[0], cols);	
!
	 	 	 	 invoiceLine.salesmanNumber = Integer.parseInt(cols[12]);	
	 	 	 	 salesmanDim.addIndexEntry(invoiceLine.salesmanNumber, invoiceLine);	
	 	 	 	 ...	
	 	 	 }	
	 	 }	
	 	 addDimension(productDimension);	
	 	 addDimension(productDimension.createProductCategoryDimension());	
	 	 ...	
	 	 lines.trimToSize();	
	 	 setValues(lines);	
	 }
Value and GroupEntry classes
• Value classe contains your basic facts (invoice lines for example) 

	

	

 public class InvoiceLine extends OlapValue<Sales>	

• GroupEntry is use to compute summarized results.

	

	

 public class Sales extends GroupEntry<InvoiceLine>	

• These are tightly coupled, a GroupEntry represent a computed
result for an array ofValues; metrics are found in both classes.
Value Class
public class InvoiceLine extends OlapValue<Sales> {	
	 public Invoice invoice;	
	 public final short lineNumber;	
	 	
	 public Product product;	
!
	 public int shippedQty;	
	 public float sales;	
	 public float profits;	
!
	 public int salesmanNumber;	
	 public int salesManagerNumber;	
!
	 public InvoiceLine(int valueIndex, short lineNumber) {	
	 	 super(valueIndex);	
	 	 this.lineNumber = lineNumber;	
	 }	
}
GroupEntry class
public class Sales extends GroupEntry<InvoiceLine> {	
	 private int shippedQty;	
	 private double sales = 0.0;	
	 private double profits = 0.0;	
	 	
!
	 public Sales(GroupEntryKey<Sales, InvoiceLine> key) {	
	 	 super(key);	
	 }	
!
	 @Override	
	 public void addEntry(InvoiceLine entry) {	
	 	 shippedQty += entry.shippedQty;	
	 	 sales += entry.sales;	
	 	 profits += entry.profits;	
	 }	
!
	 @Override	
	 public void optimizeMemoryUsage() {	
	 }	
	 	 return sales;	
	 }	
!
	 ...	
}
Dimensions classes
• Dimensions implement the engine indexes and key extraction for
result aggregation.	

• Dimensions are usually linked to another class representing an
entity like Invoice, Client, Product or ProductCatogory.	

• Entity are value object POJO for optimal speed an memory
usage.You may add a method to get the corresponding eo.	

• Dimensions are either leaf (a group of facts) or group (a group of
leaf entries).
Product dimension class
public class ProductDimension extends OlapLeafDimension<Sales,Integer,InvoiceLine> {	
!
	 public ProductDimension(OlapEngine<Sales, InvoiceLine> engine) {	
	 	 super(engine, "productCode");	
	 }	
!
	 @Override	
	 public Integer getKeyForEntry(InvoiceLine entry) {	
	 	 return entry.product.code;	
	 }	
!
	 @Override	
	 public Integer getKeyForString(String keyString) {	
	 	 return Integer.valueOf(keyString);	
	 }	
	 	
	 public ProductCategoryDimension createProductCategoryDimension() {	
	 	 long startTime = System.currentTimeMillis();	
	 	 ProductCategoryDimension dimension = new ProductCategoryDimension(engine, this);	
!
	 	 for (Product product : salesEngine().products()) {	
	 	 	 dimension.addIndexEntry(product.category.categoryID, product.code);	
	 	 }	
	 	 long fetchTime = System.currentTimeMillis() - startTime;	
	 	 engine.logMessage("createProductCategoryDimension completed in "+fetchTime+"ms.");	
	 	 return dimension;	
	 }	
!
	 private SalesEngine salesEngine() {	
	 	 return (SalesEngine) engine;	
	 }
Product category dimension class
public class ProductCategoryDimension extends OlapGroupDimension<Sales,Integer,InvoiceLine,ProductDimension,Integer> {	
!
	 public ProductCategoryDimension(OlapEngine<Sales, InvoiceLine> engine, ProductDimension childDimension) {	
	 	 super(engine, "productCategoryCode", childDimension);	
	 }	
!
	 @Override	
	 public Integer getKeyForEntry(InvoiceLine entry) {	
	 	 return entry.product.category.categoryID;	
	 }	
!
	 @Override	
	 public Integer getKeyForString(String keyString) {	
	 	 return Integer.valueOf(keyString);	
	 }
Initialize and use in an app
• The engine is multithread capable once loaded.	

• I usually create a singleton for the engine; it can also be in your
app class.	

• Entity are value object POJO for optimal speed an memory
usage.You may add a method to get the corresponding eo.	

• Dimensions are either leaf (a group of facts) or group (a group of
leaf entries).
Use in a application
	 public Application() {	
	 	 ...	
	 	 SalesEngine.createEngine();	
	 }	
!
!
In the component that uses the engine	
!
	 public OlapNavigator(WOContext context) {	
	 	 super(context);	
	 	 ....	
	 	 engine = SalesEngine.sharedEngine();	
	 	 if (engine == null) {	
	 	 	 Engine me bay null if it has not completed it's loading...	
	 	 }	
	 }	
!
	 someFetchMethod() {	
	 	 OlapResult<Sales, InvoiceLine> result = engine.resultForRequest(query);	
!
	 	 rows = new NSArray<Sales>(result.getGroups());	
	 	 	
	 	 sort or put inside a ERXDisplayGroup...	
	 }	
!
Demo app
Java and memory
• To keep the garbage collector happy, it is better to have a
maximum heap at least 2-3 times the real usage.	

• GC can kill your app performance if memory is starved.With
default setting, it may even kill your server by using multiple core
for long periods at least in 1.5 and 1.6.	

• Java 1.7 contains a new collector, probable better.
Q&A
Samuel Pelletier	

samuel@kaviju.com

Más contenido relacionado

La actualidad más candente

MVC & SQL_In_1_Hour
MVC & SQL_In_1_HourMVC & SQL_In_1_Hour
MVC & SQL_In_1_Hour
Dilip Patel
 
Hibernate Presentation
Hibernate  PresentationHibernate  Presentation
Hibernate Presentation
guest11106b
 
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres OpenMichael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
PostgresOpen
 

La actualidad más candente (20)

Drupal8 for Symfony Developers (PHP Day Verona 2017)
Drupal8 for Symfony Developers (PHP Day Verona 2017)Drupal8 for Symfony Developers (PHP Day Verona 2017)
Drupal8 for Symfony Developers (PHP Day Verona 2017)
 
Advance java session 11
Advance java session 11Advance java session 11
Advance java session 11
 
MVC & SQL_In_1_Hour
MVC & SQL_In_1_HourMVC & SQL_In_1_Hour
MVC & SQL_In_1_Hour
 
JavaScript - Chapter 12 - Document Object Model
  JavaScript - Chapter 12 - Document Object Model  JavaScript - Chapter 12 - Document Object Model
JavaScript - Chapter 12 - Document Object Model
 
Lecture 4: JavaServer Pages (JSP) & Expression Language (EL)
Lecture 4:  JavaServer Pages (JSP) & Expression Language (EL)Lecture 4:  JavaServer Pages (JSP) & Expression Language (EL)
Lecture 4: JavaServer Pages (JSP) & Expression Language (EL)
 
Lecture 3: Servlets - Session Management
Lecture 3:  Servlets - Session ManagementLecture 3:  Servlets - Session Management
Lecture 3: Servlets - Session Management
 
Apex Code Analysis Using the Tooling API and Canvas
Apex Code Analysis Using the Tooling API and CanvasApex Code Analysis Using the Tooling API and Canvas
Apex Code Analysis Using the Tooling API and Canvas
 
Java
Java Java
Java
 
ExtJs Basic Part-1
ExtJs Basic Part-1ExtJs Basic Part-1
ExtJs Basic Part-1
 
Hibernate
HibernateHibernate
Hibernate
 
Hibernate Presentation
Hibernate  PresentationHibernate  Presentation
Hibernate Presentation
 
Sling Models Using Sightly and JSP by Deepak Khetawat
Sling Models Using Sightly and JSP by Deepak KhetawatSling Models Using Sightly and JSP by Deepak Khetawat
Sling Models Using Sightly and JSP by Deepak Khetawat
 
Using the Tooling API to Generate Apex SOAP Web Service Clients
Using the Tooling API to Generate Apex SOAP Web Service ClientsUsing the Tooling API to Generate Apex SOAP Web Service Clients
Using the Tooling API to Generate Apex SOAP Web Service Clients
 
JavaOne 2014 - CON2013 - Code Generation in the Java Compiler: Annotation Pro...
JavaOne 2014 - CON2013 - Code Generation in the Java Compiler: Annotation Pro...JavaOne 2014 - CON2013 - Code Generation in the Java Compiler: Annotation Pro...
JavaOne 2014 - CON2013 - Code Generation in the Java Compiler: Annotation Pro...
 
Jpa
JpaJpa
Jpa
 
C# Advanced L07-Design Patterns
C# Advanced L07-Design PatternsC# Advanced L07-Design Patterns
C# Advanced L07-Design Patterns
 
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres OpenMichael Bayer Introduction to SQLAlchemy @ Postgres Open
Michael Bayer Introduction to SQLAlchemy @ Postgres Open
 
Connect.Tech- Enhancing Your Workflow With Xcode Source Editor Extensions
Connect.Tech- Enhancing Your Workflow With Xcode Source Editor ExtensionsConnect.Tech- Enhancing Your Workflow With Xcode Source Editor Extensions
Connect.Tech- Enhancing Your Workflow With Xcode Source Editor Extensions
 
DataFX - JavaOne 2013
DataFX - JavaOne 2013DataFX - JavaOne 2013
DataFX - JavaOne 2013
 
Efficient Rails Test Driven Development (class 3) by Wolfram Arnold
Efficient Rails Test Driven Development (class 3) by Wolfram ArnoldEfficient Rails Test Driven Development (class 3) by Wolfram Arnold
Efficient Rails Test Driven Development (class 3) by Wolfram Arnold
 

Destacado

Migrating existing Projects to Wonder
Migrating existing Projects to WonderMigrating existing Projects to Wonder
Migrating existing Projects to Wonder
WO Community
 
Chaining the Beast - Testing Wonder Applications in the Real World
Chaining the Beast - Testing Wonder Applications in the Real WorldChaining the Beast - Testing Wonder Applications in the Real World
Chaining the Beast - Testing Wonder Applications in the Real World
WO Community
 
iOS for ERREST - alternative version
iOS for ERREST - alternative versioniOS for ERREST - alternative version
iOS for ERREST - alternative version
WO Community
 
Using Nagios to monitor your WO systems
Using Nagios to monitor your WO systemsUsing Nagios to monitor your WO systems
Using Nagios to monitor your WO systems
WO Community
 
Build and deployment
Build and deploymentBuild and deployment
Build and deployment
WO Community
 
Apache Cayenne for WO Devs
Apache Cayenne for WO DevsApache Cayenne for WO Devs
Apache Cayenne for WO Devs
WO Community
 
Filtering data with D2W
Filtering data with D2W Filtering data with D2W
Filtering data with D2W
WO Community
 
Advanced Apache Cayenne
Advanced Apache CayenneAdvanced Apache Cayenne
Advanced Apache Cayenne
WO Community
 
Deploying WO on Windows
Deploying WO on WindowsDeploying WO on Windows
Deploying WO on Windows
WO Community
 
"Framework Principal" pattern
"Framework Principal" pattern"Framework Principal" pattern
"Framework Principal" pattern
WO Community
 

Destacado (17)

ERRest in Depth
ERRest in DepthERRest in Depth
ERRest in Depth
 
ERRest
ERRestERRest
ERRest
 
ERRest - The Next Steps
ERRest - The Next StepsERRest - The Next Steps
ERRest - The Next Steps
 
Migrating existing Projects to Wonder
Migrating existing Projects to WonderMigrating existing Projects to Wonder
Migrating existing Projects to Wonder
 
Reenabling SOAP using ERJaxWS
Reenabling SOAP using ERJaxWSReenabling SOAP using ERJaxWS
Reenabling SOAP using ERJaxWS
 
Chaining the Beast - Testing Wonder Applications in the Real World
Chaining the Beast - Testing Wonder Applications in the Real WorldChaining the Beast - Testing Wonder Applications in the Real World
Chaining the Beast - Testing Wonder Applications in the Real World
 
iOS for ERREST - alternative version
iOS for ERREST - alternative versioniOS for ERREST - alternative version
iOS for ERREST - alternative version
 
Using Nagios to monitor your WO systems
Using Nagios to monitor your WO systemsUsing Nagios to monitor your WO systems
Using Nagios to monitor your WO systems
 
WOver
WOverWOver
WOver
 
iOS for ERREST
iOS for ERRESTiOS for ERREST
iOS for ERREST
 
Build and deployment
Build and deploymentBuild and deployment
Build and deployment
 
Apache Cayenne for WO Devs
Apache Cayenne for WO DevsApache Cayenne for WO Devs
Apache Cayenne for WO Devs
 
Filtering data with D2W
Filtering data with D2W Filtering data with D2W
Filtering data with D2W
 
Advanced Apache Cayenne
Advanced Apache CayenneAdvanced Apache Cayenne
Advanced Apache Cayenne
 
Deploying WO on Windows
Deploying WO on WindowsDeploying WO on Windows
Deploying WO on Windows
 
High availability
High availabilityHigh availability
High availability
 
"Framework Principal" pattern
"Framework Principal" pattern"Framework Principal" pattern
"Framework Principal" pattern
 

Similar a In memory OLAP engine

Multidimensional Data Analysis with Ruby (sample)
Multidimensional Data Analysis with Ruby (sample)Multidimensional Data Analysis with Ruby (sample)
Multidimensional Data Analysis with Ruby (sample)
Raimonds Simanovskis
 
How to Sharpen Your Investigative Analysis with PowerPivot
How to Sharpen Your Investigative Analysis with PowerPivotHow to Sharpen Your Investigative Analysis with PowerPivot
How to Sharpen Your Investigative Analysis with PowerPivot
Carmen Mardiros
 

Similar a In memory OLAP engine (20)

Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML ToolkitAugmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
 
Ecommerce as an Engine
Ecommerce as an EngineEcommerce as an Engine
Ecommerce as an Engine
 
Azure integration in dynamic crm
Azure integration in dynamic crmAzure integration in dynamic crm
Azure integration in dynamic crm
 
Advance sql - window functions patterns and tricks
Advance sql - window functions patterns and tricksAdvance sql - window functions patterns and tricks
Advance sql - window functions patterns and tricks
 
Hundreds of queries in the time of one - Gianmario Spacagna
Hundreds of queries in the time of one - Gianmario SpacagnaHundreds of queries in the time of one - Gianmario Spacagna
Hundreds of queries in the time of one - Gianmario Spacagna
 
E-Bazaar
E-BazaarE-Bazaar
E-Bazaar
 
How to get Automated Testing "Done"
How to get Automated Testing "Done"How to get Automated Testing "Done"
How to get Automated Testing "Done"
 
Pa2 session 4
Pa2 session 4Pa2 session 4
Pa2 session 4
 
Client sidescripting javascript
Client sidescripting javascriptClient sidescripting javascript
Client sidescripting javascript
 
C++ super market
C++ super marketC++ super market
C++ super market
 
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...
AWS re:Invent 2016: How Fulfillment by Amazon (FBA) and Scopely Improved Resu...
 
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML ToolkitAugmenting Machine Learning with Databricks Labs AutoML Toolkit
Augmenting Machine Learning with Databricks Labs AutoML Toolkit
 
ADBMS ASSIGNMENT
ADBMS ASSIGNMENTADBMS ASSIGNMENT
ADBMS ASSIGNMENT
 
Multidimensional Data Analysis with Ruby (sample)
Multidimensional Data Analysis with Ruby (sample)Multidimensional Data Analysis with Ruby (sample)
Multidimensional Data Analysis with Ruby (sample)
 
Rapid prototyping of eclipse rcp applications - Eclipsecon Europe 2017
Rapid prototyping of eclipse rcp applications - Eclipsecon Europe 2017Rapid prototyping of eclipse rcp applications - Eclipsecon Europe 2017
Rapid prototyping of eclipse rcp applications - Eclipsecon Europe 2017
 
MSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance AppsMSFT Dumaguete 061616 - Building High Performance Apps
MSFT Dumaguete 061616 - Building High Performance Apps
 
Microservices Chaos Testing at Jet
Microservices Chaos Testing at JetMicroservices Chaos Testing at Jet
Microservices Chaos Testing at Jet
 
Aspdot
AspdotAspdot
Aspdot
 
How to Sharpen Your Investigative Analysis with PowerPivot
How to Sharpen Your Investigative Analysis with PowerPivotHow to Sharpen Your Investigative Analysis with PowerPivot
How to Sharpen Your Investigative Analysis with PowerPivot
 
Key projects in AI, ML and Generative AI
Key projects in AI, ML and Generative AIKey projects in AI, ML and Generative AI
Key projects in AI, ML and Generative AI
 

Más de WO Community (12)

Localizing your apps for multibyte languages
Localizing your apps for multibyte languagesLocalizing your apps for multibyte languages
Localizing your apps for multibyte languages
 
WOdka
WOdkaWOdka
WOdka
 
ERGroupware
ERGroupwareERGroupware
ERGroupware
 
D2W Branding Using jQuery ThemeRoller
D2W Branding Using jQuery ThemeRollerD2W Branding Using jQuery ThemeRoller
D2W Branding Using jQuery ThemeRoller
 
CMS / BLOG and SnoWOman
CMS / BLOG and SnoWOmanCMS / BLOG and SnoWOman
CMS / BLOG and SnoWOman
 
Using GIT
Using GITUsing GIT
Using GIT
 
Persistent Session Storage
Persistent Session StoragePersistent Session Storage
Persistent Session Storage
 
Back2 future
Back2 futureBack2 future
Back2 future
 
WebObjects Optimization
WebObjects OptimizationWebObjects Optimization
WebObjects Optimization
 
Dynamic Elements
Dynamic ElementsDynamic Elements
Dynamic Elements
 
Practical ERSync
Practical ERSyncPractical ERSync
Practical ERSync
 
ERRest: the Basics
ERRest: the BasicsERRest: the Basics
ERRest: the Basics
 

Último

Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
vu2urc
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
?#DUbAI#??##{{(☎️+971_581248768%)**%*]'#abortion pills for sale in dubai@
 

Último (20)

Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Advantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your BusinessAdvantages of Hiring UIUX Design Service Providers for Your Business
Advantages of Hiring UIUX Design Service Providers for Your Business
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

In memory OLAP engine

  • 1. In memory OLAP engine Samuel Pelletier Kaviju inc. samuel@kaviju.com
  • 2. OLAP ? • An acronym for OnLine Analytical Processing. • In simple words, a system to query a multidimensional data set and get answer fast for interactive reports. • A well known implementation is an Excel Pivot Table.
  • 3. Why build something new • I wanted something fast, memory efficient for simple queries with millions of facts. • Sql queries dost not works for millions of facts with multiple dimensions, especially with large number of rows. • There are specialized tools for OLAP from Microsoft, Oracle and others but they are large and expensive, too much for my needs. • Generic cheap toolkits are not memory efficient, this is the cost for their simplicity. • I wanted a simple solution to deploy with minimal dependency.
  • 4. Memory usage and time to retrieve 1 000 000 invoice lines • Fetching EOs uses 1.2 GB of ram in 13-19 s • Fetching raw rows uses 750 MB of ram in 5-8 s. • Fetching as POJOs with jdbc uses 130 MB in 4.0 s. • Reading from file as POJOs uses 130 MB in 1.4 s. • For 7 M rows, EOs would require 8.4 GB for gazillions of small objects (bad for the GC).
  • 5. Time to compute sum of sales for 1 000 000 invoice lines • 2.1 s for "select sum(sales)..." in FrontBase with table in RAM. • 0.5 s for @sum.sales on EOs. • 0.12 s for @sum.sales on raw rows. • 0.5 s for @sum.sales on POJOs. • 0.009 s for a loop with direct attribute access on POJOs.
  • 6. Some concepts • Facts are the elements being analyzed.An exemple is invoice lines. • Facts contains measures like quantities, prices or amounts. • Facts are linked to dimensions used to filter and aggregate them. For invoice lines, we have product, invoice, date, etc. • Dimensions are often part of a hierarchy, for example, products are in a product category, dates are in a month and in a week.
  • 7. Sample Invoice dimension hierarchy Invoice Line Invoice Date Month Ship to Client type Sold to Product Salesman SalesManager Week Client type Measures: Shipped Qty Sales Profits
  • 8. Steps to implement an engine • Create the Engine class. • Create required classes to model the dimension hierarchy. • Create theValue class for your facts. • Create the Group class that will compute summarized results. • Create the dimensions definition classes.
  • 9. Engine class • Engine class extends OlapEngine with Group andValue types.
 public class SalesEngine extends OlapEngine<GroupEntry,Value> • Create the objects required for the data model and lookup table used to load the facts. • Load the fact intoValue objects. • Create and register the dimensions.
  • 10. Create required model objects public class Product { public final int code; public final String name; public final ProductCategory category; public Product(int code, String name, ProductCategory category) { super(); this.code = code; this.name = name; this.category = category; } } ! private void loadProducts() { productsByCode = new HashMap<Integer, Product>(); ! WOResourceManager resourceManager = ERXApplication.application().resourceManager(); String fileName = "olapData/products.txt"; try ( InputStream fileData = resourceManager.inputStreamForResourceNamed(fileName, null, null);) { InputStreamReader fileReader = new InputStreamReader(fileData, "utf-8"); BufferedReader reader = new BufferedReader(fileReader); String line; while ( (line = reader.readLine()) != null) { String[] cols = line.split("t", -1); Product product = new Product(Integer.parseInt(cols[0]), cols[0], categoryWithID(cols[1])); productsByCode.put(product.code, product); } } ... }
  • 11. Load the facts and create dimensions private void loadInvoiceLines() { ... loadProductCategories(); loadProducts(); ! InvoiceDimension invoiceDim = new InvoiceDimension(this); SalesmanDimension salesmanDim = new SalesmanDimension(this); while ( (line = reader.readLine()) != null) { String[] cols = line.split("t", -1); ! InvoiceLine invoiceLine = new InvoiceLine(valueIndex++, Short.parseShort(cols[1])); invoiceLine.shippedQty = Integer.parseInt(cols[6]); invoiceLine.sales = Float.parseFloat(cols[7]); invoiceLine.profits = Float.parseFloat(cols[8]); lines.add(invoiceLine); invoiceDim.addLine(invoiceLine, cols[0], cols); ! invoiceLine.salesmanNumber = Integer.parseInt(cols[12]); salesmanDim.addIndexEntry(invoiceLine.salesmanNumber, invoiceLine); ... } } addDimension(productDimension); addDimension(productDimension.createProductCategoryDimension()); ... lines.trimToSize(); setValues(lines); }
  • 12. Value and GroupEntry classes • Value classe contains your basic facts (invoice lines for example) 
 public class InvoiceLine extends OlapValue<Sales> • GroupEntry is use to compute summarized results.
 public class Sales extends GroupEntry<InvoiceLine> • These are tightly coupled, a GroupEntry represent a computed result for an array ofValues; metrics are found in both classes.
  • 13. Value Class public class InvoiceLine extends OlapValue<Sales> { public Invoice invoice; public final short lineNumber; public Product product; ! public int shippedQty; public float sales; public float profits; ! public int salesmanNumber; public int salesManagerNumber; ! public InvoiceLine(int valueIndex, short lineNumber) { super(valueIndex); this.lineNumber = lineNumber; } }
  • 14. GroupEntry class public class Sales extends GroupEntry<InvoiceLine> { private int shippedQty; private double sales = 0.0; private double profits = 0.0; ! public Sales(GroupEntryKey<Sales, InvoiceLine> key) { super(key); } ! @Override public void addEntry(InvoiceLine entry) { shippedQty += entry.shippedQty; sales += entry.sales; profits += entry.profits; } ! @Override public void optimizeMemoryUsage() { } return sales; } ! ... }
  • 15. Dimensions classes • Dimensions implement the engine indexes and key extraction for result aggregation. • Dimensions are usually linked to another class representing an entity like Invoice, Client, Product or ProductCatogory. • Entity are value object POJO for optimal speed an memory usage.You may add a method to get the corresponding eo. • Dimensions are either leaf (a group of facts) or group (a group of leaf entries).
  • 16. Product dimension class public class ProductDimension extends OlapLeafDimension<Sales,Integer,InvoiceLine> { ! public ProductDimension(OlapEngine<Sales, InvoiceLine> engine) { super(engine, "productCode"); } ! @Override public Integer getKeyForEntry(InvoiceLine entry) { return entry.product.code; } ! @Override public Integer getKeyForString(String keyString) { return Integer.valueOf(keyString); } public ProductCategoryDimension createProductCategoryDimension() { long startTime = System.currentTimeMillis(); ProductCategoryDimension dimension = new ProductCategoryDimension(engine, this); ! for (Product product : salesEngine().products()) { dimension.addIndexEntry(product.category.categoryID, product.code); } long fetchTime = System.currentTimeMillis() - startTime; engine.logMessage("createProductCategoryDimension completed in "+fetchTime+"ms."); return dimension; } ! private SalesEngine salesEngine() { return (SalesEngine) engine; }
  • 17. Product category dimension class public class ProductCategoryDimension extends OlapGroupDimension<Sales,Integer,InvoiceLine,ProductDimension,Integer> { ! public ProductCategoryDimension(OlapEngine<Sales, InvoiceLine> engine, ProductDimension childDimension) { super(engine, "productCategoryCode", childDimension); } ! @Override public Integer getKeyForEntry(InvoiceLine entry) { return entry.product.category.categoryID; } ! @Override public Integer getKeyForString(String keyString) { return Integer.valueOf(keyString); }
  • 18. Initialize and use in an app • The engine is multithread capable once loaded. • I usually create a singleton for the engine; it can also be in your app class. • Entity are value object POJO for optimal speed an memory usage.You may add a method to get the corresponding eo. • Dimensions are either leaf (a group of facts) or group (a group of leaf entries).
  • 19. Use in a application public Application() { ... SalesEngine.createEngine(); } ! ! In the component that uses the engine ! public OlapNavigator(WOContext context) { super(context); .... engine = SalesEngine.sharedEngine(); if (engine == null) { Engine me bay null if it has not completed it's loading... } } ! someFetchMethod() { OlapResult<Sales, InvoiceLine> result = engine.resultForRequest(query); ! rows = new NSArray<Sales>(result.getGroups()); sort or put inside a ERXDisplayGroup... } !
  • 21. Java and memory • To keep the garbage collector happy, it is better to have a maximum heap at least 2-3 times the real usage. • GC can kill your app performance if memory is starved.With default setting, it may even kill your server by using multiple core for long periods at least in 1.5 and 1.6. • Java 1.7 contains a new collector, probable better.