SlideShare una empresa de Scribd logo
1 de 5
Descargar para leer sin conexión
IOSR Journal of Computer Engineering (IOSR-JCE)
e-ISSN: 2278-0661, p- ISSN: 2278-8727Volume 9, Issue 3 (Mar. - Apr. 2013), PP 01-05
www.iosrjournals.org

    Performance Improvement Techniques for Customized Data
                         Warehouse
                            Md. Al Mamun and Md. Humayun Kabir
Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka-1342, Bangladesh.

Abstract: In this paper, we present performance improvement techniques for data retrieval from customized
data warehouses for efficient querying and Online Analytical Processing (OLAP) in relation to efficient
database and memory management. Different database management techniques, e.g. indexing, partitioning etc.
play vital role in efficient memory management. A comparison of data retrieval time for a particular query from
a relational database as well as data warehouse database with and without indexing is performed. We show that
the application of different database management techniques result faster query execution by reducing data
retrieval time. This improved efficiency may increase the efficiency of OLAP operations, which results better
data warehouse performance.
Keywords - Data Warehouse, Indexing, OLAP, Partitioning, Querying.

                                               I.     Introduction
            Data retrieval from relational database requires high access time when it stores millions of records,
which can be overcome using indexing [1]. Test data can be generated to populate test databases to test SQL
queries [2]. Different database management techniques e.g. partitioning [3], indexing [1, 4, 5] etc. can be
employed for faster data access. Customized database application software may be developed using these
techniques for faster data processing. In the cases where the number of records in relational database becomes
very high, the query processing time becomes very long [6]. Data warehouses [7-9] which store consolidated
historic data can be constructed from relational databases. Data warehouse (DW) database store very small
number of records for a large number of records in relational database [6]. This reduction of the number of
records in data warehouse results in smaller query retrieval time [6]. Moreover, this retrieval time can be further
reduced using different indexing techniques.
          In this paper, we have studied different techniques to improve performance of data retrieval from
relational database as well as DW [8, 9] database for different sizes of data. We have used bitmap indexing
(BMI) [10] and data partitioning techniques to improve performance of data retrieval from data warehouse
database [6]. We have also shown that data retrieval time for data warehouse is very much lower compared to
relational database for similar queries using bitmap indexing [10]. This suggests that data warehouse with
bitmap indexing is more suitable for an enterprise for intelligent and faster decision making [3, 6]. The retrieval
time rises with the increase in data size.
            The paper is organized as follows. Section II presents system architecture, section III describes the
data retrieval time measurement technique, section IV presents comparison of data retrieval time for RDBMS
with and without indexing, section V presents about data retrieval time for partitioning, section VI presents
comparison of data retrieval time for different data sources with variable data sizes, section VII concludes.

                                        II.         System Architecture
           This section presents the architecture for constructing customized data warehouses from RDBMS
tables using data definition language (DDL) of SQL. Data warehouse is populated with data from external
sources e.g. relational database system, in the consolidated form using aggregation operation. We have created
dimension and fact tables from RDBMS table schemas. Queries are executed on both RDBMS tables and DW
dimension and fact tables with and without indexing using Java programs. The data retrieval time are measured
and compared. Fig 1 represents the system architecture.

                                        III.        Data Retrieval Time
      We have designed an algorithm for determining data retrieval time for relational database system and DW
database system [6]. We have used the function DBTime () to determine data retrieval time: tr = (tend – tstart ) in
milliseconds (ms) of executing a particular query.




                                               www.iosrjournals.org                                        1 | Page
Performance Improvement Techniques for Customized Data Warehouse

    Data Sources                         RDB as input
                           RDBMS                                                   Java Program
                          utput                                                                   SQL Query
                  Query
    DDL on RDB Schema
                  Execution
                Dimension and Fact Tables
                  Data
                  Warehouse
               Populate
                  Data Cube           Populated DW as Input
                  Data Warehouse
                  RDBMS
                                                                     RDB Output
                                                     Visual                            Query
                   Text/Graphical Output            Function                           Execution
                                                                       DW Output
                                            Figure 1: System Architecture

Retrieval time varies with primary memory and processing speed of computer in which we execute our software
system. The experiment is done on a laptop with Core i3 processor of 2.53 GHz, RAM 2GB, HD 500GB under
windows OS.

    IV.     Comparison Of Data Retrieval Time For Relational Databases With And Without
                                          Indexing
           We have applied indexing on students records stored in database relations using Oracle RDBMS. The
developed performance improvement system generates very large number of student random data records by
varying their CGPAs to populate the RDBMS tables. The developed prototype determines the data retrieval time
using data tables created of different data sizes using RDBMS for a particular query without indexing at first as
shown in Table 1 [6].

   Table 1 Data retrieval time without indexing and with indexing on tables of different data sizes.

    Number           of                                           The database relations are then indexed and the
    Records                Retrieval Time (ms)               same query is executed on the indexed relations. The
    Data    size     in    Without      With indexing        query execution times are recorded as shown in Table
    number           of    indexing
                                                             1. Fig 2 plots query execution time without and with
    records
                                                             indexing on different relations of variable data sizes.
    100000                 3324         2819
    200000                 10671        5457                 We notice that the data retrieval time for non-indexed
    300000                 47727        35340                relations is more than that of indexed relations for a
                                                             particular data size.




              Figure 2: Comparison of data retrieval time without indexing and with indexing for RDBMS.

                            V.     Data Retrieval Time Using Data Partitioning
         Data partitioning can speed up the performance of data processing in data retrieval. In case of small
physical memory, the large volume of data can be partitioned into smaller segments to load into primary
memory. This can help to execute the application program to access data larger than the main memory size
successfully. But if the number of partitions is big enough, then data access

                                                www.iosrjournals.org                                        2 | Page
Performance Improvement Techniques for Customized Data Warehouse

may take longer time due to switching between partitions as overhead [3]. Increasing physical memory size,
partitioning can be avoided or number of partitions can be reduced resulting faster data access. Partitioning with
indexing may cause even more reduction in data retrieval time.

                VI.      Comparison Of Data Retrieval Time For Different Data Sizes
     Consider the data retrieval time shown in Table 2 required to select 209 students of a particular session
2002-2003, who obtained CGPA 4.0 out of 100000 student records by executing a query. We have defined and
executed queries to retrieve records of Table 3 from RDBMS, DW without and with bitmap indexing.

 Table 2 Data retrieval time (in ms) for RDBMS, Data Warehouse without indexing, and Data Warehouse with
                                                    BMI
                                 RDBMS          DW (non-
                                (indexed)    indexed) access  DW with BMI
                               access time        time          access time
                                    92             41               28

                                                          Retrieval Time for RDBMS and DW

                                                   100
                             Data Retrieval Time




                                                   80

                                                   60
                                                                                                       Series1
                                                   40

                                                   20

                                                    0
                                                         RDBMS       Data Warehouse   Data Warehouse
                                                                                          with BMI
                                                                     Data Sources


Figure 3: Retrieval time for indexed RDBMS, Data Warehouse without indexing, and Data Warehouse with BMI for
selecting 209 students of a session.
   Consider the queries to retrieve session, final exam year, CGPA, and the number of the students who
obtained CGPA 4 as shown in Table 3. The queries retrieve and count student’s records which are stored in
RDBMS and DW. Table 4 represents data access time in ms for indexed RDBMS, non-indexed DW and DW
with bitmap indexing.

                      TABLE 3 Query Output for students of all sessions with CGPA 4.0
                                            4th Year                    No. of
                             Session      Final Exam CGPA Students
                           2001-2002          2005             4         503
                           2002-2003          2006             4         209
                           2003-2004          2007             4         236
                           2004-2005          2008             4         265
                           2005-2006          2009             4         241
                           2006-2007          2010             4         239
                           2007-2008          2011             4         261

                   TABLE 4 Data retrieval time for students of all sessions with CGPA 4.0
                           Indexed          Data         Data Warehouse
                           RDBMS Warehouse                    with BMI
                             555            223                   207

    Fig 4 plots the data retrieval time shown in Table 4 for different data sources to retrieve the query output
shown in Table 3.




                                                                 www.iosrjournals.org                            3 | Page
Performance Improvement Techniques for Customized Data Warehouse

                                                          Comparison of Data Retrieval Time

                                                600

                                                500




                               Retrieval Time
                                                400

                                                300                                                   Series1

                                                200

                                                100

                                                 0
                                                         RDBMS           DW            DW with BMI
                                                                     Data Sources


             Figure 4: Comparison of data retrieval time for different data sources without and with indexing

     Queries operated on databases of RDBMS require the highest time as it stores a large number of raw data
records. Table 5 represents the data retrieval time of executing a query on data tables of various sizes containing
records of up to 4 millions using RDBMS and the corresponding records in DW. We have measured the
execution time of a query for accessing data from an indexed RDBMS database, non-indexed data warehouse
database and a DW with bitmap indexing. It is observed that data retrieval from data warehouse with bitmap
indexing requires less time compared to that of data warehouse without indexing.

TABLE 5 Retrieval time for CGPA 4.0 students of all sessions from indexed data tables of different sizes
stored in RDBMS, DW without and with bitmap indexing
                              RDBMS Data Retrieval Time (ms)
                              No. of      Indexed           DW with
                              Records RDBMS DW              BMI
                                100000         555    223         207
                                300000        8964    538         502
                                500000       13000    769         435
                               1000000       16259   8569        7520
                               2000000       42546 17927        17222
                               4000000       87028 53057        38213

We create two tables Table 6 and Table 7 based on Table 5. Finally, we create another two tables Table 7 and
Table 8 for clarification of the data retrieval time for different data sizes of RDBMS database and DW database
separately.

                            TABLE 6 Corresponding Data Sizes of DW database for RDBMS database

                                                      Data Size of RDBMS            Data size in DW
                                                            100000                        1954
                                                            300000                        5879
                                                            500000                        9880
                                                            1000000                      19986
                                                            2000000                      40060
                                                            4000000                      79803

                          TABLE 7 Retrieval Time for different Data Sizes of RDBMS
                                      RDBMS         RDBMS Retrieval
                                     Data Size             Time
                                       100000                555
                                       300000               8964
                                       500000              13000
                                      1000000              16259
                                      2000000              42546
                                      4000000              87028
                                                                 www.iosrjournals.org                           4 | Page
Performance Improvement Techniques for Customized Data Warehouse




                                   Figure 5: Comparison of data retrieval time for different data Sizes shown in Table 7.

                                       TABLE 8 Retrieval Time for Data Warehouses of different Data Sizes.

                                   DW Retrieval Time vs Data Size
                                                                                                         Data size of   DW Retrieval   DW Retrieval
                                                                                                            DW            Time         Time with BMI
                   60000
                                                                                                            1954            223             207
                   50000
                                                                                                            5879            538             502
  Retrieval Time




                   40000
                                                                                                            9880            769             435
                   30000
                                                                                                           19986           8569            7520
                   20000

                   10000
                                                                                                           40060          17927            17222
                      0
                                                                                                           79803          53057            38213
                           1954     5879         9880            19986           40060       79803
                                                         Data Size

                                     DW Retrieval Time        DW Retrieval Time w ith BMI


Figure 6: Comparison of data retrieval time for different data Sizes of Data Warehouse with and without BMI shown in
Table 8.
Fig. 5 and Fig. 6 explain that data retrieval time in both RDBMS and DW cases increase with the increase in data size
significantly.

                                                                                         VII.        Conclusion
           We have observed that data retrieval from tables stored in RDBMS database is almost exponentially
rising. But the increases in data retrieval time for data warehouse with bitmap indexing or without bitmap
indexing have small increase with the increase of data size. We have shown that data retrieval time for data
warehouse is very much lower compared to relational database for similar queries. This suggests that data
warehouse is more suitable for an enterprise for intelligent and faster data access in decision making. This
suggests that OLAP system can be developed using DW database and is more suitable than using relational
database system for intelligent and efficient decision making with reporting or data analysis.

                                                                                         Acknowledgements
 We greatly acknowledge the valuable comments of the faculty members who were present at the thesis
examination board.

                                                                                             References
[1]                  M. Barrena, C. Pachon and E. Jurado, JISBD2007-04: Neighbors search in holey multidimensional spaces, IEEE Latin America
                     Transactions, Vol. 6, No. 4, Aug. 2008, pages 332-338.
[2]                  M. J. Suarez-Cabal, C. de la Riva and J. Tuya, JISBD04-Populating Test Databases for Testing SQL Queries, IEEE Latin America
                     Transactions, Vol. 8, No. 2, April 2010, pages 164-171.
[3]                  Mafruz Zaman Ashrafi, David Taniar, Kate Smith, ODAM: An Optimized Distributed Association Rule Mining Algorithm, Monash
                     University, IEEE Distributed Systems Online, IEEE Computer Society, Vol. 5, No. 3; March 2004.
[4]                  Bernd Reiner, Karl Hahn. Optimized Management of Large-Scale Data Sets Stored on Tertiary Storage System, IEEE Distributed
                     Systems Online, IEEE Computer Society, Vol. 5, No. 5; May 2004.
[5]                  S. Repp, A. Gross and C. Meinel, Browsing within Lecture Videos based on the chain index of speech transcription, IEEE
                     transactions on learning technologies, Vol. 1, Issue 3, 2008, pages 145-156.
[6]                  M. Al Mamun. Data Warehouse Performance Analysis for Online Analytical Processing, A Thesis Draft submitted for predefense of
                     MS, Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka-1342.
[7]                  V. Nebot and R. Berlanga, JISBD02-Populating Data Warehouses with Semantic Data, IEEE Latin America Transactions, Vol. 8,
                     No. 2, April 2010, pages 150-157.
[8]                  J.-N. Mazon and J. Trujillo, JISBD2007-02: Model-driven reverse engineering for data warehouse design, IEEE Latin America
                     Transactions, Vol. 6, No. 4, Aug. 2008, pages 317-323.
[9]                  E. Soler, J. Trujillo, E. Fernandez-Medina, and M. Piattini, JISBD2007-07: An extension of the relational metamodel of CWM to
                     represent secure data warehouse at the logical level, IEEE Latin America Transactions, Vol. 6, No. 4, Aug. 2008, pages 355-362.
[10]                 Morteza Zaker, Somnuk Phon-Amnuaisuk and Su-Cheng Haw, An Adequate Design for Large Datawarehouse systems: Bitmap
                     indexes versus B-tree index, International Journal of Computers and Communications, Vol. 2, Issue 2, 2008.

                                                                                         www.iosrjournals.org                                 5 | Page

Más contenido relacionado

La actualidad más candente

QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...
QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...
QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...IJORCS
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbAlexander Decker
 
Hpdw 2015-v10-paper
Hpdw 2015-v10-paperHpdw 2015-v10-paper
Hpdw 2015-v10-paperrestassure
 
Distributed parallel architecture for big data
Distributed parallel architecture for big dataDistributed parallel architecture for big data
Distributed parallel architecture for big datakamicool13
 
Building a data warehouse of call data records
Building a data warehouse of call data recordsBuilding a data warehouse of call data records
Building a data warehouse of call data recordsDavid Walker
 
Introduction to yarn B.Nandhitha 2nd M.sc., computer science,Bon secours coll...
Introduction to yarn B.Nandhitha 2nd M.sc., computer science,Bon secours coll...Introduction to yarn B.Nandhitha 2nd M.sc., computer science,Bon secours coll...
Introduction to yarn B.Nandhitha 2nd M.sc., computer science,Bon secours coll...Nandhitha B
 
Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...
Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...
Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...Nandhitha B
 
Keysum - Using Checksum Keys
Keysum - Using Checksum KeysKeysum - Using Checksum Keys
Keysum - Using Checksum KeysDavid Walker
 
IRJET- A Novel Approach to Process Small HDFS Files with Apache Spark
IRJET- A Novel Approach to Process Small HDFS Files with Apache SparkIRJET- A Novel Approach to Process Small HDFS Files with Apache Spark
IRJET- A Novel Approach to Process Small HDFS Files with Apache SparkIRJET Journal
 
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESneirew J
 
DATABASE SYSTEMS PERFORMANCE EVALUATION FOR IOT APPLICATIONS
DATABASE SYSTEMS PERFORMANCE EVALUATION FOR IOT APPLICATIONSDATABASE SYSTEMS PERFORMANCE EVALUATION FOR IOT APPLICATIONS
DATABASE SYSTEMS PERFORMANCE EVALUATION FOR IOT APPLICATIONSijdms
 

La actualidad más candente (16)

QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...
QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...
QUERY AS REGION PARTITION IN MANAGING MOVING OBJECTS FOR CONCURRENT CONTINUOU...
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
 
Hpdw 2015-v10-paper
Hpdw 2015-v10-paperHpdw 2015-v10-paper
Hpdw 2015-v10-paper
 
Distributed parallel architecture for big data
Distributed parallel architecture for big dataDistributed parallel architecture for big data
Distributed parallel architecture for big data
 
disertation
disertationdisertation
disertation
 
SiDe Enabled Reliable Replica Optimization
SiDe Enabled Reliable Replica OptimizationSiDe Enabled Reliable Replica Optimization
SiDe Enabled Reliable Replica Optimization
 
Building a data warehouse of call data records
Building a data warehouse of call data recordsBuilding a data warehouse of call data records
Building a data warehouse of call data records
 
Orcale dba training
Orcale dba trainingOrcale dba training
Orcale dba training
 
Introduction to yarn B.Nandhitha 2nd M.sc., computer science,Bon secours coll...
Introduction to yarn B.Nandhitha 2nd M.sc., computer science,Bon secours coll...Introduction to yarn B.Nandhitha 2nd M.sc., computer science,Bon secours coll...
Introduction to yarn B.Nandhitha 2nd M.sc., computer science,Bon secours coll...
 
Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...
Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...
Introduction to yarn N.Nandhitha II M.Sc., computer science Bon secours colle...
 
Keysum - Using Checksum Keys
Keysum - Using Checksum KeysKeysum - Using Checksum Keys
Keysum - Using Checksum Keys
 
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
[IJET-V1I6P11] Authors: A.Stenila, M. Kavitha, S.Alonshia
 
IRJET- A Novel Approach to Process Small HDFS Files with Apache Spark
IRJET- A Novel Approach to Process Small HDFS Files with Apache SparkIRJET- A Novel Approach to Process Small HDFS Files with Apache Spark
IRJET- A Novel Approach to Process Small HDFS Files with Apache Spark
 
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUESANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
ANALYSIS OF ATTACK TECHNIQUES ON CLOUD BASED DATA DEDUPLICATION TECHNIQUES
 
DATABASE SYSTEMS PERFORMANCE EVALUATION FOR IOT APPLICATIONS
DATABASE SYSTEMS PERFORMANCE EVALUATION FOR IOT APPLICATIONSDATABASE SYSTEMS PERFORMANCE EVALUATION FOR IOT APPLICATIONS
DATABASE SYSTEMS PERFORMANCE EVALUATION FOR IOT APPLICATIONS
 
Poster for ISGC
Poster for ISGCPoster for ISGC
Poster for ISGC
 

Destacado

Smart Data Server for Smart Shops
Smart Data Server for Smart ShopsSmart Data Server for Smart Shops
Smart Data Server for Smart ShopsIOSR Journals
 
OSN: Privacy Preserving Policies
OSN: Privacy Preserving PoliciesOSN: Privacy Preserving Policies
OSN: Privacy Preserving PoliciesIOSR Journals
 
An Adaptive Zero Voltage Mechanism for Boost Converter
An Adaptive Zero Voltage Mechanism for Boost ConverterAn Adaptive Zero Voltage Mechanism for Boost Converter
An Adaptive Zero Voltage Mechanism for Boost ConverterIOSR Journals
 
Optimal Repeated Frame Compensation Using Efficient Video Coding
Optimal Repeated Frame Compensation Using Efficient Video  CodingOptimal Repeated Frame Compensation Using Efficient Video  Coding
Optimal Repeated Frame Compensation Using Efficient Video CodingIOSR Journals
 
Finger Print Image Compression for Extracting Texture Features and Reconstru...
Finger Print Image Compression for Extracting Texture Features  and Reconstru...Finger Print Image Compression for Extracting Texture Features  and Reconstru...
Finger Print Image Compression for Extracting Texture Features and Reconstru...IOSR Journals
 
Secure Network Discovery for Risk-Aware Framework in Manet
Secure Network Discovery for Risk-Aware Framework in ManetSecure Network Discovery for Risk-Aware Framework in Manet
Secure Network Discovery for Risk-Aware Framework in ManetIOSR Journals
 
IIS Book 1 Pre-Primary Presentation
IIS Book 1 Pre-Primary PresentationIIS Book 1 Pre-Primary Presentation
IIS Book 1 Pre-Primary PresentationAnita Abdul Habib
 
Design of Low Noise Amplifier for Wimax Application
Design of Low Noise Amplifier for Wimax ApplicationDesign of Low Noise Amplifier for Wimax Application
Design of Low Noise Amplifier for Wimax ApplicationIOSR Journals
 
Prototyping the Future Potentials of Location Based Services in the Realm of ...
Prototyping the Future Potentials of Location Based Services in the Realm of ...Prototyping the Future Potentials of Location Based Services in the Realm of ...
Prototyping the Future Potentials of Location Based Services in the Realm of ...IOSR Journals
 
Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data Thomas Gottron
 
Performance Comparison of Various Filters and Wavelet Transform for Image De-...
Performance Comparison of Various Filters and Wavelet Transform for Image De-...Performance Comparison of Various Filters and Wavelet Transform for Image De-...
Performance Comparison of Various Filters and Wavelet Transform for Image De-...IOSR Journals
 
Privacy and Security in Online Examination Systems
Privacy and Security in Online Examination SystemsPrivacy and Security in Online Examination Systems
Privacy and Security in Online Examination SystemsIOSR Journals
 
Optimization of Mining Association Rule from XML Documents
Optimization of Mining Association Rule from XML DocumentsOptimization of Mining Association Rule from XML Documents
Optimization of Mining Association Rule from XML DocumentsIOSR Journals
 
Présentation de la formation à l'édition en France
Présentation de la formation à l'édition en FrancePrésentation de la formation à l'édition en France
Présentation de la formation à l'édition en FranceAsfored
 
Leadership and the Climate Change Challenge (Work in Progress)
Leadership and the Climate Change Challenge (Work in Progress)Leadership and the Climate Change Challenge (Work in Progress)
Leadership and the Climate Change Challenge (Work in Progress)Duncan Noble
 

Destacado (20)

Smart Data Server for Smart Shops
Smart Data Server for Smart ShopsSmart Data Server for Smart Shops
Smart Data Server for Smart Shops
 
OSN: Privacy Preserving Policies
OSN: Privacy Preserving PoliciesOSN: Privacy Preserving Policies
OSN: Privacy Preserving Policies
 
An Adaptive Zero Voltage Mechanism for Boost Converter
An Adaptive Zero Voltage Mechanism for Boost ConverterAn Adaptive Zero Voltage Mechanism for Boost Converter
An Adaptive Zero Voltage Mechanism for Boost Converter
 
Optimal Repeated Frame Compensation Using Efficient Video Coding
Optimal Repeated Frame Compensation Using Efficient Video  CodingOptimal Repeated Frame Compensation Using Efficient Video  Coding
Optimal Repeated Frame Compensation Using Efficient Video Coding
 
Finger Print Image Compression for Extracting Texture Features and Reconstru...
Finger Print Image Compression for Extracting Texture Features  and Reconstru...Finger Print Image Compression for Extracting Texture Features  and Reconstru...
Finger Print Image Compression for Extracting Texture Features and Reconstru...
 
Secure Network Discovery for Risk-Aware Framework in Manet
Secure Network Discovery for Risk-Aware Framework in ManetSecure Network Discovery for Risk-Aware Framework in Manet
Secure Network Discovery for Risk-Aware Framework in Manet
 
IIS Book 1 Pre-Primary Presentation
IIS Book 1 Pre-Primary PresentationIIS Book 1 Pre-Primary Presentation
IIS Book 1 Pre-Primary Presentation
 
Design of Low Noise Amplifier for Wimax Application
Design of Low Noise Amplifier for Wimax ApplicationDesign of Low Noise Amplifier for Wimax Application
Design of Low Noise Amplifier for Wimax Application
 
Prototyping the Future Potentials of Location Based Services in the Realm of ...
Prototyping the Future Potentials of Location Based Services in the Realm of ...Prototyping the Future Potentials of Location Based Services in the Realm of ...
Prototyping the Future Potentials of Location Based Services in the Realm of ...
 
Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data Perplexity of Index Models over Evolving Linked Data
Perplexity of Index Models over Evolving Linked Data
 
Performance Comparison of Various Filters and Wavelet Transform for Image De-...
Performance Comparison of Various Filters and Wavelet Transform for Image De-...Performance Comparison of Various Filters and Wavelet Transform for Image De-...
Performance Comparison of Various Filters and Wavelet Transform for Image De-...
 
Privacy and Security in Online Examination Systems
Privacy and Security in Online Examination SystemsPrivacy and Security in Online Examination Systems
Privacy and Security in Online Examination Systems
 
G01054350
G01054350G01054350
G01054350
 
Optimization of Mining Association Rule from XML Documents
Optimization of Mining Association Rule from XML DocumentsOptimization of Mining Association Rule from XML Documents
Optimization of Mining Association Rule from XML Documents
 
D0432026
D0432026D0432026
D0432026
 
J0555562
J0555562J0555562
J0555562
 
Présentation de la formation à l'édition en France
Présentation de la formation à l'édition en FrancePrésentation de la formation à l'édition en France
Présentation de la formation à l'édition en France
 
Yo y mi mascota
Yo y mi mascotaYo y mi mascota
Yo y mi mascota
 
Leadership and the Climate Change Challenge (Work in Progress)
Leadership and the Climate Change Challenge (Work in Progress)Leadership and the Climate Change Challenge (Work in Progress)
Leadership and the Climate Change Challenge (Work in Progress)
 
M0566672
M0566672M0566672
M0566672
 

Similar a A0930105

PERFORMANCE STUDY OF TIME SERIES DATABASES
PERFORMANCE STUDY OF TIME SERIES DATABASESPERFORMANCE STUDY OF TIME SERIES DATABASES
PERFORMANCE STUDY OF TIME SERIES DATABASESijdms
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataVipin Batra
 
Challenges Management and Opportunities of Cloud DBA
Challenges Management and Opportunities of Cloud DBAChallenges Management and Opportunities of Cloud DBA
Challenges Management and Opportunities of Cloud DBAinventy
 
Survey real time databases
Survey real time databasesSurvey real time databases
Survey real time databasesManuel Santos
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLijscai
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLIJSCAI Journal
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLijscai
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLIJSCAI Journal
 
Study on potential capabilities of a nodb system
Study on potential capabilities of a nodb systemStudy on potential capabilities of a nodb system
Study on potential capabilities of a nodb systemijitjournal
 
Steps to Modernize Your Data Ecosystem | Mindtree
Steps to Modernize Your Data Ecosystem | Mindtree									Steps to Modernize Your Data Ecosystem | Mindtree
Steps to Modernize Your Data Ecosystem | Mindtree AnikeyRoy
 
Six Steps to Modernize Your Data Ecosystem - Mindtree
Six Steps to Modernize Your Data Ecosystem  - MindtreeSix Steps to Modernize Your Data Ecosystem  - Mindtree
Six Steps to Modernize Your Data Ecosystem - Mindtreesamirandev1
 
6 Steps to Modernize Data Ecosystem with Mindtree
6 Steps to Modernize Data Ecosystem with Mindtree6 Steps to Modernize Data Ecosystem with Mindtree
6 Steps to Modernize Data Ecosystem with Mindtreedevraajsingh
 
Steps to Modernize Your Data Ecosystem with Mindtree Blog
Steps to Modernize Your Data Ecosystem with Mindtree Blog Steps to Modernize Your Data Ecosystem with Mindtree Blog
Steps to Modernize Your Data Ecosystem with Mindtree Blog sameerroshan
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbAlexander Decker
 
Redis Cashe is an open-source distributed in-memory data store.
Redis Cashe is an open-source distributed in-memory data store.Redis Cashe is an open-source distributed in-memory data store.
Redis Cashe is an open-source distributed in-memory data store.Artan Ajredini
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGijiert bestjournal
 
Analysis and evaluation of riak kv cluster environment using basho bench
Analysis and evaluation of riak kv cluster environment using basho benchAnalysis and evaluation of riak kv cluster environment using basho bench
Analysis and evaluation of riak kv cluster environment using basho benchStevenChike
 
Jovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloudJovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloudBharat Rane
 
The Rise of Nosql Databases
The Rise of Nosql DatabasesThe Rise of Nosql Databases
The Rise of Nosql DatabasesJAMES NGONDO
 

Similar a A0930105 (20)

PERFORMANCE STUDY OF TIME SERIES DATABASES
PERFORMANCE STUDY OF TIME SERIES DATABASESPERFORMANCE STUDY OF TIME SERIES DATABASES
PERFORMANCE STUDY OF TIME SERIES DATABASES
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Challenges Management and Opportunities of Cloud DBA
Challenges Management and Opportunities of Cloud DBAChallenges Management and Opportunities of Cloud DBA
Challenges Management and Opportunities of Cloud DBA
 
Survey real time databases
Survey real time databasesSurvey real time databases
Survey real time databases
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQL
 
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQLA STUDY ON GRAPH STORAGE DATABASE OF NOSQL
A STUDY ON GRAPH STORAGE DATABASE OF NOSQL
 
A Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQLA Study on Graph Storage Database of NOSQL
A Study on Graph Storage Database of NOSQL
 
Study on potential capabilities of a nodb system
Study on potential capabilities of a nodb systemStudy on potential capabilities of a nodb system
Study on potential capabilities of a nodb system
 
Steps to Modernize Your Data Ecosystem | Mindtree
Steps to Modernize Your Data Ecosystem | Mindtree									Steps to Modernize Your Data Ecosystem | Mindtree
Steps to Modernize Your Data Ecosystem | Mindtree
 
Six Steps to Modernize Your Data Ecosystem - Mindtree
Six Steps to Modernize Your Data Ecosystem  - MindtreeSix Steps to Modernize Your Data Ecosystem  - Mindtree
Six Steps to Modernize Your Data Ecosystem - Mindtree
 
6 Steps to Modernize Data Ecosystem with Mindtree
6 Steps to Modernize Data Ecosystem with Mindtree6 Steps to Modernize Data Ecosystem with Mindtree
6 Steps to Modernize Data Ecosystem with Mindtree
 
Steps to Modernize Your Data Ecosystem with Mindtree Blog
Steps to Modernize Your Data Ecosystem with Mindtree Blog Steps to Modernize Your Data Ecosystem with Mindtree Blog
Steps to Modernize Your Data Ecosystem with Mindtree Blog
 
A survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo dbA survey on data mining and analysis in hadoop and mongo db
A survey on data mining and analysis in hadoop and mongo db
 
Redis Cashe is an open-source distributed in-memory data store.
Redis Cashe is an open-source distributed in-memory data store.Redis Cashe is an open-source distributed in-memory data store.
Redis Cashe is an open-source distributed in-memory data store.
 
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMINGEVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
EVALUATING CASSANDRA, MONGO DB LIKE NOSQL DATASETS USING HADOOP STREAMING
 
Analysis and evaluation of riak kv cluster environment using basho bench
Analysis and evaluation of riak kv cluster environment using basho benchAnalysis and evaluation of riak kv cluster environment using basho bench
Analysis and evaluation of riak kv cluster environment using basho bench
 
E018142329
E018142329E018142329
E018142329
 
Jovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloudJovian DATA: A multidimensional database for the cloud
Jovian DATA: A multidimensional database for the cloud
 
The Rise of Nosql Databases
The Rise of Nosql DatabasesThe Rise of Nosql Databases
The Rise of Nosql Databases
 

Más de IOSR Journals (20)

A011140104
A011140104A011140104
A011140104
 
M0111397100
M0111397100M0111397100
M0111397100
 
L011138596
L011138596L011138596
L011138596
 
K011138084
K011138084K011138084
K011138084
 
J011137479
J011137479J011137479
J011137479
 
I011136673
I011136673I011136673
I011136673
 
G011134454
G011134454G011134454
G011134454
 
H011135565
H011135565H011135565
H011135565
 
F011134043
F011134043F011134043
F011134043
 
E011133639
E011133639E011133639
E011133639
 
D011132635
D011132635D011132635
D011132635
 
C011131925
C011131925C011131925
C011131925
 
B011130918
B011130918B011130918
B011130918
 
A011130108
A011130108A011130108
A011130108
 
I011125160
I011125160I011125160
I011125160
 
H011124050
H011124050H011124050
H011124050
 
G011123539
G011123539G011123539
G011123539
 
F011123134
F011123134F011123134
F011123134
 
E011122530
E011122530E011122530
E011122530
 
D011121524
D011121524D011121524
D011121524
 

A0930105

  • 1. IOSR Journal of Computer Engineering (IOSR-JCE) e-ISSN: 2278-0661, p- ISSN: 2278-8727Volume 9, Issue 3 (Mar. - Apr. 2013), PP 01-05 www.iosrjournals.org Performance Improvement Techniques for Customized Data Warehouse Md. Al Mamun and Md. Humayun Kabir Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka-1342, Bangladesh. Abstract: In this paper, we present performance improvement techniques for data retrieval from customized data warehouses for efficient querying and Online Analytical Processing (OLAP) in relation to efficient database and memory management. Different database management techniques, e.g. indexing, partitioning etc. play vital role in efficient memory management. A comparison of data retrieval time for a particular query from a relational database as well as data warehouse database with and without indexing is performed. We show that the application of different database management techniques result faster query execution by reducing data retrieval time. This improved efficiency may increase the efficiency of OLAP operations, which results better data warehouse performance. Keywords - Data Warehouse, Indexing, OLAP, Partitioning, Querying. I. Introduction Data retrieval from relational database requires high access time when it stores millions of records, which can be overcome using indexing [1]. Test data can be generated to populate test databases to test SQL queries [2]. Different database management techniques e.g. partitioning [3], indexing [1, 4, 5] etc. can be employed for faster data access. Customized database application software may be developed using these techniques for faster data processing. In the cases where the number of records in relational database becomes very high, the query processing time becomes very long [6]. Data warehouses [7-9] which store consolidated historic data can be constructed from relational databases. Data warehouse (DW) database store very small number of records for a large number of records in relational database [6]. This reduction of the number of records in data warehouse results in smaller query retrieval time [6]. Moreover, this retrieval time can be further reduced using different indexing techniques. In this paper, we have studied different techniques to improve performance of data retrieval from relational database as well as DW [8, 9] database for different sizes of data. We have used bitmap indexing (BMI) [10] and data partitioning techniques to improve performance of data retrieval from data warehouse database [6]. We have also shown that data retrieval time for data warehouse is very much lower compared to relational database for similar queries using bitmap indexing [10]. This suggests that data warehouse with bitmap indexing is more suitable for an enterprise for intelligent and faster decision making [3, 6]. The retrieval time rises with the increase in data size. The paper is organized as follows. Section II presents system architecture, section III describes the data retrieval time measurement technique, section IV presents comparison of data retrieval time for RDBMS with and without indexing, section V presents about data retrieval time for partitioning, section VI presents comparison of data retrieval time for different data sources with variable data sizes, section VII concludes. II. System Architecture This section presents the architecture for constructing customized data warehouses from RDBMS tables using data definition language (DDL) of SQL. Data warehouse is populated with data from external sources e.g. relational database system, in the consolidated form using aggregation operation. We have created dimension and fact tables from RDBMS table schemas. Queries are executed on both RDBMS tables and DW dimension and fact tables with and without indexing using Java programs. The data retrieval time are measured and compared. Fig 1 represents the system architecture. III. Data Retrieval Time We have designed an algorithm for determining data retrieval time for relational database system and DW database system [6]. We have used the function DBTime () to determine data retrieval time: tr = (tend – tstart ) in milliseconds (ms) of executing a particular query. www.iosrjournals.org 1 | Page
  • 2. Performance Improvement Techniques for Customized Data Warehouse Data Sources RDB as input RDBMS Java Program utput SQL Query Query DDL on RDB Schema Execution Dimension and Fact Tables Data Warehouse Populate Data Cube Populated DW as Input Data Warehouse RDBMS RDB Output Visual Query Text/Graphical Output Function Execution DW Output Figure 1: System Architecture Retrieval time varies with primary memory and processing speed of computer in which we execute our software system. The experiment is done on a laptop with Core i3 processor of 2.53 GHz, RAM 2GB, HD 500GB under windows OS. IV. Comparison Of Data Retrieval Time For Relational Databases With And Without Indexing We have applied indexing on students records stored in database relations using Oracle RDBMS. The developed performance improvement system generates very large number of student random data records by varying their CGPAs to populate the RDBMS tables. The developed prototype determines the data retrieval time using data tables created of different data sizes using RDBMS for a particular query without indexing at first as shown in Table 1 [6]. Table 1 Data retrieval time without indexing and with indexing on tables of different data sizes. Number of The database relations are then indexed and the Records Retrieval Time (ms) same query is executed on the indexed relations. The Data size in Without With indexing query execution times are recorded as shown in Table number of indexing 1. Fig 2 plots query execution time without and with records indexing on different relations of variable data sizes. 100000 3324 2819 200000 10671 5457 We notice that the data retrieval time for non-indexed 300000 47727 35340 relations is more than that of indexed relations for a particular data size. Figure 2: Comparison of data retrieval time without indexing and with indexing for RDBMS. V. Data Retrieval Time Using Data Partitioning Data partitioning can speed up the performance of data processing in data retrieval. In case of small physical memory, the large volume of data can be partitioned into smaller segments to load into primary memory. This can help to execute the application program to access data larger than the main memory size successfully. But if the number of partitions is big enough, then data access www.iosrjournals.org 2 | Page
  • 3. Performance Improvement Techniques for Customized Data Warehouse may take longer time due to switching between partitions as overhead [3]. Increasing physical memory size, partitioning can be avoided or number of partitions can be reduced resulting faster data access. Partitioning with indexing may cause even more reduction in data retrieval time. VI. Comparison Of Data Retrieval Time For Different Data Sizes Consider the data retrieval time shown in Table 2 required to select 209 students of a particular session 2002-2003, who obtained CGPA 4.0 out of 100000 student records by executing a query. We have defined and executed queries to retrieve records of Table 3 from RDBMS, DW without and with bitmap indexing. Table 2 Data retrieval time (in ms) for RDBMS, Data Warehouse without indexing, and Data Warehouse with BMI RDBMS DW (non- (indexed) indexed) access DW with BMI access time time access time 92 41 28 Retrieval Time for RDBMS and DW 100 Data Retrieval Time 80 60 Series1 40 20 0 RDBMS Data Warehouse Data Warehouse with BMI Data Sources Figure 3: Retrieval time for indexed RDBMS, Data Warehouse without indexing, and Data Warehouse with BMI for selecting 209 students of a session. Consider the queries to retrieve session, final exam year, CGPA, and the number of the students who obtained CGPA 4 as shown in Table 3. The queries retrieve and count student’s records which are stored in RDBMS and DW. Table 4 represents data access time in ms for indexed RDBMS, non-indexed DW and DW with bitmap indexing. TABLE 3 Query Output for students of all sessions with CGPA 4.0 4th Year No. of Session Final Exam CGPA Students 2001-2002 2005 4 503 2002-2003 2006 4 209 2003-2004 2007 4 236 2004-2005 2008 4 265 2005-2006 2009 4 241 2006-2007 2010 4 239 2007-2008 2011 4 261 TABLE 4 Data retrieval time for students of all sessions with CGPA 4.0 Indexed Data Data Warehouse RDBMS Warehouse with BMI 555 223 207 Fig 4 plots the data retrieval time shown in Table 4 for different data sources to retrieve the query output shown in Table 3. www.iosrjournals.org 3 | Page
  • 4. Performance Improvement Techniques for Customized Data Warehouse Comparison of Data Retrieval Time 600 500 Retrieval Time 400 300 Series1 200 100 0 RDBMS DW DW with BMI Data Sources Figure 4: Comparison of data retrieval time for different data sources without and with indexing Queries operated on databases of RDBMS require the highest time as it stores a large number of raw data records. Table 5 represents the data retrieval time of executing a query on data tables of various sizes containing records of up to 4 millions using RDBMS and the corresponding records in DW. We have measured the execution time of a query for accessing data from an indexed RDBMS database, non-indexed data warehouse database and a DW with bitmap indexing. It is observed that data retrieval from data warehouse with bitmap indexing requires less time compared to that of data warehouse without indexing. TABLE 5 Retrieval time for CGPA 4.0 students of all sessions from indexed data tables of different sizes stored in RDBMS, DW without and with bitmap indexing RDBMS Data Retrieval Time (ms) No. of Indexed DW with Records RDBMS DW BMI 100000 555 223 207 300000 8964 538 502 500000 13000 769 435 1000000 16259 8569 7520 2000000 42546 17927 17222 4000000 87028 53057 38213 We create two tables Table 6 and Table 7 based on Table 5. Finally, we create another two tables Table 7 and Table 8 for clarification of the data retrieval time for different data sizes of RDBMS database and DW database separately. TABLE 6 Corresponding Data Sizes of DW database for RDBMS database Data Size of RDBMS Data size in DW 100000 1954 300000 5879 500000 9880 1000000 19986 2000000 40060 4000000 79803 TABLE 7 Retrieval Time for different Data Sizes of RDBMS RDBMS RDBMS Retrieval Data Size Time 100000 555 300000 8964 500000 13000 1000000 16259 2000000 42546 4000000 87028 www.iosrjournals.org 4 | Page
  • 5. Performance Improvement Techniques for Customized Data Warehouse Figure 5: Comparison of data retrieval time for different data Sizes shown in Table 7. TABLE 8 Retrieval Time for Data Warehouses of different Data Sizes. DW Retrieval Time vs Data Size Data size of DW Retrieval DW Retrieval DW Time Time with BMI 60000 1954 223 207 50000 5879 538 502 Retrieval Time 40000 9880 769 435 30000 19986 8569 7520 20000 10000 40060 17927 17222 0 79803 53057 38213 1954 5879 9880 19986 40060 79803 Data Size DW Retrieval Time DW Retrieval Time w ith BMI Figure 6: Comparison of data retrieval time for different data Sizes of Data Warehouse with and without BMI shown in Table 8. Fig. 5 and Fig. 6 explain that data retrieval time in both RDBMS and DW cases increase with the increase in data size significantly. VII. Conclusion We have observed that data retrieval from tables stored in RDBMS database is almost exponentially rising. But the increases in data retrieval time for data warehouse with bitmap indexing or without bitmap indexing have small increase with the increase of data size. We have shown that data retrieval time for data warehouse is very much lower compared to relational database for similar queries. This suggests that data warehouse is more suitable for an enterprise for intelligent and faster data access in decision making. This suggests that OLAP system can be developed using DW database and is more suitable than using relational database system for intelligent and efficient decision making with reporting or data analysis. Acknowledgements We greatly acknowledge the valuable comments of the faculty members who were present at the thesis examination board. References [1] M. Barrena, C. Pachon and E. Jurado, JISBD2007-04: Neighbors search in holey multidimensional spaces, IEEE Latin America Transactions, Vol. 6, No. 4, Aug. 2008, pages 332-338. [2] M. J. Suarez-Cabal, C. de la Riva and J. Tuya, JISBD04-Populating Test Databases for Testing SQL Queries, IEEE Latin America Transactions, Vol. 8, No. 2, April 2010, pages 164-171. [3] Mafruz Zaman Ashrafi, David Taniar, Kate Smith, ODAM: An Optimized Distributed Association Rule Mining Algorithm, Monash University, IEEE Distributed Systems Online, IEEE Computer Society, Vol. 5, No. 3; March 2004. [4] Bernd Reiner, Karl Hahn. Optimized Management of Large-Scale Data Sets Stored on Tertiary Storage System, IEEE Distributed Systems Online, IEEE Computer Society, Vol. 5, No. 5; May 2004. [5] S. Repp, A. Gross and C. Meinel, Browsing within Lecture Videos based on the chain index of speech transcription, IEEE transactions on learning technologies, Vol. 1, Issue 3, 2008, pages 145-156. [6] M. Al Mamun. Data Warehouse Performance Analysis for Online Analytical Processing, A Thesis Draft submitted for predefense of MS, Department of Computer Science and Engineering, Jahangirnagar University, Savar, Dhaka-1342. [7] V. Nebot and R. Berlanga, JISBD02-Populating Data Warehouses with Semantic Data, IEEE Latin America Transactions, Vol. 8, No. 2, April 2010, pages 150-157. [8] J.-N. Mazon and J. Trujillo, JISBD2007-02: Model-driven reverse engineering for data warehouse design, IEEE Latin America Transactions, Vol. 6, No. 4, Aug. 2008, pages 317-323. [9] E. Soler, J. Trujillo, E. Fernandez-Medina, and M. Piattini, JISBD2007-07: An extension of the relational metamodel of CWM to represent secure data warehouse at the logical level, IEEE Latin America Transactions, Vol. 6, No. 4, Aug. 2008, pages 355-362. [10] Morteza Zaker, Somnuk Phon-Amnuaisuk and Su-Cheng Haw, An Adequate Design for Large Datawarehouse systems: Bitmap indexes versus B-tree index, International Journal of Computers and Communications, Vol. 2, Issue 2, 2008. www.iosrjournals.org 5 | Page