SlideShare una empresa de Scribd logo
1 de 7
Descargar para leer sin conexión
International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013
DOI : 10.5121/ijcsity.2013.1302 09
Mine Blood Donors Information through
Improved K-Means Clustering
Bondu Venkateswarlu1
and Prof G.S.V.Prasad Raju2
1
Department of Computer Science and Systems Engineering, Andhra University,
Visakhapatnam-530003,India.
iambondu@yahoo.com.
2 School of Distance Education, Andhra university, Visakhapatnam-530003,India.
gsvpraju2011@yahoo.com
ABSTRACT
The number of accidents and health diseases which are increasing at an alarming rate
are resulting in a huge increase in the demand for blood. There is a necessity for the
organized analysis of the blood donor database or blood banks repositories. Clustering analysis is one of
the data mining applications and K-means clustering algorithm is the fundamental algorithm for modern
clustering techniques. K-means clustering algorithm is traditional approach and iterative algorithm. At
every iteration, it attempts to find the distance from the centroid of each cluster to each and every data
point. This paper gives the improvement to the original k-means algorithm by improving the initial
centroids with distribution of data. Results and discussions show that improved K-means algorithm
produces accurate clusters in less computation time to find the donors information.
KEYWORDS
Clustering, means, Euclidean distance, centroid, datasets
I INTRODUCTION
Now a day, due to large number of accidents and health problems, it is dire need to find the blood
donor information instantly. Since the blood bank repositories are rapidly growing, it is becoming
increasingly difficult to extract the information required by using the conventional database
techniques. So, there is a need to come up with a solution using clustering for analysis and obtain
the information of the blood donors [1].
Clustering analysis [2] is one of the major data analysis methods, which help to identify natural
grouping of data objects from the dataset. Clustering is unsupervised classification and is a
process of partitioning a set of data objects into a set of meaning full sub classes. This can be
done by applying the various similarity and distance measures to the algorithm.
For all geometrical problems, the standard distance metric used is the Euclidean distance, which
is just the direct distance between any two points and can be easily measured with the help of just
a ruler, be it in two-dimensional space or three-dimensional space. It is a default distance measure
[3] for the k-means clustering algorithm.
K-means Clustering Algorithm is a Partitioning algorithm; it is effective in producing for many
practical applications [4, 5, 6, and 7] like Pattern recognition, Document classification, Image
processing, Economic science. However, while calculating the initial cluster centroids, the K-
International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013
10
means algorithm is vulnerable to errors. Because of this the initial cluster centroids are produced
arbitrarily, hence disabling the algorithm from providing the expected results. The accuracy of the
initial centroid determines the efficiency of the original k-means algorithm. This paper attempts to
improve the accuracy of the k-means algorithm by increasing the accuracy in the calculation of
the initial centroids.
II K-Means Clustering Algorithm
The K-means algorithm assigns each point to the cluster whose centroid (Centre) is in the nearest
proximity (centroid is the average of all the points in the cluster). The centroid’s coordinates are
determined by calculating the arithmetic mean of all the points in the cluster separately for each
dimension. The biggest asset of this algorithm is its speed, which allow for it to be run on large
databases. Also the simplicity of this algorithm makes it all the more easy to use. But this
algorithm has its own share of shortcomings associated with it. The inability of this algorithm to
yield the same results with each run is a major disadvantage and this occurs as the derived
clusters are not independent of the initial random assignments. Also, this algorithm considerably
reduces the intra-cluster variance, but does not take measures to ensure the global minimum of the
variance of the result.
There are two separate phases in the k-means algorithm- wherein the first phase involves the
identification of k centroids, given that we know the number of clusters (K) beforehand thereby
arriving at one centroid per cluster. Initially K points which are likely to be in different clusters
have to be selected, which are then made the centroids of their respective clusters. The Euclidean
distance is then calculated for each data point from each of the cluster centroid. Compare the
values; find the closest centroid for the data point. Then bind the centroid and the data point
which results in the completion of the first phase and then an early grouping is done.
At this stage, the new centroids have to be determined by calculating the mean value for each
cluster. After we get k new centroids, then a new binding is to be created between the same data
points previously used and the nearest new centroid, thus generating a loop. Because of this loop,
the k centroids are prone to a change in their positions in a step by step manner. This process is
repeated until convergence criteria is meat means the centroids of clusters are do not move
anymore. The pseudo Code for the k-means Clustering algorithm.
Algorithm : The k-means clustering algorithm
Input:
D = {d1, d2,......, dn} //set of n data items.
k // Number of desired clusters
Output:
A set of k clusters.
Steps:
1. Arbitrarily choose k data-items from D as initial centroids;
2. Repeat
2.1 Assign each data item di to the cluster which has the closest centroid;
2.2 Calculate the new mean of each cluster;
Until convergence criterion is met.
International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013
11
III LITERATURE SURVEY
In a web based information system for blood donation available on www. Pavel Berkhin[11]
performed extensive research in the field of data mining experiments and organized analysis of
the blood bank repositories which is helpful to the health professionals for a better management
of the blood bank facility.
Arun K.Pujari et al. [12] have worked to improve the performance of blood donation information
analysis. In this paper, improved k-means clustering is adopted that improves the performance for
determining the blood donors information based on the required blood group and location where
needed.
Several attempts are made by the researchers to improve efficiency of the k-means clustering
[8,9,10]. The variants of the k-means clustering algorithm, are K-modes and K-medoids. These
algorithms replace the means with the modes and medoids. The K-modes algorithm handles the
categorical data.
These also give better performance based on the way we choose the initial modes and methods.
For clustering the data, the k-means and the k-modes are integrated by the k-prototypes algorithm.
Both the numeric and categorical attributes are taken into account to define the dissimilarity
measure.
Fahim A.M. et. al.[8] and Yuan F et.al. [10] worked for improving the performance of the k-
means by finding the initial centroids. The centroids obtained this by these methods provide
consistent and accurate clusters for the given datasets.
Abdul Nazeer K A et.al [13] brought forward an efficient method for assigning the data-points to
the clusters. The original k-means algorithm it computes the distances between the data points
from all the centroids in every iteration, thus making this algorithm computationally very
expensive. Another approach given by Fahim makes use of two distance functions for this
purpose- one similar to the k-means algorithm and another one based on a heuristics to reduce the
number of distance calculations. However as was the case in the original k-means algorithm, this
algorithm too determines the initial centroids randomly, thus compromising the accuracy of the
resultant clusters.
Koteswararao[2] proposes improved k-means algorithm using a O(n logn) heuristic method for
finding the initial centroids. In this method, the initial centroids are generated in accordance with
the distribution of the data instead of generating them randomly.
IV PROPOSED METHODOLOGY
Find the blood donors information and satisfy the need by these steps:
Collect data from the blood bank.
Apply the improved K-means Algorithm to classify the number of blood donors through the
blood group and location where it needed.
Extract the Mail ids from the resultant cluster which satisfies the criteria of blood group and
location. Forward the message to the donors.
International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013
12
In this methodology, the improved clustering algorithm, deals with the multi dimensional data
values. Each data point di may contain multiple attributes such as di1, di2,…..dim, where m is the
number of attributes or columns in each data value. In such cases we first determine the column
with maximum range [7], where range is the difference between the maximum and the minimum
element in the column.
Then determine the initial centroids [10] as range of the Column is divided by the number of
clusters and sum with the minimum value of the column. i.e., assume we have the two
dimensional dataset, then the initial centroid are
Cx=MAXx-MINxK+1+clusterid*MINx Cy=MAXy-MINyK+1+clusterid*MINy
After determining the initial centroids, the data points are distributed to the each cluster as
specified by the Abdul Nazeer [13]. That is the data points are divided into k-equal partitions.
For each iteration, calculate the Euclidean distance between the data point and each centroid.
For all geometrical problems, the standard distance metric used is the Euclidean distance, which
is just the direct distance between any two points and can be easily measured with the help of just
a ruler, be it in two-dimensional space or three-dimensional space. It is a default distance measure
[3] for the k-means clustering algorithm. Clustering problems and clustering text widely use of
Euclidean distance, which is a true metric as it satisfies all the above stated four conditions. Also
the Euclidean metric is the default distance metric which is used in the k-means algorithm.
The distance measured by the Euclidean metric is also known by
‘as-the-crow-flies’ distance. This distance from a point X (X1, X2,
etc.) to a point Y (Y1, Y2, etc.) is:
The square root of the sum of the squares of the differences between the corresponding
values of the two data points is necessary for finding the Euclidean distance.
Find new centroid by calculate the mean value of distance of the all data points of that cluster.
Each data point di is assigned to the cluster having the closest centroid. The distance between the
data points and the centroid is measured by using the Euclidean distance. Improved K-means
Clustering Algorithm is outlined as below
Algorithm 2: K-means Clustering Algorithm with improved initial centroids.
Input:
D = {d1, d2,......,dn} // set of n data items.
K // Number of desired clusters.
Output:
A set of k clusters.
Steps:
Calculate the initial centroids as the formula give above and set the cluster with that
centroid.
Repeat
2.1 Initially assign the each data point to the cluster
International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013
13
2.2 Update the centroid value by calculate the mean of that cluster
Until all data points are assigned to any one of the clusters.
3. Repeat
3.1 Assign each data item di to the cluster which has the closest centroid;
3.2 Calculate new mean of each cluster;
Until convergence criterion is met.
This algorithm gives the better performance and accurate results within less time compared to the
literature [10,12].
V RESULTS AND DISCUSSIONS
For testing accuracy and efficiency of original K-means and improved K-means clustering
algorithm, various datasets are used. Data sets are varying in size. The same data sets are given to
the original K-means and improved K-means algorithm.
Both algorithms require the inputs as number of clusters, and the required blood group and
location where it needed. The original k-means algorithm takes the initial centroids as the
arbitrarily choosing random values; the improved k-means algorithm takes the initial centroid in
more meaning full way by calculated using formulas.
The efficiency of this algorithm can be analyzed by the accuracy of the clusters obtained from the
experiments performed on the blood bank dataset which satisfy the given criterion (here blood
group). The time taken and the percentage accuracy for each experiment are computed and
tabulated. Performances of the algorithms for the different data sets are tabulated in Table I. The
results produced from different algorithms are compared in figures 1 to 2.
Figure1 describes the accuracy of the algorithms, y-axis shows the accuracy. Figure 2 describes
the time taken by the algorithm, y-axis shows the time taken by the algorithm.
The performance of the improvised algorithm as measured against the original one is tabulated in
the Table 1. It shows the results for different data sets.
Table 1:PERFORMANCE COMPARISION OF THE ALGORITHMS FOR DIFFERENT
DATASETS BY ACCURACY
Datasets Original K-means Improved K-menas
>10000Records 79.2 83.4
>5000Records 86.5 90.2
>1000Records 96 96.5
Figure 1
Table 2 shows the time which the improvised K-Means takes to obtain the data from different
datasets as measured against the Original K-Means.
International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013
14
Table 2:PERFORMANCE COMPARISION OF THE ALGORITHMS FOR DIFFERENT
DATASETS BY TIME
Datasets Original K-means Improved K-menas
>10000Records 40 28
>5000Records 30 17
>1000Records 7 3
Figure 2
VI CONCLUSION
This K-means algorithm has many real time applications, but its performance cannot be
guaranteed as it takes the initial centroids randomly. Also the computational complexity of the
original k-means algorithm is alarmingly high considering the need to reassign the data points
many number of times, as they are reassigned every time the loop runs. So that we are taking the
initial centroids in a meaning full way and assign data points to the clusters in a distributed
manner. These results in better accuracy compared to the classic K-means Algorithm. Results and
Discussions show that the improved initial centroids with k-means clustering algorithm give the
accurate and efficient results. So that we fulfill the need of the many people by forward the mails
to the blood donors.
A limitation of the improved k-means algorithm is , it still takes the input as number of clusters
and take the location in a static way. For future work for this paper is place the GPRS positioning
system and take the number of clusters by the size of the dataset.
REFERENCES
[1] Jaipur National University, Jaipur, “Predicting the Number of Blood Donors through their Age and
Blood Group by using Data Mining Tool”
[2] N. Koteswara Rao, G. Sridhar Reddy “ Discovery of Preliminary Centroids Using Improved K-
Means Clustering Algorithm”
[3] K. RAJENDRA PRASAD and Dr. P.GOVINDA RAJULU, “A survey on clustering technique for
datasets using efficient graph structutes” International Journal of Engineering Science and
Technology Vol. 2 (7), 2010, 2707-2714.
[4] Jiawei Han M. KAMBER, Data Mining Concepts and Techniques, Morgan Kaufmann Publishers, An
Imprint of Elsevier,
[5] Margaret H. Dunham, Data Mining- Introductory and Advanced Concepts, Pearson Education, 2006.
[6] McQueen J, “Some methods for classification and analysis of multivariate observations,” Proc. 5th
Berkeley Symp. Math. Statist. Prob., (1):281–297, 1967.
[7] Pang-Ning Tan, Michael Steinback and Vipin Kumar, Introduction to Data Mining, Pearson
Education, 2007. Transactions on Information Theory, 28(2): 129-136.
[8] Fahim A.M, Salem A. M, Torkey A and Ramadan M. A, “An Efficient enhanced k-means clustering
algorithm,“ Journal of Zhejiang University, 10(7):1626–1633, 2006.
International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013
15
[9] Huang Z, “Extensions to the k-means algorithm for clustering large data sets with categorical values,”
Data Mining and Knowledge Discovery, (2):283–304, 1998.
[10] Yuan F, Meng Z. H, Zhang H. X and Dong C. R, “A New Algorithm to Get the Initial Centroids,”
Proc. of the 3rd International Conference on Machine Learning and Cybernetics, pages 26–29, August
2004.
[11] Wen-ChanLee and Bor-Wen Cheng; ‘An Intelligent system for improving performance of blood
donation’, Journal of Quality, Vol. 18, Issue No. II, 2011.
[12] Arun.K.Pujari(2001): Data Mining Techniques, Universities Press
[13] Abdul Nazeer K A, Sebastian M P, “Improving the Accuracy and Efficiency of the k-means
Clustering Algorithm,“ Proceedings of the International Conference on Data Mining and Knowledge
Engineering, London, UK, 2009.
[14] Nidhi Singh and Divakar Singh, “Performance Evaluation of K-Means and Heirarichal Clustering in
Terms of Accuracy and Running Time” (IJCSIT) International Journal of Computer Science and
Information Technologies, Vol. 3 (3) , 2012,4119-4121.
Authors
1Venkateswarlu Bondu received the Master’s Degree in Computer Science and Systems Engineering from
Andhra University College of Engineering, pursuing Ph.D in Department of Computer Science and
Systems Engineering Andhra University,Visakhapatnam. His research area is Data Mining and Data
Warehousing.
2Prof G.S.V.Prasad Raju Professor in Computer Science ,in School of Distance Education, Andhra
University,Visakhapatnam. His area of specialization is E-Commerce, Internet Technologies.

Más contenido relacionado

La actualidad más candente

The improved k means with particle swarm optimization
The improved k means with particle swarm optimizationThe improved k means with particle swarm optimization
The improved k means with particle swarm optimizationAlexander Decker
 
Analysis and implementation of modified k medoids
Analysis and implementation of modified k medoidsAnalysis and implementation of modified k medoids
Analysis and implementation of modified k medoidseSAT Publishing House
 
Dynamic approach to k means clustering algorithm-2
Dynamic approach to k means clustering algorithm-2Dynamic approach to k means clustering algorithm-2
Dynamic approach to k means clustering algorithm-2IAEME Publication
 
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMSSCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMSijdkp
 
K-means Clustering Method for the Analysis of Log Data
K-means Clustering Method for the Analysis of Log DataK-means Clustering Method for the Analysis of Log Data
K-means Clustering Method for the Analysis of Log Dataidescitation
 
84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1bPRAWEEN KUMAR
 
Improved probabilistic distance based locality preserving projections method ...
Improved probabilistic distance based locality preserving projections method ...Improved probabilistic distance based locality preserving projections method ...
Improved probabilistic distance based locality preserving projections method ...IJECEIAES
 
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSA HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSEditor IJCATR
 
Pillar k means
Pillar k meansPillar k means
Pillar k meansswathi b
 
A comparative study of three validities computation methods for multimodel ap...
A comparative study of three validities computation methods for multimodel ap...A comparative study of three validities computation methods for multimodel ap...
A comparative study of three validities computation methods for multimodel ap...IJECEIAES
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
An improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyAn improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyijpla
 
Survey on Unsupervised Learning in Datamining
Survey on Unsupervised Learning in DataminingSurvey on Unsupervised Learning in Datamining
Survey on Unsupervised Learning in DataminingIOSR Journals
 

La actualidad más candente (20)

The improved k means with particle swarm optimization
The improved k means with particle swarm optimizationThe improved k means with particle swarm optimization
The improved k means with particle swarm optimization
 
Analysis and implementation of modified k medoids
Analysis and implementation of modified k medoidsAnalysis and implementation of modified k medoids
Analysis and implementation of modified k medoids
 
Dynamic approach to k means clustering algorithm-2
Dynamic approach to k means clustering algorithm-2Dynamic approach to k means clustering algorithm-2
Dynamic approach to k means clustering algorithm-2
 
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMSSCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
SCAF – AN EFFECTIVE APPROACH TO CLASSIFY SUBSPACE CLUSTERING ALGORITHMS
 
K-means Clustering Method for the Analysis of Log Data
K-means Clustering Method for the Analysis of Log DataK-means Clustering Method for the Analysis of Log Data
K-means Clustering Method for the Analysis of Log Data
 
84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b84cc04ff77007e457df6aa2b814d2346bf1b
84cc04ff77007e457df6aa2b814d2346bf1b
 
A046010107
A046010107A046010107
A046010107
 
Improved probabilistic distance based locality preserving projections method ...
Improved probabilistic distance based locality preserving projections method ...Improved probabilistic distance based locality preserving projections method ...
Improved probabilistic distance based locality preserving projections method ...
 
Noura2
Noura2Noura2
Noura2
 
K means report
K means reportK means report
K means report
 
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETSA HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
A HYBRID MODEL FOR MINING MULTI DIMENSIONAL DATA SETS
 
Pillar k means
Pillar k meansPillar k means
Pillar k means
 
A comparative study of three validities computation methods for multimodel ap...
A comparative study of three validities computation methods for multimodel ap...A comparative study of three validities computation methods for multimodel ap...
A comparative study of three validities computation methods for multimodel ap...
 
Saif_CCECE2007_full_paper_submitted
Saif_CCECE2007_full_paper_submittedSaif_CCECE2007_full_paper_submitted
Saif_CCECE2007_full_paper_submitted
 
Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)Welcome to International Journal of Engineering Research and Development (IJERD)
Welcome to International Journal of Engineering Research and Development (IJERD)
 
Lx3520322036
Lx3520322036Lx3520322036
Lx3520322036
 
Master's Thesis Presentation
Master's Thesis PresentationMaster's Thesis Presentation
Master's Thesis Presentation
 
An improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracyAn improvement in k mean clustering algorithm using better time and accuracy
An improvement in k mean clustering algorithm using better time and accuracy
 
Az36311316
Az36311316Az36311316
Az36311316
 
Survey on Unsupervised Learning in Datamining
Survey on Unsupervised Learning in DataminingSurvey on Unsupervised Learning in Datamining
Survey on Unsupervised Learning in Datamining
 

Destacado

Feature extraction based retrieval of
Feature extraction based retrieval ofFeature extraction based retrieval of
Feature extraction based retrieval ofijcsity
 
Social Media in Government
Social Media in GovernmentSocial Media in Government
Social Media in GovernmentJon Parks
 
37번 여의주 예방논문1 발표
37번 여의주 예방논문1 발표37번 여의주 예방논문1 발표
37번 여의주 예방논문1 발표Benedict Choi
 
Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...ijcsity
 
AN ENERGY EFFICIENT COUNTERMEASURE AGAINST MULTIPLE ATTACKS OF THE FALSE DATA...
AN ENERGY EFFICIENT COUNTERMEASURE AGAINST MULTIPLE ATTACKS OF THE FALSE DATA...AN ENERGY EFFICIENT COUNTERMEASURE AGAINST MULTIPLE ATTACKS OF THE FALSE DATA...
AN ENERGY EFFICIENT COUNTERMEASURE AGAINST MULTIPLE ATTACKS OF THE FALSE DATA...ijcsity
 
Introducing ViewToo v.2.0
Introducing ViewToo v.2.0Introducing ViewToo v.2.0
Introducing ViewToo v.2.0Jiransoft
 
An intro to google ad words keyword planner tool
An intro to google ad words keyword planner toolAn intro to google ad words keyword planner tool
An intro to google ad words keyword planner toolJon Parks
 
Cancer recurrence prediction using
Cancer recurrence prediction usingCancer recurrence prediction using
Cancer recurrence prediction usingijcsity
 
How to write an effective e-mail copy
How to write an effective e-mail copyHow to write an effective e-mail copy
How to write an effective e-mail copyHisham Nabawi
 
Novel text categorization by amalgamation of augmented k nearest neighbourhoo...
Novel text categorization by amalgamation of augmented k nearest neighbourhoo...Novel text categorization by amalgamation of augmented k nearest neighbourhoo...
Novel text categorization by amalgamation of augmented k nearest neighbourhoo...ijcsity
 
Office box user guide_v3.0
Office box user guide_v3.0Office box user guide_v3.0
Office box user guide_v3.0Jiransoft
 
Google Plus and Your Mobile Recruiting Strategy
Google Plus and Your Mobile Recruiting StrategyGoogle Plus and Your Mobile Recruiting Strategy
Google Plus and Your Mobile Recruiting StrategyJon Parks
 
34번 서누리 예방논문1 발표
34번 서누리 예방논문1 발표34번 서누리 예방논문1 발표
34번 서누리 예방논문1 발표Benedict Choi
 
Office box user_guide_v3.0
Office box user_guide_v3.0Office box user_guide_v3.0
Office box user_guide_v3.0Jiransoft
 
Performance evaluation of clustering algorithms
Performance evaluation of clustering algorithmsPerformance evaluation of clustering algorithms
Performance evaluation of clustering algorithmsijcsity
 
International Journal of Computational Science and Information Technology (I...
 International Journal of Computational Science and Information Technology (I... International Journal of Computational Science and Information Technology (I...
International Journal of Computational Science and Information Technology (I...ijcsity
 
Introducing OfficeBox (Oct.2013)
Introducing OfficeBox (Oct.2013)Introducing OfficeBox (Oct.2013)
Introducing OfficeBox (Oct.2013)Jiransoft
 

Destacado (20)

Feature extraction based retrieval of
Feature extraction based retrieval ofFeature extraction based retrieval of
Feature extraction based retrieval of
 
Social Media in Government
Social Media in GovernmentSocial Media in Government
Social Media in Government
 
Ejercicio
EjercicioEjercicio
Ejercicio
 
37번 여의주 예방논문1 발표
37번 여의주 예방논문1 발표37번 여의주 예방논문1 발표
37번 여의주 예방논문1 발표
 
Affascinante
AffascinanteAffascinante
Affascinante
 
Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...Great model a model for the automatic generation of semantic relations betwee...
Great model a model for the automatic generation of semantic relations betwee...
 
AN ENERGY EFFICIENT COUNTERMEASURE AGAINST MULTIPLE ATTACKS OF THE FALSE DATA...
AN ENERGY EFFICIENT COUNTERMEASURE AGAINST MULTIPLE ATTACKS OF THE FALSE DATA...AN ENERGY EFFICIENT COUNTERMEASURE AGAINST MULTIPLE ATTACKS OF THE FALSE DATA...
AN ENERGY EFFICIENT COUNTERMEASURE AGAINST MULTIPLE ATTACKS OF THE FALSE DATA...
 
Introducing ViewToo v.2.0
Introducing ViewToo v.2.0Introducing ViewToo v.2.0
Introducing ViewToo v.2.0
 
An intro to google ad words keyword planner tool
An intro to google ad words keyword planner toolAn intro to google ad words keyword planner tool
An intro to google ad words keyword planner tool
 
Cancer recurrence prediction using
Cancer recurrence prediction usingCancer recurrence prediction using
Cancer recurrence prediction using
 
How to write an effective e-mail copy
How to write an effective e-mail copyHow to write an effective e-mail copy
How to write an effective e-mail copy
 
Novel text categorization by amalgamation of augmented k nearest neighbourhoo...
Novel text categorization by amalgamation of augmented k nearest neighbourhoo...Novel text categorization by amalgamation of augmented k nearest neighbourhoo...
Novel text categorization by amalgamation of augmented k nearest neighbourhoo...
 
Office box user guide_v3.0
Office box user guide_v3.0Office box user guide_v3.0
Office box user guide_v3.0
 
Google Plus and Your Mobile Recruiting Strategy
Google Plus and Your Mobile Recruiting StrategyGoogle Plus and Your Mobile Recruiting Strategy
Google Plus and Your Mobile Recruiting Strategy
 
34번 서누리 예방논문1 발표
34번 서누리 예방논문1 발표34번 서누리 예방논문1 발표
34번 서누리 예방논문1 발표
 
Office box user_guide_v3.0
Office box user_guide_v3.0Office box user_guide_v3.0
Office box user_guide_v3.0
 
Performance evaluation of clustering algorithms
Performance evaluation of clustering algorithmsPerformance evaluation of clustering algorithms
Performance evaluation of clustering algorithms
 
International Journal of Computational Science and Information Technology (I...
 International Journal of Computational Science and Information Technology (I... International Journal of Computational Science and Information Technology (I...
International Journal of Computational Science and Information Technology (I...
 
Affascinante
AffascinanteAffascinante
Affascinante
 
Introducing OfficeBox (Oct.2013)
Introducing OfficeBox (Oct.2013)Introducing OfficeBox (Oct.2013)
Introducing OfficeBox (Oct.2013)
 

Similar a Mine Blood Donors Information through Improved K-Means Clustering

Premeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringPremeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringIJCSIS Research Publications
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)theijes
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningNatasha Grant
 
Comparison Between Clustering Algorithms for Microarray Data Analysis
Comparison Between Clustering Algorithms for Microarray Data AnalysisComparison Between Clustering Algorithms for Microarray Data Analysis
Comparison Between Clustering Algorithms for Microarray Data AnalysisIOSR Journals
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...IRJET Journal
 
New Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids AlgorithmNew Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids AlgorithmEditor IJCATR
 
Particle Swarm Optimization based K-Prototype Clustering Algorithm
Particle Swarm Optimization based K-Prototype Clustering Algorithm Particle Swarm Optimization based K-Prototype Clustering Algorithm
Particle Swarm Optimization based K-Prototype Clustering Algorithm iosrjce
 
Ensemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes ClusteringEnsemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes ClusteringIJERD Editor
 
Max stable set problem to found the initial centroids in clustering problem
Max stable set problem to found the initial centroids in clustering problemMax stable set problem to found the initial centroids in clustering problem
Max stable set problem to found the initial centroids in clustering problemnooriasukmaningtyas
 
8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithmLaura Petrosanu
 
Unsupervised Machine Learning Algorithm K-means-Clustering.pptx
Unsupervised Machine Learning Algorithm K-means-Clustering.pptxUnsupervised Machine Learning Algorithm K-means-Clustering.pptx
Unsupervised Machine Learning Algorithm K-means-Clustering.pptxAnupama Kate
 
Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147Editor IJARCET
 
Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147Editor IJARCET
 
K Mean and Fuzzy Clustering Algorithm Predicated Brain Tumor Segmentation And...
K Mean and Fuzzy Clustering Algorithm Predicated Brain Tumor Segmentation And...K Mean and Fuzzy Clustering Algorithm Predicated Brain Tumor Segmentation And...
K Mean and Fuzzy Clustering Algorithm Predicated Brain Tumor Segmentation And...IRJET Journal
 
Unsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and AssumptionsUnsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and Assumptionsrefedey275
 

Similar a Mine Blood Donors Information through Improved K-Means Clustering (20)

Premeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means ClusteringPremeditated Initial Points for K-Means Clustering
Premeditated Initial Points for K-Means Clustering
 
The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)The International Journal of Engineering and Science (The IJES)
The International Journal of Engineering and Science (The IJES)
 
A Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data MiningA Comparative Study Of Various Clustering Algorithms In Data Mining
A Comparative Study Of Various Clustering Algorithms In Data Mining
 
Comparison Between Clustering Algorithms for Microarray Data Analysis
Comparison Between Clustering Algorithms for Microarray Data AnalysisComparison Between Clustering Algorithms for Microarray Data Analysis
Comparison Between Clustering Algorithms for Microarray Data Analysis
 
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCAREK-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
K-MEANS AND D-STREAM ALGORITHM IN HEALTHCARE
 
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
Machine Learning Algorithms for Image Classification of Hand Digits and Face ...
 
New Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids AlgorithmNew Approach for K-mean and K-medoids Algorithm
New Approach for K-mean and K-medoids Algorithm
 
Particle Swarm Optimization based K-Prototype Clustering Algorithm
Particle Swarm Optimization based K-Prototype Clustering Algorithm Particle Swarm Optimization based K-Prototype Clustering Algorithm
Particle Swarm Optimization based K-Prototype Clustering Algorithm
 
I017235662
I017235662I017235662
I017235662
 
Ensemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes ClusteringEnsemble based Distributed K-Modes Clustering
Ensemble based Distributed K-Modes Clustering
 
Neural nw k means
Neural nw k meansNeural nw k means
Neural nw k means
 
Data clustering
Data clustering Data clustering
Data clustering
 
Max stable set problem to found the initial centroids in clustering problem
Max stable set problem to found the initial centroids in clustering problemMax stable set problem to found the initial centroids in clustering problem
Max stable set problem to found the initial centroids in clustering problem
 
8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm8.clustering algorithm.k means.em algorithm
8.clustering algorithm.k means.em algorithm
 
Unsupervised Machine Learning Algorithm K-means-Clustering.pptx
Unsupervised Machine Learning Algorithm K-means-Clustering.pptxUnsupervised Machine Learning Algorithm K-means-Clustering.pptx
Unsupervised Machine Learning Algorithm K-means-Clustering.pptx
 
Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147
 
Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147Volume 2-issue-6-2143-2147
Volume 2-issue-6-2143-2147
 
K Mean and Fuzzy Clustering Algorithm Predicated Brain Tumor Segmentation And...
K Mean and Fuzzy Clustering Algorithm Predicated Brain Tumor Segmentation And...K Mean and Fuzzy Clustering Algorithm Predicated Brain Tumor Segmentation And...
K Mean and Fuzzy Clustering Algorithm Predicated Brain Tumor Segmentation And...
 
Unsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and AssumptionsUnsupervised learning Algorithms and Assumptions
Unsupervised learning Algorithms and Assumptions
 
40120130406009
4012013040600940120130406009
40120130406009
 

Más de ijcsity

International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...ijcsity
 
NETWORK MEDIA ATTENTION AND GREEN TECHNOLOGY INNOVATION
NETWORK MEDIA ATTENTION AND  GREEN TECHNOLOGY INNOVATION NETWORK MEDIA ATTENTION AND  GREEN TECHNOLOGY INNOVATION
NETWORK MEDIA ATTENTION AND GREEN TECHNOLOGY INNOVATION ijcsity
 
4th International Conference on Artificial Intelligence and Machine Learning ...
4th International Conference on Artificial Intelligence and Machine Learning ...4th International Conference on Artificial Intelligence and Machine Learning ...
4th International Conference on Artificial Intelligence and Machine Learning ...ijcsity
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...ijcsity
 
5th International Conference on Machine Learning & Applications (CMLA 2023)
5th International Conference on Machine Learning & Applications (CMLA 2023)5th International Conference on Machine Learning & Applications (CMLA 2023)
5th International Conference on Machine Learning & Applications (CMLA 2023)ijcsity
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...ijcsity
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...ijcsity
 
8th International Conference on Networks, Communications, Wireless and Mobile...
8th International Conference on Networks, Communications, Wireless and Mobile...8th International Conference on Networks, Communications, Wireless and Mobile...
8th International Conference on Networks, Communications, Wireless and Mobile...ijcsity
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...ijcsity
 
10th International Conference on Computer Science and Information Technology ...
10th International Conference on Computer Science and Information Technology ...10th International Conference on Computer Science and Information Technology ...
10th International Conference on Computer Science and Information Technology ...ijcsity
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...ijcsity
 
International Conference on Speech and NLP (SPNLP 2023)
International Conference on Speech and NLP (SPNLP 2023) International Conference on Speech and NLP (SPNLP 2023)
International Conference on Speech and NLP (SPNLP 2023) ijcsity
 
SPNLP CFP - 27 mar.pdf
SPNLP CFP - 27 mar.pdfSPNLP CFP - 27 mar.pdf
SPNLP CFP - 27 mar.pdfijcsity
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...ijcsity
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...ijcsity
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...ijcsity
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...ijcsity
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...ijcsity
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...ijcsity
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...ijcsity
 

Más de ijcsity (20)

International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...
 
NETWORK MEDIA ATTENTION AND GREEN TECHNOLOGY INNOVATION
NETWORK MEDIA ATTENTION AND  GREEN TECHNOLOGY INNOVATION NETWORK MEDIA ATTENTION AND  GREEN TECHNOLOGY INNOVATION
NETWORK MEDIA ATTENTION AND GREEN TECHNOLOGY INNOVATION
 
4th International Conference on Artificial Intelligence and Machine Learning ...
4th International Conference on Artificial Intelligence and Machine Learning ...4th International Conference on Artificial Intelligence and Machine Learning ...
4th International Conference on Artificial Intelligence and Machine Learning ...
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...
 
5th International Conference on Machine Learning & Applications (CMLA 2023)
5th International Conference on Machine Learning & Applications (CMLA 2023)5th International Conference on Machine Learning & Applications (CMLA 2023)
5th International Conference on Machine Learning & Applications (CMLA 2023)
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...
 
8th International Conference on Networks, Communications, Wireless and Mobile...
8th International Conference on Networks, Communications, Wireless and Mobile...8th International Conference on Networks, Communications, Wireless and Mobile...
8th International Conference on Networks, Communications, Wireless and Mobile...
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...
 
10th International Conference on Computer Science and Information Technology ...
10th International Conference on Computer Science and Information Technology ...10th International Conference on Computer Science and Information Technology ...
10th International Conference on Computer Science and Information Technology ...
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...
 
International Conference on Speech and NLP (SPNLP 2023)
International Conference on Speech and NLP (SPNLP 2023) International Conference on Speech and NLP (SPNLP 2023)
International Conference on Speech and NLP (SPNLP 2023)
 
SPNLP CFP - 27 mar.pdf
SPNLP CFP - 27 mar.pdfSPNLP CFP - 27 mar.pdf
SPNLP CFP - 27 mar.pdf
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...
 
International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...International Journal of Computational Science and Information Technology (IJ...
International Journal of Computational Science and Information Technology (IJ...
 

Último

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountPuma Security, LLC
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 

Último (20)

How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Breaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path MountBreaking the Kubernetes Kill Chain: Host Path Mount
Breaking the Kubernetes Kill Chain: Host Path Mount
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 

Mine Blood Donors Information through Improved K-Means Clustering

  • 1. International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013 DOI : 10.5121/ijcsity.2013.1302 09 Mine Blood Donors Information through Improved K-Means Clustering Bondu Venkateswarlu1 and Prof G.S.V.Prasad Raju2 1 Department of Computer Science and Systems Engineering, Andhra University, Visakhapatnam-530003,India. iambondu@yahoo.com. 2 School of Distance Education, Andhra university, Visakhapatnam-530003,India. gsvpraju2011@yahoo.com ABSTRACT The number of accidents and health diseases which are increasing at an alarming rate are resulting in a huge increase in the demand for blood. There is a necessity for the organized analysis of the blood donor database or blood banks repositories. Clustering analysis is one of the data mining applications and K-means clustering algorithm is the fundamental algorithm for modern clustering techniques. K-means clustering algorithm is traditional approach and iterative algorithm. At every iteration, it attempts to find the distance from the centroid of each cluster to each and every data point. This paper gives the improvement to the original k-means algorithm by improving the initial centroids with distribution of data. Results and discussions show that improved K-means algorithm produces accurate clusters in less computation time to find the donors information. KEYWORDS Clustering, means, Euclidean distance, centroid, datasets I INTRODUCTION Now a day, due to large number of accidents and health problems, it is dire need to find the blood donor information instantly. Since the blood bank repositories are rapidly growing, it is becoming increasingly difficult to extract the information required by using the conventional database techniques. So, there is a need to come up with a solution using clustering for analysis and obtain the information of the blood donors [1]. Clustering analysis [2] is one of the major data analysis methods, which help to identify natural grouping of data objects from the dataset. Clustering is unsupervised classification and is a process of partitioning a set of data objects into a set of meaning full sub classes. This can be done by applying the various similarity and distance measures to the algorithm. For all geometrical problems, the standard distance metric used is the Euclidean distance, which is just the direct distance between any two points and can be easily measured with the help of just a ruler, be it in two-dimensional space or three-dimensional space. It is a default distance measure [3] for the k-means clustering algorithm. K-means Clustering Algorithm is a Partitioning algorithm; it is effective in producing for many practical applications [4, 5, 6, and 7] like Pattern recognition, Document classification, Image processing, Economic science. However, while calculating the initial cluster centroids, the K-
  • 2. International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013 10 means algorithm is vulnerable to errors. Because of this the initial cluster centroids are produced arbitrarily, hence disabling the algorithm from providing the expected results. The accuracy of the initial centroid determines the efficiency of the original k-means algorithm. This paper attempts to improve the accuracy of the k-means algorithm by increasing the accuracy in the calculation of the initial centroids. II K-Means Clustering Algorithm The K-means algorithm assigns each point to the cluster whose centroid (Centre) is in the nearest proximity (centroid is the average of all the points in the cluster). The centroid’s coordinates are determined by calculating the arithmetic mean of all the points in the cluster separately for each dimension. The biggest asset of this algorithm is its speed, which allow for it to be run on large databases. Also the simplicity of this algorithm makes it all the more easy to use. But this algorithm has its own share of shortcomings associated with it. The inability of this algorithm to yield the same results with each run is a major disadvantage and this occurs as the derived clusters are not independent of the initial random assignments. Also, this algorithm considerably reduces the intra-cluster variance, but does not take measures to ensure the global minimum of the variance of the result. There are two separate phases in the k-means algorithm- wherein the first phase involves the identification of k centroids, given that we know the number of clusters (K) beforehand thereby arriving at one centroid per cluster. Initially K points which are likely to be in different clusters have to be selected, which are then made the centroids of their respective clusters. The Euclidean distance is then calculated for each data point from each of the cluster centroid. Compare the values; find the closest centroid for the data point. Then bind the centroid and the data point which results in the completion of the first phase and then an early grouping is done. At this stage, the new centroids have to be determined by calculating the mean value for each cluster. After we get k new centroids, then a new binding is to be created between the same data points previously used and the nearest new centroid, thus generating a loop. Because of this loop, the k centroids are prone to a change in their positions in a step by step manner. This process is repeated until convergence criteria is meat means the centroids of clusters are do not move anymore. The pseudo Code for the k-means Clustering algorithm. Algorithm : The k-means clustering algorithm Input: D = {d1, d2,......, dn} //set of n data items. k // Number of desired clusters Output: A set of k clusters. Steps: 1. Arbitrarily choose k data-items from D as initial centroids; 2. Repeat 2.1 Assign each data item di to the cluster which has the closest centroid; 2.2 Calculate the new mean of each cluster; Until convergence criterion is met.
  • 3. International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013 11 III LITERATURE SURVEY In a web based information system for blood donation available on www. Pavel Berkhin[11] performed extensive research in the field of data mining experiments and organized analysis of the blood bank repositories which is helpful to the health professionals for a better management of the blood bank facility. Arun K.Pujari et al. [12] have worked to improve the performance of blood donation information analysis. In this paper, improved k-means clustering is adopted that improves the performance for determining the blood donors information based on the required blood group and location where needed. Several attempts are made by the researchers to improve efficiency of the k-means clustering [8,9,10]. The variants of the k-means clustering algorithm, are K-modes and K-medoids. These algorithms replace the means with the modes and medoids. The K-modes algorithm handles the categorical data. These also give better performance based on the way we choose the initial modes and methods. For clustering the data, the k-means and the k-modes are integrated by the k-prototypes algorithm. Both the numeric and categorical attributes are taken into account to define the dissimilarity measure. Fahim A.M. et. al.[8] and Yuan F et.al. [10] worked for improving the performance of the k- means by finding the initial centroids. The centroids obtained this by these methods provide consistent and accurate clusters for the given datasets. Abdul Nazeer K A et.al [13] brought forward an efficient method for assigning the data-points to the clusters. The original k-means algorithm it computes the distances between the data points from all the centroids in every iteration, thus making this algorithm computationally very expensive. Another approach given by Fahim makes use of two distance functions for this purpose- one similar to the k-means algorithm and another one based on a heuristics to reduce the number of distance calculations. However as was the case in the original k-means algorithm, this algorithm too determines the initial centroids randomly, thus compromising the accuracy of the resultant clusters. Koteswararao[2] proposes improved k-means algorithm using a O(n logn) heuristic method for finding the initial centroids. In this method, the initial centroids are generated in accordance with the distribution of the data instead of generating them randomly. IV PROPOSED METHODOLOGY Find the blood donors information and satisfy the need by these steps: Collect data from the blood bank. Apply the improved K-means Algorithm to classify the number of blood donors through the blood group and location where it needed. Extract the Mail ids from the resultant cluster which satisfies the criteria of blood group and location. Forward the message to the donors.
  • 4. International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013 12 In this methodology, the improved clustering algorithm, deals with the multi dimensional data values. Each data point di may contain multiple attributes such as di1, di2,…..dim, where m is the number of attributes or columns in each data value. In such cases we first determine the column with maximum range [7], where range is the difference between the maximum and the minimum element in the column. Then determine the initial centroids [10] as range of the Column is divided by the number of clusters and sum with the minimum value of the column. i.e., assume we have the two dimensional dataset, then the initial centroid are Cx=MAXx-MINxK+1+clusterid*MINx Cy=MAXy-MINyK+1+clusterid*MINy After determining the initial centroids, the data points are distributed to the each cluster as specified by the Abdul Nazeer [13]. That is the data points are divided into k-equal partitions. For each iteration, calculate the Euclidean distance between the data point and each centroid. For all geometrical problems, the standard distance metric used is the Euclidean distance, which is just the direct distance between any two points and can be easily measured with the help of just a ruler, be it in two-dimensional space or three-dimensional space. It is a default distance measure [3] for the k-means clustering algorithm. Clustering problems and clustering text widely use of Euclidean distance, which is a true metric as it satisfies all the above stated four conditions. Also the Euclidean metric is the default distance metric which is used in the k-means algorithm. The distance measured by the Euclidean metric is also known by ‘as-the-crow-flies’ distance. This distance from a point X (X1, X2, etc.) to a point Y (Y1, Y2, etc.) is: The square root of the sum of the squares of the differences between the corresponding values of the two data points is necessary for finding the Euclidean distance. Find new centroid by calculate the mean value of distance of the all data points of that cluster. Each data point di is assigned to the cluster having the closest centroid. The distance between the data points and the centroid is measured by using the Euclidean distance. Improved K-means Clustering Algorithm is outlined as below Algorithm 2: K-means Clustering Algorithm with improved initial centroids. Input: D = {d1, d2,......,dn} // set of n data items. K // Number of desired clusters. Output: A set of k clusters. Steps: Calculate the initial centroids as the formula give above and set the cluster with that centroid. Repeat 2.1 Initially assign the each data point to the cluster
  • 5. International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013 13 2.2 Update the centroid value by calculate the mean of that cluster Until all data points are assigned to any one of the clusters. 3. Repeat 3.1 Assign each data item di to the cluster which has the closest centroid; 3.2 Calculate new mean of each cluster; Until convergence criterion is met. This algorithm gives the better performance and accurate results within less time compared to the literature [10,12]. V RESULTS AND DISCUSSIONS For testing accuracy and efficiency of original K-means and improved K-means clustering algorithm, various datasets are used. Data sets are varying in size. The same data sets are given to the original K-means and improved K-means algorithm. Both algorithms require the inputs as number of clusters, and the required blood group and location where it needed. The original k-means algorithm takes the initial centroids as the arbitrarily choosing random values; the improved k-means algorithm takes the initial centroid in more meaning full way by calculated using formulas. The efficiency of this algorithm can be analyzed by the accuracy of the clusters obtained from the experiments performed on the blood bank dataset which satisfy the given criterion (here blood group). The time taken and the percentage accuracy for each experiment are computed and tabulated. Performances of the algorithms for the different data sets are tabulated in Table I. The results produced from different algorithms are compared in figures 1 to 2. Figure1 describes the accuracy of the algorithms, y-axis shows the accuracy. Figure 2 describes the time taken by the algorithm, y-axis shows the time taken by the algorithm. The performance of the improvised algorithm as measured against the original one is tabulated in the Table 1. It shows the results for different data sets. Table 1:PERFORMANCE COMPARISION OF THE ALGORITHMS FOR DIFFERENT DATASETS BY ACCURACY Datasets Original K-means Improved K-menas >10000Records 79.2 83.4 >5000Records 86.5 90.2 >1000Records 96 96.5 Figure 1 Table 2 shows the time which the improvised K-Means takes to obtain the data from different datasets as measured against the Original K-Means.
  • 6. International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013 14 Table 2:PERFORMANCE COMPARISION OF THE ALGORITHMS FOR DIFFERENT DATASETS BY TIME Datasets Original K-means Improved K-menas >10000Records 40 28 >5000Records 30 17 >1000Records 7 3 Figure 2 VI CONCLUSION This K-means algorithm has many real time applications, but its performance cannot be guaranteed as it takes the initial centroids randomly. Also the computational complexity of the original k-means algorithm is alarmingly high considering the need to reassign the data points many number of times, as they are reassigned every time the loop runs. So that we are taking the initial centroids in a meaning full way and assign data points to the clusters in a distributed manner. These results in better accuracy compared to the classic K-means Algorithm. Results and Discussions show that the improved initial centroids with k-means clustering algorithm give the accurate and efficient results. So that we fulfill the need of the many people by forward the mails to the blood donors. A limitation of the improved k-means algorithm is , it still takes the input as number of clusters and take the location in a static way. For future work for this paper is place the GPRS positioning system and take the number of clusters by the size of the dataset. REFERENCES [1] Jaipur National University, Jaipur, “Predicting the Number of Blood Donors through their Age and Blood Group by using Data Mining Tool” [2] N. Koteswara Rao, G. Sridhar Reddy “ Discovery of Preliminary Centroids Using Improved K- Means Clustering Algorithm” [3] K. RAJENDRA PRASAD and Dr. P.GOVINDA RAJULU, “A survey on clustering technique for datasets using efficient graph structutes” International Journal of Engineering Science and Technology Vol. 2 (7), 2010, 2707-2714. [4] Jiawei Han M. KAMBER, Data Mining Concepts and Techniques, Morgan Kaufmann Publishers, An Imprint of Elsevier, [5] Margaret H. Dunham, Data Mining- Introductory and Advanced Concepts, Pearson Education, 2006. [6] McQueen J, “Some methods for classification and analysis of multivariate observations,” Proc. 5th Berkeley Symp. Math. Statist. Prob., (1):281–297, 1967. [7] Pang-Ning Tan, Michael Steinback and Vipin Kumar, Introduction to Data Mining, Pearson Education, 2007. Transactions on Information Theory, 28(2): 129-136. [8] Fahim A.M, Salem A. M, Torkey A and Ramadan M. A, “An Efficient enhanced k-means clustering algorithm,“ Journal of Zhejiang University, 10(7):1626–1633, 2006.
  • 7. International Journal of Computational Science and Information Technology (IJCSITY) Vol.1, No.3, August 2013 15 [9] Huang Z, “Extensions to the k-means algorithm for clustering large data sets with categorical values,” Data Mining and Knowledge Discovery, (2):283–304, 1998. [10] Yuan F, Meng Z. H, Zhang H. X and Dong C. R, “A New Algorithm to Get the Initial Centroids,” Proc. of the 3rd International Conference on Machine Learning and Cybernetics, pages 26–29, August 2004. [11] Wen-ChanLee and Bor-Wen Cheng; ‘An Intelligent system for improving performance of blood donation’, Journal of Quality, Vol. 18, Issue No. II, 2011. [12] Arun.K.Pujari(2001): Data Mining Techniques, Universities Press [13] Abdul Nazeer K A, Sebastian M P, “Improving the Accuracy and Efficiency of the k-means Clustering Algorithm,“ Proceedings of the International Conference on Data Mining and Knowledge Engineering, London, UK, 2009. [14] Nidhi Singh and Divakar Singh, “Performance Evaluation of K-Means and Heirarichal Clustering in Terms of Accuracy and Running Time” (IJCSIT) International Journal of Computer Science and Information Technologies, Vol. 3 (3) , 2012,4119-4121. Authors 1Venkateswarlu Bondu received the Master’s Degree in Computer Science and Systems Engineering from Andhra University College of Engineering, pursuing Ph.D in Department of Computer Science and Systems Engineering Andhra University,Visakhapatnam. His research area is Data Mining and Data Warehousing. 2Prof G.S.V.Prasad Raju Professor in Computer Science ,in School of Distance Education, Andhra University,Visakhapatnam. His area of specialization is E-Commerce, Internet Technologies.