Intelligent Heart Attack Prediction System Using Big Data

Abstract: In today’s world heart attack is the very common. So diagnosing patients correctly on timely basis is the most challenging task. Treatment of heart disease is quite high and not affordable by most of the patients. The main objective is to develop a prototype Intelligent Heart Attack Prediction System using Big data and data mining modelling techniques. This system can discover and extract hidden knowledge (patterns and relationships) associated with heart disease from a historical heart disease database. It can answer complex queries for diagnosing heart disease and thus assist healthcare practitioners to make intelligent clinical decisions which traditional decision support systems cannot. By providing effective treatments, it also helps to reduce treatment costs. The healthcare industry collects huge amounts of big data which, unfortunately, are not “mined”. Remedies can be provided by using advanced data mining techniques. Medical diagnosis is regarded as an important yet complicated task that needs to be executed accurately and efficiently. The automation of this system would be extremely advantageous. Therefore, a medical diagnosis system like heart attack prediction system would probably be exceedingly beneficial.

ISSN 2350-1022
International Journal of Recent Research in Mathematics Computer Science and Information Technology
Vol. 2, Issue 2, pp: (73-77), Month: October 2015 – March 2016, Available at: www.paperpublications.org
Page | 73
Paper Publications
Intelligent Heart Attack Prediction System
Using Big Data
1
Prajakta Ghadge, 2
Vrushali Girme, 3
Kajal Kokane, 4
Prajakta Deshmukh
Savatribai Pune University
Abstract: In today’s world heart attack is the very common. So diagnosing patients correctly on timely basis is the
most challenging task. Treatment of heart disease is quite high and not affordable by most of the patients. The
main objective is to develop a prototype Intelligent Heart Attack Prediction System using Big data and data
mining modelling techniques. This system can discover and extract hidden knowledge (patterns and relationships)
associated with heart disease from a historical heart disease database. It can answer complex queries for
diagnosing heart disease and thus assist healthcare practitioners to make intelligent clinical decisions which
traditional decision support systems cannot.
By providing effective treatments, it also helps to reduce treatment costs. The healthcare industry collects huge
amounts of big data which, unfortunately, are not “mined”. Remedies can be provided by using advanced data
mining techniques. Medical diagnosis is regarded as an important yet complicated task that needs to be executed
accurately and efficiently. The automation of this system would be extremely advantageous.
Therefore, a medical diagnosis system like heart attack prediction system would probably be exceedingly
beneficial.
Keywords: Big data; Data Mining; Hadoop; Healthcare; Knowledge-Discovery; Risk Prediction.
I. INTRODUCTION
The heart is vital part of our body. Life is completely dependent on efficient working of heart. If functioning of heart is
not proper, it will affect the other parts of body human such as brain, kidney etc. Now a days heart attack has become a
major cause of uncertain death world wide. Earlier only middle aged and old people were prone to heart attack, but due to
today’s unhealthy lifestyle it is affecting youngsters too. Heart attack prediction is a complex task for medical
practitioners. So heart attack prediction system would prove to be beneficial by using data mining techniques.
Data mining has already established as a novel field for exploring knowledge from hidden patterns in the big datasets.
“Data Mining” is a non-trivial extraction of implicit, previously unknown and potentially useful information from given
data. In short, it is a process to analyze the data from different perspective and gather the knowledge from it. This
discovered knowledge can be used in different application domains for example in healthcare industry (Heart attack
prediction). So we will use data mining techniques along with big data and machine learning to develop software to assist
doctors and other people in making decision of heart attack in early stages.
II. LITERATURE SURVEY
Investigation is carried out that focuses on diagnosis of heart disease. Different data mining techniques have been applied
for diagnosis & achieved different probabilities for several methods.
An Intelligent Heart Attack Prediction System is developed using data mining techniques. Neural Network, Naïve Bayes
and Decision Tree proposed by Sellappan Palaniappan.
ISSN 2350-1022
International Journal of Recent Research in Mathematics Computer Science and Information Technology
Vol. 2, Issue 2, pp: (73-77), Month: October 2015 – March 2016, Available at: www.paperpublications.org
Page | 74
Paper Publications
Each method has its own strength to get appropriate outputs. To setup this system hidden patterns and relationship
between them is used. It is expandable, web based and user friendly system.
Multi parametric feature with linear and nonlinear characteristics of Heart Rate Variability (HRV) a novel technique was
developed by HeonGyu Lee. To achieve this, he has used several classifiers i.e. Bayesian Classifiers, Classification based
on Multiple Association Rules (CMAR), Decision Tree and Support Vector Machine (SVM).
The identification problem of constrained association rules for heart attack prediction was studied by Carlos Ordonez.
The final dataset contains records of patients having heart disease. Three constraints to decrease the number of patterns
are as follows:
1. The attributes should appear only on one side of the rule.
2. Separation of attributes into groups i.e. uninteresting groups.
3. In a rule, the number of attributes should be limited. The result will be separated into two groups of rules, the presence
or absence of heart disease.
In heart attack prediction, Blood Pressure and Sugar with the help of neural networks was proposed by Niti Guru. The
dataset has 13 attributes in each record. The supervised network i.e. Neural Network along with back propagation
algorithm is used for training and testing of data.
Akhil Jabbarr proposed an efficient associative classification algorithm by using genetic approach for heart attack
prediction. The main motive of using genetic algorithm in the discovery of high level prediction rules is that the
discovered rules have high predictive accuracy and are highly comprehensible with great interestingness values.
Kiyong Noh, for the extraction of multi parametric features by assessing Heart Rate Variability (HRV) from ECG, Data
preprocessing and heart disease pattern used classification method. The dataset consisting of 670 people was distributed
into two groups viz, normal people and patients with heart disease.
III. TECHNOLOGIES
Big Data:
Big data is a broad term for datasets, it is so large or complex that traditional data processing applications are insufficient.
Challenges of big data include analysis, capture, data collection, search, sharing, storage, transfer, visualization, and
information privacy. The term simply refers to the use of predictive analysis or other certain advanced methods to extract
valuable information from data, and often refer to a particular size of dataset. Accurate analysis in big data may result in
more confident decision making. And better decisions can provide us with greater operational efficiency, reduced risk
and cost reduction.
Data mining:
Data mining is the analysis step of Knowledge Discovery in Databases or KDD. It is an interdisciplinary subfield of
computer science. This is the process of discovering patterns in large datasets ("big data") involving methods at the
intersection of artificial intelligence, machine learning, database systems and statistics. The final goal of the data mining
process is to extract relevant information from a dataset and transform it into an understandable format for further use.
Hadoop and Mahout:
Apache Hadoop is an open-source software framework written in Java for distributed processing and distributed storage
of huge datasets on computer clusters built from commodity hardware. All the modules in Hadoop are designed with a
significant assumption that hardware failures of individual machines, or racks of machines are common place and should
be handled automatically in s/w by the framework. The core of Apache Hadoop consists of a storage part Hadoop
Distributed File System (HDFS) and a processing part Map Reduce. Hadoop files are split into large number of blocks
and are distributed among the nodes in the cluster. For data processing, Hadoop Map Reduce transfers packaged code for
fnodes for parallel processing, based on the given data each node needs to be processed. This approach takes advantage of
data locality nodes, manipulating the data that they have on hand to allow the data to be processed more efficiently and
ISSN 2350-1022
International Journal of Recent Research in Mathematics Computer Science and Information Technology
Vol. 2, Issue 2, pp: (73-77), Month: October 2015 – March 2016, Available at: www.paperpublications.org
Page | 75
Paper Publications
faster than it would be in a more conventional supercomputer architecture that depends on a parallel file system where
data and computation are connected through high-speed networking.
Apache Software Foundation produced Apache Mahout project for free implementations of distributed or scalable
machine learning algorithms which primarily focus on the areas of collaborative filtering, classification and clustering.
Apache Hadoop platform is used by many applications. It also provides Java libraries for common math operations which
focus on linear algebra and statistics and primitive Java collections.
Naive Bayes:
Naive Bayes classification is based on Bayes theorem. This classification algorithm uses conditional independence, means
it assumes that an attribute value on a given class is independent on values of other attributes.
The Bayes theorem is as follows:
Let,
A={a1, a2, …....an} be a set of n attributes.
In Bayesian theorem, A is considered as evidence and H be some hypothesis mean value, the data of A is a subset of class
C.
We have to evaluate P (H|A), the probability that the hypothesis H holds given evidence i.e. data sample A.
According to Bayesian theorem the P (H|A) is given as,
P (H|A) = P (A| H) P (H) / P (A)
IV. ARCHITECTURE
System architecture is described below:
Fig 1. Architecture
 This system consists of two datasets. First is original big dataset and the other one is updated dataset.
 HDFS: It is a Java-based file system which provides a user with reliable and scalable data storage.
Name node: It is the centre piece of an HDFS. It keeps the record of all files in the file system, and tracks location of the
file in a cluster. The name node cannot store the file data by itself.
Data node: Data Node stores data in the Hadoop File System. A functional file system consists of more than one Data
Node, with data replication.
ISSN 2350-1022
International Journal of Recent Research in Mathematics Computer Science and Information Technology
Vol. 2, Issue 2, pp: (73-77), Month: October 2015 – March 2016, Available at: www.paperpublications.org
Page | 76
Paper Publications
 Hbase: It is used when one needs random, real time read or write access to Big Data. The goal of hbase is to host
very large tables i.e. billions of rows X millions of columns.
V. RESULTS
Data set:
Clinical records have accumulated large quantity of data about patients and their medical diagnosis reports. The term
cardiovascular disease involves the diverse diseases that affect the heart. Record set with medical attributes was obtained
from the Cleveland Heart Disease database which is available on web. With the assistance of the dataset, the patterns
necessary for heart attack prediction are extracted.
Input attributes:
1. Age in Year.
2. Sex- (value 1: Male; value 0: Female).
3. Thal - (value 3: normal; value 6: fixed defect; value 7: reversible defect).
4. CA – number of major vessels colored by fluoroscopy (value 0-3).
5. Old peak – ST depression induced by exercise.
6. Exang - exercise induced angina (value 1: yes; value 0: no).
7. Chest Pain Type -(value 1:typical type1 angina, value 2: typical type 2 angina, value 3:non-angina pain; value 4:
asymptomatic).
8. Restecg – resting electrographic results (value 0: normal; value 1: having ST-T wave abnormality; value 2: showing
probable or definite left ventricular hypertrophy).
9. Serum Cholestrol (mg/dl).
10 .Fasting Blood Sugar- (value 0: <120 mg/dl:value 1: >120 mg/dl; ).
11. Trest Blood Pressure (mm Hg on admission to the hospital).
12. Slope – the slope of the peak exercise ST segment (value 1: unsloping; value 2: flat; value 3: down sloping).
13. Thalach – maximum heart rate achieved.
14. Heart Disease Present - 0:No 1: Yes.
Fig 2. Training time for Mahout with different number of nodes
ISSN 2350-1022
International Journal of Recent Research in Mathematics Computer Science and Information Technology
Vol. 2, Issue 2, pp: (73-77), Month: October 2015 – March 2016, Available at: www.paperpublications.org
Page | 77
Paper Publications
VI. CONCLUSION
Here we have studied heart attack prediction system using big data solution. Our proposed solution gives big data
infrastructure for both predictive modeling and information extraction. We study the effectiveness of our proposed system
with an experiment set, consisting of scalability and quality. Future scope of our system aims at giving big data
infrastructure for our designed risk calculation tools, for designing more sophisticated prediction models and feature
extraction techniques and extending our proposed system to predict other clinical risks.
REFERENCES
[1] Big Data Solutions for Predicting Risk-of-Readmission for Congestive Heart Failure Patients by Kiyana Zolfaghar,
Naren Meadem, Ankur Teredesai, Senjuti Basu Roy, Si-Chi Chin Institute of Technology, CWDS, UW Tacoma.
IEEE 2013.
[2] Prediction of Heart Disease using Classification Algorithms by Hlaudi Daniel Masethe, Mosima Anna Masethe,
USA
[3] Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction by Jyoti Soni, Ujma
Ansari, Dipesh Sharma International Journal of Computer Applications
[4] Heart Disease Prediction System using Associative Classification and Genetic Algorithm by M.Akhil jabbar ,
Dr.Priti Chandrab, Dr.B.L Deekshatuluc, Research Scholar, JNTU Hyderabad, A.P INDIA
[5] Heart Disease Prediction System using Naïve Bayes and Jelinek-mercer smoothing by Ms.Rupali R.Patil Asst.
Professor, Jawaharlal Nehru College of Engineering.
[6] Improved Study of Heart Disease Prediction System using Data Mining Classification Techniques Chaitrali S.
Dangare Sulabha S. Apte, PhD.

Recomendados

Prognosis of Cardiac Disease using Data Mining Techniques A Comprehensive Survey por
Prognosis of Cardiac Disease using Data Mining Techniques A Comprehensive SurveyPrognosis of Cardiac Disease using Data Mining Techniques A Comprehensive Survey
Prognosis of Cardiac Disease using Data Mining Techniques A Comprehensive Surveyijtsrd
50 vistas5 diapositivas
Heart Disease Prediction Using Data Mining Techniques por
Heart Disease Prediction Using Data Mining TechniquesHeart Disease Prediction Using Data Mining Techniques
Heart Disease Prediction Using Data Mining TechniquesIJRES Journal
6K vistas4 diapositivas
PSO-An Intellectual Technique for Feature Reduction on Heart Malady Anticipat... por
PSO-An Intellectual Technique for Feature Reduction on Heart Malady Anticipat...PSO-An Intellectual Technique for Feature Reduction on Heart Malady Anticipat...
PSO-An Intellectual Technique for Feature Reduction on Heart Malady Anticipat...Sivagowry Shathesh
1.2K vistas12 diapositivas
A Survey on Heart Disease Prediction Techniques por
A Survey on Heart Disease Prediction TechniquesA Survey on Heart Disease Prediction Techniques
A Survey on Heart Disease Prediction Techniquesijtsrd
73 vistas4 diapositivas
Heart disease prediction por
Heart disease predictionHeart disease prediction
Heart disease predictionAriful Haque
738 vistas16 diapositivas
Psdot 14 using data mining techniques in heart por
Psdot 14 using data mining techniques in heartPsdot 14 using data mining techniques in heart
Psdot 14 using data mining techniques in heartZTech Proje
4.7K vistas5 diapositivas

Más contenido relacionado

La actualidad más candente

Intelligent data analysis for medicinal diagnosis por
Intelligent data analysis for medicinal diagnosisIntelligent data analysis for medicinal diagnosis
Intelligent data analysis for medicinal diagnosisIRJET Journal
34 vistas14 diapositivas
Prediction of heart disease using classification mining technique on spark por
Prediction of heart disease using classification mining technique on sparkPrediction of heart disease using classification mining technique on spark
Prediction of heart disease using classification mining technique on sparkdbpublications
32 vistas5 diapositivas
IRJET- A Literature Review on Heart and Alzheimer Disease Prediction por
IRJET- A Literature Review on Heart and Alzheimer Disease PredictionIRJET- A Literature Review on Heart and Alzheimer Disease Prediction
IRJET- A Literature Review on Heart and Alzheimer Disease PredictionIRJET Journal
137 vistas4 diapositivas
Data mining techniques on heart failure diagnosis por
Data mining techniques on heart failure diagnosisData mining techniques on heart failure diagnosis
Data mining techniques on heart failure diagnosisSteve Iduye
404 vistas38 diapositivas
50120130406032 por
5012013040603250120130406032
50120130406032IAEME Publication
304 vistas6 diapositivas
H0342044046 por
H0342044046H0342044046
H0342044046inventionjournals
167 vistas3 diapositivas

La actualidad más candente(20)

Intelligent data analysis for medicinal diagnosis por IRJET Journal
Intelligent data analysis for medicinal diagnosisIntelligent data analysis for medicinal diagnosis
Intelligent data analysis for medicinal diagnosis
IRJET Journal34 vistas
Prediction of heart disease using classification mining technique on spark por dbpublications
Prediction of heart disease using classification mining technique on sparkPrediction of heart disease using classification mining technique on spark
Prediction of heart disease using classification mining technique on spark
dbpublications32 vistas
IRJET- A Literature Review on Heart and Alzheimer Disease Prediction por IRJET Journal
IRJET- A Literature Review on Heart and Alzheimer Disease PredictionIRJET- A Literature Review on Heart and Alzheimer Disease Prediction
IRJET- A Literature Review on Heart and Alzheimer Disease Prediction
IRJET Journal137 vistas
Data mining techniques on heart failure diagnosis por Steve Iduye
Data mining techniques on heart failure diagnosisData mining techniques on heart failure diagnosis
Data mining techniques on heart failure diagnosis
Steve Iduye404 vistas
Health care analytics por Gaurav Dubey
Health care analyticsHealth care analytics
Health care analytics
Gaurav Dubey223 vistas
Detection of heart diseases by data mining por Abheepsa Pattnaik
Detection of heart diseases by data miningDetection of heart diseases by data mining
Detection of heart diseases by data mining
Abheepsa Pattnaik10.7K vistas
A comparative study of cn2 rule and svm algorithm por Alexander Decker
A comparative study of cn2 rule and svm algorithmA comparative study of cn2 rule and svm algorithm
A comparative study of cn2 rule and svm algorithm
Alexander Decker1.1K vistas
A Heart Disease Prediction Model using Logistic Regression por ijtsrd
A Heart Disease Prediction Model using Logistic RegressionA Heart Disease Prediction Model using Logistic Regression
A Heart Disease Prediction Model using Logistic Regression
ijtsrd660 vistas
IRJET- Develop Futuristic Prediction Regarding Details of Health System for H... por IRJET Journal
IRJET- Develop Futuristic Prediction Regarding Details of Health System for H...IRJET- Develop Futuristic Prediction Regarding Details of Health System for H...
IRJET- Develop Futuristic Prediction Regarding Details of Health System for H...
IRJET Journal60 vistas
Chronic Kidney Disease Prediction por Rajandeep Gill
Chronic Kidney Disease PredictionChronic Kidney Disease Prediction
Chronic Kidney Disease Prediction
Rajandeep Gill5.2K vistas
Disease Prediction And Doctor Appointment system por KOYELMAJUMDAR1
Disease Prediction And Doctor Appointment  systemDisease Prediction And Doctor Appointment  system
Disease Prediction And Doctor Appointment system
KOYELMAJUMDAR11.7K vistas
Hybrid Technique for Associative Classification of Heart Diseases por Jagdeep Singh Malhi
Hybrid Technique for Associative Classification of Heart DiseasesHybrid Technique for Associative Classification of Heart Diseases
Hybrid Technique for Associative Classification of Heart Diseases
Jagdeep Singh Malhi2.2K vistas
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd por Healthcare consultant
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan PhdSMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd
SMART HEALTH PREDICTION USING DATA MINING by Dr.Mahboob Khan Phd
Healthcare consultant 10.3K vistas
Survey on data mining techniques in heart disease prediction por Sivagowry Shathesh
Survey on data mining techniques in heart disease predictionSurvey on data mining techniques in heart disease prediction
Survey on data mining techniques in heart disease prediction
Sivagowry Shathesh13.3K vistas
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase por ijtsrd
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBaseA Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase
A Heart Disease Prediction Model using Logistic Regression By Cleveland DataBase
ijtsrd2K vistas

Destacado

Energy Aware Computing in Sensor Networks por
Energy Aware Computing in Sensor NetworksEnergy Aware Computing in Sensor Networks
Energy Aware Computing in Sensor Networkspaperpublications3
44 vistas5 diapositivas
Folder Security Using Graphical Password Authentication Scheme por
Folder Security Using Graphical Password Authentication SchemeFolder Security Using Graphical Password Authentication Scheme
Folder Security Using Graphical Password Authentication Schemepaperpublications3
34 vistas8 diapositivas
Fuzzy α^m-Separation Axioms por
Fuzzy α^m-Separation AxiomsFuzzy α^m-Separation Axioms
Fuzzy α^m-Separation Axiomspaperpublications3
29 vistas12 diapositivas
Smart Password por
Smart PasswordSmart Password
Smart Passwordpaperpublications3
28 vistas6 diapositivas
Diabetics Online por
Diabetics OnlineDiabetics Online
Diabetics Onlinepaperpublications3
28 vistas5 diapositivas
Tree Based Proactive Source Routing Protocol for MANETs por
Tree Based Proactive Source Routing Protocol for MANETsTree Based Proactive Source Routing Protocol for MANETs
Tree Based Proactive Source Routing Protocol for MANETspaperpublications3
37 vistas4 diapositivas

Destacado(20)

Folder Security Using Graphical Password Authentication Scheme por paperpublications3
Folder Security Using Graphical Password Authentication SchemeFolder Security Using Graphical Password Authentication Scheme
Folder Security Using Graphical Password Authentication Scheme
paperpublications334 vistas
Tree Based Proactive Source Routing Protocol for MANETs por paperpublications3
Tree Based Proactive Source Routing Protocol for MANETsTree Based Proactive Source Routing Protocol for MANETs
Tree Based Proactive Source Routing Protocol for MANETs
paperpublications337 vistas
Analysis of Fungus in Plant Using Image Processing Techniques por paperpublications3
Analysis of Fungus in Plant Using Image Processing TechniquesAnalysis of Fungus in Plant Using Image Processing Techniques
Analysis of Fungus in Plant Using Image Processing Techniques
paperpublications352 vistas
Using Bandwidth Aggregation to Improve the Performance of Video Quality- Adap... por paperpublications3
Using Bandwidth Aggregation to Improve the Performance of Video Quality- Adap...Using Bandwidth Aggregation to Improve the Performance of Video Quality- Adap...
Using Bandwidth Aggregation to Improve the Performance of Video Quality- Adap...
paperpublications358 vistas
Fingerprint Feature Extraction, Identification and Authentication: A Review por paperpublications3
Fingerprint Feature Extraction, Identification and Authentication: A ReviewFingerprint Feature Extraction, Identification and Authentication: A Review
Fingerprint Feature Extraction, Identification and Authentication: A Review
paperpublications342 vistas
Efficient File Sharing Scheme in Mobile Adhoc Network por paperpublications3
Efficient File Sharing Scheme in Mobile Adhoc NetworkEfficient File Sharing Scheme in Mobile Adhoc Network
Efficient File Sharing Scheme in Mobile Adhoc Network
paperpublications328 vistas
Efficient and Optimal Routing Scheme for Wireless Sensor Networks por paperpublications3
Efficient and Optimal Routing Scheme for Wireless Sensor NetworksEfficient and Optimal Routing Scheme for Wireless Sensor Networks
Efficient and Optimal Routing Scheme for Wireless Sensor Networks
paperpublications332 vistas
Supervised Multi Attribute Gene Manipulation For Cancer por paperpublications3
Supervised Multi Attribute Gene Manipulation For CancerSupervised Multi Attribute Gene Manipulation For Cancer
Supervised Multi Attribute Gene Manipulation For Cancer
paperpublications346 vistas
Privacy Preserving Data Leak Detection for Sensitive Data por paperpublications3
Privacy Preserving Data Leak Detection for Sensitive DataPrivacy Preserving Data Leak Detection for Sensitive Data
Privacy Preserving Data Leak Detection for Sensitive Data
paperpublications335 vistas
Sign Language Recognition with Gesture Analysis por paperpublications3
Sign Language Recognition with Gesture AnalysisSign Language Recognition with Gesture Analysis
Sign Language Recognition with Gesture Analysis
paperpublications334 vistas
Comparative Study of Data Mining Classification Algorithms in Heart Disease P... por paperpublications3
Comparative Study of Data Mining Classification Algorithms in Heart Disease P...Comparative Study of Data Mining Classification Algorithms in Heart Disease P...
Comparative Study of Data Mining Classification Algorithms in Heart Disease P...
paperpublications335 vistas
Avoiding Anonymous Users in Multiple Social Media Networks (SMN) por paperpublications3
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
Avoiding Anonymous Users in Multiple Social Media Networks (SMN)
paperpublications347 vistas
Multimedia Cloud Computing To Save Smartphone Energy por paperpublications3
Multimedia Cloud Computing To Save Smartphone EnergyMultimedia Cloud Computing To Save Smartphone Energy
Multimedia Cloud Computing To Save Smartphone Energy
paperpublications328 vistas

Similar a Intelligent Heart Attack Prediction System Using Big Data

HEALTH PREDICTION ANALYSIS USING DATA MINING por
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
840 vistas6 diapositivas
Comparing Data Mining Techniques used for Heart Disease Prediction por
Comparing Data Mining Techniques used for Heart Disease PredictionComparing Data Mining Techniques used for Heart Disease Prediction
Comparing Data Mining Techniques used for Heart Disease PredictionIRJET Journal
151 vistas7 diapositivas
HEALTH PREDICTION ANALYSIS USING DATA MINING por
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MININGAshish Salve
756 vistas34 diapositivas
IRJET-Survey on Data Mining Techniques for Disease Prediction por
IRJET-Survey on Data Mining Techniques for Disease PredictionIRJET-Survey on Data Mining Techniques for Disease Prediction
IRJET-Survey on Data Mining Techniques for Disease PredictionIRJET Journal
86 vistas7 diapositivas
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION por
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONMULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONIJDKP
37 vistas7 diapositivas
Prediction of Heart Disease using Machine Learning Algorithms: A Survey por
Prediction of Heart Disease using Machine Learning Algorithms: A SurveyPrediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A Surveyrahulmonikasharma
5.5K vistas6 diapositivas

Similar a Intelligent Heart Attack Prediction System Using Big Data(20)

HEALTH PREDICTION ANALYSIS USING DATA MINING por Ashish Salve
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
Ashish Salve840 vistas
Comparing Data Mining Techniques used for Heart Disease Prediction por IRJET Journal
Comparing Data Mining Techniques used for Heart Disease PredictionComparing Data Mining Techniques used for Heart Disease Prediction
Comparing Data Mining Techniques used for Heart Disease Prediction
IRJET Journal151 vistas
HEALTH PREDICTION ANALYSIS USING DATA MINING por Ashish Salve
HEALTH PREDICTION ANALYSIS USING DATA  MININGHEALTH PREDICTION ANALYSIS USING DATA  MINING
HEALTH PREDICTION ANALYSIS USING DATA MINING
Ashish Salve756 vistas
IRJET-Survey on Data Mining Techniques for Disease Prediction por IRJET Journal
IRJET-Survey on Data Mining Techniques for Disease PredictionIRJET-Survey on Data Mining Techniques for Disease Prediction
IRJET-Survey on Data Mining Techniques for Disease Prediction
IRJET Journal86 vistas
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION por IJDKP
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTIONMULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
MULTI MODEL DATA MINING APPROACH FOR HEART FAILURE PREDICTION
IJDKP37 vistas
Prediction of Heart Disease using Machine Learning Algorithms: A Survey por rahulmonikasharma
Prediction of Heart Disease using Machine Learning Algorithms: A SurveyPrediction of Heart Disease using Machine Learning Algorithms: A Survey
Prediction of Heart Disease using Machine Learning Algorithms: A Survey
rahulmonikasharma5.5K vistas
Analysis on Data Mining Techniques for Heart Disease Dataset por IRJET Journal
Analysis on Data Mining Techniques for Heart Disease DatasetAnalysis on Data Mining Techniques for Heart Disease Dataset
Analysis on Data Mining Techniques for Heart Disease Dataset
IRJET Journal38 vistas
IRJET- Analyse Big Data Electronic Health Records Database using Hadoop Cluster por IRJET Journal
IRJET- Analyse Big Data Electronic Health Records Database using Hadoop ClusterIRJET- Analyse Big Data Electronic Health Records Database using Hadoop Cluster
IRJET- Analyse Big Data Electronic Health Records Database using Hadoop Cluster
IRJET Journal248 vistas
IRJET- Heart Disease Prediction and Recommendation por IRJET Journal
IRJET-  	  Heart Disease Prediction and RecommendationIRJET-  	  Heart Disease Prediction and Recommendation
IRJET- Heart Disease Prediction and Recommendation
IRJET Journal128 vistas
Paper id 212014112 por IJRAT
Paper id 212014112Paper id 212014112
Paper id 212014112
IJRAT261 vistas
prediction of heart disease using machine learning algorithms por INFOGAIN PUBLICATION
prediction of heart disease using machine learning algorithmsprediction of heart disease using machine learning algorithms
prediction of heart disease using machine learning algorithms
Disease prediction in big data healthcare using extended convolutional neural... por IJAAS Team
Disease prediction in big data healthcare using extended convolutional neural...Disease prediction in big data healthcare using extended convolutional neural...
Disease prediction in big data healthcare using extended convolutional neural...
IJAAS Team69 vistas
Ijarcet vol-2-issue-4-1393-1397 por Editor IJARCET
Ijarcet vol-2-issue-4-1393-1397Ijarcet vol-2-issue-4-1393-1397
Ijarcet vol-2-issue-4-1393-1397
Editor IJARCET313 vistas
DESIGN AND IMPLEMENTATION OF CARDIAC DISEASE USING NAIVE BAYES TECHNIQUE por IRJET Journal
DESIGN AND IMPLEMENTATION OF CARDIAC DISEASE USING NAIVE BAYES TECHNIQUEDESIGN AND IMPLEMENTATION OF CARDIAC DISEASE USING NAIVE BAYES TECHNIQUE
DESIGN AND IMPLEMENTATION OF CARDIAC DISEASE USING NAIVE BAYES TECHNIQUE
IRJET Journal9 vistas
Early Identification of Diseases Based on Responsible Attribute using Data Mi... por IRJET Journal
Early Identification of Diseases Based on Responsible Attribute using Data Mi...Early Identification of Diseases Based on Responsible Attribute using Data Mi...
Early Identification of Diseases Based on Responsible Attribute using Data Mi...
IRJET Journal64 vistas
Health Care Application using Machine Learning and Deep Learning por IRJET Journal
Health Care Application using Machine Learning and Deep LearningHealth Care Application using Machine Learning and Deep Learning
Health Care Application using Machine Learning and Deep Learning
IRJET Journal11 vistas
Smart Health Prediction Report por Arhind Gautam
Smart Health Prediction ReportSmart Health Prediction Report
Smart Health Prediction Report
Arhind Gautam5.6K vistas
Seminar report SMART HEALTH PREDICTION por Arhind Gautam
Seminar report SMART HEALTH PREDICTIONSeminar report SMART HEALTH PREDICTION
Seminar report SMART HEALTH PREDICTION
Arhind Gautam295 vistas

Último

START Newsletter 3 por
START Newsletter 3START Newsletter 3
START Newsletter 3Start Project
6 vistas25 diapositivas
Web Dev Session 1.pptx por
Web Dev Session 1.pptxWeb Dev Session 1.pptx
Web Dev Session 1.pptxVedVekhande
13 vistas22 diapositivas
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth por
BCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for GrowthBCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for Growth
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for GrowthInnomantra
10 vistas4 diapositivas
Design of machine elements-UNIT 3.pptx por
Design of machine elements-UNIT 3.pptxDesign of machine elements-UNIT 3.pptx
Design of machine elements-UNIT 3.pptxgopinathcreddy
34 vistas31 diapositivas
Searching in Data Structure por
Searching in Data StructureSearching in Data Structure
Searching in Data Structureraghavbirla63
14 vistas8 diapositivas
802.11 Computer Networks por
802.11 Computer Networks802.11 Computer Networks
802.11 Computer NetworksTusharChoudhary72015
13 vistas33 diapositivas

Último(20)

Web Dev Session 1.pptx por VedVekhande
Web Dev Session 1.pptxWeb Dev Session 1.pptx
Web Dev Session 1.pptx
VedVekhande13 vistas
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth por Innomantra
BCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for GrowthBCIC - Manufacturing Conclave -  Technology-Driven Manufacturing for Growth
BCIC - Manufacturing Conclave - Technology-Driven Manufacturing for Growth
Innomantra 10 vistas
Design of machine elements-UNIT 3.pptx por gopinathcreddy
Design of machine elements-UNIT 3.pptxDesign of machine elements-UNIT 3.pptx
Design of machine elements-UNIT 3.pptx
gopinathcreddy34 vistas
Searching in Data Structure por raghavbirla63
Searching in Data StructureSearching in Data Structure
Searching in Data Structure
raghavbirla6314 vistas
fakenews_DBDA_Mar23.pptx por deepmitra8
fakenews_DBDA_Mar23.pptxfakenews_DBDA_Mar23.pptx
fakenews_DBDA_Mar23.pptx
deepmitra816 vistas
MongoDB.pdf por ArthyR3
MongoDB.pdfMongoDB.pdf
MongoDB.pdf
ArthyR349 vistas
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf por AlhamduKure
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdfASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
ASSIGNMENTS ON FUZZY LOGIC IN TRAFFIC FLOW.pdf
AlhamduKure6 vistas
GDSC Mikroskil Members Onboarding 2023.pdf por gdscmikroskil
GDSC Mikroskil Members Onboarding 2023.pdfGDSC Mikroskil Members Onboarding 2023.pdf
GDSC Mikroskil Members Onboarding 2023.pdf
gdscmikroskil59 vistas
SUMIT SQL PROJECT SUPERSTORE 1.pptx por Sumit Jadhav
SUMIT SQL PROJECT SUPERSTORE 1.pptxSUMIT SQL PROJECT SUPERSTORE 1.pptx
SUMIT SQL PROJECT SUPERSTORE 1.pptx
Sumit Jadhav 22 vistas
_MAKRIADI-FOTEINI_diploma thesis.pptx por fotinimakriadi
_MAKRIADI-FOTEINI_diploma thesis.pptx_MAKRIADI-FOTEINI_diploma thesis.pptx
_MAKRIADI-FOTEINI_diploma thesis.pptx
fotinimakriadi10 vistas
Proposal Presentation.pptx por keytonallamon
Proposal Presentation.pptxProposal Presentation.pptx
Proposal Presentation.pptx
keytonallamon63 vistas
Ansari: Practical experiences with an LLM-based Islamic Assistant por M Waleed Kadous
Ansari: Practical experiences with an LLM-based Islamic AssistantAnsari: Practical experiences with an LLM-based Islamic Assistant
Ansari: Practical experiences with an LLM-based Islamic Assistant
M Waleed Kadous7 vistas
Design_Discover_Develop_Campaign.pptx por ShivanshSeth6
Design_Discover_Develop_Campaign.pptxDesign_Discover_Develop_Campaign.pptx
Design_Discover_Develop_Campaign.pptx
ShivanshSeth645 vistas
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc... por csegroupvn
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
Design of Structures and Foundations for Vibrating Machines, Arya-ONeill-Pinc...
csegroupvn6 vistas
REACTJS.pdf por ArthyR3
REACTJS.pdfREACTJS.pdf
REACTJS.pdf
ArthyR335 vistas

Intelligent Heart Attack Prediction System Using Big Data

  • 1. ISSN 2350-1022 International Journal of Recent Research in Mathematics Computer Science and Information Technology Vol. 2, Issue 2, pp: (73-77), Month: October 2015 – March 2016, Available at: www.paperpublications.org Page | 73 Paper Publications Intelligent Heart Attack Prediction System Using Big Data 1 Prajakta Ghadge, 2 Vrushali Girme, 3 Kajal Kokane, 4 Prajakta Deshmukh Savatribai Pune University Abstract: In today’s world heart attack is the very common. So diagnosing patients correctly on timely basis is the most challenging task. Treatment of heart disease is quite high and not affordable by most of the patients. The main objective is to develop a prototype Intelligent Heart Attack Prediction System using Big data and data mining modelling techniques. This system can discover and extract hidden knowledge (patterns and relationships) associated with heart disease from a historical heart disease database. It can answer complex queries for diagnosing heart disease and thus assist healthcare practitioners to make intelligent clinical decisions which traditional decision support systems cannot. By providing effective treatments, it also helps to reduce treatment costs. The healthcare industry collects huge amounts of big data which, unfortunately, are not “mined”. Remedies can be provided by using advanced data mining techniques. Medical diagnosis is regarded as an important yet complicated task that needs to be executed accurately and efficiently. The automation of this system would be extremely advantageous. Therefore, a medical diagnosis system like heart attack prediction system would probably be exceedingly beneficial. Keywords: Big data; Data Mining; Hadoop; Healthcare; Knowledge-Discovery; Risk Prediction. I. INTRODUCTION The heart is vital part of our body. Life is completely dependent on efficient working of heart. If functioning of heart is not proper, it will affect the other parts of body human such as brain, kidney etc. Now a days heart attack has become a major cause of uncertain death world wide. Earlier only middle aged and old people were prone to heart attack, but due to today’s unhealthy lifestyle it is affecting youngsters too. Heart attack prediction is a complex task for medical practitioners. So heart attack prediction system would prove to be beneficial by using data mining techniques. Data mining has already established as a novel field for exploring knowledge from hidden patterns in the big datasets. “Data Mining” is a non-trivial extraction of implicit, previously unknown and potentially useful information from given data. In short, it is a process to analyze the data from different perspective and gather the knowledge from it. This discovered knowledge can be used in different application domains for example in healthcare industry (Heart attack prediction). So we will use data mining techniques along with big data and machine learning to develop software to assist doctors and other people in making decision of heart attack in early stages. II. LITERATURE SURVEY Investigation is carried out that focuses on diagnosis of heart disease. Different data mining techniques have been applied for diagnosis & achieved different probabilities for several methods. An Intelligent Heart Attack Prediction System is developed using data mining techniques. Neural Network, Naïve Bayes and Decision Tree proposed by Sellappan Palaniappan.
  • 2. ISSN 2350-1022 International Journal of Recent Research in Mathematics Computer Science and Information Technology Vol. 2, Issue 2, pp: (73-77), Month: October 2015 – March 2016, Available at: www.paperpublications.org Page | 74 Paper Publications Each method has its own strength to get appropriate outputs. To setup this system hidden patterns and relationship between them is used. It is expandable, web based and user friendly system. Multi parametric feature with linear and nonlinear characteristics of Heart Rate Variability (HRV) a novel technique was developed by HeonGyu Lee. To achieve this, he has used several classifiers i.e. Bayesian Classifiers, Classification based on Multiple Association Rules (CMAR), Decision Tree and Support Vector Machine (SVM). The identification problem of constrained association rules for heart attack prediction was studied by Carlos Ordonez. The final dataset contains records of patients having heart disease. Three constraints to decrease the number of patterns are as follows: 1. The attributes should appear only on one side of the rule. 2. Separation of attributes into groups i.e. uninteresting groups. 3. In a rule, the number of attributes should be limited. The result will be separated into two groups of rules, the presence or absence of heart disease. In heart attack prediction, Blood Pressure and Sugar with the help of neural networks was proposed by Niti Guru. The dataset has 13 attributes in each record. The supervised network i.e. Neural Network along with back propagation algorithm is used for training and testing of data. Akhil Jabbarr proposed an efficient associative classification algorithm by using genetic approach for heart attack prediction. The main motive of using genetic algorithm in the discovery of high level prediction rules is that the discovered rules have high predictive accuracy and are highly comprehensible with great interestingness values. Kiyong Noh, for the extraction of multi parametric features by assessing Heart Rate Variability (HRV) from ECG, Data preprocessing and heart disease pattern used classification method. The dataset consisting of 670 people was distributed into two groups viz, normal people and patients with heart disease. III. TECHNOLOGIES Big Data: Big data is a broad term for datasets, it is so large or complex that traditional data processing applications are insufficient. Challenges of big data include analysis, capture, data collection, search, sharing, storage, transfer, visualization, and information privacy. The term simply refers to the use of predictive analysis or other certain advanced methods to extract valuable information from data, and often refer to a particular size of dataset. Accurate analysis in big data may result in more confident decision making. And better decisions can provide us with greater operational efficiency, reduced risk and cost reduction. Data mining: Data mining is the analysis step of Knowledge Discovery in Databases or KDD. It is an interdisciplinary subfield of computer science. This is the process of discovering patterns in large datasets ("big data") involving methods at the intersection of artificial intelligence, machine learning, database systems and statistics. The final goal of the data mining process is to extract relevant information from a dataset and transform it into an understandable format for further use. Hadoop and Mahout: Apache Hadoop is an open-source software framework written in Java for distributed processing and distributed storage of huge datasets on computer clusters built from commodity hardware. All the modules in Hadoop are designed with a significant assumption that hardware failures of individual machines, or racks of machines are common place and should be handled automatically in s/w by the framework. The core of Apache Hadoop consists of a storage part Hadoop Distributed File System (HDFS) and a processing part Map Reduce. Hadoop files are split into large number of blocks and are distributed among the nodes in the cluster. For data processing, Hadoop Map Reduce transfers packaged code for fnodes for parallel processing, based on the given data each node needs to be processed. This approach takes advantage of data locality nodes, manipulating the data that they have on hand to allow the data to be processed more efficiently and
  • 3. ISSN 2350-1022 International Journal of Recent Research in Mathematics Computer Science and Information Technology Vol. 2, Issue 2, pp: (73-77), Month: October 2015 – March 2016, Available at: www.paperpublications.org Page | 75 Paper Publications faster than it would be in a more conventional supercomputer architecture that depends on a parallel file system where data and computation are connected through high-speed networking. Apache Software Foundation produced Apache Mahout project for free implementations of distributed or scalable machine learning algorithms which primarily focus on the areas of collaborative filtering, classification and clustering. Apache Hadoop platform is used by many applications. It also provides Java libraries for common math operations which focus on linear algebra and statistics and primitive Java collections. Naive Bayes: Naive Bayes classification is based on Bayes theorem. This classification algorithm uses conditional independence, means it assumes that an attribute value on a given class is independent on values of other attributes. The Bayes theorem is as follows: Let, A={a1, a2, …....an} be a set of n attributes. In Bayesian theorem, A is considered as evidence and H be some hypothesis mean value, the data of A is a subset of class C. We have to evaluate P (H|A), the probability that the hypothesis H holds given evidence i.e. data sample A. According to Bayesian theorem the P (H|A) is given as, P (H|A) = P (A| H) P (H) / P (A) IV. ARCHITECTURE System architecture is described below: Fig 1. Architecture  This system consists of two datasets. First is original big dataset and the other one is updated dataset.  HDFS: It is a Java-based file system which provides a user with reliable and scalable data storage. Name node: It is the centre piece of an HDFS. It keeps the record of all files in the file system, and tracks location of the file in a cluster. The name node cannot store the file data by itself. Data node: Data Node stores data in the Hadoop File System. A functional file system consists of more than one Data Node, with data replication.
  • 4. ISSN 2350-1022 International Journal of Recent Research in Mathematics Computer Science and Information Technology Vol. 2, Issue 2, pp: (73-77), Month: October 2015 – March 2016, Available at: www.paperpublications.org Page | 76 Paper Publications  Hbase: It is used when one needs random, real time read or write access to Big Data. The goal of hbase is to host very large tables i.e. billions of rows X millions of columns. V. RESULTS Data set: Clinical records have accumulated large quantity of data about patients and their medical diagnosis reports. The term cardiovascular disease involves the diverse diseases that affect the heart. Record set with medical attributes was obtained from the Cleveland Heart Disease database which is available on web. With the assistance of the dataset, the patterns necessary for heart attack prediction are extracted. Input attributes: 1. Age in Year. 2. Sex- (value 1: Male; value 0: Female). 3. Thal - (value 3: normal; value 6: fixed defect; value 7: reversible defect). 4. CA – number of major vessels colored by fluoroscopy (value 0-3). 5. Old peak – ST depression induced by exercise. 6. Exang - exercise induced angina (value 1: yes; value 0: no). 7. Chest Pain Type -(value 1:typical type1 angina, value 2: typical type 2 angina, value 3:non-angina pain; value 4: asymptomatic). 8. Restecg – resting electrographic results (value 0: normal; value 1: having ST-T wave abnormality; value 2: showing probable or definite left ventricular hypertrophy). 9. Serum Cholestrol (mg/dl). 10 .Fasting Blood Sugar- (value 0: <120 mg/dl:value 1: >120 mg/dl; ). 11. Trest Blood Pressure (mm Hg on admission to the hospital). 12. Slope – the slope of the peak exercise ST segment (value 1: unsloping; value 2: flat; value 3: down sloping). 13. Thalach – maximum heart rate achieved. 14. Heart Disease Present - 0:No 1: Yes. Fig 2. Training time for Mahout with different number of nodes
  • 5. ISSN 2350-1022 International Journal of Recent Research in Mathematics Computer Science and Information Technology Vol. 2, Issue 2, pp: (73-77), Month: October 2015 – March 2016, Available at: www.paperpublications.org Page | 77 Paper Publications VI. CONCLUSION Here we have studied heart attack prediction system using big data solution. Our proposed solution gives big data infrastructure for both predictive modeling and information extraction. We study the effectiveness of our proposed system with an experiment set, consisting of scalability and quality. Future scope of our system aims at giving big data infrastructure for our designed risk calculation tools, for designing more sophisticated prediction models and feature extraction techniques and extending our proposed system to predict other clinical risks. REFERENCES [1] Big Data Solutions for Predicting Risk-of-Readmission for Congestive Heart Failure Patients by Kiyana Zolfaghar, Naren Meadem, Ankur Teredesai, Senjuti Basu Roy, Si-Chi Chin Institute of Technology, CWDS, UW Tacoma. IEEE 2013. [2] Prediction of Heart Disease using Classification Algorithms by Hlaudi Daniel Masethe, Mosima Anna Masethe, USA [3] Predictive Data Mining for Medical Diagnosis: An Overview of Heart Disease Prediction by Jyoti Soni, Ujma Ansari, Dipesh Sharma International Journal of Computer Applications [4] Heart Disease Prediction System using Associative Classification and Genetic Algorithm by M.Akhil jabbar , Dr.Priti Chandrab, Dr.B.L Deekshatuluc, Research Scholar, JNTU Hyderabad, A.P INDIA [5] Heart Disease Prediction System using Naïve Bayes and Jelinek-mercer smoothing by Ms.Rupali R.Patil Asst. Professor, Jawaharlal Nehru College of Engineering. [6] Improved Study of Heart Disease Prediction System using Data Mining Classification Techniques Chaitrali S. Dangare Sulabha S. Apte, PhD.