Introduction
Heart diseases is a term covering any disorder of the heart. Heart diseases have become a major concern to deal with as studies show that the number of deaths due to heart diseases have increased significantly over the past few decades in India, in fact it has become the leading cause of death in India.
A study shows that from 1990 to 2016 the death rate due to heart diseases have increased around 34 per cent from 155.7 to 209.1 deaths per one lakh population in India.
Thus preventing Heart diseases has become more than necessary. Good data-driven systems for predicting heart diseases can improve the entire research and prevention process, making sure that more people can live healthy lives. This is where Machine Learning comes into play. Machine Learning helps in predicting the Heart diseases, and the predictions made are quite accurate.
Problem Description :
A dataset is formed by taking into consideration some of the information of 920 individuals. The problem is : based on the given information about each individual we have to calculate that whether that individual will suffer from heart disease.
Dataset :
The Heart disease data set consists of patient data from Cleveland, Hungary, Long Beach and Switzerland. The combined dataset consists of 14 features and 916 samples with many missing values. The features used in here are,
- Age : displays the age of the individual.
- Sex : displays the gender of the individual using the following format : 1 = male 0 = female.
- Chest-pain type : displays the type of chest-pain experienced by the individual using the following format : 1 = typical angina 2 = atypical angina 3 = non – anginal pain 4 = asymptotic
- Resting Blood Pressure : displays the resting blood pressure value of an individual in mmHg (unit)
- Serum Cholestrol : displays the serum cholestrol in mg/dl (unit)
- Fasting Blood Sugar : compares the fasting blood sugar value of an individual with 120mg/dl. If fasting blood sugar > 120mg/dl then : 1 (true) else : 0 (false)
- Resting ECG : 0 = normal 1 = having ST-T wave abnormality 2 = left ventricular hyperthrophy
- Max heart rate achieved : displays the max heart rate achieved by an individual.
- Exercise induced angina : 1 = yes 0 = no
- ST depression induced by exercise relative to rest : displays the value which is integer or float.
- Peak exercise ST segment : 1 = upsloping 2 = flat 3 = downsloping
- Number of major vessels (0-3) colored by flourosopy : displays the value as integer or float.
- Thal : displays the thalassemia : 3 = normal 6 = fixed defect 7 = reversable defect
- Diagnosis of heart disease : Displays whether the individual is suffering from heart disease or not : 0 = absence 1,2,3,4 = present.
Running the web app
Locally
- Install requirements
pip install -r requirements.txt
- Run flask web app
python main_file.py
Models used and accuracy
A Random forest classifier achieves an average multi-class classification accuracy of 56-60%(183 test samples). It gets 75-80% average binary classification accuracy(heart disease or no heart disease).
There are no reviews yet.