This project aims to predict the likelihood of heart disease based on key health indicators such as age, cholesterol level, blood pressure, and other medical attributes.
The goal is to assist in early diagnosis and prevention by applying machine learning techniques to analyze patient data.
Heart disease is a leading global health issue. Early prediction through data-driven models can assist in medical diagnosis and preventive care.
This project uses supervised learning algorithms to analyze patient data and predict the risk of heart disease.
🚀 Try it here: Clinical Risk Classification System App
- Logistic Regression
- Decision Tree
- Naive Bayes
- Support Vector Machine (SVM)
- K-Nearest Neighbors (KNN)
Each model was trained and compared using metrics such as accuracy and F1-score.
- Data preprocessing (handling missing values, normalization, encoding)
- Exploratory Data Analysis (EDA) with visualizations
- Feature selection and correlation analysis
- Model training, evaluation, and hyperparameter tuning
- Model comparison and performance reporting
- Interactive web interface built with Streamlit
- Real-time prediction from user input
The dataset contains various medical attributes such as:
- Age
- Sex
- Chest Pain Type
- Resting Blood Pressure
- Cholesterol
- Fasting Blood Sugar
- Resting ECG
- Maximum Heart Rate
- Exercise-Induced Angina
- ST Depression (oldpeak)
- ST slope
- Heart Disease
Dataset Source: UCI Machine Learning Repository - Heart Disease Dataset
- Programming Language: Python
- Libraries: Pandas, NumPy, Matplotlib, Seaborn, Scikit-learn, Streamlit, Joblib
- Tools: Jupyter Notebook
- IDE: VS Code
| Model | Accuracy | F1-Score |
|---|---|---|
| Logistic Regression | 0.87 | 0.88 |
| KNN | 0.86 | 0.88 |
| SVM | 0.85 | 0.87 |
| Naive Bayes | 0.85 | 0.86 |
| Decision Tree | 0.79 | 0.81 |
Logistic Regression achieved the best overall performance with 87% accuracy.
- The project structure I have:
Heart-Disease-Prediction/ │ ├── DataSet/ │ └── heart.csv ├── venv/ ├── app.py ├── heart_disease_prediction.ipynb ├── columns.pkl ├── LR_Heart.pkl ├── Scaler.pkl ├── README.md └── requirements.txt
-
Clone this repository:
git clone https://github.com/yourusername/Heart-Disease-Prediction.git cd Heart-Disease-Prediction -
Install dependencies:
pip install -r requirements.txt
-
Create Virtual Environment:
python -m venv venv
-
Run the Jupyter Notebook:
After installing all the dependencies in the virtual environment, open the notebook and select the virtual environment as your kernel, select this virtual environment, and then run all the cells.
- Integrate deep learning models for improved accuracy
- Connect to a database for storing patient predictions
- Improve UI with health insights and visualization dashboards
Muhammad Talha
Final-year Computer Science student at UET Lahore