This project analyzes student academic performance to understand how factors such as gender, parental education, lunch type, and test preparation influence scores in Mathematics, Reading, and Writing.
The analysis focuses on data cleaning, exploratory data analysis (EDA), and visualization using Python.
Student-Performance-Analysis/ │ ├── main.py # Main analysis script ├── Expanded_data_with_more_features.csv # Dataset ├── README.md # Project documentation └── images/ # Generated visualizations ├── gender_distribution.png ├── parent_education_heatmap.png └── math_score_boxplot.png
- Python Programming
- Data Analysis using Pandas & NumPy
- Exploratory Data Analysis (EDA)
- Data Visualization using Matplotlib & Seaborn
- Handling Missing Values
- Insight Generation from Visual Patterns
Visualizes the count of male and female students to understand dataset composition.
A heatmap highlighting the relationship between parental education levels and student performance across subjects.
A boxplot displaying score distribution, spread, and outliers in Mathematics.
- Parental education level shows a noticeable impact on student performance.
- Clear score variations observed across different demographic groups.
- Presence of outliers in Math scores highlights performance disparities.
- Visualization helps in understanding score distribution patterns.
- Install required libraries:
- Ensure the dataset is in the same directory as
main.py. - Run the script:
- Build a machine learning model to predict student scores
- Deploy the analysis using a Streamlit dashboard
- Add correlation matrix and pair plots
- Perform feature engineering for deeper insights
Karan
BS in Data Science & Applications, IIT Madras
This project demonstrates beginner-to-intermediate data analysis skills suitable for internships and entry-level data analytics roles.