This project presents a detailed exploratory analysis of the StudentsPerformance.csv dataset.
The goal is to uncover insights into how different factors such as gender, parental education, lunch type, and test preparation affect students' academic performance in Math, Reading, and Writing.
- File Name:
StudentsPerformance.csv - Records: 1000 students
- Features: 8 columns
- Data Types: Categorical + Numerical
genderrace/ethnicityparental level of educationlunch(standard or free/reduced)test preparation course(completed or not)math scorereading scorewriting score
- Python 3 (Google Colab)
- Pandas
- Seaborn
- Matplotlib
- FPDF / MS Word (for report generation)
- Loaded the dataset using
pandas.read_csv() - Cleaned column names for consistency
- Checked for missing values and duplicates
- Viewed top records using
.head() - Inspected data types using
.info() - Generated summary statistics with
.describe()
- Countplots for categorical columns like gender, lunch, etc.
- Histograms and boxplots for scores
- Outlier and distribution detection
- Correlation heatmap to identify relationships
- Pairplot to explore multivariate score patterns
- Grouped average scores by:
- Gender
- Race/Ethnicity
- Test Preparation Course
- 🧠 Reading & Writing scores are highly correlated
- 👩🎓 Female students score higher in Reading and Writing
- 👨🎓 Male students slightly outperform in Math
- 📚 Students who completed test prep scored significantly higher
- 🍽️ Students with standard lunch performed better
- 🏅 Group E students had the best overall performance
EDA_Student Performance.ipynb– Full Python analysis notebookExploratory Data Analysis Report.pdf– Final summary report in PDF formatStudentsPerformance.csv– Original dataset used for analysisTASK 5 DA.pdf– Screenshot-based visual report submissionREADME.md– Project overview and instructions
This EDA highlights the impact of demographics and preparation on academic performance.
The findings can help educators design targeted academic interventions and provide data-driven support to students.
✍️ Somya Sinha
📅 Task 5 – Internship Project (April 2025)
🔗 LinkedIn Profile
💬 Feel free to connect, discuss insights, or suggest improvements!