Skip to content

Exploratory Data Analysis (EDA) using Python (Pandas, Matplotlib, Seaborn).

Notifications You must be signed in to change notification settings

sinhasomya100/Task-5

Repository files navigation

📊 Exploratory Data Analysis (EDA) – Student Performance Dataset

This project presents a detailed exploratory analysis of the StudentsPerformance.csv dataset.
The goal is to uncover insights into how different factors such as gender, parental education, lunch type, and test preparation affect students' academic performance in Math, Reading, and Writing.


📁 About the Dataset

  • File Name: StudentsPerformance.csv
  • Records: 1000 students
  • Features: 8 columns
  • Data Types: Categorical + Numerical

🎯 Features Included:

  • gender
  • race/ethnicity
  • parental level of education
  • lunch (standard or free/reduced)
  • test preparation course (completed or not)
  • math score
  • reading score
  • writing score

🛠️ Tools & Libraries Used

  • Python 3 (Google Colab)
  • Pandas
  • Seaborn
  • Matplotlib
  • FPDF / MS Word (for report generation)

🧪 EDA Steps Followed

✅ Step 1: Data Loading & Cleaning

  • Loaded the dataset using pandas.read_csv()
  • Cleaned column names for consistency
  • Checked for missing values and duplicates

✅ Step 2: Dataset Overview

  • Viewed top records using .head()
  • Inspected data types using .info()
  • Generated summary statistics with .describe()

✅ Step 3: Univariate Analysis

  • Countplots for categorical columns like gender, lunch, etc.
  • Histograms and boxplots for scores
  • Outlier and distribution detection

✅ Step 4: Correlation & Pairplot

  • Correlation heatmap to identify relationships
  • Pairplot to explore multivariate score patterns

✅ Step 5: Bivariate Analysis

  • Grouped average scores by:
    • Gender
    • Race/Ethnicity
    • Test Preparation Course

📌 Key Insights

  • 🧠 Reading & Writing scores are highly correlated
  • 👩‍🎓 Female students score higher in Reading and Writing
  • 👨‍🎓 Male students slightly outperform in Math
  • 📚 Students who completed test prep scored significantly higher
  • 🍽️ Students with standard lunch performed better
  • 🏅 Group E students had the best overall performance

📄 Files Included

  • EDA_Student Performance.ipynb – Full Python analysis notebook
  • Exploratory Data Analysis Report.pdf – Final summary report in PDF format
  • StudentsPerformance.csv – Original dataset used for analysis
  • TASK 5 DA.pdf – Screenshot-based visual report submission
  • README.md – Project overview and instructions


📌 Conclusion

This EDA highlights the impact of demographics and preparation on academic performance.
The findings can help educators design targeted academic interventions and provide data-driven support to students.


📬 Author

✍️ Somya Sinha
📅 Task 5 – Internship Project (April 2025)
🔗 LinkedIn Profile
💬 Feel free to connect, discuss insights, or suggest improvements!

About

Exploratory Data Analysis (EDA) using Python (Pandas, Matplotlib, Seaborn).

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published