In this project, I will take on the role of a Data Analyst at the Census Bureau, which collects census data and creates interesting visualizations and insights from it. I will clean the data and organize the data. The first visualization I will make is a scatterplot that shows average income in a state vs proportion of women in that state. From there, I will make histogram and bar graphs based on race data gouped by each state.
I will be using glob to combine files, regex to clean and replace data in columns, fillna and drop_duplicates to handle missing data, and matplotlib to plot the cleaned data into visualization.
The data has been provided and oragnized in 10 files named states[0-9].
State: the name of the stateTotalPop: total population of state in whole numberHispanic: percentage of population with indicated raceWhite: percentage of population with indicated raceBlack: percentage of population with indicated raceNative: percentage of population with indicated raceAsian: percentage of population with indicated racePacific: percentage of population with indicated raceIncome: average income of state in dollar formattingGenderPop: number of population based on gender, expressed asnumberM_numberF