This is a repository for my Master's thesis about machine learning algorithms for classification in Python. It contains four data sets that I will use to test some classification algorithms.
All the datasets can be found at the UC Irvine Machine Learning Repository in csv format with their description. The data sets being used for this project are listed below:
- Bank Marketing Data Set: https://archive.ics.uci.edu/ml/datasets/Bank+Marketing
- Car Evaluation Data Set: https://archive.ics.uci.edu/ml/datasets/Car+Evaluation
- Default of credit card clients Data Set: https://archive.ics.uci.edu/ml/datasets/default+of+credit+card+clients
- Auto MPG Data Set: https://archive.ics.uci.edu/ml/datasets/Auto+MPG
Use the package manager pip to install the dependencies for the project listed below:
- pandas
- pandas-profiling
- scikit-learn
- matplotlib
- jupyterlab
- seaborn
or just use the requirements.txt file found here to install them as below:
pip install -r requirements.txtTo launch a new Jupyter notebook open a command prompt and type:
jupyter notebookAfter opening the browser, open an .ipynb file and run it. You can also add your code or edit the existing one to your desires.
Pull requests are welcome.