Skip to content

Jupyter notebook for the Kaggle "Predict HIV Progression" competition. This is a practice project to demonstrate my python and machine learning skills. I completed this project without looking at of the solutions submitted by other competitors.

Notifications You must be signed in to change notification settings

claudiofr/predict_hiv_progression

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 

Repository files navigation

predict_hiv_progression

Jupyter notebook for the Kaggle "Predict HIV Progression" competition. This is a practice project to demonstrate my python and machine learning skills.

In this project I build a model to predict a patient's short term disease progression based on the nucleotide sequences of two enzymes and 2 other scalar patient parameters. We use the dataset that was used in an old Kaggle competition located here: https://www.kaggle.com/c/hivprogression/data

The solution to the competition is present on the site, but I did not examine the solution nor any of the submitted notebooks. I developed the solution presented here independently.

I first do some prelimary exploration of the data. This is followed by a number of feature engineering steps that involve transformations of the data.

I develop a model and experiment with various parameters. I evaluate the parameter combinations using hold out validation with the StratifiedKFold class using only the training set. For each best combination we then run predictions on the test set to get an accuracy score.

The code is organized into a series of classes and functions that can be used to perform the various data transformation and model testing tasks required.

About

Jupyter notebook for the Kaggle "Predict HIV Progression" competition. This is a practice project to demonstrate my python and machine learning skills. I completed this project without looking at of the solutions submitted by other competitors.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published