In this project, the user ratings from a subset of Amazon reviews are used to train various recommendation systems by evaluating various algorithms with hyperparameter optimization. The models return a list of recommended items based on the reviewer's previous actions.
The Movies_and_TV ratings data was retrieved from Recommender Systems and Personalization Datasets. This includes (item, user, rating, timestamp) tuples. A subset of the data was used in the analysis due to computation constraints due to sparcity.
Models were trained using:
surpriselibrary- A ranking model using
tensorflow-recommenders - Collaborative filtering models using
tensorflow scipylibrary- A popularity recommender model
For the initial training of the models using surprise:
- The default parameters of
BaselineOnly,KNNBaseline,KNNBasic,KNNWithMeans,KNNWithZScore,CoClustering,SVD,SVDpp,NMFandNormalPredictorwere evaluated using thecross_validatemethod using 3-fold cross validation to determine which algorithm yielded the lowestRMSEerrors. - Then hyperparameter optimization was performed to find the best parameters for the four model types with the lowest
RMSEerrors (SVDpp,SVD,BaselineOnlyandKNNBaseline).
For the construction of the ranking recommender model:
- The data was loaded in a
tensorflowdataset and partitioned into train/test sets. - A vocabulary was generated to map the feature values to embedding vectors.
- The model used an
embedding_dimension=64and contained multiple stacked dense layers withactivation='relu'. - Raw features were used as input to
compute_lossandMeanSquaredErrorwas used for the loss metric. - The model was fit for 20
epochsand evaluated. - The ranking model was tested by computing predictions for items and ranking by the predictions made.
- The model was saved and exported for serving purpose in
TensorFLow Lite.
For the training of the SVD based models using SciPy:
- A rating matrix with items and reviewers was constructed.
- The parameters for the model were
U, sigma, Vtwere tested inrandomized_svdandTruncatedSVDTruncatedSVD. - A diagonal matrix was constructed in
SVD. - Then the ratings were predicted and
RMSEwas calculated.
For the construction of the popularity recommender model:
- A recommendation score was created by counting each reviewer for each unique item.
- This score was sorted and a recommendation rank was created based on scoring.
- Predictions were then calculated for various reviewers.