An interactive Streamlit app to explore, cluster, and generate smart recommendations on Amazon products.
Includes preprocessing, KMeans clustering, visualizations (boxplots, t-SNE), and a data-driven suggestion system for sellers.
amazon_insights_app/ βββ dataset/ β βββ amazon.csv # Original raw dataset β βββ amazon_clean.csv # Cleaned dataset β βββ amazon_clustered.csv # Clustered dataset βββ models/ β βββ kmeans_model.pkl # Trained KMeans model βββ notebooks/ β βββ preprocessing.ipynb # Data cleaning and transformation β βββ cluster.ipynb # Clustering analysis and t-SNE βββ pages/ β βββ 1_Descrizione.py # Data overview and stats β βββ 2_Clustering.py # Cluster visualizations β βββ 3_Raccomandazioni.py # Interactive recommendation system βββ app.py # Main Streamlit app entry point βββ requirements.txt
The dataset contains real Amazon product information:
- Product name, price (actual/discounted), discount percentage
- Ratings and number of reviews
- Product category and description
- Image and purchase links
- π§Ύ Data Overview: Preview data, inspect basic stats, plot price distributions.
- π€ Clustering: Group products by pricing, rating and discount using KMeans.
- π Visualizations: Boxplots, category counts, t-SNE plots for dimensionality reduction.
- π¦ Recommendation Engine: Suggest the best category and product example based on user input.
- Clone the repo:
git clone https://github.com/your_username/amazon_insights_app.git cd amazon_insights_app
Install dependencies: pip install -r requirements.txt
Run the app: streamlit run app.py
π§ Built With Python π Streamlit β for building the interactive UI Pandas & NumPy β for data wrangling Scikit-Learn β for clustering and preprocessing Matplotlib & Seaborn β for beautiful charts
π€ Author Developed with β€οΈ by Dante Trabassi Feel free to open an issue or contribute via pull request!
License MIT