OpenAI Embeddings for Needs Clustering

This repository contains a Jupyter notebook that fetches and processes data from a CSV file, computes embeddings for data entries using OpenAI's API, and provides recommendations based on the closest matches (knn) found through these embeddings.

Installation

To use this notebook, you'll need to install the following Python libraries:

openai [Note: You'll need your API key. See Quickstart.]
python-dotenv
pandas
numpy
tenacity
pickle
tiktoken
nomic [Note: Nomic requires an account. See Quickstart.]

You can install these using pip:

pip install openai python-dotenv pandas numpy tenacity pickle tiktoken nomic

Environment Variables

You'll need to set the following environment variables:

OPENAI_API_KEY: Your OpenAI API key.

You can do this by creating a .env file in the root directory of this project and adding the following line:

OPENAI_API_KEY=your_openai_api_key

Replace your_openai_api_key with your actual OpenAI API key.

Data Format

The data is expected to be in a CSV format file named Problem_Intake_CurrentVers_TEST.csv in the source_data directory.

The CSV file should have the following columns:

date: The date of the data entry.
need: The need statement.
contact: The contact information.
dept: The department associated with the need statement.

Usage

To use the notebook, simply open it in your Jupyter notebook environment and run the cells sequentially. The notebook will:

Fetch data from the CSV file.
Compute the embeddings for each need statement using the OpenAI API.
Provide recommendations based on the closest matches found through these embeddings.

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Please make sure to update the tests as appropriate.

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 23 Commits
source_data		source_data
.gitignore		.gitignore
Needs.code-workspace		Needs.code-workspace
Needs_embeddings.ipynb		Needs_embeddings.ipynb
README.md		README.md
environment.yml		environment.yml
needs.pkl		needs.pkl
needs_main.pkl		needs_main.pkl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

OpenAI Embeddings for Needs Clustering

Table of Contents

Installation

Environment Variables

Data Format

Usage

Contributing

License

About

Uh oh!

Languages

akantuncch/Needs

Folders and files

Latest commit

History

Repository files navigation

OpenAI Embeddings for Needs Clustering

Table of Contents

Installation

Environment Variables

Data Format

Usage

Contributing

License

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Languages