CHATDB-FOR-LARGE-SCALE-ENTERTAINMENT-DATASETS

CHATDB FOR LARGE-SCALE ENTERTAINMENT DATASETS

Final capstone project for Foundations of Management Course (DSCI 551) at USC Viterbi. Full-stack AI chatbot enabling natural language querying and modification of 20M+ records across Postgres, MySQL, and MongoDB. Built with Python, Spark, LangChain, and GPT-3.5; implemented secure APIs with authentication and access control for production-like environments.

To access our full dataset, please visit this Drive Link. The original dataset can be found on Kaggle

To install all required libraries and packages

pip install -r requirements.txt

to install MySQL

brew install mysql

to start MySQL Service

brew services start mysql

to install Streamlit

pip install streamlit

to install langchain

pip install langchain, langchain_community, langchain_openai

to install pymysql

pip install pymysql

Upload to MySQL

login to MySQL

mysql -u root -p

Enter password
create database

CREATE DATABASE project551;

Use created database

USE project551;

Upload dataset

source dataset.sql

Upload to PostGreSQL

login to PostGreSQL

psql -U postgres

create database

CREATE DATABASE project551;

connect to created database

\c project551

Upload dataset

\i dataset_psql.sql

Upload to MongoDB

Place your JSON file

Save the JSON file you want to upload to MongoDB in our CHATDB-FOR-LARGE-SCALE-ENTERTAINMENT-DATASETS folder

Edit the script with your MongoDB credentials

In the mongo_db.py file, find the init_database function and call it with your MongoDB username, password, and appName

Update the file path and collection name

In upload_data_to_mongo function, update the file path to match the location of your JSON file -- as well as the collection name

Run the Upload

Call the upload_data_to_mongo function to upload collections to the MongoDB database

Start streamlit to interact with our NLI real-time

streamlit run app.py

streamlit run mongo_db.py

Repository Structure

|--requirements.txt

|--README.md

|--code/

|--app.py # Used to create NLI and generate queries for MySQL/PostgreSQL
|--mongo_db.py  # Used to create NLI and generate queries for MongoDB
|--dataset.sql   # Used to upload data to MySQL
|--dataset_psql.sql  # Used to upload data to PostgreSQL

|--reports/

|--Draft- Group Proposal.pdf 
|--Mid Progress Report.pdf
|--551_ Group Proposal Final.pdf
|--CHATDB_Final_Report.pdf

OpenAI API Keys

For privacy and security reasons, we have not attached the API keys used in the Github.

However, we have added comments in the code indicating when to use your personal API key to replace the variable OPENAI_API_KEY.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

CHATDB-FOR-LARGE-SCALE-ENTERTAINMENT-DATASETS

To access our full dataset, please visit this Drive Link. The original dataset can be found on Kaggle

To install all required libraries and packages

to install MySQL

to start MySQL Service

to install Streamlit

to install langchain

to install pymysql

Upload to MySQL

Upload to PostGreSQL

Upload to MongoDB

Start streamlit to interact with our NLI real-time

Repository Structure

OpenAI API Keys

However, we have added comments in the code indicating when to use your personal API key to replace the variable OPENAI_API_KEY.

About

Uh oh!

Releases

Packages

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 78 Commits
code		code
reports		reports
README.md		README.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

CHATDB-FOR-LARGE-SCALE-ENTERTAINMENT-DATASETS

To access our full dataset, please visit this Drive Link. The original dataset can be found on Kaggle

To install all required libraries and packages

to install MySQL

to start MySQL Service

to install Streamlit

to install langchain

to install pymysql

Upload to MySQL

Upload to PostGreSQL

Upload to MongoDB

Start streamlit to interact with our NLI real-time

Repository Structure

OpenAI API Keys

However, we have added comments in the code indicating when to use your personal API key to replace the variable OPENAI_API_KEY.

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Uh oh!

Contributors

Uh oh!

Languages

Packages