How it Works
The project follows a modern vector search architecture:
- Data Source (
data.json): A simple JSON file acts as the database, holding the Q&A pairs with stable IDs. - Embedding Generation (OpenAI): When indexing, text content (
Q: {question} A: {answer}) is sent to the OpenAI API to be converted into a numerical representation (a 1,536-dimension vector embedding). - Vector Database (Vertex AI): The generated embeddings and their corresponding IDs are stored in a Vertex AI Matching Engine index. This database is highly optimized for finding the "nearest" vectors to a given query vector.
- Indexing Script (
add_to_index.py): This script is a utility for performing a batch update. It readsdata.json, generates embeddings for all entries, and uploads them to Vertex AI to build or update the index. - API Server (
main.py): A FastAPI server that exposes a/queryendpoint. It takes a user's question, generates an embedding for it, queries Vertex AI to get the IDs of the most similar Q&A pairs, and then retrieves the full Q&A content fromdata.jsonto return to the user.
+-----------------+ +----------------------+ +-----------------+
| | | | | |
| data.json <------> main.py <------> User |
| (Q&A Content) | | (FastAPI Server) | | (via /docs) |
| | | | | |
+-------+---------+ +-----------+----------+ +-----------------+
^ |
| (Lookup by ID) | (Query with embedding)
| v
+-------+--------------------------+-------------------------+
| |
| Google Cloud Vertex AI |
| (Vector Search Index) |
| |
+------------------------------------------------------------+
^
| (Batch update with embeddings)
|
+-------+---------+
| |
| add_to_index.py |
| (Indexing Tool) |
| |
+-----------------+