Skip to main content

How it Works

The project follows a modern vector search architecture:

  1. Data Source (data.json): A simple JSON file acts as the database, holding the Q&A pairs with stable IDs.
  2. Embedding Generation (OpenAI): When indexing, text content (Q: {question} A: {answer}) is sent to the OpenAI API to be converted into a numerical representation (a 1,536-dimension vector embedding).
  3. Vector Database (Vertex AI): The generated embeddings and their corresponding IDs are stored in a Vertex AI Matching Engine index. This database is highly optimized for finding the "nearest" vectors to a given query vector.
  4. Indexing Script (add_to_index.py): This script is a utility for performing a batch update. It reads data.json, generates embeddings for all entries, and uploads them to Vertex AI to build or update the index.
  5. API Server (main.py): A FastAPI server that exposes a /query endpoint. It takes a user's question, generates an embedding for it, queries Vertex AI to get the IDs of the most similar Q&A pairs, and then retrieves the full Q&A content from data.json to return to the user.
+-----------------+      +----------------------+      +-----------------+
| | | | | |
| data.json <------> main.py <------> User |
| (Q&A Content) | | (FastAPI Server) | | (via /docs) |
| | | | | |
+-------+---------+ +-----------+----------+ +-----------------+
^ |
| (Lookup by ID) | (Query with embedding)
| v
+-------+--------------------------+-------------------------+
| |
| Google Cloud Vertex AI |
| (Vector Search Index) |
| |
+------------------------------------------------------------+
^
| (Batch update with embeddings)
|
+-------+---------+
| |
| add_to_index.py |
| (Indexing Tool) |
| |
+-----------------+