This project provides a RESTful API for embedding and querying codebases. It enables users to download a GitHub repository, process the codebase, store it in a vector database, and query it for relevant information.
-
Embed Codebase:
- Downloads a GitHub repository using a URL.
- Processes and chunks the codebase.
- Stores the processed data in a vector database.
-
Query Codebase:
- Accepts a natural language query.
- Searches the vector database for the most relevant code chunks.
- Returns the results for further processing.
- POST
/api/embed-codebase - Description: Downloads and processes a GitHub repository, embedding its code into a vector database.
- Request Body:
{ "repoUrl": "https://github.com/user/repository.git" } - Response:
{ "message": "Repository downloaded and code collected." }
- POST
/api/query - Description: Queries the vector database with a natural language query.
- Request Body:
{ "query": "What does the codebase do?", "folderName": "user_repository_timestamp" } - Response:
{ "response": [ /* relevant code chunks */ ] }
-
Clone the Repository:
git clone https://github.com/your-repo/codebase-embedding-api.git cd codebase-embedding-api -
Install Dependencies:
npm install
-
Build the Project:
npm run build
-
Start Server (TypeScript):
node dist/server.js
- Start the server:
npm run start
- Use an API client (like Postman or cURL) to interact with the endpoints.
- Node.js (v18+)
- NPM
- TypeScript
src/
├── server.ts # Main server file
├── utils.ts # Utility functions
├── code_service.ts # Code processing logic
├── chroma_db.ts # Vector database functions
cloned_codebases/ # Directory for downloaded repositories