Skip to content

A lightweight RAG for embedding and querying codebases. This service allows users to download GitHub repositories, process their code, store it in a vector database, and query it using natural language to retrieve relevant code snippets. Perfect for building intelligent code search and analysis tools.

Notifications You must be signed in to change notification settings

hashir-ayaz/codebase-rag

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

57 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Codebase Embedding and Querying API

This project provides a RESTful API for embedding and querying codebases. It enables users to download a GitHub repository, process the codebase, store it in a vector database, and query it for relevant information.


Features

  • Embed Codebase:

    • Downloads a GitHub repository using a URL.
    • Processes and chunks the codebase.
    • Stores the processed data in a vector database.
  • Query Codebase:

    • Accepts a natural language query.
    • Searches the vector database for the most relevant code chunks.
    • Returns the results for further processing.

API Endpoints

1. Embed Codebase

  • POST /api/embed-codebase
  • Description: Downloads and processes a GitHub repository, embedding its code into a vector database.
  • Request Body:
    {
      "repoUrl": "https://github.com/user/repository.git"
    }
  • Response:
    {
      "message": "Repository downloaded and code collected."
    }

2. Query Codebase

  • POST /api/query
  • Description: Queries the vector database with a natural language query.
  • Request Body:
    {
      "query": "What does the codebase do?",
      "folderName": "user_repository_timestamp"
    }
  • Response:
    {
      "response": [
        /* relevant code chunks */
      ]
    }

Installation

  1. Clone the Repository:

    git clone https://github.com/your-repo/codebase-embedding-api.git
    cd codebase-embedding-api
  2. Install Dependencies:

    npm install
  3. Build the Project:

    npm run build
  4. Start Server (TypeScript):

    node dist/server.js

Usage

  1. Start the server:
    npm run start
  2. Use an API client (like Postman or cURL) to interact with the endpoints.

Requirements

  • Node.js (v18+)
  • NPM
  • TypeScript

Folder Structure

src/
├── server.ts             # Main server file
├── utils.ts              # Utility functions
├── code_service.ts       # Code processing logic
├── chroma_db.ts          # Vector database functions
cloned_codebases/         # Directory for downloaded repositories

About

A lightweight RAG for embedding and querying codebases. This service allows users to download GitHub repositories, process their code, store it in a vector database, and query it using natural language to retrieve relevant code snippets. Perfect for building intelligent code search and analysis tools.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •