This document explains how to use Ollama with semindex to leverage GPU-accelerated language models for enhanced code understanding and generation.
-
Install and run Ollama on your system:
- Visit https://ollama.ai to download and install Ollama
- Make sure Ollama is running (typically accessible at
http://localhost:11434)
-
Pull a model you'd like to use:
ollama pull llama3 # or for code-specific models: ollama pull codellama:7b ollama pull deepseek-coder:6.7b
To perform a query and get an AI-generated response using Ollama:
semindex query "Explain how authentication works in this codebase" --ollamaTo use a specific Ollama model:
semindex query "How do I add a new user?" --ollama --ollama-model codellama:7bYou can combine Ollama with other features:
semindex query "Generate a test for the user authentication function" \
--ollama \
--ollama-model codellama:7b \
--top-k 5 \
--hybrid \
--include-docsYou can set these environment variables to customize Ollama behavior:
SEMINDEX_OLLAMA_MODEL: Default model to use (default: llama3)SEMINDEX_OLLAMA_BASE_URL: Base URL for Ollama API (default: http://localhost:11434)
semindex query "Explain the user management system" --ollamasemindex query "Show me how password hashing is implemented" --ollamasemindex query "Generate documentation for the API endpoints" --ollamasemindex query "How can I optimize the database queries in this module?" --ollamaWithout Ollama (just retrieves relevant code snippets):
semindex query "How does user authentication work?"With Ollama (retrieves code + generates AI explanation):
semindex query "How does user authentication work?" --ollamaThe Ollama-enhanced query will not only show relevant code snippets but also provide an AI-generated explanation of how the authentication system works, making it more similar to tools like Claude Code or GitHub Copilot.
- Use code-specific models like
codellama:7bordeepseek-coder:6.7bfor better code understanding - Adjust
--top-kto control how many code snippets are provided as context (default is 10) - Use
--max-tokensto control the length of the Ollama response - Consider using
--hybridand--include-docsfor more comprehensive context