A Kotlin library for interacting with various Large Language Model (LLM) providers through a unified interface. Supports OpenAI and Google Gemini with a simple, consumer-friendly API.
- Unified Interface: Single API for multiple LLM providers (OpenAI, Gemini)
- Security First: No hardcoded API keys - consumers must provide their own keys explicitly
- Explicit HTTP Client: Consumers must provide and pass their own HttpClient instance (no library-provided HTTP client)
- Type-Safe: Kotlin serialization for structured request/response handling
- Production Ready: Built-in timeouts, connection pooling, and error handling
- Factory Pattern: Instantiate once with API keys and HttpClient; create provider clients without repeating secrets
- Extensible: Clean architecture for adding new providers
- Streaming: Stream partial responses via Kotlin Flow; design details in docs/STREAMING_API_DESIGN.md
repositories {
mavenCentral()
}
dependencies {
implementation("io.github.researchforyounow:llm-clients:0.7.5")
}<dependency>
<groupId>io.github.researchforyounow</groupId>
<artifactId>llm-clients</artifactId>
<version>0.7.5</version>
</dependency>SECURITY NOTICE: This library requires you to provide your own API keys explicitly. No hardcoded API keys are included for security reasons.
- OpenAI API Key: Get from https://platform.openai.com/api-keys
- Gemini API Key: Get from https://makersuite.google.com/app/apikey
// Call once at application startup with your API keys
fun initializeLibrary(): LlmClientFactory {
val httpClient = ExampleHttpClient.createRecommendedHttpClient()
return LlmClientFactory(
httpClient = httpClient,
openAiApiKey = System.getenv("OPENAI_API_KEY") ?: "your-openai-api-key-here",
geminiApiKey = System.getenv("GEMINI_API_KEY") ?: "your-gemini-api-key-here"
)
}Note: This library does not use keys.properties. Configure API keys via environment variables or your secrets manager. The factory holds your keys and shared HttpClient so callers only supply model options.
val factory = initializeLibrary()
val openAiClient = factory.createOpenAiClient(
config = OpenAiConfig(
modelName = Models.GPT_4O_2024_08_06,
temperature = 0.3
)
)
val geminiClient = factory.createGeminiClient(
config = GeminiConfig(
model = GeminiModel.GEMINI_1_5_FLASH_LATEST
)
)
// Make a structured request
val structuredResult = openAiClient.generate(
request = GenerationRequest.of(
"What is the capital of Japan?",
"You are a helpful assistant."
),
responseType = MyDataClass::class.java
)
// Or get plain text easily
val textResult = openAiClient.generateText(
GenerationRequest.of("Tell me a short joke about Kotlin")
)
structuredResult.fold(
onSuccess = { response -> println("Success: $response") },
onFailure = { error -> println("Error: ${error.message}") }
)For a comprehensive, production-ready example showing:
- One-time library initialization
- Multiple client configurations
- Error handling and best practices
- Real-world usage patterns
- GPT-4o (default:
gpt-4o-2024-08-06) - GPT-4.1 family
- Search preview models (
gpt-4o-search-preview,gpt-4o-mini-search-preview) - Custom models via configuration
- Gemini 1.5 Flash (default:
gemini-1.5-flash-latest) - Gemini 1.5 Pro
- Custom models via configuration
val client = factory.createOpenAiClient(
config = OpenAiConfig(
modelName = Models.GPT_4O_2024_08_06, // Optional
temperature = 0.7, // Optional (0.0-2.0)
maxTokens = 2000 // Optional
),
)val client = factory.createGeminiClient(
config = GeminiConfig(
model = GeminiModel.GEMINI_1_5_FLASH_LATEST // Optional
),
)Both OpenAiConfig and GeminiConfig accept an optional retryPolicy parameter.
Retries use exponential backoff with jitter and are only applied when an idempotencyKey
is supplied in the GenerationRequest.
val policy = RetryPolicy(maxAttempts = 3, initialDelayMillis = 200, jitterMillis = 100)
val client = factory.createOpenAiClient(
config = OpenAiConfig(retryPolicy = policy)
)@Serializable
data class StructuredResponse(
val answer: String,
val confidence: Double,
val reasoning: String
)
val result = client.generate(
request = GenerationRequest.of(
"Explain photosynthesis",
"Respond in JSON format with answer, confidence, and reasoning fields."
),
responseType = StructuredResponse::class.java
)// Different clients for different use cases
val creativeClient = factory.createOpenAiClient(
config = OpenAiConfig(temperature = 0.9)
)
val factualClient = factory.createOpenAiClient(
config = OpenAiConfig(temperature = 0.1)
)Both OpenAI and Gemini can report token usage information. Supply a usageSink
in the configuration to observe normalized metrics:
val client = factory.createOpenAiClient(
config = OpenAiConfig(
usageSink = { usage ->
println("prompt=${usage.promptTokens} completion=${usage.completionTokens}")
}
)
)The LlmUsage model normalizes provider-specific fields (e.g., OpenAI
prompt_tokens/completion_tokens). Providers that don't return usage simply
never invoke the sink.
Search-preview models use chat completions with web_search_options. Enable it
via enableWebSearch = true and omit sampling params (they are ignored).
@Serializable
data class WebSearchResult(
val answer: String,
val sources: List<String>,
)
val client = factory.createOpenAiClient(
config = OpenAiConfig(
modelName = Models.GPT_4O_MINI_SEARCH_PREVIEW,
responseFormat = ResponseFormat.JSON_SCHEMA,
jsonSchemaName = "web_search_result",
jsonSchema = """
{
"type": "object",
"properties": {
"answer": { "type": "string" },
"sources": { "type": "array", "items": { "type": "string" } }
},
"required": ["answer", "sources"]
}
""".trimIndent(),
enableWebSearch = true,
),
)This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
This library follows a small, stable public API surface centered on the LlmClient interfaces and the request/response models shown in the examples. Implementation details that live in core-api.
| Provider | Generate (sync) | Streaming (Flow) | Images | Structured JSON parsing | Retry policy | Usage metrics (LlmUsageSink) | Error mapping (LlmError) | Notes |
|---|---|---|---|---|---|---|---|---|
| OpenAI | Yes | Yes | Yes | Yes (native response_format supported) | Yes (exponential backoff + jitter) | Yes | Yes | Default model: gpt-4o-2024-08-06 |
| Google Gemini | Yes | Yes | No | Yes (prompt-guided parsing) | Yes (exponential backoff + jitter) | Yes | Yes | Default model: gemini-1.5-flash-latest |
- Generate images using OpenAI Images API via OpenAiClient.generateImage.
- Returns either URLs or base64-encoded JSON depending on ImageResponseFormat.
- Models and sizes are constrained by OpenAI (e.g., DALL·E 3 supports 1024x1024, 1024x1792, 1792x1024 with n=1).
- See full example: examples/src/main/kotlin/examples/OpenAiImageExample.kt
Example:
val client = factory.createOpenAiClient(OpenAiConfig.defaultConfig())
val imgReq = ImageGenerationRequest(
prompt = "A watercolor painting of a mountain at sunrise",
n = 1,
size = "1024x1024",
responseFormat = ImageResponseFormat.URL,
model = OpenAiImageModel.DALL_E_3
)
val result = client.generateImage(imgReq)
result.onSuccess { images ->
images.forEach { println(it.url ?: "[base64 image]") }
}The LlmClientFactory injects API keys from your environment or secrets manager (see Quick Start). You configure per-provider behavior via config objects when creating clients.
| Parameter | Type | Default | Notes |
|---|---|---|---|
| modelName | String | gpt-4o-2024-08-06 | Any OpenAI model id. Helpers in Models. |
| temperature | Double | 0.28 | Range 0.0..2.0. |
| maxTokens | Int | 4000 | Must be > 0. |
| topP | Double | 1.0 | Range 0.0..1.0. |
| frequencyPenalty | Double | 0.0 | Range -2.0..2.0. |
| presencePenalty | Double | 0.0 | Range -2.0..2.0. |
| stopSequences | List | [] | Up to 4 sequences. |
| seed | Int? | null | Optional deterministic seed. |
| responseFormat | ResponseFormat | JSON_OBJECT | TEXT, JSON_OBJECT, or JSON_SCHEMA. |
| jsonSchema | String? | null | JSON schema object; wrapper with name is auto-added. |
| jsonSchemaName | String | response | Required by OpenAI for JSON_SCHEMA. |
| user | String? | null | Optional user identifier for OpenAI. |
| logitBias | Map<String, Double> | {} | Up to 300 entries, values -100..100. |
| stream | Boolean | false | If true, enables streaming on request. |
| enableWebSearch | Boolean | false | Adds web_search_options and omits sampling params. |
| apiUrl | String | https://api.openai.com/v1/chat/completions | Base URL for Chat Completions. |
| retryPolicy | RetryPolicy | NO_RETRY | Exponential backoff with jitter when set. |
| usageSink | LlmUsageSink? | null | Callback for normalized usage metrics. |
| apiKey | String | "" | Injected by LlmClientFactory (e.g., OPENAI_API_KEY). |
| Parameter | Type | Default | Notes |
|---|---|---|---|
| model | GeminiModel | GeminiModel.GEMINI_1_5_FLASH_LATEST | Use enum. API model id available as model.modelName. |
| temperature | Double | 0.7 | Range 0.0..2.0. |
| topK | Int | 40 | Must be > 0. |
| topP | Double | 0.95 | Range 0.0..1.0. |
| maxOutputTokens | Int | 2048 | Must be > 0. |
| candidateCount | Int | 1 | Number of candidates to return. |
| stopSequences | List | [] | Optional stop sequences. |
| apiUrl | String | auto | When empty, computed as https://generativelanguage.googleapis.com/v1beta/models/{model.modelName}:generateContent. |
| retryPolicy | RetryPolicy | NO_RETRY | Exponential backoff with jitter when set. |
| usageSink | LlmUsageSink? | null | Callback for normalized usage metrics. |
| apiKey | String | "" | Injected by LlmClientFactory (e.g., GEMINI_API_KEY). |
- OPENAI_API_KEY: OpenAI secret used by LlmClientFactory.
- GEMINI_API_KEY: Gemini secret used by LlmClientFactory.
See examples in examples/ for usage, including streaming and structured responses.
val audioFile = AudioFile(
bytes = File("/path/to/audio.mp3").readBytes(),
fileName = "audio.mp3",
contentType = "audio/mpeg",
)
val request = AudioTranscriptionRequest(
file = audioFile,
model = "gpt-4o-transcribe",
responseFormat = AudioResponseFormat.TEXT,
)
val result = openAiClient.transcribe(request).getOrThrow()
println(result.text)val request = AudioTranscriptionRequest(
file = audioFile,
model = "gpt-4o-mini-transcribe",
responseFormat = AudioResponseFormat.TEXT,
stream = true,
)
openAiClient.streamTranscription(request).collect { event ->
println(event.type)
}val request = RealtimeTranscriptionSessionRequest(
transcriptionModel = "gpt-4o-mini-transcribe",
// language defaults to "en"
)
val session = openAiClient.createRealtimeTranscriptionSession(request).getOrThrow()
val connection = openAiClient.openRealtimeTranscriptionConnection(request, session).getOrThrow()
val controller = RealtimeTranscriptionController(connection)
controller.start()
// appendAudio(...) with PCM16 bytes from your mic