Documentation Index
Fetch the complete documentation index at: https://mintlify.com/thinkex-oss/thinkex/llms.txt
Use this file to discover all available pages before exploring further.
ThinkEx uses Google’s Gemini models via the Vercel AI SDK Gateway for intelligent assistance and content processing.
Primary Models
Gemini 2.5 Flash
Model ID: google/gemini-2.5-flash (default)
Google’s latest fast model optimized for speed and efficiency.
Characteristics:
- Speed: Very fast response times
- Context Window: 1M tokens
- Multimodal: Text, images, video, audio
- Thinking: Dynamic reasoning budget
- Grounding: Google Search integration
Best For:
- General chat conversations
- Quick content analysis
- Real-time assistance
- Web search synthesis
Configuration:
const result = await streamText({
model: gateway("google/gemini-2.5-flash"),
temperature: 1.0,
providerOptions: {
google: {
thinkingConfig: {
includeThoughts: true,
},
},
},
});
Gemini 2.5 Flash Lite
Model ID: google/gemini-2.5-flash-lite
Lightweight version optimized for simple tasks.
Characteristics:
- Speed: Fastest response times
- Context Window: 1M tokens
- Cost: Most economical
- Multimodal: Text, images, video, audio
Best For:
- File processing and analysis
- Web search queries
- Simple content extraction
- Background processing tasks
Usage in ThinkEx:
// Web search tool
const { text } = await generateText({
model: google('gemini-2.5-flash-lite'),
tools: {
googleSearch: google.tools.googleSearch({ mode: 'MODE_UNSPECIFIED' }),
},
prompt: `Search for: ${query}`,
});
// File analysis
const { text } = await generateText({
model: google("gemini-2.5-flash-lite"),
messages: [{
role: "user",
content: [
{ type: "text", text: "Analyze this file..." },
{ type: "file", data: fileUrl, mediaType: "application/pdf" },
],
}],
});
Gemini 3 Flash Preview
Model ID: google/gemini-3-flash-preview
Next-generation Gemini model with enhanced reasoning.
Characteristics:
- Thinking: Explicit thinking levels (minimal, standard, deep)
- Context Window: 1M+ tokens
- Reasoning: Enhanced multi-step reasoning
- Multimodal: Advanced vision and audio understanding
Best For:
- Complex problem solving
- Multi-step reasoning tasks
- Advanced content analysis
- Research and synthesis
Configuration:
providerOptions: {
google: {
thinkingConfig: {
includeThoughts: true,
thinkingLevel: "minimal", // "minimal" | "standard" | "deep"
},
},
}
Thinking Levels:
- minimal: Quick reasoning for simple tasks
- standard: Balanced reasoning for most tasks
- deep: Extended reasoning for complex problems
Model Selection
The chat API accepts a modelId parameter:
POST /api/chat
{
"modelId": "google/gemini-2.5-flash",
"messages": [...],
...
}
Auto-Prefixing:
If you provide a model ID without a provider prefix (e.g., gemini-2.5-flash), it’s automatically prefixed with google/:
// These are equivalent:
"gemini-2.5-flash" → "google/gemini-2.5-flash"
Default Model:
If no modelId is specified, the default is google/gemini-2.5-flash.
Model Capabilities
Multimodal Support
All Gemini models support multiple content types:
Text:
{ type: "text", text: "Analyze this content..." }
Images:
{
type: "file",
data: imageUrl, // or base64 data URL
mediaType: "image/jpeg",
filename: "photo.jpg",
}
Videos:
{
type: "file",
data: "https://youtube.com/watch?v=...",
mediaType: "video/mp4",
}
PDFs:
{
type: "file",
data: pdfUrl,
mediaType: "application/pdf",
filename: "document.pdf",
}
Audio:
{
type: "file",
data: audioUrl,
mediaType: "audio/mpeg",
filename: "audio.mp3",
}
All models support function calling:
tools: {
createNote: tool({
description: "Create a note card",
inputSchema: z.object({
title: z.string(),
content: z.string(),
}),
execute: async ({ title, content }) => {
// Implementation
},
}),
}
Grounding
Gemini models support web grounding:
providerOptions: {
google: {
grounding: {
// Google Search integration
},
},
}
ThinkEx uses explicit webSearch tool instead of automatic grounding for better control and source attribution.
Provider Configuration
Google AI Studio
Setup:
- Get API key from Google AI Studio
- Add to environment:
GOOGLE_GENERATIVE_AI_API_KEY=AIza...
Rate Limits:
- Free tier: 15 requests/minute
- Paid tier: Higher limits based on plan
AI Gateway
Optional: Use Vercel AI Gateway for enhanced routing:
AI_GATEWAY_API_KEY=your-gateway-key
Benefits:
- Automatic failover between providers
- Load balancing across models
- Centralized logging and monitoring
- Cost optimization
Web Search
// src/lib/ai/tools/web-search.ts
const { text } = await generateText({
model: google('gemini-2.5-flash-lite'),
tools: {
googleSearch: google.tools.googleSearch({ mode: 'MODE_UNSPECIFIED' }),
},
prompt: query,
});
File Processing
// src/lib/ai/tools/process-files.ts
const { text } = await generateText({
model: google("gemini-2.5-flash-lite"),
messages: [{
role: "user",
content: [
{ type: "text", text: batchPrompt },
...fileInfos.map(f => ({
type: "file",
data: f.fileUrl,
mediaType: f.mediaType,
filename: f.filename,
})),
],
}],
});
URL Processing
// src/lib/ai/tools/process-urls.ts
const { text } = await generateText({
model: google("gemini-2.5-flash"),
prompt: `Analyze content from: ${url}...`,
});
Context Caching
Long context is automatically cached:
onFinish: ({ usage }) => {
console.log({
cachedInputTokens: usage?.cachedInputTokens,
inputTokens: usage?.inputTokens,
});
}
Message Pruning
Reduce token usage by pruning old messages:
const prunedMessages = pruneMessages({
messages: convertedMessages,
reasoning: "before-last-message",
toolCalls: "before-last-5-messages",
emptyMessages: "remove",
});
Streaming
Use streaming for better perceived performance:
const result = streamText({
model,
messages,
experimental_transform: smoothStream({
chunking: "word",
delayInMs: 15,
}),
});
Token Usage Tracking
Per-Step Tracking
onStepFinish: (result) => {
const { usage, finishReason } = result;
console.log({
stepType: result.stepType,
inputTokens: usage?.inputTokens,
outputTokens: usage?.outputTokens,
reasoningTokens: usage?.reasoningTokens,
});
}
Final Usage
onFinish: ({ usage, finishReason }) => {
console.log({
totalTokens: usage?.totalTokens,
cachedInputTokens: usage?.cachedInputTokens,
finishReason,
});
}
Experimental Features
Claude Support (Experimental)
ThinkEx has experimental support for Anthropic’s Claude:
// Special mapping: Claude Sonnet 4.5 → Gemini 3 Flash Preview
if (modelId === "anthropic/claude-sonnet-4.5") {
modelId = "google/gemini-3-flash-preview";
}
Claude support is experimental and not fully tested. Stick with Gemini models for production use.
Cost Optimization
Model Selection Strategy
-
Simple tasks →
gemini-2.5-flash-lite
- File analysis
- Web search
- Content extraction
-
General chat →
gemini-2.5-flash
- User conversations
- Content generation
- Tool orchestration
-
Complex reasoning →
gemini-3-flash-preview
- Multi-step problems
- Research synthesis
- Advanced analysis
Caching Strategy
- PDFs: Cache OCR results after first extraction
- Messages: Use context caching for long conversations
- Files: Store processed results in database
Error Handling
Rate Limit Errors
try {
const result = await streamText({ model, ... });
} catch (error) {
if (error.status === 429) {
// Rate limit exceeded
// Implement exponential backoff
}
}
Timeout Protection
const result = await streamText({
model,
messages,
stopWhen: stepCountIs(25), // Prevent infinite loops
});
Next Steps
AI Overview
Learn about AI architecture and features
AI Tools
Explore available AI tools