Skip to content

Commit 65e042a

Browse files
authored
docs: update google provider docs for implicit caching (vercel#6656)
## Background Google 2.5 model family now supports implicit caching. ## Summary Update docs. ## Tasks - [x] Formatting issues have been fixed (run `pnpm prettier-fix` in the project root)
1 parent 0235765 commit 65e042a

1 file changed

Lines changed: 48 additions & 6 deletions

File tree

content/providers/01-ai-sdk-providers/15-google-generative-ai.mdx

Lines changed: 48 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -240,7 +240,46 @@ See [File Parts](/docs/foundations/prompts#file-parts) for details on how to use
240240

241241
### Cached Content
242242

243-
You can use Google Generative AI language models to cache content:
243+
Google Generative AI supports both explicit and implicit caching to help reduce costs on repetitive content.
244+
245+
#### Implicit Caching
246+
247+
Gemini 2.5 models automatically provide cache cost savings without needing to create an explicit cache. When you send requests that share common prefixes with previous requests, you'll receive a 75% token discount on cached content.
248+
249+
To maximize cache hits with implicit caching:
250+
251+
- Keep content at the beginning of requests consistent
252+
- Add variable content (like user questions) at the end of prompts
253+
- Ensure requests meet minimum token requirements:
254+
- Gemini 2.5 Flash: 1024 tokens minimum
255+
- Gemini 2.5 Pro: 2048 tokens minimum
256+
257+
```ts
258+
import { google } from '@ai-sdk/google';
259+
import { generateText } from 'ai';
260+
261+
// Structure prompts with consistent content at the beginning
262+
const baseContext =
263+
'You are a cooking assistant with expertise in Italian cuisine. Here are 1000 lasagna recipes for reference...';
264+
265+
const { text: veggieLasagna } = await generateText({
266+
model: google('gemini-2.5-pro'),
267+
prompt: `${baseContext}\n\nWrite a vegetarian lasagna recipe for 4 people.`,
268+
});
269+
270+
// Second request with same prefix - eligible for cache hit
271+
const { text: meatLasagna, response } = await generateText({
272+
model: google('gemini-2.5-pro'),
273+
prompt: `${baseContext}\n\nWrite a meat lasagna recipe for 12 people.`,
274+
});
275+
276+
// Check cached token count in usage metadata
277+
console.log('Cached tokens:', response.body.usageMetadata);
278+
```
279+
280+
#### Explicit Caching
281+
282+
For guaranteed cost savings, you can still use explicit caching with Gemini 2.5 and 2.0 models:
244283

245284
```ts
246285
import { google } from '@ai-sdk/google';
@@ -251,30 +290,33 @@ const cacheManager = new GoogleAICacheManager(
251290
process.env.GOOGLE_GENERATIVE_AI_API_KEY,
252291
);
253292

254-
// As of August 23rd, 2024, these are the only models that support caching
293+
// Supported models for explicit caching
255294
type GoogleModelCacheableId =
295+
| 'models/gemini-2.5-pro'
296+
| 'models/gemini-2.5-flash'
297+
| 'models/gemini-2.0-flash'
256298
| 'models/gemini-1.5-flash-001'
257299
| 'models/gemini-1.5-pro-001';
258300

259-
const model: GoogleModelCacheableId = 'models/gemini-1.5-pro-001';
301+
const model: GoogleModelCacheableId = 'models/gemini-2.5-pro';
260302

261303
const { name: cachedContent } = await cacheManager.create({
262304
model,
263305
contents: [
264306
{
265307
role: 'user',
266-
parts: [{ text: '1000 Lasanga Recipes...' }],
308+
parts: [{ text: '1000 Lasagna Recipes...' }],
267309
},
268310
],
269311
ttlSeconds: 60 * 5,
270312
});
271313

272-
const { text: veggieLasangaRecipe } = await generateText({
314+
const { text: veggieLasagnaRecipe } = await generateText({
273315
model: google(model, { cachedContent }),
274316
prompt: 'Write a vegetarian lasagna recipe for 4 people.',
275317
});
276318

277-
const { text: meatLasangaRecipe } = await generateText({
319+
const { text: meatLasagnaRecipe } = await generateText({
278320
model: google(model, { cachedContent }),
279321
prompt: 'Write a meat lasagna recipe for 12 people.',
280322
});

0 commit comments

Comments
 (0)