The Vercel AI SDK is the fastest way to add AI to a Next.js app. It handles streaming, manages state on the client, supports tool calls, and works with any AI provider — including Claude. This guide covers everything you need to go from zero to a production-quality AI chat app with Next.js 15.
Why Vercel AI SDK
Before the SDK existed, adding streaming AI responses to Next.js meant manually handling SSE, parsing chunks, and managing loading state client-side. The SDK abstracts all of that:
streamText— streams a response from the model, handles tokens as they arriveuseChat— React hook that manages message history, input state, and streaming- Tool calls — define functions Claude can call, execute them server-side, feed results back
generateObject— get structured JSON output that matches a Zod schema- Provider-agnostic — swap between Claude, OpenAI, Mistral by changing one import
Setup
npm install ai @ai-sdk/anthropic zodSet your key:
# .env.local
ANTHROPIC_API_KEY=sk-ant-...Basic Streaming Chat
The API Route
Create app/api/chat/route.ts:
import { streamText } from 'ai'
import { anthropic } from '@ai-sdk/anthropic'
export async function POST(req: Request) {
const { messages } = await req.json()
const result = await streamText({
model: anthropic('claude-sonnet-4-6'),
system: 'You are a helpful assistant. Be concise and accurate.',
messages,
})
return result.toDataStreamResponse()
}That's it for the backend. toDataStreamResponse() returns the correct headers and body format that useChat expects.
The Chat Component
// components/Chat.tsx
'use client'
import { useChat } from 'ai/react'
import { useRef, useEffect } from 'react'
export function Chat() {
const { messages, input, handleInputChange, handleSubmit, isLoading, stop } = useChat({
api: '/api/chat',
})
const bottomRef = useRef<HTMLDivElement>(null)
useEffect(() => {
bottomRef.current?.scrollIntoView({ behavior: 'smooth' })
}, [messages])
return (
<div className="flex flex-col h-screen max-w-2xl mx-auto p-4">
<div className="flex-1 overflow-y-auto space-y-4 pb-4">
{messages.map((m) => (
<div
key={m.id}
className={`flex ${m.role === 'user' ? 'justify-end' : 'justify-start'}`}
>
<div
className={`rounded-2xl px-4 py-2 max-w-[80%] ${
m.role === 'user'
? 'bg-blue-600 text-white'
: 'bg-gray-100 text-gray-900'
}`}
>
{m.content}
</div>
</div>
))}
{isLoading && (
<div className="flex justify-start">
<div className="bg-gray-100 rounded-2xl px-4 py-2 text-gray-500">
Thinking...
</div>
</div>
)}
<div ref={bottomRef} />
</div>
<form onSubmit={handleSubmit} className="flex gap-2">
<input
value={input}
onChange={handleInputChange}
placeholder="Type a message..."
className="flex-1 rounded-xl border px-4 py-2 focus:outline-none focus:ring-2 focus:ring-blue-500"
disabled={isLoading}
/>
{isLoading ? (
<button
type="button"
onClick={stop}
className="rounded-xl bg-red-500 px-4 py-2 text-white"
>
Stop
</button>
) : (
<button
type="submit"
disabled={!input.trim()}
className="rounded-xl bg-blue-600 px-4 py-2 text-white disabled:opacity-50"
>
Send
</button>
)}
</form>
</div>
)
}Use it in a page:
// app/page.tsx
import { Chat } from '@/components/Chat'
export default function Home() {
return <Chat />
}Tool Calls
Tool calls are where the SDK really shines. Claude decides when to call your functions, executes them, and incorporates the results — all within the same streaming response.
Define Tools
// app/api/chat/route.ts
import { streamText, tool } from 'ai'
import { anthropic } from '@ai-sdk/anthropic'
import { z } from 'zod'
export async function POST(req: Request) {
const { messages } = await req.json()
const result = await streamText({
model: anthropic('claude-sonnet-4-6'),
messages,
tools: {
getWeather: tool({
description: 'Get the current weather for a city',
parameters: z.object({
city: z.string().describe('The city name'),
unit: z.enum(['celsius', 'fahrenheit']).default('celsius'),
}),
execute: async ({ city, unit }) => {
// In production, call a real weather API
return {
city,
temperature: unit === 'celsius' ? 18 : 64,
condition: 'partly cloudy',
humidity: 65,
}
},
}),
searchWeb: tool({
description: 'Search the web for recent information',
parameters: z.object({
query: z.string().describe('The search query'),
}),
execute: async ({ query }) => {
// In production, call a search API (Brave, Tavily, etc.)
return {
results: [`Result for: ${query}`],
}
},
}),
},
maxSteps: 5, // allow up to 5 tool call rounds
})
return result.toDataStreamResponse()
}maxSteps lets the model call multiple tools in sequence. Without it, the response stops after the first tool call.
Render Tool Calls on the Client
useChat gives you full access to tool calls in the message object:
{messages.map((m) => (
<div key={m.id}>
{/* Text content */}
{m.content && <p>{m.content}</p>}
{/* Tool calls */}
{m.toolInvocations?.map((tool) => (
<div key={tool.toolCallId} className="bg-gray-50 rounded-lg p-3 text-sm">
<p className="font-medium text-gray-500">
Called: {tool.toolName}
</p>
{tool.state === 'result' && (
<pre className="mt-1 text-gray-700">
{JSON.stringify(tool.result, null, 2)}
</pre>
)}
</div>
))}
</div>
))}Structured Output with generateObject
When you need JSON, not freeform text, use generateObject. It validates the output against a Zod schema and retries if the model returns invalid data.
// app/api/analyze/route.ts
import { generateObject } from 'ai'
import { anthropic } from '@ai-sdk/anthropic'
import { z } from 'zod'
const SentimentSchema = z.object({
sentiment: z.enum(['positive', 'negative', 'neutral']),
score: z.number().min(0).max(1).describe('Confidence score'),
summary: z.string().describe('One-sentence summary'),
keywords: z.array(z.string()).describe('Key topics mentioned'),
})
export async function POST(req: Request) {
const { text } = await req.json()
const { object } = await generateObject({
model: anthropic('claude-sonnet-4-6'),
schema: SentimentSchema,
prompt: `Analyze the sentiment of this text: "${text}"`,
})
return Response.json(object)
}Call it from anywhere:
const response = await fetch('/api/analyze', {
method: 'POST',
headers: { 'Content-Type': 'application/json' },
body: JSON.stringify({ text: reviewText }),
})
const analysis = await response.json()
// { sentiment: 'positive', score: 0.87, summary: '...', keywords: [...] }useCompletion — Single-Turn Streaming
For one-shot text generation (not chat), useCompletion is simpler than useChat:
'use client'
import { useCompletion } from 'ai/react'
export function TextImprover() {
const { completion, input, handleInputChange, handleSubmit, isLoading } = useCompletion({
api: '/api/improve',
})
return (
<div>
<form onSubmit={handleSubmit}>
<textarea value={input} onChange={handleInputChange} rows={4} />
<button type="submit" disabled={isLoading}>
{isLoading ? 'Improving...' : 'Improve text'}
</button>
</form>
{completion && (
<div className="mt-4 p-4 bg-green-50 rounded-lg">
{completion}
</div>
)}
</div>
)
}The API route:
// app/api/improve/route.ts
import { streamText } from 'ai'
import { anthropic } from '@ai-sdk/anthropic'
export async function POST(req: Request) {
const { prompt } = await req.json()
const result = await streamText({
model: anthropic('claude-sonnet-4-6'),
prompt: `Improve the following text, keeping the same meaning but making it clearer and more professional:\n\n${prompt}`,
})
return result.toTextStreamResponse()
}Note: toTextStreamResponse() for useCompletion, toDataStreamResponse() for useChat.
Persisting Chat History
By default, useChat resets on page reload. To persist conversations, initialize it with stored messages and save them after each response:
'use client'
import { useChat } from 'ai/react'
import { useEffect } from 'react'
const STORAGE_KEY = 'chat-history'
export function PersistentChat() {
const storedMessages = typeof window !== 'undefined'
? JSON.parse(localStorage.getItem(STORAGE_KEY) || '[]')
: []
const { messages, input, handleInputChange, handleSubmit, isLoading } = useChat({
initialMessages: storedMessages,
})
useEffect(() => {
if (messages.length > 0) {
localStorage.setItem(STORAGE_KEY, JSON.stringify(messages))
}
}, [messages])
// ... rest of UI
}For multi-user apps, store the history in a database (Postgres, Supabase, etc.) and load it server-side via initialMessages.
Model Selection
The SDK works with multiple Claude models:
import { anthropic } from '@ai-sdk/anthropic'
// Fast and cheap — customer service, simple tasks
anthropic('claude-haiku-4-5-20251001')
// Best balance — most use cases
anthropic('claude-sonnet-4-6')
// Maximum capability — complex reasoning, long context
anthropic('claude-opus-4-6')For a chat app where users ask general questions, claude-sonnet-4-6 is the right default. Switch to claude-haiku-4-5-20251001 if you're handling high volume and need to reduce costs. See the OpenAI vs Claude API comparison for a full breakdown.
Error Handling
Add error handling to both the route and the client:
// route.ts
export async function POST(req: Request) {
try {
const { messages } = await req.json()
const result = await streamText({
model: anthropic('claude-sonnet-4-6'),
messages,
})
return result.toDataStreamResponse()
} catch (error) {
console.error('AI stream error:', error)
return new Response('Failed to generate response', { status: 500 })
}
}// component
const { messages, error, reload } = useChat()
{error && (
<div className="text-red-500">
Error: {error.message}
<button onClick={reload}>Retry</button>
</div>
)}Rate Limiting
In production, protect your AI endpoint from abuse:
// middleware.ts
import { NextRequest, NextResponse } from 'next/server'
const requestCounts = new Map<string, { count: number; resetTime: number }>()
export function middleware(request: NextRequest) {
if (request.nextUrl.pathname === '/api/chat') {
const ip = request.headers.get('x-forwarded-for') ?? 'anonymous'
const now = Date.now()
const windowMs = 60 * 1000 // 1 minute
const maxRequests = 20
const record = requestCounts.get(ip)
if (!record || now > record.resetTime) {
requestCounts.set(ip, { count: 1, resetTime: now + windowMs })
} else if (record.count >= maxRequests) {
return new NextResponse('Rate limit exceeded', { status: 429 })
} else {
record.count++
}
}
return NextResponse.next()
}For production, use Redis-based rate limiting (Upstash is a good fit with Vercel).
Complete App Structure
app/
api/
chat/route.ts — streaming chat endpoint
analyze/route.ts — generateObject endpoint
page.tsx — Chat component
components/
Chat.tsx — useChat hook + UI
Message.tsx — single message with tool calls
lib/
ai.ts — shared model config
Shared model config:
// lib/ai.ts
import { anthropic } from '@ai-sdk/anthropic'
export const defaultModel = anthropic('claude-sonnet-4-6')
export const fastModel = anthropic('claude-haiku-4-5-20251001')What to Build Next
- AI-powered search — combine
generateObjectwith a search API to return structured results - Document Q&A — upload files, extract text, use as context in the prompt (see the RAG system guide for the embedding approach)
- Code assistant — system prompt focused on your codebase, tool call to read files
- Form auto-fill —
generateObjectwith your form's Zod schema + user description
The Vercel AI SDK removes almost all boilerplate from AI integration. You focus on the product, it handles the streaming protocol, state management, and provider differences. If you're building a Next.js app with AI features, this is the stack to use in 2026.
For auth on top of your AI app, see Next.js authentication with Auth.js v5. For the full-stack foundation this builds on, see the Next.js 15 TypeScript tutorial. For understanding the AI provider options, read the OpenAI vs Claude API comparison.