AI / Next.js

AI SDK i OpenAI w Next.js

Vercel AI SDK, OpenAI API, LangChain.js, streaming chat, RAG, embeddings i generative UI w TypeScript i Next.js.

useChat

Streaming chat

LangChain

AI orchestration

RAG

Doc Q&A

pgvector

Embeddings

6 AI SDK dla TypeScript

Vercel AI SDK, OpenAI SDK, LangChain, LlamaIndex, Mastra i Anthropic SDK — providery, streaming, RSC support i najlepszy use case.

SDK	Providery	Streaming	RSC	Kiedy
Vercel AI SDK	OpenAI, Anthropic, Google, Mistral	Tak (native)	Tak (streamUI)	Next.js, streaming chat, generative UI
OpenAI SDK	Tylko OpenAI	Tak	Ręcznie	OpenAI-specific, Assistants API
LangChain.js	Wszyscy + lokalne	Tak (LCEL)	Trudniejsze	Complex chains, agents, RAG
LlamaIndex.TS	Wszyscy	Ograniczone	Nie natywnie	RAG, document Q&A, retrieval
Mastra	OpenAI, Anthropic	Tak	Tak	TypeScript-first agents, workflows
Anthropic SDK	Tylko Anthropic/Claude	Tak	Ręcznie	Claude models, long context, vision

Często zadawane pytania

Co to jest Vercel AI SDK i jak integrować OpenAI w Next.js?

Vercel AI SDK: framework-agnostic TypeScript SDK dla AI aplikacji. Provider-agnostic (OpenAI, Anthropic, Google, Mistral). Streaming first. RSC i Edge ready. Core: generateText, streamText, generateObject, streamObject. UI: useChat, useCompletion, useAssistant. RSC: streamUI, createStreamableUI. generateText (server): const {text} = await generateText({model: openai('gpt-4o'), prompt: 'Wyjasnij REST API'}). system: 'Jestes ekspertem TypeScript'. messages: [{role: 'user', content: 'Pytanie'}]. maxTokens: 1000. temperature: 0.7. streamText: const result = streamTextWithModel({...}). return result.toTextStreamResponse(). LLM providers: openai() — npm install @ai-sdk/openai. anthropic() — Claude models. google() — Gemini models. mistral() — Mistral models. openai.embedding() — text embeddings. Route Handler (Next.js): export async function POST(req: Request) {const {messages} = await req.json(). const result = streamText({model: openai('gpt-4o'), messages}). return result.toDataStreamResponse()}. useChat hook (Client): const {messages, input, handleInputChange, handleSubmit} = useChat({api: '/api/chat'}). form onSubmit={handleSubmit}, input value={input} onChange={handleInputChange}. messages.map(m => div role={m.role}). Streaming tokens wizualnie po stronie klienta. Tool calls: tools: {getWeather: tool({description: '...', parameters: z.object({location: z.string()}), execute: async ({location}) => ({temp: 22})})}. maxToolRoundtrips: 5.

OpenAI API — modele, embeddings i Assistants API?

OpenAI modele 2024: GPT-4o (multimodal, szybki, tani). GPT-4o mini (tani, szybki, daily tasks). o1, o1-mini (reasoning, chain-of-thought). GPT-4 Turbo (128k context). GPT-3.5 Turbo (legacy, tani). openai npm: new OpenAI({apiKey: process.env.OPENAI_API_KEY}). chat.completions.create({model: 'gpt-4o', messages: [{role: 'system', content: '...'}, {role: 'user', content: '...'}], stream: true}). Vision: content: [{type: 'text', text: 'Opisz ten obraz'}, {type: 'image_url', image_url: {url: 'data:image/jpeg;base64,...'}}]. Embeddings: embeddings.create({model: 'text-embedding-3-small', input: 'tekst do embeddingu'}). 1536 wymiarów. text-embedding-3-large = lepsza jakość. Assistants API (beta): Persistent threads. File search (vector store wbudowane). Code interpreter. openai.beta.assistants.create({model, instructions, tools: [{type: 'file_search'}]}). threads.create(). threads.messages.create(threadId, {role: 'user', content}). threads.runs.stream(threadId, {assistant_id}). Structured outputs: response_format: {type: 'json_schema', json_schema: {name: 'user', schema: zodToJsonSchema(UserSchema)}}. Zawsze valid JSON pasujący do schema. generateObject w AI SDK: const {object} = await generateObject({model: openai('gpt-4o'), schema: z.object({categories: z.array(z.string())}), prompt: 'Kategoryzuj artykuł'}. Rate limits: tier-based. exponential backoff. openai.RateLimitError. Cost management: tokenizer (tiktoken-js). count_tokens przed requestem. Cache (Redis) dla powtarzalnych promptów.

LangChain.js i LlamaIndex — frameworki do budowania LLM aplikacji?

LangChain.js: framework do budowania LLM chains i agentów. Modułowy. ChatOpenAI, ChatAnthropic, ChatGoogleGenerativeAI. PromptTemplate: const prompt = ChatPromptTemplate.fromMessages([['system', 'Jestes...'], ['human', '{input}']]). Chain: const chain = prompt.pipe(llm).pipe(new StringOutputParser()). const result = await chain.invoke({input: 'Pytanie'}). LCEL (LangChain Expression Language): pipe() do łączenia komponentów. RunnableParallel — równolegle. RunnableLambda — custom logika. RAG chain: const retrievalChain = createRetrievalChain({combineDocsChain: createStuffDocumentsChain({llm, prompt}), retriever: vectorStore.asRetriever()}). Memory: ConversationBufferMemory, ConversationSummaryMemory. Agents: AgentExecutor z tools. createOpenAIFunctionsAgent. Tools: TavilySearch, DuckDuckGo, Calculator, PythonREPL. createReactAgent (ReAct pattern). LlamaIndex.TS: RAG framework. Document, SimpleNodeParser. OpenAIEmbedding. VectorStoreIndex.fromDocuments(documents). const queryEngine = index.asQueryEngine(). const {response} = await queryEngine.query('Pytanie'). ServiceContext: llm, embedModel, chunkSize. Document loaders: PDFReader, SimpleWebPageReader. Kiedy LangChain vs AI SDK: AI SDK — Next.js, streaming, simple chat. LangChain — complex chains, agents, RAG pipelines. LlamaIndex — RAG focus, document Q&A. Mastra: nowy TypeScript AI framework (2024). Agents, workflows, memory. Vercel AI SDK 4.0: generowanieObject, streamObject, experimental_continueSteps.

Budowanie chatbotów — useChat, streaming i persystencja konwersacji?

Architektura chatbota: Client (useChat) <-> Route Handler (streamText) <-> LLM API. Messages format: {id, role: 'user'|'assistant'|'system'|'tool', content}. useChat options: api — endpoint URL. initialMessages — wczytaj z DB. onResponse — callback. onFinish — po zakończeniu. onError — error handling. body — dodatkowe pola do POST. headers — custom headers. Persystencja: onFinish callback -> zapisz do DB. const {id, role, content, createdAt} = message. prisma.message.createMany({data: messages}). Wczytaj: initialMessages z getServerSideProps lub client fetch. System prompts: personalizacja. useChat body: {systemPrompt: '...'}. Route Handler odczytuje. User context: przekazuj userId. Route Handler weryfikuje (auth). Konwersacje wielokrotne: conversationId w body. Ładuj historię z DB. Attach do messages. Streaming UI (React Server Components): import {createAI, streamUI} from 'ai/rsc'. Po stronie serwera: streamUI({model, messages, text: ({content}) => div{content}, tools: {getStock: {description, parameters, generate: async* ({symbol}) => {yield Loading. return StockComponent({symbol, price: await getPrice(symbol)}}}}}. Generative UI: zwracaj React components ze streamu. Skeleton -> Real component. Tool streaming: onToolCall (client). toolInvocations w messages. renderToolInvocation custom UI. Rate limiting chatbota: upstash/ratelimit per userId. Edge Runtime compatible. Moderacja: openai.moderations.create({input}) -> categories. Blokuj harmful content.

AI w aplikacjach webowych — image generation, transcription i translation?

Image generation: OpenAI DALL-E 3: images.generate({model: 'dall-e-3', prompt: 'opis', size: '1024x1024', quality: 'hd'}). Replicate: run('stability-ai/sdxl', {input: {prompt}}). Fal.ai: szybszy, cheapszy. Stability AI API. Transcription (Speech-to-Text): OpenAI Whisper: audio.transcriptions.create({file: audioFile, model: 'whisper-1', language: 'pl'}). Whisper local (openai-whisper npm). AssemblyAI: diarization, timestamps. Deepgram: real-time streaming. Text-to-Speech: openai.audio.speech.create({model: 'tts-1', voice: 'nova', input: 'Tekst'}). ElevenLabs: cloning voices, emotional. Azure TTS: 300+ voices. Translation: OpenAI GPT-4o (kontekst, nuanse). Google Translate API (szybki, tani). DeepL API (European languages, quality). deepl-node npm. Document AI: PDF parsing: LlamaParse (LlamaIndex). Azure Document Intelligence. AWS Textract. Structured extraction z invoice/receipt. Vision: gpt-4o multimodal. Analyze products images. OCR + table extraction. Vercel AI SDK image support: {type: 'image', image: new URL('...') | Buffer | base64}. AI cost monitoring: LangSmith (LangChain), Helicone, Langfuse. Track tokens, latency, errors. Cache strategia: Redis cache dla repeated prompts. Semantic cache — podobne pytania = ten sam wynik. GPTCache library. Local models: Ollama (llama3, mistral). Cloudflare AI Workers. Llama.cpp w WASM. Bez kosztów API.

Czytaj dalej