AI SDK i OpenAI w Next.js
Vercel AI SDK, OpenAI API, LangChain.js, streaming chat, RAG, embeddings i generative UI w TypeScript i Next.js.
6 AI SDK dla TypeScript
Vercel AI SDK, OpenAI SDK, LangChain, LlamaIndex, Mastra i Anthropic SDK — providery, streaming, RSC support i najlepszy use case.
| SDK | Providery | Streaming | RSC | Kiedy |
|---|---|---|---|---|
| Vercel AI SDK | OpenAI, Anthropic, Google, Mistral | Tak (native) | Tak (streamUI) | Next.js, streaming chat, generative UI |
| OpenAI SDK | Tylko OpenAI | Tak | Ręcznie | OpenAI-specific, Assistants API |
| LangChain.js | Wszyscy + lokalne | Tak (LCEL) | Trudniejsze | Complex chains, agents, RAG |
| LlamaIndex.TS | Wszyscy | Ograniczone | Nie natywnie | RAG, document Q&A, retrieval |
| Mastra | OpenAI, Anthropic | Tak | Tak | TypeScript-first agents, workflows |
| Anthropic SDK | Tylko Anthropic/Claude | Tak | Ręcznie | Claude models, long context, vision |
Często zadawane pytania
Co to jest Vercel AI SDK i jak integrować OpenAI w Next.js?
Vercel AI SDK: framework-agnostic TypeScript SDK dla AI aplikacji. Provider-agnostic (OpenAI, Anthropic, Google, Mistral). Streaming first. RSC i Edge ready. Core: generateText, streamText, generateObject, streamObject. UI: useChat, useCompletion, useAssistant. RSC: streamUI, createStreamableUI. generateText (server): const {text} = await generateText({model: openai('gpt-4o'), prompt: 'Wyjasnij REST API'}). system: 'Jestes ekspertem TypeScript'. messages: [{role: 'user', content: 'Pytanie'}]. maxTokens: 1000. temperature: 0.7. streamText: const result = streamTextWithModel({...}). return result.toTextStreamResponse(). LLM providers: openai() — npm install @ai-sdk/openai. anthropic() — Claude models. google() — Gemini models. mistral() — Mistral models. openai.embedding() — text embeddings. Route Handler (Next.js): export async function POST(req: Request) {const {messages} = await req.json(). const result = streamText({model: openai('gpt-4o'), messages}). return result.toDataStreamResponse()}. useChat hook (Client): const {messages, input, handleInputChange, handleSubmit} = useChat({api: '/api/chat'}). form onSubmit={handleSubmit}, input value={input} onChange={handleInputChange}. messages.map(m => div role={m.role}). Streaming tokens wizualnie po stronie klienta. Tool calls: tools: {getWeather: tool({description: '...', parameters: z.object({location: z.string()}), execute: async ({location}) => ({temp: 22})})}. maxToolRoundtrips: 5.
OpenAI API — modele, embeddings i Assistants API?
OpenAI modele 2024: GPT-4o (multimodal, szybki, tani). GPT-4o mini (tani, szybki, daily tasks). o1, o1-mini (reasoning, chain-of-thought). GPT-4 Turbo (128k context). GPT-3.5 Turbo (legacy, tani). openai npm: new OpenAI({apiKey: process.env.OPENAI_API_KEY}). chat.completions.create({model: 'gpt-4o', messages: [{role: 'system', content: '...'}, {role: 'user', content: '...'}], stream: true}). Vision: content: [{type: 'text', text: 'Opisz ten obraz'}, {type: 'image_url', image_url: {url: 'data:image/jpeg;base64,...'}}]. Embeddings: embeddings.create({model: 'text-embedding-3-small', input: 'tekst do embeddingu'}). 1536 wymiarów. text-embedding-3-large = lepsza jakość. Assistants API (beta): Persistent threads. File search (vector store wbudowane). Code interpreter. openai.beta.assistants.create({model, instructions, tools: [{type: 'file_search'}]}). threads.create(). threads.messages.create(threadId, {role: 'user', content}). threads.runs.stream(threadId, {assistant_id}). Structured outputs: response_format: {type: 'json_schema', json_schema: {name: 'user', schema: zodToJsonSchema(UserSchema)}}. Zawsze valid JSON pasujący do schema. generateObject w AI SDK: const {object} = await generateObject({model: openai('gpt-4o'), schema: z.object({categories: z.array(z.string())}), prompt: 'Kategoryzuj artykuł'}. Rate limits: tier-based. exponential backoff. openai.RateLimitError. Cost management: tokenizer (tiktoken-js). count_tokens przed requestem. Cache (Redis) dla powtarzalnych promptów.
LangChain.js i LlamaIndex — frameworki do budowania LLM aplikacji?
LangChain.js: framework do budowania LLM chains i agentów. Modułowy. ChatOpenAI, ChatAnthropic, ChatGoogleGenerativeAI. PromptTemplate: const prompt = ChatPromptTemplate.fromMessages([['system', 'Jestes...'], ['human', '{input}']]). Chain: const chain = prompt.pipe(llm).pipe(new StringOutputParser()). const result = await chain.invoke({input: 'Pytanie'}). LCEL (LangChain Expression Language): pipe() do łączenia komponentów. RunnableParallel — równolegle. RunnableLambda — custom logika. RAG chain: const retrievalChain = createRetrievalChain({combineDocsChain: createStuffDocumentsChain({llm, prompt}), retriever: vectorStore.asRetriever()}). Memory: ConversationBufferMemory, ConversationSummaryMemory. Agents: AgentExecutor z tools. createOpenAIFunctionsAgent. Tools: TavilySearch, DuckDuckGo, Calculator, PythonREPL. createReactAgent (ReAct pattern). LlamaIndex.TS: RAG framework. Document, SimpleNodeParser. OpenAIEmbedding. VectorStoreIndex.fromDocuments(documents). const queryEngine = index.asQueryEngine(). const {response} = await queryEngine.query('Pytanie'). ServiceContext: llm, embedModel, chunkSize. Document loaders: PDFReader, SimpleWebPageReader. Kiedy LangChain vs AI SDK: AI SDK — Next.js, streaming, simple chat. LangChain — complex chains, agents, RAG pipelines. LlamaIndex — RAG focus, document Q&A. Mastra: nowy TypeScript AI framework (2024). Agents, workflows, memory. Vercel AI SDK 4.0: generowanieObject, streamObject, experimental_continueSteps.
Budowanie chatbotów — useChat, streaming i persystencja konwersacji?
Architektura chatbota: Client (useChat) <-> Route Handler (streamText) <-> LLM API. Messages format: {id, role: 'user'|'assistant'|'system'|'tool', content}. useChat options: api — endpoint URL. initialMessages — wczytaj z DB. onResponse — callback. onFinish — po zakończeniu. onError — error handling. body — dodatkowe pola do POST. headers — custom headers. Persystencja: onFinish callback -> zapisz do DB. const {id, role, content, createdAt} = message. prisma.message.createMany({data: messages}). Wczytaj: initialMessages z getServerSideProps lub client fetch. System prompts: personalizacja. useChat body: {systemPrompt: '...'}. Route Handler odczytuje. User context: przekazuj userId. Route Handler weryfikuje (auth). Konwersacje wielokrotne: conversationId w body. Ładuj historię z DB. Attach do messages. Streaming UI (React Server Components): import {createAI, streamUI} from 'ai/rsc'. Po stronie serwera: streamUI({model, messages, text: ({content}) => div{content}, tools: {getStock: {description, parameters, generate: async* ({symbol}) => {yield Loading. return StockComponent({symbol, price: await getPrice(symbol)}}}}}. Generative UI: zwracaj React components ze streamu. Skeleton -> Real component. Tool streaming: onToolCall (client). toolInvocations w messages. renderToolInvocation custom UI. Rate limiting chatbota: upstash/ratelimit per userId. Edge Runtime compatible. Moderacja: openai.moderations.create({input}) -> categories. Blokuj harmful content.
AI w aplikacjach webowych — image generation, transcription i translation?
Image generation: OpenAI DALL-E 3: images.generate({model: 'dall-e-3', prompt: 'opis', size: '1024x1024', quality: 'hd'}). Replicate: run('stability-ai/sdxl', {input: {prompt}}). Fal.ai: szybszy, cheapszy. Stability AI API. Transcription (Speech-to-Text): OpenAI Whisper: audio.transcriptions.create({file: audioFile, model: 'whisper-1', language: 'pl'}). Whisper local (openai-whisper npm). AssemblyAI: diarization, timestamps. Deepgram: real-time streaming. Text-to-Speech: openai.audio.speech.create({model: 'tts-1', voice: 'nova', input: 'Tekst'}). ElevenLabs: cloning voices, emotional. Azure TTS: 300+ voices. Translation: OpenAI GPT-4o (kontekst, nuanse). Google Translate API (szybki, tani). DeepL API (European languages, quality). deepl-node npm. Document AI: PDF parsing: LlamaParse (LlamaIndex). Azure Document Intelligence. AWS Textract. Structured extraction z invoice/receipt. Vision: gpt-4o multimodal. Analyze products images. OCR + table extraction. Vercel AI SDK image support: {type: 'image', image: new URL('...') | Buffer | base64}. AI cost monitoring: LangSmith (LangChain), Helicone, Langfuse. Track tokens, latency, errors. Cache strategia: Redis cache dla repeated prompts. Semantic cache — podobne pytania = ten sam wynik. GPTCache library. Local models: Ollama (llama3, mistral). Cloudflare AI Workers. Llama.cpp w WASM. Bez kosztów API.
Powiązane artykuły
Skontaktuj się z nami
Porozmawiajmy o Twoim projekcie. Bezpłatna wycena w ciągu 24 godzin.
Wyślij zapytanie
Telefon
+48 790 814 814
Pon-Pt: 9:00 - 18:00
adam@fotz.pl
Odpowiadamy w ciągu 24h
Adres
Plac Wolności 16
61-739 Poznań
Godziny pracy
Wolisz porozmawiać?
Zadzwoń teraz i porozmawiaj z naszym specjalistą o Twoim projekcie.
Zadzwoń teraz