Search Engines dla Aplikacji Web
Algolia, Typesense, Meilisearch, Elasticsearch, pgvector i Pinecone — full-text search, faceting, semantic search i RAG w Next.js.
6 rozwiązań search — porównanie
Algolia, Typesense, Meilisearch, Elasticsearch, pgvector i Pinecone — model hostingu, cena, latencja i najlepszy case użycia.
| Rozwiązanie | Model | Cena | Latencja | Kiedy |
|---|---|---|---|---|
| Algolia | SaaS managed | Darmowy + pay per search | 1-50ms global | Fast time-to-value, e-commerce |
| Typesense | Self-host / Cloud | Open-source / $0.0001/req | 1-10ms (lokalne) | Cost-conscious, self-host, SQL-like |
| Meilisearch | Self-host / Cloud | Open-source / Cloud plany | 5-20ms (lokalne) | Easiest DX, developer-friendly |
| Elasticsearch | Self-host / Elastic Cloud | Open-source (AGPL) / Cloud | Zależy od klastra | Enterprise, analytics, ELK |
| pgvector (PostgreSQL) | Database extension | 0 (masz Postgres) | Zależy od DB | Semantic search, AI RAG, masz PG |
| Pinecone | SaaS vector DB | Darmowy starter + pro | 1-100ms | AI/ML, RAG, vector similarity |
Często zadawane pytania
Co to jest Algolia i dlaczego jest szybsza od tradycyjnego full-text search?
Algolia: hosted search-as-a-service (2012). API-first. Typo-tolerant. Instant (1-50ms). Dlaczego szybsza: inverted index w RAM (nie disk). Distributed infrastructure. Prefix matching (nie full-text scan). Faceting jako pierwsza klasa. Typo tolerance: edit distance algorytm. 1 typo dla krótkich słów, 2 dla długich. Phonetic similarity. Split words. Kluczowe koncepty: Index: kolekcja recordów (jak tabela). Record: obiekt JSON {id, title, description, price}. Attribute: pole w rekordzie. Hit: znaleziony rekord. Facet: atrybut do filtrowania (kategoria, cena, ocena). InstantSearch.js: gotowe UI komponenty. SearchBox, Hits, Pagination, RefinementList. React InstantSearch: liteClient(algolia, {appId, apiKey}). InstantSearch client={client} indexName='products'. SearchBox, Hits — wrapper React. useSearchBox, useHits — hooks. Configure: attributesToRetrieve, hitsPerPage. Relevance: searchableAttributes — które pola przeszukiwać. customRanking: desc('popularity'). ranking: typo, geo, words, filters, proximity, attribute, exact, custom. Faceting: attributesForFaceting. facetFilters: [['brand:Apple', 'brand:Samsung']]. numericFilters: 'price>=100'. Geo search: aroundLatLng, aroundRadius. Record upload: chunked (max 1000/req). algoliasearch.initIndex('products').saveObjects(records). Cennik: darmowy plan (10K records, 10K req/month). Potem drogie przy scale.
Typesense — open-source alternatywa dla Algolia?
Typesense: open-source search engine (2019, Jason Bosco). Rust. Self-hostable lub Typesense Cloud. Podobne do Algolia API. Szybszy index — writes. Tańszy w dużej skali. Konfiguracja: new Typesense.Client({nodes: [{host, port, protocol}], apiKey}). Schema: client.collections().create({name: 'products', fields: [{name: 'title', type: 'string'}, {name: 'price', type: 'float'}, {name: 'category', type: 'string', facet: true}], default_sorting_field: 'price'}). Document ops: client.collections('products').documents().create(doc). documents().import(docs, {action: 'upsert'}). Search: client.collections('products').documents().search({q: 'laptop', query_by: 'title,description', filter_by: 'price:=[100..2000]', facet_by: 'category', sort_by: 'price:asc'}). InstantSearch adapter: typesense-instantsearch-adapter. Kompatybilny z Algolia InstantSearch UI. Typesense vs Algolia: Typesense — open-source, cheaper at scale, self-host. Algolia — battle-tested, więcej features, superior geo search, droższy. Typesense Cloud — managed, affordable, SLA. Typesense dla Next.js: server-side search w getServerSideProps lub Route Handler. Spellchecking: num_typos: 2. typo_tokens_threshold: 1. Multi-search: client.multiSearch.perform({searches: [...]}). Geosearch: geopoint field type. filter_by: 'location:(48.853,2.344,5km)'.
Meilisearch — lokalny i open-source search engine?
Meilisearch: open-source search engine (2019, Rust). Developer-friendly. Instant search (typo-tolerant). Self-host lub Meilisearch Cloud. Prostszy setup niż Typesense. Konfiguracja: new MeiliSearch({host: 'http://localhost:7700', apiKey}). Index: client.index('movies'). Documents: index.addDocuments([{id: 1, title: 'Inception', genre: ['sci-fi', 'thriller']}]). Search: index.search('inception', {limit: 10, filter: 'genre = sci-fi', facets: ['genre'], attributesToHighlight: ['title', 'overview']}). Filtry: filter: 'price >= 100 AND category = Electronics'. Facets: index.updateFaceting({sortFacetValuesBy: {'*': 'count'}}). Settings: index.updateSettings({searchableAttributes: ['title', 'description'], displayedAttributes: ['title', 'price', 'id'], filterableAttributes: ['category', 'price'], sortableAttributes: ['price', 'createdAt'], rankingRules: ['words', 'typo', 'proximity', 'attribute', 'sort', 'exactness']}). Typo tolerance: words.length < 5 -> 0 typos. 5-8 -> 1 typo. 9+ -> 2 typos. Geosearch: _geo field {lat, lng}. geoPoint filter. Meilisearch vs Typesense vs Algolia: Meilisearch — easiest setup, great DX, Rust performance. Typesense — more features (multi-search, vector). Algolia — production battle-tested, managed SLA. Vector search: Meilisearch 1.3+ embeddings. AI-powered semantic search. embedders configuration (OpenAI, HuggingFace, REST).
Elasticsearch vs hosted search — kiedy własny klaster?
Elasticsearch: Apache Lucene-based. Najpotężniejszy. Produkty Elastic: Kibana (visualization), Logstash (data pipeline), Beats (data shippers). Full-text search: inverted index. DSL queries: match, term, range, bool, nested. Aggregations: terms, date_histogram, avg, sum. BM25 scoring. Kiedy Elasticsearch: logi i monitoring (ELK Stack). Kompleksowe analytics. Duże wolumeny (biliony dokumentów). Custom scoring. Geospatial. Wady: complex operations. Expensive managed (Elastic Cloud). Resource-hungry. OpenSearch: fork od Elasticsearch 7.10 (AWS, 2021). Open-source (Apache 2.0). Kompatybilny z ES API. AWS OpenSearch Service. Elastic Cloud: managed Elasticsearch. 8.x z vector search, ELSER (semantic). Elastic Agent. Elasticsearch Serverless (2024). Porównanie filozofii: Algolia/Typesense/Meilisearch — search-focused. Prosty ops model. Gotowe typo-tolerance, faceting. Elasticsearch — analytics + search. Złożony ale potężny. Własne skalowanie. pg_search (PostgreSQL): pgvector + ParadeDB. Full-text search w PostgreSQL. Bez zewnętrznej usługi. ts_vector, ts_query natywne. SQLite FTS5: dla małych projektów. Kiedy co: Algolia — managed, fast time-to-value, płatne. Typesense — self-host lub Cloud, SQL-like filters. Meilisearch — easiest DX, self-host. Elasticsearch — enterprise analytics, kompleksowe. PostgreSQL FTS — już masz PG, małe projekty.
AI-powered search — vector search, semantic search i RAG w 2024?
Tradycyjny search: keyword matching. 'laptop' != 'notebook'. Semantic search: zrozumienie znaczenia zapytania. Embeddings: dense vector reprezentacja tekstu. Podobieństwo = cosine similarity. Vector databases: Pinecone, Weaviate, Qdrant, Chroma. PostgreSQL pgvector: SELECT * FROM documents ORDER BY embedding vector cosine ops query_embedding LIMIT 5. Hybrid search: keyword (BM25) + vector similarity. Fuzzy + semantic. Reciprocal Rank Fusion (RRF) — merge rankingów. Typesense vector: embedding field. vector search + keyword. Meilisearch AI search: OpenAI, HuggingFace embedder. Semantic re-ranking. Algolia NeuralSearch: Algolia AI embeddings. Hybrid search built-in. RAG (Retrieval-Augmented Generation): 1. User query. 2. Semantic search w dokumentach (vector DB). 3. Retrieved context -> LLM prompt. 4. LLM generuje odpowiedź z kontekstem. LlamaIndex: framework do RAG. Document loaders (PDF, Markdown, web). Node parsers. Vector store integrations. Query engines. LangChain: pełny AI framework. Agents, tools, memory. Vercel AI SDK: useChat, useCompletion. streamText, generateText. ai/react, ai/rsc (RSC streaming). Tool calls. OpenAI Assistants API: file search (vector store built-in). code interpreter. GPT-4 z wyszukiwaniem. Implementacja RAG w Next.js: Route Handler + OpenAI embeddings + pgvector + streamText.
Powiązane artykuły
Skontaktuj się z nami
Porozmawiajmy o Twoim projekcie. Bezpłatna wycena w ciągu 24 godzin.
Wyślij zapytanie
Telefon
+48 790 814 814
Pon-Pt: 9:00 - 18:00
adam@fotz.pl
Odpowiadamy w ciągu 24h
Adres
Plac Wolności 16
61-739 Poznań
Godziny pracy
Wolisz porozmawiać?
Zadzwoń teraz i porozmawiaj z naszym specjalistą o Twoim projekcie.
Zadzwoń teraz