diff --git a/docs/codedocs/api-reference/search-cloudflare.md b/docs/codedocs/api-reference/search-cloudflare.md new file mode 100644 index 0000000..35c7763 --- /dev/null +++ b/docs/codedocs/api-reference/search-cloudflare.md @@ -0,0 +1,77 @@ +--- +title: "Search (Cloudflare Workers)" +description: "API reference for the Cloudflare Workers Search client entry point." +--- + +The Cloudflare entry point exports `Search` and `ClientConfig`. It expects credentials to be provided explicitly and sets platform telemetry to `cloudflare` by default. + +**Source**: `src/platforms/cloudflare.ts` + +## Constructor +```ts +new Search(config: ClientConfig) +``` + +### ClientConfig +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| url | `string \| undefined` | — | REST URL for Upstash Search. Required in Cloudflare Workers. | +| token | `string \| undefined` | — | REST token for Upstash Search. Required in Cloudflare Workers. | +| enableTelemetry | `boolean \| undefined` | `true` | When `false`, telemetry headers are not sent. | +| retry | `false \| { retries?: number; backoff?: (retryCount: number) => number }` | default retries/backoff | Controls retry behavior for network errors. | +| cache | `"default" \| "force-cache" \| "no-cache" \| "no-store" \| "only-if-cached" \| "reload" \| false \| undefined` | Fetch default | Controls Fetch API cache behavior. | + +## Static factory +```ts +Search.fromEnv( + env?: { UPSTASH_SEARCH_REST_URL: string; UPSTASH_SEARCH_REST_TOKEN: string }, + config?: Omit +): Search +``` + +This helper mirrors the Node.js API but is often used with Worker `env` bindings instead of `process.env`. + +## Methods +### index +```ts +index(indexName: string): SearchIndex +``` +Creates a `SearchIndex` scoped to the provided index name. + +### listIndexes +```ts +listIndexes(): Promise +``` +Returns a list of index names (namespaces) available in the database. + +### info +```ts +info(): Promise<{ + diskSize: number; + pendingDocumentCount: number; + documentCount: number; + indexes: Record; +}> +``` +Returns storage and document counts for the entire database and per index. + +## Usage example (Cloudflare Workers) +```ts worker.ts +import { Search } from "@upstash/search/cloudflare"; + +export default { + async fetch(request: Request, env: { UPSTASH_SEARCH_REST_URL: string; UPSTASH_SEARCH_REST_TOKEN: string }) { + const client = new Search({ + url: env.UPSTASH_SEARCH_REST_URL, + token: env.UPSTASH_SEARCH_REST_TOKEN, + }); + + const index = client.index<{ text: string }>("notes"); + const results = await index.search({ query: "hello", limit: 3 }); + + return new Response(JSON.stringify(results), { headers: { "Content-Type": "application/json" } }); + }, +}; +``` + +**Related**: [SearchIndex](./search-index), [Types](../types) diff --git a/docs/codedocs/api-reference/search-index.md b/docs/codedocs/api-reference/search-index.md new file mode 100644 index 0000000..dee05ff --- /dev/null +++ b/docs/codedocs/api-reference/search-index.md @@ -0,0 +1,142 @@ +--- +title: "SearchIndex" +description: "API reference for index-level operations like upsert, search, fetch, and range." +--- + +`SearchIndex` is created by calling `Search.index(name)`. It scopes all document operations to a single index (namespace). + +**Source**: `src/search-index.ts` + +## Constructor +```ts +new SearchIndex(httpClient, vectorIndex, indexName) +``` + +You typically do not instantiate this class directly. Use `Search.index()` instead. + +## Methods +### upsert +```ts +upsert( + params: UpsertParameters | UpsertParameters[] +): Promise +``` + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| params | `UpsertParameters` \| `UpsertParameters[]` | — | Single document or array of documents with `id`, `content`, and optional `metadata`. | + +Returns a string status from the REST API. + +```ts index.ts +await index.upsert({ id: "doc-1", content: { title: "Hello" } }); +``` + +### search +```ts +search(params: { + query: string; + limit?: number; + filter?: string | TreeNode; + reranking?: boolean; + semanticWeight?: number; + inputEnrichment?: boolean; + keepOriginalQueryAfterEnrichment?: boolean; +}): Promise> +``` + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| query | string | — | Search query text. | +| limit | number \| undefined | `5` | Maximum number of results. | +| filter | string \| `TreeNode` \| undefined | — | Filter expression or typed filter tree. | +| reranking | boolean \| undefined | `false` | Enable reranking for higher‑quality results. | +| semanticWeight | number \| undefined | `0.75` | Balance between semantic and full‑text relevance (0–1). | +| inputEnrichment | boolean \| undefined | `true` | Enable query enrichment. | +| keepOriginalQueryAfterEnrichment | boolean \| undefined | `false` | Keep original query alongside enriched query. | + +```ts index.ts +const results = await index.search({ + query: "edge runtime", + limit: 5, + reranking: true, + filter: { AND: [{ category: { equals: "docs" } }] }, +}); +``` + +### fetch +```ts +fetch(params: Parameters[0]): Promise<(Document | null)[]> +``` + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| params | `Parameters[0]` | — | Fetch options from `@upstash/vector` (e.g., `{ ids: string[] }`). | + +```ts index.ts +const docs = await index.fetch({ ids: ["doc-1", "doc-2"] }); +``` + +### delete +```ts +delete(params: Parameters[0]): Promise<{ deleted: number }> +``` + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| params | `Parameters[0]` | — | Delete options from `@upstash/vector` (e.g., `{ ids: string[] }`). | + +```ts index.ts +await index.delete({ ids: ["doc-1"] }); +``` + +### range +```ts +range(params: { cursor: string; limit: number; prefix?: string }): Promise<{ nextCursor: string; documents: Document[] }> +``` + +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| cursor | string | — | Cursor string for pagination. Use "0" to start. | +| limit | number | — | Max documents to return. | +| prefix | string \| undefined | — | Only return IDs with this prefix. | + +```ts index.ts +const { nextCursor, documents } = await index.range({ cursor: "0", limit: 20, prefix: "doc_" }); +``` + +### reset +```ts +reset(): Promise<{ success: boolean }> +``` + +Clears all documents in the index. + +```ts index.ts +await index.reset(); +``` + +### deleteIndex +```ts +deleteIndex(): Promise<{ success: boolean }> +``` + +Deletes the entire index and all documents. + +```ts index.ts +await index.deleteIndex(); +``` + +### info +```ts +info(): Promise<{ pendingDocumentCount: number; documentCount: number }> +``` + +Returns document counts for the index. + +```ts index.ts +const info = await index.info(); +console.log(info.documentCount); +``` + +**Related**: [Search (Node.js)](./search-nodejs), [Filters](../filters) diff --git a/docs/codedocs/api-reference/search-nodejs.md b/docs/codedocs/api-reference/search-nodejs.md new file mode 100644 index 0000000..f87b6a1 --- /dev/null +++ b/docs/codedocs/api-reference/search-nodejs.md @@ -0,0 +1,72 @@ +--- +title: "Search (Node.js)" +description: "API reference for the Node.js Search client entry point." +--- + +The Node.js entry point exports `Search` and `ClientConfig`. It reads credentials from environment variables by default and injects runtime telemetry headers unless disabled. + +**Source**: `src/platforms/nodejs.ts` + +## Constructor +```ts +new Search(config: ClientConfig) +``` + +### ClientConfig +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| url | `string \| undefined` | — | REST URL for Upstash Search. If omitted, the client checks `NEXT_PUBLIC_UPSTASH_SEARCH_REST_URL` and `UPSTASH_SEARCH_REST_URL`. | +| token | `string \| undefined` | — | REST token for Upstash Search. If omitted, the client checks `NEXT_PUBLIC_UPSTASH_SEARCH_REST_TOKEN` and `UPSTASH_SEARCH_REST_TOKEN`. | +| enableTelemetry | `boolean \| undefined` | `true` | When `false`, telemetry headers are not sent. Disabled automatically if `UPSTASH_DISABLE_TELEMETRY` is set. | +| retry | `false \| { retries?: number; backoff?: (retryCount: number) => number }` | default retries/backoff | Controls retry behavior for network errors. | +| cache | `"default" \| "force-cache" \| "no-cache" \| "no-store" \| "only-if-cached" \| "reload" \| false \| undefined` | `"no-store"` | Controls Fetch API cache behavior. | + +## Static factory +```ts +Search.fromEnv( + env?: { UPSTASH_SEARCH_REST_URL: string; UPSTASH_SEARCH_REST_TOKEN: string }, + config?: Omit +): Search +``` + +Use this when you want to explicitly pass environment variables (useful in serverless frameworks) but still allow retry/cache overrides. + +## Methods +### index +```ts +index(indexName: string): SearchIndex +``` +Creates a `SearchIndex` scoped to the provided index name. + +### listIndexes +```ts +listIndexes(): Promise +``` +Returns a list of index names (namespaces) available in the database. + +### info +```ts +info(): Promise<{ + diskSize: number; + pendingDocumentCount: number; + documentCount: number; + indexes: Record; +}> +``` +Returns storage and document counts for the entire database and per index. + +## Usage example +```ts index.ts +import { Search } from "@upstash/search"; + +const client = new Search({ + url: process.env.UPSTASH_SEARCH_REST_URL!, + token: process.env.UPSTASH_SEARCH_REST_TOKEN!, +}); + +const index = client.index("movies"); +const stats = await client.info(); +console.log(stats.documentCount); +``` + +**Related**: [SearchIndex](./search-index), [Types](../types) diff --git a/docs/codedocs/architecture.md b/docs/codedocs/architecture.md new file mode 100644 index 0000000..9c0116c --- /dev/null +++ b/docs/codedocs/architecture.md @@ -0,0 +1,49 @@ +--- +title: "Architecture" +description: "How Upstash Search JS is structured internally and how requests flow through the SDK." +--- + +Upstash Search JS is a thin HTTP client built around a few focused modules: a platform-specific `Search` wrapper, a shared `Search` core that composes a `SearchIndex`, a small HTTP client with retry and cache controls, and a filter builder that turns typed filter trees into the string syntax expected by the REST API. + +```mermaid +graph TD + A[platforms/nodejs.ts] -->|extends| B[search.ts] + C[platforms/cloudflare.ts] -->|extends| B[search.ts] + B --> D[search-index.ts] + B --> E[@upstash/vector Index] + D --> F[client/search-client.ts] + D --> G[client/metadata.ts] + F --> H[fetch + REST API] +``` + +**Key Design Decisions** +- **Platform-specific entry points**: `src/platforms/nodejs.ts` and `src/platforms/cloudflare.ts` create a `Search` instance with runtime-appropriate defaults for telemetry and cache. This keeps the core `Search` class (`src/search.ts`) clean and portable, while letting each platform decide how to read credentials and detect runtime details. +- **Composition over duplication**: The core `Search` class constructs a `@upstash/vector` `Index` (`src/search.ts`) and shares the same underlying HTTP client. `SearchIndex` (`src/search-index.ts`) receives both the raw HTTP client and the vector index so it can use REST endpoints for search and the Vector SDK for fetch/delete/range/reset APIs. This avoids duplicating REST utilities while still exposing a concise Search‑specific API. +- **Typed filter trees**: The filter system in `src/client/metadata.ts` defines a `TreeNode` type that merges content and metadata fields and enforces mutually exclusive operations at the type level. It then translates that type-safe structure into a single REST filter string via `constructFilterString`. This lets you build complex filters without manually concatenating strings. +- **Retry and cache as first-class config**: `src/client/search-client.ts` defines `RequesterConfig` and `RetryConfig`, then normalizes them into a concrete retry plan. The implementation explicitly sets a default exponential backoff (e.g., `Math.exp(retryCount) * 50`) and keeps cache policy in the request options. + +**How the Pieces Fit Together** +1. **Search instance creation**: You instantiate `Search` from `src/platforms/nodejs.ts` or `src/platforms/cloudflare.ts`. These constructors validate credentials, set telemetry headers, and create an `HttpClient` with retry/cache options. +2. **Index selection**: `Search.index()` from `src/search.ts` creates a `SearchIndex` that is scoped to a namespace (index name). This isolates document operations per index. +3. **Document operations**: `SearchIndex` provides `upsert`, `fetch`, `search`, `range`, `reset`, and `deleteIndex`. Search requests are sent directly via `HttpClient` to REST endpoints (`/search/{index}` or `/upsert-data/{index}`). Fetch/range/delete/reset use the `@upstash/vector` `Index` with a namespace set to the index name. +4. **Filtering**: If you pass a structured filter object to `SearchIndex.search`, it is converted to the REST filter expression by `constructFilterString` (`src/client/metadata.ts`). The resulting string is included in the POST body for the search request. + +**Request Lifecycle (Search)** +```mermaid +sequenceDiagram + participant App + participant SearchIndex + participant HttpClient + participant UpstashAPI + + App->>SearchIndex: search({ query, filter, ... }) + SearchIndex->>SearchIndex: validate semanticWeight + SearchIndex->>SearchIndex: construct filter string (optional) + SearchIndex->>HttpClient: request({ path: ["search", index], body }) + HttpClient->>UpstashAPI: POST /search/{index} + UpstashAPI-->>HttpClient: JSON result + HttpClient-->>SearchIndex: result array + SearchIndex-->>App: normalized documents with score +``` + +The result is a compact, predictable SDK surface that stays close to the REST API while still giving you typed documents and helper methods for common index tasks. diff --git a/docs/codedocs/filters.md b/docs/codedocs/filters.md new file mode 100644 index 0000000..82df4e5 --- /dev/null +++ b/docs/codedocs/filters.md @@ -0,0 +1,101 @@ +--- +title: "Filters and Query Trees" +description: "Build safe, expressive filters using TreeNode and understand how they compile into REST syntax." +--- + +Filters let you narrow search results by content fields or metadata fields. Upstash Search JS offers a typed filter tree (`TreeNode`) that compiles into the string expression expected by the REST API. This gives you strong TypeScript guidance while still producing the exact filter syntax the service requires. + +**Why this exists** +Search filters are easy to get wrong when they are built as raw strings—especially when mixing `AND`/`OR`, array operators, and metadata fields. The filter tree in `src/client/metadata.ts` enforces mutually exclusive operations (e.g., you can’t use `equals` and `in` at the same time for a field) and generates the correct string representation. + +**How it relates to other concepts** +- `SearchIndex.search` accepts `filter` as either a string or a `TreeNode` object. +- The filter compiler (`constructFilterString`) is called inside `SearchIndex.search` before the REST call is made. +- Metadata fields are merged into the content type and referenced with the `@metadata.` prefix. + +**Internal mechanics** +`TreeNode` is defined in `src/client/metadata.ts` as a recursive type: +- A leaf is a single field with a single operation (`equals`, `glob`, `in`, etc.). +- A tree node can also be `{ AND: TreeNode[] }` or `{ OR: TreeNode[] }`. +- Metadata fields are mapped as `@metadata.` so filters can safely target content and metadata together. + +`constructFilterString` walks the tree recursively and uses an `operationMap` to produce the filter string. Arrays are formatted as `(...)` and string values are wrapped in single quotes. Invalid or missing operations throw an error. + +```mermaid +flowchart TD + A[TreeNode filter] --> B{AND/OR?} + B -->|AND| C[Join children with AND] + B -->|OR| D[Join children with OR] + B -->|Leaf| E[Map operation to operator] + E --> F[Format value] + C --> G[Final filter string] + D --> G[Final filter string] + E --> G[Final filter string] +``` + +**Basic usage** +```ts index.ts +import { Search } from "@upstash/search"; + +type Content = { title: string; category: "classic" | "modern" }; + +type Meta = { year: number }; + +const client = new Search({ + url: process.env.UPSTASH_SEARCH_REST_URL!, + token: process.env.UPSTASH_SEARCH_REST_TOKEN!, +}); + +const index = client.index("movies"); + +const results = await index.search({ + query: "space", + filter: { AND: [{ category: { equals: "classic" } }] }, +}); +``` + +**Advanced usage: mixing metadata, arrays, and nested logic** +```ts index.ts +import { Search } from "@upstash/search"; + +type Content = { title: string; tags: string[] }; + +type Meta = { rating: number; region: string }; + +const client = new Search({ + url: process.env.UPSTASH_SEARCH_REST_URL!, + token: process.env.UPSTASH_SEARCH_REST_TOKEN!, +}); + +const index = client.index("content"); + +const results = await index.search({ + query: "edge runtime", + filter: { + AND: [ + { tags: { contains: "serverless" } }, + { "@metadata.rating": { greaterThanOrEquals: 4.5 } }, + { + OR: [ + { "@metadata.region": { in: ["us-east", "eu-west"] } }, + { title: { glob: "*edge*" } }, + ], + }, + ], + }, +}); +``` + +`constructFilterString` throws if a filter leaf has no operation or if a value is `undefined`. When building filters from user input, validate and normalize values before you build the tree to prevent runtime errors. + + + +Typed filters make refactoring safer because field names are derived from your content and metadata types. However, they add TypeScript complexity and sometimes require explicit casting when you want to build filters dynamically. Raw strings are more flexible, but they increase the risk of syntax errors and make it harder to catch mistakes early. In teams that value strong typing, the filter tree is usually worth the extra verbosity. + + +Using `@metadata.` keeps the filter namespace explicit and avoids collisions between content fields and metadata fields. The trade‑off is that you must reference metadata fields with a string key, which can be more error‑prone in dynamic code. If you build filters programmatically, consider centralizing metadata field names in constants to avoid typos. This also makes it easier to update field names if your schema evolves. + + +Complex `AND`/`OR` trees map cleanly to the REST syntax, but they can become hard to reason about in large queries. It is often better to build and test smaller filter fragments and then compose them. When you do use deep nesting, ensure you have tests or fixtures that show expected filter strings so regressions are visible. This makes it easier to confirm that the boolean logic still matches your business rules. + + diff --git a/docs/codedocs/guides/filtering-playbook.md b/docs/codedocs/guides/filtering-playbook.md new file mode 100644 index 0000000..bfb57e4 --- /dev/null +++ b/docs/codedocs/guides/filtering-playbook.md @@ -0,0 +1,96 @@ +--- +title: "Filtering Playbook" +description: "Practical patterns for building and testing filters in real applications." +--- + +This guide shows how to build filters that match real application needs: attribute filters, metadata constraints, and nested boolean logic. You will use the typed filter tree where possible and fall back to raw strings only when needed. + +**Problem** +Search results are noisy without the ability to filter by category, status, or metadata like region and rating. Developers often end up with stringly‑typed filters that break when fields change. + +**Solution** +Use the `TreeNode` filter structure for most queries, then test the resulting behavior with a small set of seed documents. For complex scenarios, build reusable filter fragments and compose them with `AND`/`OR`. + + + +### Seed a small dataset +```ts index.ts +import { Search } from "@upstash/search"; + +type Content = { title: string; tags: string[]; category: string }; + +type Meta = { rating: number; region: string }; + +const client = new Search({ + url: process.env.UPSTASH_SEARCH_REST_URL!, + token: process.env.UPSTASH_SEARCH_REST_TOKEN!, +}); + +const index = client.index("catalog"); + +await index.upsert([ + { id: "a", content: { title: "Edge Search", tags: ["serverless"], category: "docs" }, metadata: { rating: 4.9, region: "us-east" } }, + { id: "b", content: { title: "Keyword Indexing", tags: ["fulltext"], category: "blog" }, metadata: { rating: 4.2, region: "eu-west" } }, +]); +``` + + +### Build a structured filter +```ts index.ts +const filter = { + AND: [ + { category: { equals: "docs" } }, + { tags: { contains: "serverless" } }, + { "@metadata.rating": { greaterThanOrEquals: 4.5 } }, + ], +} as const; + +const results = await index.search({ query: "search", filter, limit: 5 }); +``` + + +### Handle OR groups and fall back to strings when needed +```ts index.ts +const results = await index.search({ + query: "indexing", + filter: { + OR: [ + { category: { equals: "docs" } }, + { category: { equals: "blog" } }, + { title: { glob: "*search*" } }, + ], + }, +}); + +// If you need a raw string (for dynamic operators or advanced syntax) +const stringFilter = "category = 'docs' AND title GLOB '*search*'"; +const altResults = await index.search({ query: "search", filter: stringFilter }); +``` + + + +**Complete example: composing reusable filter fragments** +```ts filters.ts +export const docsOnly = { category: { equals: "docs" } } as const; +export const highRating = { "@metadata.rating": { greaterThanOrEquals: 4.5 } } as const; +export const inRegions = (regions: string[]) => ({ "@metadata.region": { in: regions } }) as const; +``` + +```ts index.ts +import { Search } from "@upstash/search"; +import { docsOnly, highRating, inRegions } from "./filters"; + +const client = new Search({ + url: process.env.UPSTASH_SEARCH_REST_URL!, + token: process.env.UPSTASH_SEARCH_REST_TOKEN!, +}); + +const index = client.index("catalog"); + +const results = await index.search({ + query: "edge", + filter: { AND: [docsOnly, highRating, inRegions(["us-east", "eu-west"]) ] }, +}); +``` + +This pattern keeps filter logic centralized and makes it easy to evolve fields without touching every query in your codebase. diff --git a/docs/codedocs/guides/reranking-and-enrichment.md b/docs/codedocs/guides/reranking-and-enrichment.md new file mode 100644 index 0000000..5414941 --- /dev/null +++ b/docs/codedocs/guides/reranking-and-enrichment.md @@ -0,0 +1,77 @@ +--- +title: "Reranking and Enrichment" +description: "Tune search quality with reranking, semantic weighting, and input enrichment controls." +--- + +This guide shows how to tune search quality using `reranking`, `semanticWeight`, and `inputEnrichment`. These options let you trade off cost, latency, and relevance depending on the workload. + +**Problem** +Keyword search can miss intent, while purely semantic search can over‑generalize. Teams need a way to blend the two and optionally rerank results for higher relevance, without changing their indexing pipeline. + +**Solution** +Use the search parameters in `SearchIndex.search`. You can enable reranking for high‑value queries, balance semantic and keyword relevance with `semanticWeight`, and control query enrichment for improved intent understanding. Most teams start with a hybrid weight (0.6–0.8), then adjust based on click‑through or downstream conversion metrics. + + + +### Start with a balanced hybrid search +```ts index.ts +const results = await index.search({ + query: "space opera", + limit: 5, + semanticWeight: 0.75, +}); +``` + + +### Enable reranking for premium queries +```ts index.ts +const results = await index.search({ + query: "best sci-fi from the 70s", + limit: 5, + reranking: true, +}); +``` + + +### Control enrichment for deterministic behavior +```ts index.ts +const results = await index.search({ + query: "serverless search", + limit: 5, + inputEnrichment: false, + keepOriginalQueryAfterEnrichment: false, +}); +``` + + + +**Complete example: feature flags for quality vs cost** +```ts search.ts +import { Search } from "@upstash/search"; + +const client = new Search({ + url: process.env.UPSTASH_SEARCH_REST_URL!, + token: process.env.UPSTASH_SEARCH_REST_TOKEN!, +}); + +const index = client.index<{ title: string; body: string }>("docs"); + +export async function searchDocs(query: string, opts: { premium?: boolean }) { + return await index.search({ + query, + limit: opts.premium ? 10 : 5, + reranking: opts.premium ? true : false, + semanticWeight: opts.premium ? 0.8 : 0.6, + inputEnrichment: true, + keepOriginalQueryAfterEnrichment: true, + }); +} +``` + +**When to tune these knobs** +- If search feels too literal, increase `semanticWeight` toward 1.0. +- If results feel vague or off‑topic, decrease `semanticWeight` toward 0.5. +- Enable `reranking` for queries where the first few results must be excellent (support, documentation, or high‑value product pages). +- Disable `inputEnrichment` in deterministic workflows where you need exact query behavior for compliance or reproducible tests. + +In this setup, premium users get better relevance at higher cost, while free‑tier users receive a balanced hybrid search with lower latency and cost. diff --git a/docs/codedocs/guides/serverless-setup.md b/docs/codedocs/guides/serverless-setup.md new file mode 100644 index 0000000..e970db6 --- /dev/null +++ b/docs/codedocs/guides/serverless-setup.md @@ -0,0 +1,89 @@ +--- +title: "Serverless and Edge Setup" +description: "Configure Upstash Search JS in Node.js serverless and Cloudflare Workers with minimal friction." +--- + +This guide shows how to use the SDK in serverless and edge runtimes where long‑lived TCP connections are not ideal. You will configure credentials, create a `Search` client, and run a minimal search request. + +**Problem** +You need a lightweight search client that works in serverless or edge runtimes without sockets, while still supporting retries, cache control, and typed documents. + +**Solution** +Use the platform‑specific entry points and let the SDK construct a connectionless HTTP client. The Node.js entry can load credentials from the environment, while Cloudflare Workers typically pass secrets explicitly. + + + +### Configure environment variables +For Node.js (serverless or traditional), set the REST URL and token: + +```bash +export UPSTASH_SEARCH_REST_URL="" +export UPSTASH_SEARCH_REST_TOKEN="" +``` + +For Vercel/Next.js, use the same variable names or `NEXT_PUBLIC_` variants if you must read them in client code. + + +### Create a client and index +Use the entry point that matches your runtime. + + + +```ts index.ts +import { Search } from "@upstash/search"; + +type Doc = { text: string }; + +const client = Search.fromEnv(); +const index = client.index("notes"); +``` + + +```ts worker.ts +import { Search } from "@upstash/search/cloudflare"; + +type Doc = { text: string }; + +const client = new Search({ + url: env.UPSTASH_SEARCH_REST_URL, + token: env.UPSTASH_SEARCH_REST_TOKEN, +}); + +const index = client.index("notes"); +``` + + + + +### Perform a minimal search +```ts index.ts +await index.upsert({ id: "n1", content: { text: "hello serverless" } }); +const results = await index.search({ query: "hello", limit: 1 }); + +return new Response(JSON.stringify(results)); +``` + + + +**Complete runnable example (Node.js API route)** +```ts api/search.ts +import { Search } from "@upstash/search"; + +const client = Search.fromEnv(); +const index = client.index<{ text: string }>("notes"); + +export async function handler(req: Request) { + const { q } = await req.json(); + + const results = await index.search({ + query: typeof q === "string" ? q : "", + limit: 5, + }); + + return new Response(JSON.stringify(results), { + headers: { "Content-Type": "application/json" }, + }); +} +``` + +If you run this handler with `q = "hello"`, you should receive an array of matching documents with scores. diff --git a/docs/codedocs/index.md b/docs/codedocs/index.md new file mode 100644 index 0000000..802fe83 --- /dev/null +++ b/docs/codedocs/index.md @@ -0,0 +1,102 @@ +--- +title: "Getting Started" +description: "Use Upstash Search JS to index documents and run semantic + full‑text search over HTTP from any runtime." +--- + +Upstash Search JS is a connectionless HTTP client that lets you index documents and run AI-powered search (semantic + full‑text) on Upstash Search from Node.js, serverless, and edge runtimes. + +**The Problem** +- Traditional search SDKs assume long‑lived TCP connections that don’t fit serverless or edge runtimes. +- Building a useful search experience requires combining semantic search, keyword search, and filtering without re‑implementing infrastructure. +- Multi‑environment apps (Node.js, Cloudflare Workers, Vercel, browser) need consistent APIs with runtime‑safe defaults. +- Index management tasks (upsert, delete, range scans) become tedious without a clear, typed client. + +**The Solution** +Upstash Search JS wraps Upstash Search’s REST API with a small, typed client. You create a `Search` instance, select an index, and use `upsert`, `search`, `fetch`, and `range` without managing TCP connections or custom request signing. + +```ts index.ts +import { Search } from "@upstash/search"; + +type Content = { title: string; genre: string; category: "classic" | "modern" }; +type Metadata = { director: string }; + +const client = new Search({ + url: process.env.UPSTASH_SEARCH_REST_URL!, + token: process.env.UPSTASH_SEARCH_REST_TOKEN!, +}); + +const movies = client.index("movies"); + +await movies.upsert({ + id: "star-wars", + content: { title: "Star Wars", genre: "sci-fi", category: "classic" }, + metadata: { director: "George Lucas" }, +}); + +const results = await movies.search({ query: "space opera", limit: 2, reranking: true }); +console.log(results.map((r) => r.id)); +``` + +**Installation** + + +```bash +npm install @upstash/search +``` + + +```bash +pnpm add @upstash/search +``` + + +```bash +yarn add @upstash/search +``` + + +```bash +bun add @upstash/search +``` + + + +**Quick Start** +The smallest working example uses a single index and a single search. + +```ts index.ts +import { Search } from "@upstash/search"; + +type Note = { text: string }; + +const client = new Search({ + url: process.env.UPSTASH_SEARCH_REST_URL!, + token: process.env.UPSTASH_SEARCH_REST_TOKEN!, +}); + +const notes = client.index("notes"); + +await notes.upsert({ id: "n1", content: { text: "hello world" } }); +const results = await notes.search({ query: "hello", limit: 1 }); + +console.log(results[0].content.text); +``` + +Expected output: + +```txt +hello world +``` + +**Key Features** +- Connectionless HTTP client suitable for serverless and edge runtimes +- Semantic + full‑text search with optional reranking +- Typed documents and metadata, with strongly typed filters +- Index management helpers: upsert, fetch, range, reset, delete +- Runtime‑aware telemetry headers (optional) + + + How modules fit together and the request flow + Understand clients, indexes, and filters + Complete API for Search and SearchIndex + diff --git a/docs/codedocs/indexes.md b/docs/codedocs/indexes.md new file mode 100644 index 0000000..f836623 --- /dev/null +++ b/docs/codedocs/indexes.md @@ -0,0 +1,99 @@ +--- +title: "Indexes and Namespaces" +description: "Learn how SearchIndex scopes data, manages documents, and bridges REST and Vector APIs." +--- + +An index (namespace) is the primary unit of data isolation in Upstash Search JS. The `SearchIndex` class gives you a scoped view of a single index, letting you upsert, search, fetch, range‑scan, and delete documents without passing the index name to every call. + +**Why this exists** +Search databases often contain multiple logical collections (movies, products, support articles). Namespaces let you separate them and operate on each collection independently. `SearchIndex` wraps these operations in a typed, ergonomic interface. + +**How it relates to other concepts** +- You create a `SearchIndex` by calling `Search.index()` from `src/search.ts`. +- Search operations (`search`, `upsert`) go through the SDK’s REST client (`src/client/search-client.ts`). +- Fetch, delete, range, and reset operations use the `@upstash/vector` `Index` with the namespace set to the index name. +- Filters (see `TreeNode` in `src/client/metadata.ts`) can be passed to `search` to apply structured constraints. + +**Internal mechanics** +`SearchIndex` is implemented in `src/search-index.ts` and receives three constructor parameters: `httpClient`, `vectorIndex`, and `indexName`. It uses each depending on the operation: +- `upsert` posts to `/upsert-data/{indexName}` using the HTTP client. +- `search` posts to `/search/{indexName}` and maps results into `{ id, content, metadata, score }`. +- `fetch`, `delete`, `range`, `reset`, and `deleteIndex` delegate to `@upstash/vector` with `{ namespace: indexName }`. + +```mermaid +flowchart TD + A[Search.index("movies")] --> B[SearchIndex] + B -->|upsert/search| C[HttpClient -> REST API] + B -->|fetch/delete/range/reset| D[@upstash/vector Index] + D --> E[Namespace = indexName] +``` + +**Basic usage** +```ts index.ts +import { Search } from "@upstash/search"; + +type Movie = { title: string; genre: string }; + +type Meta = { director: string; year: number }; + +const client = new Search({ + url: process.env.UPSTASH_SEARCH_REST_URL!, + token: process.env.UPSTASH_SEARCH_REST_TOKEN!, +}); + +const movies = client.index("movies"); + +await movies.upsert([ + { id: "m1", content: { title: "Alien", genre: "sci-fi" }, metadata: { director: "Ridley Scott", year: 1979 } }, + { id: "m2", content: { title: "Arrival", genre: "sci-fi" }, metadata: { director: "Denis Villeneuve", year: 2016 } }, +]); + +const results = await movies.search({ query: "first contact", limit: 2 }); +console.log(results.map((r) => r.id)); +``` + +**Advanced usage: pagination and cleanup** +```ts index.ts +import { Search } from "@upstash/search"; + +type Doc = { text: string }; + +const client = new Search({ + url: process.env.UPSTASH_SEARCH_REST_URL!, + token: process.env.UPSTASH_SEARCH_REST_TOKEN!, +}); + +const docs = client.index("docs"); + +// Range-scan all documents with a prefix +let cursor = "0"; +const all: Doc[] = []; + +do { + const { nextCursor, documents } = await docs.range({ + cursor, + limit: 50, + prefix: "doc_", + }); + + all.push(...documents.map((d) => d.content)); + cursor = nextCursor; +} while (cursor !== "0"); + +// Remove the entire index when it is no longer needed +await docs.deleteIndex(); +``` + +`SearchIndex.search` validates `semanticWeight` and throws if it is outside the `0`–`1` range. Guard or clamp user input before passing it into search to avoid throwing errors in production. + + + +Search results are returned by the REST API because they include AI search features and reranking parameters not exposed in the Vector SDK. In contrast, fetch/delete/range/reset operations use `@upstash/vector` because those endpoints are already stable and optimized for vector‑style document operations. This split keeps the SDK surface small and avoids duplicating logic. The trade‑off is that two different request paths are used internally, which can make debugging harder if you expect all operations to flow through the same HTTP client. + + +The index name is injected into every request path or namespace option. This gives strong isolation between datasets but also means that typos create new, empty indexes rather than errors. If you build dynamic index names, consider normalizing them (lowercase, stable prefixes) and validating them before calling `Search.index()`. For multi‑tenant systems, treat the index name as part of your tenancy boundary and avoid using user‑supplied raw input directly. + + +`upsert` treats the document `id` as a unique key; inserting the same ID overwrites the previous document. This is convenient for incremental updates but can hide accidental ID collisions if your ID generation is weak. For append‑only workflows, bake in a strong unique suffix (like a timestamp or UUID). For replacement workflows, explicitly log the IDs you update so it is clear when a document changes. + + diff --git a/docs/codedocs/search-client.md b/docs/codedocs/search-client.md new file mode 100644 index 0000000..64d53d8 --- /dev/null +++ b/docs/codedocs/search-client.md @@ -0,0 +1,77 @@ +--- +title: "Search Client" +description: "Understand how the platform-specific Search client builds requests, handles retries, and injects telemetry." +--- + +The Search client is the entry point to the SDK. It is responsible for validating credentials, attaching headers, and creating the HTTP requester that every index operation uses. There are two platform wrappers—Node.js and Cloudflare Workers—that both extend the same core class but differ in how they read environment variables and determine telemetry headers. + +**Why this exists** +Upstash Search is accessed over HTTP. The client abstracts away authorization headers, retry logic, and runtime differences so you can focus on indexing and querying. Without it, each function call would need to manually build requests and handle transient network failures. + +**How it relates to other concepts** +- The `Search` client creates a `SearchIndex` for each namespace via `Search.index()` in `src/search.ts`. +- `SearchIndex` uses the client’s `HttpClient` to call `/search` and `/upsert-data` endpoints and the `@upstash/vector` SDK for other index operations. +- The filter system (`TreeNode` + `constructFilterString`) plugs into `SearchIndex.search` to add typed filtering. + +**Internal mechanics** +The platform-specific `Search` class in `src/platforms/nodejs.ts` and `src/platforms/cloudflare.ts`: +- Validates `url` and `token` and throws `UpstashError` if missing. +- Warns if the credentials contain whitespace, which can break auth headers. +- Builds telemetry headers using `src/client/telemetry.ts` (Node.js adds runtime details; Cloudflare sets a fixed platform value). +- Constructs an `HttpClient` (`src/client/search-client.ts`) with `baseUrl`, `headers`, `retry`, and `cache` settings. + +`HttpClient` normalizes retry config: if `retry` is `false` it attempts only once; otherwise it defaults to 5 retries with exponential backoff (`Math.exp(retryCount) * 50`). It then POSTs JSON to the REST endpoint and throws an `UpstashError` on non‑OK responses. + +```mermaid +flowchart TD + A[User config] --> B[Search constructor] + B --> C[Validate url/token] + C --> D[Telemetry headers] + D --> E[HttpClient] + E --> F[Search core] + F --> G[SearchIndex] +``` + +**Basic usage** +```ts index.ts +import { Search } from "@upstash/search"; + +const client = new Search({ + url: process.env.UPSTASH_SEARCH_REST_URL!, + token: process.env.UPSTASH_SEARCH_REST_TOKEN!, +}); + +const index = client.index("movies"); +``` + +**Advanced usage: custom retry + cache** +```ts index.ts +import { Search } from "@upstash/search"; + +const client = new Search({ + url: process.env.UPSTASH_SEARCH_REST_URL!, + token: process.env.UPSTASH_SEARCH_REST_TOKEN!, + retry: { + retries: 3, + backoff: (count) => 100 + count * 200, + }, + cache: "no-store", + enableTelemetry: false, +}); + +const index = client.index("support-articles"); +``` + +If you pass `url` or `token` values with leading or trailing whitespace, the client will warn you but still attempt requests. Trim secrets in your deployment pipeline and avoid multiline environment variables to prevent intermittent authentication failures. + + + +Telemetry headers are enabled by default so Upstash can understand SDK usage and runtime patterns. In `src/platforms/nodejs.ts`, the SDK reads `UPSTASH_DISABLE_TELEMETRY` and, if set, removes those headers entirely. Disabling telemetry reduces outbound metadata but also makes it harder to diagnose runtime-specific issues and measure SDK performance across environments. If you disable it, consider adding your own monitoring around HTTP failures and retries so you still have visibility into request reliability. + + +The default exponential backoff in `src/client/search-client.ts` is safe for most serverless workloads, but it increases tail latency when the network is flaky. Reducing retries can improve worst‑case latency but risks more visible failures at peak load. For user‑facing search, a small number of retries (1‑3) usually gives a better balance. For background ingestion, longer retries can be acceptable because throughput matters more than response time. + + +The HTTP client exposes `cache` settings for runtimes that respect the Fetch API cache semantics. Using `no-store` avoids stale reads but also prevents edge caching of identical searches. In read‑heavy workloads with mostly stable content, `force-cache` can improve latency at the cost of freshness. Always align cache policy with the consistency needs of your content; for example, ingestion-heavy apps should use `no-store` while a static docs search might tolerate cached search results for short periods. + + diff --git a/docs/codedocs/test.md b/docs/codedocs/test.md new file mode 100644 index 0000000..20ea199 --- /dev/null +++ b/docs/codedocs/test.md @@ -0,0 +1,24 @@ +--- +title: "Components Showcase" +description: "Internal page used to validate MDX components and formatting conventions." +--- + +This page exists to validate the MDX component syntax and ensure the documentation build can render core components. It is not meant for end‑user documentation, but the content here is still written in a clear and consistent style to avoid build errors. If you are using this repository as a template, you can remove this page from navigation and treat it as a formatting reference for authors. + +The most important rule is that MDX components in this project must not be self‑closing. Always use explicit opening and closing tags such as `...` or `...`. This page demonstrates that requirement with a simple set of components and a short narrative. + +When adding new documentation pages, keep lists flat, escape pipe characters in tables, and ensure JSX attributes are space‑separated. + +Below is a simple Cards grid that is valid for this MDX setup: + + + See how modules fit together + Learn the main abstractions + Explore the full API surface + + +If you want to test syntax locally, add a temporary page like this and verify that the build passes. Avoid adding non‑registered components or custom tags, as those will cause compilation errors in Fumadocs. Also avoid nested bullet lists; if you need hierarchy, use separate paragraphs or section headings instead. + +This file is deliberately kept in plain English and uses the same frontmatter format as the rest of the docs, which helps catch issues with title and description parsing. Feel free to replace this content with a more specific internal style guide, or delete the file entirely if you manage navigation exclusively through `meta.json` and do not want a playground page. + +If you keep the page, consider renaming it and adding it to the sidebar only for maintainers. It is useful as a smoke test after dependency upgrades because it exercises `Cards` and `Callout` rendering in the same build as the main documentation. diff --git a/docs/codedocs/types.md b/docs/codedocs/types.md new file mode 100644 index 0000000..c8e0447 --- /dev/null +++ b/docs/codedocs/types.md @@ -0,0 +1,82 @@ +--- +title: "Types" +description: "TypeScript types exported by Upstash Search JS and how to use them." +--- + +This SDK ships with TypeScript types that describe documents, search results, and configuration. The definitions below are taken from the source files so you can rely on them in your own code. + +**Source**: `src/types.ts`, `src/client/search-client.ts`, `src/platforms/nodejs.ts`, `src/platforms/cloudflare.ts`, `src/client/metadata.ts` + +## Core data types +```ts +export type Dict = Record; + +export type UpsertParameters = { + id: string; + content: TContent; + metadata?: TIndexMetadata; +}; + +export type Document< + TContent extends Dict, + TMetadata extends Dict, + TWithScore extends boolean = false, +> = { + id: string; + content: TContent; + metadata?: TMetadata; +} & (TWithScore extends true ? { score: number } : {}); + +export type SearchResult = Document< + TContent, + TMetadata, + true +>[]; +``` + +These types let you strongly type content and metadata. For example, `SearchResult` ensures each result includes the fields your application expects and includes `score`. + +## Filter tree +```ts +export type TreeNode = + | Leaf> + | { OR: TreeNode[] } + | { AND: TreeNode[] }; +``` + +`TreeNode` lets you build structured filters. It merges content fields with metadata fields prefixed as `@metadata.` so you can filter across both namespaces without collisions. + +## Client configuration +```ts +export type RetryConfig = + | false + | { + retries?: number; + backoff?: (retryCount: number) => number; + }; + +export type RequesterConfig = { + retry?: RetryConfig; + cache?: "default" | "force-cache" | "no-cache" | "no-store" | "only-if-cached" | "reload" | false; +}; +``` + +`RequesterConfig` is included in the platform `ClientConfig` type, so you can configure retries and cache behavior regardless of runtime. + +## Platform client config +```ts +export type ClientConfig = { + url?: string; + token?: string; + enableTelemetry?: boolean; +} & RequesterConfig; +``` + +This type is exported in both `src/platforms/nodejs.ts` and `src/platforms/cloudflare.ts`. Use it when you want to type custom wrappers or factory helpers around `Search`. + +## Re-exported vector types +```ts +export type { QueryResult, Index as VectorIndex } from "@upstash/vector"; +``` + +These types are re-exported from `@upstash/vector` and are useful when you need to interact with lower‑level vector operations or annotate advanced integrations.