Download OpenAPI specification:
A stateless RESTful service that provides five core capabilities:
The similarity endpoint supports two scenarios:
Self-similarity - Provide a single set of strings (set) and the
service always returns a flattened upper-triangle similarity
vector (excluding the diagonal). The response also includes the number
of input strings (n) so the caller can easily re-shape the full matrix.
Cross-similarity - Provide two distinct sets (set_a and set_b)
and the service returns similarities only between items across the two
sets (|A| x |B| matrix). The representation can be matrix (default)
or flattened when the flatten=true query parameter is supplied.
The themes endpoint groups a collection of open-ended responses (e.g., survey comments, product reviews) into latent themes. Each theme now contains:
shortLabel - a concise (2-4 word) name for dashboards or charts.label - a slightly longer descriptive title.description - 1-2 sentences summarizing the common idea captured by the theme.representatives - exactly two representative input strings for each theme.Callers may optionally specify minThemes, maxThemes, and a free-text
context string to steer clustering (e.g., "focus on UX issues").
The sentiment endpoint classifies each input string as positive, negative, neutral, or mixed and returns a confidence value ∈ [0, 1].
The service maintains internally versioned models. When the optional version
field is omitted, the latest production version is used. Supplying a version
locks behaviour to that specific model version, enabling reproducible results
even after future upgrades.
Embeddings endpoint. Supports synchronous (fast=true) and asynchronous (fast=false or omitted) modes.
Generates dense vector embeddings for input strings in a single batch. Supports synchronous (fast=true) and asynchronous (fast=false or omitted) modes.
• Synchronous mode processes up to 200 input strings and returns embeddings immediately (HTTP 200). • Asynchronous mode accepts up to 2,000 input strings and returns a jobId (HTTP 202) to poll via the /jobs endpoint.
| inputs required | Array of strings <= 2000 items List of input strings. For synchronous (fast=true) mode, max 200; for asynchronous (fast=false or omitted) mode, max 2000. |
| fast | boolean Flag indicating synchronous (true) or asynchronous (false) processing. Default false. |
required | Array of objects (EmbeddingDocument) <= 200 items |
| requestId required | string |
{- "inputs": [
- "Hello world",
- "Test input"
]
}{- "embeddings": [
- {
- "text": "Hello world",
- "vector": [
- 0.1,
- 0.2,
- 0.3
]
}, - {
- "text": "Test input",
- "vector": [
- 0.4,
- 0.5,
- 0.6
]
}
], - "requestId": "example-request-id"
}Similarity endpoint. Provides self- and cross-similarity computations with sync (fast=true) and async (fast=false or omitted) modes.
Computes pairwise cosine similarity between input strings.
Supports self-similarity (provide set) and cross-similarity (provide set_a and set_b), with synchronous (fast=true) and asynchronous (fast=false or omitted) modes.
• In synchronous mode, self-similarity supports up to 100 input strings; cross-similarity requires |set_a|×|set_b| ≤ 10,000. Returns similarities immediately (HTTP 200). • In asynchronous mode, supports larger inputs (self up to 44,721 items; cross with |set_a|×|set_b| ≤ 2,000,000,000) and returns a jobId (HTTP 202) to poll via the /jobs endpoint.
| set required | Array of strings [ 2 .. 44721 ] items Array of strings for self-similarity. For synchronous (fast=true), max 100; for asynchronous (fast=false or omitted), max 44721. |
| set_a | Array of strings non-empty Array of strings for cross-similarity. For synchronous (fast=true), ensure |set_a|×|set_b| ≤ 10000; for asynchronous (fast=false or omitted), ensure |set_a|×|set_b| ≤ 2000000000. |
| set_b | Array of strings non-empty Array of strings for cross-similarity. For synchronous (fast=true), ensure |set_a|×|set_b| ≤ 10000; for asynchronous (fast=false or omitted), ensure |set_a|×|set_b| ≤ 2000000000. |
| version | string |
| fast | boolean Flag indicating synchronous (true) or asynchronous (false) processing. Default false. |
| flatten | boolean For cross-similarity, flatten the matrix into a 1-D array. Ignored for self-similarity. Default false. |
| scenario required | string Enum: "self" "cross" |
| mode required | string Enum: "matrix" "flattened" |
| n required | integer |
| flattened required | Array of numbers <float> <= 2000000000 items [ items <float > ] |
| matrix | Array of numbers <= 2000000000 items [ items <float > <= 2000000000 items [ items <float > ] ] |
| requestId required | string |
{- "set": [
- "alpha",
- "beta",
- "gamma"
]
}{- "scenario": "cross",
- "mode": "matrix",
- "matrix": [
- [
- 1,
- 0.8
], - [
- 0.8,
- 1
]
], - "flattened": [
- 1,
- 0.8,
- 0.8,
- 1
], - "requestId": "example-request-id"
}Themes endpoint. Clusters text into themes with sync (fast=true) and async (fast=false or omitted) modes.
Groups input strings into latent themes using LLM-based clustering. Supports synchronous (fast=true) and asynchronous (fast=false or omitted) modes.
Each theme includes a shortLabel, label, description, and exactly two representative input strings.
Optionally control theme count with minThemes, maxThemes, and steer focus via context.
| inputs required | Array of strings [ 2 .. 500 ] items List of input strings. For synchronous (fast=true) mode, max 200; for asynchronous (fast=false or omitted) mode, max 500. |
| minThemes | integer >= 1 |
| maxThemes | integer <= 50 |
| context | string |
| version | string |
| prune | integer [ 0 .. 50 ] |
| fast | boolean Flag indicating synchronous (true) or asynchronous (false) processing. Default false. |
required | Array of objects (Theme) <= 50 items |
| requestId required | string |
{- "inputs": [
- "fast service",
- "slow response",
- "easy setup"
]
}{- "themes": [
- {
- "shortLabel": "UI Issue",
- "label": "User Interface Issues",
- "description": "Problems related to layout and design.",
- "representatives": [
- "Button not aligned",
- "Text too small"
]
}
], - "requestId": "example-request-id"
}Sentiment endpoint. Classifies sentiment with sync (fast=true) and async (fast=false or omitted) modes.
Classifies the sentiment of each input string as positive, negative, neutral, or mixed, with confidence scores ∈ [0,1]. Supports synchronous (fast=true) and asynchronous (fast=false or omitted) modes.
• In synchronous mode, processes up to 200 input strings and returns results immediately (HTTP 200). • In asynchronous mode, accepts up to 10000 input strings and returns a jobId (HTTP 202) to poll via the /jobs endpoint.
Optionally supply version for reproducible outputs.
| inputs required | Array of strings [ 1 .. 10000 ] items List of input strings. For synchronous (fast=true) mode, max 200; for asynchronous (fast=false or omitted) mode, max 10000. |
| version | string |
| fast | boolean Flag indicating synchronous (true) or asynchronous (false) processing. Default false. |
required | Array of objects (SentimentResult) <= 10000 items |
| requestId required | string |
{- "inputs": [
- "I love this",
- "I hate that"
]
}{- "results": [
- {
- "sentiment": "positive",
- "confidence": 0.95
}, - {
- "sentiment": "negative",
- "confidence": 0.85
}
], - "requestId": "example-request-id"
}Extractions endpoint. Extracts elements matching themes with sync (fast=true) and async (fast=false or omitted) modes.
Extracts substrings from inputs that match the provided themes. Supports synchronous (fast=true) and asynchronous (fast=false or omitted) modes.
• Both modes support up to 200 input strings and up to 50 themes. • Synchronous mode returns extraction results immediately (HTTP 200). • Asynchronous mode returns a jobId (HTTP 202) to poll via the /jobs endpoint.
Returns a 3-dimensional array where extractions[i][j] contains matching elements for input i and theme j.
| inputs required | Array of strings [ 1 .. 200 ] items |
| themes required | Array of strings [ 1 .. 50 ] items |
| version | string |
| fast | boolean Flag indicating synchronous (true) or asynchronous (false) processing. Default false. |
| extractions required | Array of strings <= 1000 items [ items <= 1000 items [ items <= 1000 items ] ] 3D array of extracted elements, shape [inputs.length][themes.length][K] |
| requestId required | string |
{- "inputs": [
- "The food was great and the service was slow."
], - "themes": [
- "food",
- "service"
]
}{- "extractions": [
- [
- [
- "food was great"
], - [
- "service was slow"
]
]
], - "requestId": "example-request-id"
}Jobs endpoint. Poll job status for asynchronous requests generated by other endpoints.
Retrieves the status of a previously submitted long-running job.
Returns pending, completed, or failed. When completed, includes
a resultUrl to download results.
| jobId required | string Unique identifier for the job. |
| jobId | string |
| status required | string Enum: "pending" "completed" "failed" |
| resultUrl | string |
{- "jobId": "string",
- "status": "pending",
- "resultUrl": "string"
}