# Taxonomy ## List Fields **get** `/v1/taxonomy/fields` Returns the feedback fields that can be used to generate a taxonomy for a tenant, along with the number of text records and embedded records available per field scope (source_type, source_id, field_id). A field with no attributed source is exposed under the canonical "no source" bucket (empty source_id). Requires Hub embeddings to be configured; otherwise the endpoint returns 503. ### Query Parameters - `tenant_id: string` Tenant whose taxonomy-capable fields should be listed. ### Returns - `data: array of object { embedding_count, field_id, record_count, 5 more }` - `embedding_count: number` Number of those records that have an embedding. - `field_id: string` - `record_count: number` Number of text feedback records in the scope. - `source_id: string` Empty string is the canonical "no source" bucket. - `source_type: string` - `tenant_id: string` - `field_label: optional string` - `source_name: optional string` ### Example ```http curl http://localhost:8080/v1/taxonomy/fields \ -H "Authorization: Bearer $HUB_API_KEY" ``` ## Domain Types ### Run - `Run = object { id, cluster_count, created_at, 16 more }` A persisted taxonomy generation run. - `id: string` - `cluster_count: number` - `created_at: string` - `embedding_count: number` - `field_id: string` - `node_count: number` - `record_count: number` - `source_id: string` Empty string is the canonical "no source" bucket. - `source_type: string` - `status: "pending" or "running" or "succeeded" or 2 more` Lifecycle state of a taxonomy run. Allowed transitions are pending -> running|failed|canceled and running -> succeeded|failed|canceled. - `"pending"` - `"running"` - `"succeeded"` - `"failed"` - `"canceled"` - `tenant_id: string` - `updated_at: string` - `error: optional string` Sanitized failure message; present on failed runs. - `error_code: optional "insufficient_data" or "service_unavailable" or "generation_failed" or 2 more` Machine-readable reason a taxonomy run failed or a prerequisite was not met. - `"insufficient_data"` - `"service_unavailable"` - `"generation_failed"` - `"invalid_output"` - `"internal_error"` - `field_label: optional string` Human-readable field label; absent when unknown. - `finished_at: optional string` - `metrics: optional map[unknown]` Opaque run metrics recorded by the taxonomy service. - `params: optional map[unknown]` Opaque run parameters recorded by Hub. - `started_at: optional string` ### Node - `Node = object { id, created_at, label, 13 more }` A node in a taxonomy tree. Non-root nodes have a parent; leaf nodes reference the cluster they summarize. - `id: string` - `created_at: string` - `label: string` - `level: number` Depth in the tree; the root is level 0. - `node_type: "root" or "branch" or "leaf"` Position of a node within the taxonomy tree. - `"root"` - `"branch"` - `"leaf"` - `run_id: string` - `sort_order: number` - `updated_at: string` - `children: optional array of Node` Child nodes, present when the tree is returned hierarchically. - `cluster_id: optional string` Cluster this node summarizes; typically present on leaf nodes. - `description: optional string` - `metadata: optional map[unknown]` - `original_label: optional string` Label as originally generated, before any rename. - `parent_id: optional string` Parent node ID; absent for the root node. - `removed_at: optional string` Set when the node has been soft-removed. - `removed_by: optional string` Actor that soft-removed the node. # Runs ## List **get** `/v1/taxonomy/runs` Returns taxonomy run history for a tenant, most recent first. Optionally filter by source_type, field_id, and source_id. source_id is a tri-state filter: omit it for no source filter, pass an empty string to scope to the canonical "no source" bucket, or pass a concrete value to match that source. ### Query Parameters - `tenant_id: string` Tenant whose runs should be listed. - `field_id: optional string` Optional field_id filter. - `limit: optional number` Maximum number of runs to return. - `source_id: optional string` Optional source_id filter. Omit for no filter; empty string scopes to the "no source" bucket; a concrete value matches that source. - `source_type: optional string` Optional source_type filter. ### Returns - `data: array of Run` - `id: string` - `cluster_count: number` - `created_at: string` - `embedding_count: number` - `field_id: string` - `node_count: number` - `record_count: number` - `source_id: string` Empty string is the canonical "no source" bucket. - `source_type: string` - `status: "pending" or "running" or "succeeded" or 2 more` Lifecycle state of a taxonomy run. Allowed transitions are pending -> running|failed|canceled and running -> succeeded|failed|canceled. - `"pending"` - `"running"` - `"succeeded"` - `"failed"` - `"canceled"` - `tenant_id: string` - `updated_at: string` - `error: optional string` Sanitized failure message; present on failed runs. - `error_code: optional "insufficient_data" or "service_unavailable" or "generation_failed" or 2 more` Machine-readable reason a taxonomy run failed or a prerequisite was not met. - `"insufficient_data"` - `"service_unavailable"` - `"generation_failed"` - `"invalid_output"` - `"internal_error"` - `field_label: optional string` Human-readable field label; absent when unknown. - `finished_at: optional string` - `metrics: optional map[unknown]` Opaque run metrics recorded by the taxonomy service. - `params: optional map[unknown]` Opaque run parameters recorded by Hub. - `started_at: optional string` ### Example ```http curl http://localhost:8080/v1/taxonomy/runs \ -H "Authorization: Bearer $HUB_API_KEY" ``` ## Start **post** `/v1/taxonomy/runs` Starts a manual taxonomy generation run for a field scope. Hub validates that the field has enough embedded text feedback (below the configured minimum returns 400 with an "insufficient data" validation error), creates the run, and hands it to the taxonomy compute service. Idempotent per scope: if a run is already pending or running for the same scope, the existing run is returned with `in_progress: true` (HTTP 200) instead of starting a new one; a newly created run returns HTTP 202 with `in_progress: false`. While a tenant data purge runs for the same tenant_id, the request is rejected with HTTP 409 (code `tenant_write_conflict`) and may be retried. Requires Hub embeddings and the taxonomy service to be configured; otherwise returns 503. ### Body Parameters - `field_id: string` - `source_type: string` - `tenant_id: string` - `actor_id: optional string` Optional identifier of the actor starting the run. - `field_label: optional string` Optional human-readable field label. - `source_id: optional string` Optional; empty or omitted is the canonical "no source" bucket. ### Returns - `in_progress: boolean` True when an existing pending/running run for the scope was returned instead of starting a new one. - `run: Run` A persisted taxonomy generation run. - `id: string` - `cluster_count: number` - `created_at: string` - `embedding_count: number` - `field_id: string` - `node_count: number` - `record_count: number` - `source_id: string` Empty string is the canonical "no source" bucket. - `source_type: string` - `status: "pending" or "running" or "succeeded" or 2 more` Lifecycle state of a taxonomy run. Allowed transitions are pending -> running|failed|canceled and running -> succeeded|failed|canceled. - `"pending"` - `"running"` - `"succeeded"` - `"failed"` - `"canceled"` - `tenant_id: string` - `updated_at: string` - `error: optional string` Sanitized failure message; present on failed runs. - `error_code: optional "insufficient_data" or "service_unavailable" or "generation_failed" or 2 more` Machine-readable reason a taxonomy run failed or a prerequisite was not met. - `"insufficient_data"` - `"service_unavailable"` - `"generation_failed"` - `"invalid_output"` - `"internal_error"` - `field_label: optional string` Human-readable field label; absent when unknown. - `finished_at: optional string` - `metrics: optional map[unknown]` Opaque run metrics recorded by the taxonomy service. - `params: optional map[unknown]` Opaque run parameters recorded by Hub. - `started_at: optional string` ### Example ```http curl http://localhost:8080/v1/taxonomy/runs \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $HUB_API_KEY" \ -d '{ "field_id": "feedback", "source_type": "formbricks", "tenant_id": "org-123" }' ``` ## Retrieve **get** `/v1/taxonomy/runs/{run_id}` Returns a single taxonomy run by ID, scoped to the tenant. Returns 404 if the run does not belong to the tenant. ### Path Parameters - `run_id: string` ### Query Parameters - `tenant_id: string` Tenant that owns the run. ### Returns - `Run = object { id, cluster_count, created_at, 16 more }` A persisted taxonomy generation run. - `id: string` - `cluster_count: number` - `created_at: string` - `embedding_count: number` - `field_id: string` - `node_count: number` - `record_count: number` - `source_id: string` Empty string is the canonical "no source" bucket. - `source_type: string` - `status: "pending" or "running" or "succeeded" or 2 more` Lifecycle state of a taxonomy run. Allowed transitions are pending -> running|failed|canceled and running -> succeeded|failed|canceled. - `"pending"` - `"running"` - `"succeeded"` - `"failed"` - `"canceled"` - `tenant_id: string` - `updated_at: string` - `error: optional string` Sanitized failure message; present on failed runs. - `error_code: optional "insufficient_data" or "service_unavailable" or "generation_failed" or 2 more` Machine-readable reason a taxonomy run failed or a prerequisite was not met. - `"insufficient_data"` - `"service_unavailable"` - `"generation_failed"` - `"invalid_output"` - `"internal_error"` - `field_label: optional string` Human-readable field label; absent when unknown. - `finished_at: optional string` - `metrics: optional map[unknown]` Opaque run metrics recorded by the taxonomy service. - `params: optional map[unknown]` Opaque run parameters recorded by Hub. - `started_at: optional string` ### Example ```http curl http://localhost:8080/v1/taxonomy/runs/$RUN_ID \ -H "Authorization: Bearer $HUB_API_KEY" ``` ## Get Tree **get** `/v1/taxonomy/runs/{run_id}/tree` Returns the run and its taxonomy tree (visible nodes only; soft-removed nodes are excluded). Tenant-scoped; returns 404 if the run does not belong to the tenant. ### Path Parameters - `run_id: string` ### Query Parameters - `tenant_id: string` Tenant that owns the run. ### Returns - `root: Node` A node in a taxonomy tree. Non-root nodes have a parent; leaf nodes reference the cluster they summarize. - `id: string` - `created_at: string` - `label: string` - `level: number` Depth in the tree; the root is level 0. - `node_type: "root" or "branch" or "leaf"` Position of a node within the taxonomy tree. - `"root"` - `"branch"` - `"leaf"` - `run_id: string` - `sort_order: number` - `updated_at: string` - `children: optional array of Node` Child nodes, present when the tree is returned hierarchically. - `cluster_id: optional string` Cluster this node summarizes; typically present on leaf nodes. - `description: optional string` - `metadata: optional map[unknown]` - `original_label: optional string` Label as originally generated, before any rename. - `parent_id: optional string` Parent node ID; absent for the root node. - `removed_at: optional string` Set when the node has been soft-removed. - `removed_by: optional string` Actor that soft-removed the node. - `run: Run` A persisted taxonomy generation run. - `id: string` - `cluster_count: number` - `created_at: string` - `embedding_count: number` - `field_id: string` - `node_count: number` - `record_count: number` - `source_id: string` Empty string is the canonical "no source" bucket. - `source_type: string` - `status: "pending" or "running" or "succeeded" or 2 more` Lifecycle state of a taxonomy run. Allowed transitions are pending -> running|failed|canceled and running -> succeeded|failed|canceled. - `"pending"` - `"running"` - `"succeeded"` - `"failed"` - `"canceled"` - `tenant_id: string` - `updated_at: string` - `error: optional string` Sanitized failure message; present on failed runs. - `error_code: optional "insufficient_data" or "service_unavailable" or "generation_failed" or 2 more` Machine-readable reason a taxonomy run failed or a prerequisite was not met. - `"insufficient_data"` - `"service_unavailable"` - `"generation_failed"` - `"invalid_output"` - `"internal_error"` - `field_label: optional string` Human-readable field label; absent when unknown. - `finished_at: optional string` - `metrics: optional map[unknown]` Opaque run metrics recorded by the taxonomy service. - `params: optional map[unknown]` Opaque run parameters recorded by Hub. - `started_at: optional string` ### Example ```http curl http://localhost:8080/v1/taxonomy/runs/$RUN_ID/tree \ -H "Authorization: Bearer $HUB_API_KEY" ``` # Active ## Get Tree **get** `/v1/taxonomy/runs/active/tree` Returns the currently active taxonomy run and its tree for a field scope. Exactly one run is active per scope at a time. Returns 404 when no run has been activated for the scope. ### Query Parameters - `field_id: string` Field ID of the scope. - `source_type: string` Source type of the scope. - `tenant_id: string` Tenant that owns the scope. - `source_id: optional string` Source ID of the scope; empty string is the canonical "no source" bucket. ### Returns - `root: Node` A node in a taxonomy tree. Non-root nodes have a parent; leaf nodes reference the cluster they summarize. - `id: string` - `created_at: string` - `label: string` - `level: number` Depth in the tree; the root is level 0. - `node_type: "root" or "branch" or "leaf"` Position of a node within the taxonomy tree. - `"root"` - `"branch"` - `"leaf"` - `run_id: string` - `sort_order: number` - `updated_at: string` - `children: optional array of Node` Child nodes, present when the tree is returned hierarchically. - `cluster_id: optional string` Cluster this node summarizes; typically present on leaf nodes. - `description: optional string` - `metadata: optional map[unknown]` - `original_label: optional string` Label as originally generated, before any rename. - `parent_id: optional string` Parent node ID; absent for the root node. - `removed_at: optional string` Set when the node has been soft-removed. - `removed_by: optional string` Actor that soft-removed the node. - `run: Run` A persisted taxonomy generation run. - `id: string` - `cluster_count: number` - `created_at: string` - `embedding_count: number` - `field_id: string` - `node_count: number` - `record_count: number` - `source_id: string` Empty string is the canonical "no source" bucket. - `source_type: string` - `status: "pending" or "running" or "succeeded" or 2 more` Lifecycle state of a taxonomy run. Allowed transitions are pending -> running|failed|canceled and running -> succeeded|failed|canceled. - `"pending"` - `"running"` - `"succeeded"` - `"failed"` - `"canceled"` - `tenant_id: string` - `updated_at: string` - `error: optional string` Sanitized failure message; present on failed runs. - `error_code: optional "insufficient_data" or "service_unavailable" or "generation_failed" or 2 more` Machine-readable reason a taxonomy run failed or a prerequisite was not met. - `"insufficient_data"` - `"service_unavailable"` - `"generation_failed"` - `"invalid_output"` - `"internal_error"` - `field_label: optional string` Human-readable field label; absent when unknown. - `finished_at: optional string` - `metrics: optional map[unknown]` Opaque run metrics recorded by the taxonomy service. - `params: optional map[unknown]` Opaque run parameters recorded by Hub. - `started_at: optional string` ### Example ```http curl http://localhost:8080/v1/taxonomy/runs/active/tree \ -H "Authorization: Bearer $HUB_API_KEY" ``` # Nodes ## Rename **patch** `/v1/taxonomy/nodes/{node_id}` Renames a taxonomy node's label and records a rename event attributed to actor_id. Tenant-scoped; returns 404 if the node does not belong to the tenant. While a tenant data purge runs for the same tenant_id, the request is rejected with HTTP 409 (code `tenant_write_conflict`) and may be retried. ### Path Parameters - `node_id: string` ### Body Parameters - `actor_id: string` - `label: string` New node label. - `tenant_id: string` ### Returns - `Node = object { id, created_at, label, 13 more }` A node in a taxonomy tree. Non-root nodes have a parent; leaf nodes reference the cluster they summarize. - `id: string` - `created_at: string` - `label: string` - `level: number` Depth in the tree; the root is level 0. - `node_type: "root" or "branch" or "leaf"` Position of a node within the taxonomy tree. - `"root"` - `"branch"` - `"leaf"` - `run_id: string` - `sort_order: number` - `updated_at: string` - `children: optional array of Node` Child nodes, present when the tree is returned hierarchically. - `cluster_id: optional string` Cluster this node summarizes; typically present on leaf nodes. - `description: optional string` - `metadata: optional map[unknown]` - `original_label: optional string` Label as originally generated, before any rename. - `parent_id: optional string` Parent node ID; absent for the root node. - `removed_at: optional string` Set when the node has been soft-removed. - `removed_by: optional string` Actor that soft-removed the node. ### Example ```http curl http://localhost:8080/v1/taxonomy/nodes/$NODE_ID \ -X PATCH \ -H 'Content-Type: application/json' \ -H "Authorization: Bearer $HUB_API_KEY" \ -d '{ "actor_id": "user-42", "label": "Authentication Problems", "tenant_id": "org-123" }' ``` ## Soft Remove **delete** `/v1/taxonomy/nodes/{node_id}` Soft-removes a taxonomy node (sets removed_at/removed_by) and records a soft_remove event attributed to actor_id. The node is retained for audit but excluded from tree responses. Tenant-scoped; returns 404 if the node does not belong to the tenant. While a tenant data purge runs for the same tenant_id, the request is rejected with HTTP 409 (code `tenant_write_conflict`). ### Path Parameters - `node_id: string` ### Query Parameters - `actor_id: string` Identifier of the actor performing the removal (recorded in the audit event). - `tenant_id: string` Tenant that owns the node. ### Returns - `Node = object { id, created_at, label, 13 more }` A node in a taxonomy tree. Non-root nodes have a parent; leaf nodes reference the cluster they summarize. - `id: string` - `created_at: string` - `label: string` - `level: number` Depth in the tree; the root is level 0. - `node_type: "root" or "branch" or "leaf"` Position of a node within the taxonomy tree. - `"root"` - `"branch"` - `"leaf"` - `run_id: string` - `sort_order: number` - `updated_at: string` - `children: optional array of Node` Child nodes, present when the tree is returned hierarchically. - `cluster_id: optional string` Cluster this node summarizes; typically present on leaf nodes. - `description: optional string` - `metadata: optional map[unknown]` - `original_label: optional string` Label as originally generated, before any rename. - `parent_id: optional string` Parent node ID; absent for the root node. - `removed_at: optional string` Set when the node has been soft-removed. - `removed_by: optional string` Actor that soft-removed the node. ### Example ```http curl http://localhost:8080/v1/taxonomy/nodes/$NODE_ID \ -X DELETE \ -H "Authorization: Bearer $HUB_API_KEY" ``` ## List Records **get** `/v1/taxonomy/nodes/{node_id}/records` Returns the feedback records assigned to a node and all of its (visible) descendant nodes, via the clusters those nodes reference. Tenant-scoped. The `limit` in the response reflects the applied cap. ### Path Parameters - `node_id: string` ### Query Parameters - `tenant_id: string` Tenant that owns the node. - `limit: optional number` Maximum number of feedback records to return. ### Returns - `data: array of FeedbackRecordData` - `id: string` UUIDv7 primary key - `collected_at: string` When the feedback was collected - `created_at: string` When this record was created - `field_id: string` Identifier for the question/field - `field_type: "text" or "categorical" or "nps" or 6 more` Type of field - `"text"` - `"categorical"` - `"nps"` - `"csat"` - `"ces"` - `"rating"` - `"number"` - `"boolean"` - `"date"` - `source_type: string` Type of feedback source - `submission_id: string` Identifier for the logical submission this record belongs to (required). - `tenant_id: string` Tenant/organization identifier. NULL bytes not allowed. - `updated_at: string` When this record was last updated - `field_group_id: optional string` Stable identifier grouping related fields (for ranking, matrix, grid questions) - `field_group_label: optional string` Human-readable question text for the group - `field_label: optional string` The actual question text - `language: optional string` ISO language code. NULL bytes not allowed. - `metadata: optional map[unknown]` Additional context - `sentiment: optional "very_negative" or "negative" or "neutral" or 3 more` Sentiment polarity inferred from value_text (sentiment enrichment). Read-only; absent until the record is enriched. - `"very_negative"` - `"negative"` - `"neutral"` - `"positive"` - `"very_positive"` - `"mixed"` - `sentiment_score: optional number` Signed sentiment polarity from -1.0 (very negative) to 1.0 (very positive) (sentiment enrichment). Read-only; absent until the record is enriched. - `source_id: optional string` Reference to survey/form/ticket ID - `source_name: optional string` Human-readable name - `translation_lang_key: optional string` BCP-47 target locale that value_text_translated was produced in (language enrichment). Read-only; absent until the record is enriched. - `user_id: optional string` User ID (e.g., anonymous ID or email hash) - `value_boolean: optional boolean` Boolean response - `value_date: optional string` Date response - `value_number: optional number` Numeric response - `value_text: optional string` Text response. NULL bytes not allowed. - `value_text_translated: optional string` value_text translated into the tenant's configured target language (language enrichment). Read-only; absent until the record is enriched. - `limit: number` The applied maximum number of records. ### Example ```http curl http://localhost:8080/v1/taxonomy/nodes/$NODE_ID/records \ -H "Authorization: Bearer $HUB_API_KEY" ```