This page shows you how to conduct a smart search in a Catalog using a text prompt.
This API performs semantic search by embedding user queries with the same Indexing Embed VDP Pipeline that is used by the Process Files operation. The user query embedding is then compared to the embeddings of the chunks in the specified Catalog to find and return the most contextually similar chunks.
#Retrieve Chunks via API
The Retrieve Chunks
API allows you to perform a semantic search within a Catalog by providing a text prompt.
This operation returns the most contextually similar chunks based on the provided text.
Replace NAMESPACE_ID
with the Catalog owner's ID (namespace), CATALOG_ID
with the identifier
of the Catalog you are searching, and REQUESTER_UID
with the UID of the entity (organization) on whose behalf the
request is being made (optional). If you are not making the request on behalf of another entity (organization), you can
omit the Instill-Requester-Uid
header.
#Body Parameters
textPrompt
(string, required): The text prompt to search for in the Catalog.topK
(integer, optional): Specifies the number of similar chunks to return. Defaults to 5.
#Example Response
A successful response will return a list of similar chunks found in the Catalog:
{ "similarChunks": [ { "chunkUid": "ba30f524-889c-4dc7-82a2-33a8f7be2d47", "similarityScore": 0.95, "textContent": "Instill Core is a full-stack AI solution to accelerate AI development...", "sourceFile": "core-intro.txt" }, { "chunkUid": "757ab6d9-e5b4-482e-8017-5582b578e57a", "similarityScore": 0.90, "textContent": "Transform unstructured data into a knowledge base with a unified format...", "sourceFile": "catalog-intro.pdf" } ]}
#Output Description
similarChunks
(array of objects): An array where each object represents a similar chunk found in the Catalog.chunkUid
(string): The unique identifier of the chunk.similarityScore
(number): The similarity score between the input text prompt and the chunk content. Scores range from 0 to 1, with higher scores indicating greater relevance.textContent
(string): The content of the similar chunk.sourceFile
(string): The name of the source file from which the chunk was extracted.
Notes:
- Ensure that the
Authorization
header contains a valid API token with theBearer
prefix. - The
Instill-Requester-Uid
header is optional and should be included if the authenticated user is making the request on behalf of another entity, such as an organization they belong to. - Adjust the
topK
parameter based on how many context chunks you want to retrieve for your search. If omitted, it defaults to 5. - The API performs semantic search using embeddings, so the results will be based on contextual similarity rather than exact keyword matches.
#Error Responses
401 Unauthorized
: Returned when the client credentials are not valid. Ensure your API token is correct and has the necessary permissions.default
: An unexpected error response. The response will include anrpcStatus
object with details about the error.