Pinecone

The Pinecone component is a data component that allows users to build and search vector datasets. It can carry out the following tasks:

#Release Stage

Alpha

#Configuration

The component definition and tasks are defined in the definition.json and tasks.json files respectively.

#Setup

In order to communicate with Pinecone, the following connection details need to be provided. You may specify them directly in a pipeline recipe as key-value pairs within the component's setup block, or you can create a Connection from the Integration Settings page and reference the whole setup as setup: ${connection.<my-connection-id>}.

FieldField IDTypeNote
API Key (required)api-keystringFill in your Pinecone AI API key. You can create an api key in Pinecone Console.
Pinecone Index URLurlstringFill in your Pinecone index URL. It is in the form.

#Supported Tasks

#Query

Retrieve the ids of the most similar items in a namespace, along with their similarity scores.

InputIDTypeDescription
Task ID (required)taskstringTASK_QUERY
IDidstringThe unique ID of the vector to be used as a query vector. If present, the vector parameter will be ignored.
Vector (required)vectorarray[number]An array of dimensions for the query vector.
Top K (required)top-kintegerThe number of results to return for each query.
NamespacenamespacestringThe namespace to query.
FilterfilterobjectThe filter to apply. You can use vector metadata to limit your search. See more details here.
Minimum Scoremin-scorenumberExclude results whose score is below this value.
Include Metadatainclude-metadatabooleanIndicates whether metadata is included in the response as well as the IDs.
Include Valuesinclude-valuesbooleanIndicates whether vector values are included in the response.
OutputIDTypeDescription
NamespacenamespacestringThe namespace of the query.
Matchesmatchesarray[object]The matches returned for the query.
Output Objects in Query

Matches

FieldField IDTypeNote
IDidstringThe ID of the matched vector.
MetadatametadataobjectMetadata.
ScorescorenumberA measure of similarity between this vector and the query vector. The higher the score, the more similar they are.
ValuesvaluesarrayVector data values.

#Upsert

Writes vectors into a namespace. If a new value is upserted for an existing vector id, it will overwrite the previous value. This task will be soon replaced by TASK_BATCH_UPSERT, which extends its functionality.

InputIDTypeDescription
Task ID (required)taskstringTASK_UPSERT
ID (required)idstringThis is the vector's unique id.
Values (required)valuesarray[number]An array of dimensions for the vector to be saved.
NamespacenamespacestringThe namespace to query.
MetadatametadataobjectThe vector metadata.
OutputIDTypeDescription
Upserted Countupserted-countintegerNumber of records modified or added.

#Batch Upsert

Writes vectors into a namespace. If a new value is upserted for an existing vector ID, it will overwrite the previous value.

InputIDTypeDescription
Task ID (required)taskstringTASK_BATCH_UPSERT
Vectors (required)vectorsarray[object]Array of vectors to upsert
NamespacenamespacestringThe namespace to query.
Input Objects in Batch Upsert

Vectors

Array of vectors to upsert

FieldField IDTypeNote
IDidstringThe unique ID of the vector.
MetadatametadataobjectThe vector metadata. This is a set of key-value pairs that can be used to store additional information about the vector. The values can have the following types: string, number, boolean, or array of strings.
ValuesvaluesarrayAn array of dimensions for the vector to be saved.
OutputIDTypeDescription
Upserted Countupserted-countintegerNumber of records modified or added.

#Rerank

Rerank documents, such as text passages, according to their relevance to a query. The input is a list of documents and a query. The output is a list of documents, sorted by relevance to the query.

InputIDTypeDescription
Task ID (required)taskstringTASK_RERANK
Query (required)querystringThe query to rerank the documents.
Documents (required)documentsarray[string]The documents to rerank.
Top Ntop-nintegerThe number of results to return sorted by relevance. Defaults to the number of inputs.
OutputIDTypeDescription
Reranked Documents.documentsarray[string]Reranked documents.
Scoresscoresarray[number]The relevance score of the documents normalized between 0 and 1.