The Pinecone component is a data component that allows users to build and search vector datasets.
It can carry out the following tasks:
#Release Stage
Alpha
#Configuration
The component definition and tasks are defined in the definition.json and tasks.json files respectively.
#Setup
In order to communicate with Pinecone, the following connection details need to be
provided. You may specify them directly in a pipeline recipe as key-value pairs
within the component's setup
block, or you can create a Connection from
the Integration Settings
page and reference the whole setup
as setup: ${connection.<my-connection-id>}
.
Field | Field ID | Type | Note |
---|
API Key (required) | api-key | string | Fill in your Pinecone AI API key. You can create an api key in Pinecone Console. |
Pinecone Index URL | url | string | Fill in your Pinecone index URL. It is in the form. |
#Supported Tasks
#Query
Retrieve the ids of the most similar items in a namespace, along with their similarity scores.
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_QUERY |
ID | id | string | The unique ID of the vector to be used as a query vector. If present, the vector parameter will be ignored. |
Vector (required) | vector | array[number] | An array of dimensions for the query vector. |
Top K (required) | top-k | integer | The number of results to return for each query. |
Namespace | namespace | string | The namespace to query. |
Filter | filter | object | The filter to apply. You can use vector metadata to limit your search. See more details here. |
Minimum Score | min-score | number | Exclude results whose score is below this value. |
Include Metadata | include-metadata | boolean | Indicates whether metadata is included in the response as well as the IDs. |
Include Values | include-values | boolean | Indicates whether vector values are included in the response. |
Output | ID | Type | Description |
---|
Namespace | namespace | string | The namespace of the query. |
Matches | matches | array[object] | The matches returned for the query. |
Output Objects in Query
Matches
Field | Field ID | Type | Note |
---|
ID | id | string | The ID of the matched vector. |
Metadata | metadata | object | Metadata. |
Score | score | number | A measure of similarity between this vector and the query vector. The higher the score, the more similar they are. |
Values | values | array | Vector data values. |
#Upsert
Writes vectors into a namespace. If a new value is upserted for an existing vector id, it will overwrite the previous value. This task will be soon replaced by TASK_BATCH_UPSERT
, which extends its functionality.
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_UPSERT |
ID (required) | id | string | This is the vector's unique id. |
Values (required) | values | array[number] | An array of dimensions for the vector to be saved. |
Namespace | namespace | string | The namespace to query. |
Metadata | metadata | object | The vector metadata. |
Output | ID | Type | Description |
---|
Upserted Count | upserted-count | integer | Number of records modified or added. |
#Batch Upsert
Writes vectors into a namespace. If a new value is upserted for an existing vector ID, it will overwrite the previous value.
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_BATCH_UPSERT |
Vectors (required) | vectors | array[object] | Array of vectors to upsert |
Namespace | namespace | string | The namespace to query. |
Input Objects in Batch Upsert
Vectors
Array of vectors to upsert
Field | Field ID | Type | Note |
---|
ID | id | string | The unique ID of the vector. |
Metadata | metadata | object | The vector metadata. This is a set of key-value pairs that can be used to store additional information about the vector. The values can have the following types: string, number, boolean, or array of strings. |
Values | values | array | An array of dimensions for the vector to be saved. |
Output | ID | Type | Description |
---|
Upserted Count | upserted-count | integer | Number of records modified or added. |
#Rerank
Rerank documents, such as text passages, according to their relevance to a query. The input is a list of documents and a query. The output is a list of documents, sorted by relevance to the query.
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_RERANK |
Query (required) | query | string | The query to rerank the documents. |
Documents (required) | documents | array[string] | The documents to rerank. |
Top N | top-n | integer | The number of results to return sorted by relevance. Defaults to the number of inputs. |
Output | ID | Type | Description |
---|
Reranked Documents. | documents | array[string] | Reranked documents. |
Scores | scores | array[number] | The relevance score of the documents normalized between 0 and 1. |