The Instill Artifact component is a data component that allows users to manipulate and smart search files and data in the artifact store.
It can carry out the following tasks:
To use Artifact Component, you will need to set up the OpenAI API key for self-hosted deployment of Instill Core.
You can do this by setting the OPENAI_API_KEY
environment variable.
Please refer to configuring-the-embedding-feature
p.s. In Instill Cloud case, you do not need to set up the OpenAI API key.
#Release Stage
Alpha
#Configuration
The component definition and tasks are defined in the definition.json and tasks.json files respectively.
#Supported Tasks
#Upload File
Upload and process the files into chunks into Catalog
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_UPLOAD_FILE |
Options (required) | options | object | Choose to upload the files to existing catalog or create a new catalog |
The options
Object
Options
options
must fulfill one of the following schemas:
Existing Catalog
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | Catalog ID that you input in the Catalog |
File | file | string | base64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML file to be uploaded into catalog |
File Name | file-name | string | The name of the file, please remember to add the file extension in the end of file name. e.g. 'example.pdf' |
Namespace | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Option | option | string | Must be "existing catalog" |
Create New Catalog
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | Catalog ID for new catalog you want to create |
Description | description | string | Description of the catalog |
File | file | string | base64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML file to be uploaded into catalog |
File Name | file-name | string | The name of the file, please remember to add the file extension in the end of file name. e.g. 'example.pdf' |
Namespace | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Option | option | string | Must be "create new catalog" |
Tags | tags | array | Tags for the catalog |
Output | ID | Type | Description |
---|
File | file | object | Result of uploading file into catalog |
Status | status | boolean | The status of trigger file processing, if succeeded, return true |
Output Objects in Upload File
File
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | The ID of the catalog that you upload files |
Create Time | create-time | string | The creation time of the file in ISO 8601 format |
File Name | file-name | string | The name of the file |
Type | file-type | string | The type of the file |
File UID | file-uid | string | The unique identifier of the file |
Size | size | number | The size of the file in bytes |
Update Time | update-time | string | The update time of the file in ISO 8601 format |
#Upload Files
Upload and process the files into chunks into Catalog
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_UPLOAD_FILES |
Options (required) | options | object | Choose to upload the files to existing catalog or create a new catalog |
The options
Object
Options
options
must fulfill one of the following schemas:
Existing Catalog
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | Catalog ID that you input in the Catalog |
File Names | file-names | array | The names of the files, please remember to add the file extension in the end of file name. e.g. 'example.pdf' |
Files | files | array | base64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML files to be uploaded into catalog |
Namespace | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Option | option | string | Must be "existing catalog" |
Create New Catalog
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | Catalog ID for new catalog you want to create |
Description | description | string | Description of the catalog |
File Names | file-names | array | The names of the files, please remember to add the file extension in the end of file name. e.g. 'example.pdf' |
Files | files | array | base64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML files to be uploaded into catalog |
Namespace | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Option | option | string | Must be "create new catalog" |
Tags | tags | array | Tags for the catalog |
Output | ID | Type | Description |
---|
Files | files | array[object] | Files metadata in catalog |
Status | status | boolean | The status of trigger file processing, if ALL succeeded, return true |
Output Objects in Upload Files
Files
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | The ID of the catalog that you upload files |
Create Time | create-time | string | The creation time of the file in ISO 8601 format |
File Name | file-name | string | The name of the file |
Type | file-type | string | The type of the file |
File UID | file-uid | string | The unique identifier of the file |
Size | size | number | The size of the file in bytes |
Update Time | update-time | string | The update time of the file in ISO 8601 format |
get the metadata of the files in the catalog
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_GET_FILES_METADATA |
Namespace (required) | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Catalog ID (required) | catalog-id | string | Catalog ID that you input to search files in the Catalog |
Output | ID | Type | Description |
---|
Files | files | array[object] | Files metadata in catalog |
Output Objects in Get Files Metadata
Field | Field ID | Type | Note |
---|
Catalog ID | catalog-id | string | The ID of the catalog that you upload files |
Create Time | create-time | string | The creation time of the file in ISO 8601 format |
File Name | file-name | string | The name of the file |
Type | file-type | string | The type of the file |
File UID | file-uid | string | The unique identifier of the file |
Size | size | number | The size of the file in bytes |
Update Time | update-time | string | The update time of the file in ISO 8601 format |
get the metadata of the chunks from a file in the catalog
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_GET_CHUNKS_METADATA |
Catalog ID (required) | catalog-id | string | Catalog ID that you input to search files in the Catalog |
Namespace (required) | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
File UID (required) | file-uid | string | The unique identifier of the file |
Output | ID | Type | Description |
---|
Chunks | chunks | array[object] | Chunks metadata of the file in catalog |
Output Objects in Get Chunks Metadata
Field | Field ID | Type | Note |
---|
Chunk UID | chunk-uid | string | The unique identifier of the chunk |
Create Time | create-time | string | The creation time of the chunk in ISO 8601 format |
End Position | end-position | integer | The end position of the chunk in the file |
File UID | original-file-uid | string | The unique identifier of the file |
Retrievable | retrievable | boolean | The retrievable status of the chunk |
Start Position | start-position | integer | The start position of the chunk in the file |
Token Count | token-count | integer | The token count of the chunk |
#Get File in Markdown
get the file content in markdown format
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_GET_FILE_IN_MARKDOWN |
Catalog ID (required) | catalog-id | string | Catalog ID that you input to search files in the Catalog |
Namespace (required) | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
File UID (required) | file-uid | string | The unique identifier of the file |
Output | ID | Type | Description |
---|
File UID | original-file-uid | string | The unique identifier of the file |
Content | content | string | The content of the file in markdown format |
Create Time | create-time | string | The creation time of the source file in ISO 8601 format |
Update Time | update-time | string | The update time of the source file in ISO 8601 format |
#Match File Status
Check if the specified file's processing status is done
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_MATCH_FILE_STATUS |
Catalog ID (required) | catalog-id | string | Catalog ID that you input to check files' processing status in the Catalog |
Namespace (required) | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
File UID (required) | file-uid | string | The unique identifier of the file |
Output | ID | Type | Description |
---|
Status | succeeded | boolean | The status of the file processing, if succeeded, return true |
#Retrieve
search the chunks in the catalog
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_RETRIEVE |
Catalog ID (required) | catalog-id | string | Catalog ID that you input to search files in the Catalog |
Namespace (required) | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Text Prompt (required) | text-prompt | string | The prompt string to search the chunks |
Top K | top-k | integer | The number of top chunks to return. The range is from 1~20, and default is 5 |
Output | ID | Type | Description |
---|
Chunks | chunks | array[object] | Chunks data from smart search |
Output Objects in Retrieve
Chunks
Field | Field ID | Type | Note |
---|
Chunk UID | chunk-uid | string | The unique identifier of the chunk |
Similarity | similarity-score | number | The similarity score of the chunk |
Source File Name | source-file-name | string | The name of the source file |
Text Content | text-content | string | The text content of the chunk |
#Ask
Reply the questions based on the files in the catalog
Input | ID | Type | Description |
---|
Task ID (required) | task | string | TASK_ASK |
Catalog ID (required) | catalog-id | string | Catalog ID that you input to search files in the Catalog |
Namespace (required) | namespace | string | Fill in your namespace, you can get namespace through the tab of switching namespace |
Question (required) | question | string | The question to reply |
Top K | top-k | integer | The number of top answers to return. The range is from 1~20, and default is 5 |
Output | ID | Type | Description |
---|
Answer | answer | string | Answers data from smart search |
Chunks (optional) | chunks | array[object] | Chunks data to answer question |
Output Objects in Ask
Chunks
Field | Field ID | Type | Note |
---|
Chunk UID | chunk-uid | string | The unique identifier of the chunk |
Similarity | similarity-score | number | The similarity score of the chunk |
Source File Name | source-file-name | string | The name of the source file |
Text Content | text-content | string | The text content of the chunk |
#Example Recipes
Recipe for the Ask your Catalog pipeline.
catalog-id: ${variable.catalog_name}
namespace: ${variable.namespace}
question: ${variable.question}
description: The name of your catalog i.e. "instill-ai"
description: The namespace of your catalog i.e. "instill-ai"
description: The question to ask your catalog i.e. "What is Instill AI doing?", "What is Artifact?"
value: ${artifact-0.output.answer}