Instill Artifact

The Instill Artifact component is a data component that allows users to manipulate and smart search files and data in the artifact store. It can carry out the following tasks:

To use Artifact Component, you will need to set up the OpenAI API key for self-hosted deployment of Instill Core. You can do this by setting the OPENAI_API_KEY environment variable. Please refer to configuring-the-embedding-feature p.s. In Instill Cloud case, you do not need to set up the OpenAI API key.

#Release Stage

Alpha

#Configuration

The component definition and tasks are defined in the definition.json and tasks.json files respectively.

#Supported Tasks

#Upload File

Upload and process the files into chunks into Catalog

InputIDTypeDescription
Task ID (required)taskstringTASK_UPLOAD_FILE
Options (required)optionsobjectChoose to upload the files to existing catalog or create a new catalog
The options Object

Options

options must fulfill one of the following schemas:

Existing Catalog
FieldField IDTypeNote
Catalog IDcatalog-idstringCatalog ID that you input in the Catalog
Filefilestringbase64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML file to be uploaded into catalog
File Namefile-namestringThe name of the file, please remember to add the file extension in the end of file name. e.g. 'example.pdf'
NamespacenamespacestringFill in your namespace, you can get namespace through the tab of switching namespace
OptionoptionstringMust be "existing catalog"
Create New Catalog
FieldField IDTypeNote
Catalog IDcatalog-idstringCatalog ID for new catalog you want to create
DescriptiondescriptionstringDescription of the catalog
Filefilestringbase64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML file to be uploaded into catalog
File Namefile-namestringThe name of the file, please remember to add the file extension in the end of file name. e.g. 'example.pdf'
NamespacenamespacestringFill in your namespace, you can get namespace through the tab of switching namespace
OptionoptionstringMust be "create new catalog"
TagstagsarrayTags for the catalog
OutputIDTypeDescription
FilefileobjectResult of uploading file into catalog
StatusstatusbooleanThe status of trigger file processing, if succeeded, return true
Output Objects in Upload File

File

FieldField IDTypeNote
Catalog IDcatalog-idstringThe ID of the catalog that you upload files
Create Timecreate-timestringThe creation time of the file in ISO 8601 format
File Namefile-namestringThe name of the file
Typefile-typestringThe type of the file
File UIDfile-uidstringThe unique identifier of the file
SizesizenumberThe size of the file in bytes
Update Timeupdate-timestringThe update time of the file in ISO 8601 format

#Upload Files

Upload and process the files into chunks into Catalog

InputIDTypeDescription
Task ID (required)taskstringTASK_UPLOAD_FILES
Options (required)optionsobjectChoose to upload the files to existing catalog or create a new catalog
The options Object

Options

options must fulfill one of the following schemas:

Existing Catalog
FieldField IDTypeNote
Catalog IDcatalog-idstringCatalog ID that you input in the Catalog
File Namesfile-namesarrayThe names of the files, please remember to add the file extension in the end of file name. e.g. 'example.pdf'
Filesfilesarraybase64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML files to be uploaded into catalog
NamespacenamespacestringFill in your namespace, you can get namespace through the tab of switching namespace
OptionoptionstringMust be "existing catalog"
Create New Catalog
FieldField IDTypeNote
Catalog IDcatalog-idstringCatalog ID for new catalog you want to create
DescriptiondescriptionstringDescription of the catalog
File Namesfile-namesarrayThe names of the files, please remember to add the file extension in the end of file name. e.g. 'example.pdf'
Filesfilesarraybase64 encoded PDF/DOCX/DOC/PPTX/PPT/HTML files to be uploaded into catalog
NamespacenamespacestringFill in your namespace, you can get namespace through the tab of switching namespace
OptionoptionstringMust be "create new catalog"
TagstagsarrayTags for the catalog
OutputIDTypeDescription
Filesfilesarray[object]Files metadata in catalog
StatusstatusbooleanThe status of trigger file processing, if ALL succeeded, return true
Output Objects in Upload Files

Files

FieldField IDTypeNote
Catalog IDcatalog-idstringThe ID of the catalog that you upload files
Create Timecreate-timestringThe creation time of the file in ISO 8601 format
File Namefile-namestringThe name of the file
Typefile-typestringThe type of the file
File UIDfile-uidstringThe unique identifier of the file
SizesizenumberThe size of the file in bytes
Update Timeupdate-timestringThe update time of the file in ISO 8601 format

#Get Files Metadata

get the metadata of the files in the catalog

InputIDTypeDescription
Task ID (required)taskstringTASK_GET_FILES_METADATA
Namespace (required)namespacestringFill in your namespace, you can get namespace through the tab of switching namespace
Catalog ID (required)catalog-idstringCatalog ID that you input to search files in the Catalog
OutputIDTypeDescription
Filesfilesarray[object]Files metadata in catalog
Output Objects in Get Files Metadata

Files

FieldField IDTypeNote
Catalog IDcatalog-idstringThe ID of the catalog that you upload files
Create Timecreate-timestringThe creation time of the file in ISO 8601 format
File Namefile-namestringThe name of the file
Typefile-typestringThe type of the file
File UIDfile-uidstringThe unique identifier of the file
SizesizenumberThe size of the file in bytes
Update Timeupdate-timestringThe update time of the file in ISO 8601 format

#Get Chunks Metadata

get the metadata of the chunks from a file in the catalog

InputIDTypeDescription
Task ID (required)taskstringTASK_GET_CHUNKS_METADATA
Catalog ID (required)catalog-idstringCatalog ID that you input to search files in the Catalog
Namespace (required)namespacestringFill in your namespace, you can get namespace through the tab of switching namespace
File UID (required)file-uidstringThe unique identifier of the file
OutputIDTypeDescription
Chunkschunksarray[object]Chunks metadata of the file in catalog
Output Objects in Get Chunks Metadata

Chunks

FieldField IDTypeNote
Chunk UIDchunk-uidstringThe unique identifier of the chunk
Create Timecreate-timestringThe creation time of the chunk in ISO 8601 format
End Positionend-positionintegerThe end position of the chunk in the file
File UIDoriginal-file-uidstringThe unique identifier of the file
RetrievableretrievablebooleanThe retrievable status of the chunk
Start Positionstart-positionintegerThe start position of the chunk in the file
Token Counttoken-countintegerThe token count of the chunk

#Get File in Markdown

get the file content in markdown format

InputIDTypeDescription
Task ID (required)taskstringTASK_GET_FILE_IN_MARKDOWN
Catalog ID (required)catalog-idstringCatalog ID that you input to search files in the Catalog
Namespace (required)namespacestringFill in your namespace, you can get namespace through the tab of switching namespace
File UID (required)file-uidstringThe unique identifier of the file
OutputIDTypeDescription
File UIDoriginal-file-uidstringThe unique identifier of the file
ContentcontentstringThe content of the file in markdown format
Create Timecreate-timestringThe creation time of the source file in ISO 8601 format
Update Timeupdate-timestringThe update time of the source file in ISO 8601 format

#Match File Status

Check if the specified file's processing status is done

InputIDTypeDescription
Task ID (required)taskstringTASK_MATCH_FILE_STATUS
Catalog ID (required)catalog-idstringCatalog ID that you input to check files' processing status in the Catalog
Namespace (required)namespacestringFill in your namespace, you can get namespace through the tab of switching namespace
File UID (required)file-uidstringThe unique identifier of the file
OutputIDTypeDescription
StatussucceededbooleanThe status of the file processing, if succeeded, return true

#Retrieve

search the chunks in the catalog

InputIDTypeDescription
Task ID (required)taskstringTASK_RETRIEVE
Catalog ID (required)catalog-idstringCatalog ID that you input to search files in the Catalog
Namespace (required)namespacestringFill in your namespace, you can get namespace through the tab of switching namespace
Text Prompt (required)text-promptstringThe prompt string to search the chunks
Top Ktop-kintegerThe number of top chunks to return. The range is from 1~20, and default is 5
OutputIDTypeDescription
Chunkschunksarray[object]Chunks data from smart search
Output Objects in Retrieve

Chunks

FieldField IDTypeNote
Chunk UIDchunk-uidstringThe unique identifier of the chunk
Similaritysimilarity-scorenumberThe similarity score of the chunk
Source File Namesource-file-namestringThe name of the source file
Text Contenttext-contentstringThe text content of the chunk

#Ask

Reply the questions based on the files in the catalog

InputIDTypeDescription
Task ID (required)taskstringTASK_ASK
Catalog ID (required)catalog-idstringCatalog ID that you input to search files in the Catalog
Namespace (required)namespacestringFill in your namespace, you can get namespace through the tab of switching namespace
Question (required)questionstringThe question to reply
Top Ktop-kintegerThe number of top answers to return. The range is from 1~20, and default is 5
OutputIDTypeDescription
AnsweranswerstringAnswers data from smart search
Chunks (optional)chunksarray[object]Chunks data to answer question
Output Objects in Ask

Chunks

FieldField IDTypeNote
Chunk UIDchunk-uidstringThe unique identifier of the chunk
Similaritysimilarity-scorenumberThe similarity score of the chunk
Source File Namesource-file-namestringThe name of the source file
Text Contenttext-contentstringThe text content of the chunk

#Example Recipes

Recipe for the Ask your Catalog pipeline.


version: v1beta
component:
artifact-0:
type: instill-artifact
task: TASK_ASK
input:
catalog-id: ${variable.catalog_name}
namespace: ${variable.namespace}
question: ${variable.question}
top-k: 5
variable:
catalog_name:
title: catalog-name
description: The name of your catalog i.e. "instill-ai"
instill-format: string
namespace:
title: namespace
description: The namespace of your catalog i.e. "instill-ai"
instill-format: string
question:
title: question
description: The question to ask your catalog i.e. "What is Instill AI doing?", "What is Artifact?"
instill-format: string
output:
answer:
title: answer
value: ${artifact-0.output.answer}