Artifact

The Artifact component is a data component that allows users to manipulate and smart search files and data in the artifact store. It can carry out the following tasks:

To use Artifact Component, you will need to set up the OpenAI API key for self-hosted deployment of Instill Core. You can do this by setting the OPENAI_API_KEY environment variable. Please refer to configuring-the-embedding-feature p.s. In Instill Cloud case, you do not need to set up the OpenAI API key.

#Release Stage

Alpha

#Configuration

The component configuration is defined and maintained here.

#Supported Tasks

#Upload File

Upload and process the files into chunks into Catalog

InputIDTypeDescription
Task ID (required)taskstringTASK_UPLOAD_FILE
Options (required)optionsobjectChoose to upload the files to existing catalog or create a new catalog
OutputIDTypeDescription
FilefileobjectResult of uploading file into catalog
StatusstatusbooleanThe status of trigger file processing, if succeeded, return true

#Upload Files

Upload and process the files into chunks into Catalog

InputIDTypeDescription
Task ID (required)taskstringTASK_UPLOAD_FILES
Options (required)optionsobjectChoose to upload the files to existing catalog or create a new catalog
OutputIDTypeDescription
Filesfilesarray[object]Files metadata in catalog
StatusstatusbooleanThe status of trigger file processing, if ALL succeeded, return true

#Get Files Metadata

get the metadata of the files in the catalog

InputIDTypeDescription
Task ID (required)taskstringTASK_GET_FILES_METADATA
Namespace (required)namespacestringFill in your namespace, you can get namespace through the tab of switching namespace
Catalog ID (required)catalog-idstringCatalog ID that you input to search files in the Catalog
OutputIDTypeDescription
Filesfilesarray[object]Files metadata in catalog

#Get Chunks Metadata

get the metadata of the chunks from a file in the catalog

InputIDTypeDescription
Task ID (required)taskstringTASK_GET_CHUNKS_METADATA
Catalog ID (required)catalog-idstringCatalog ID that you input to search files in the Catalog
Namespace (required)namespacestringFill in your namespace, you can get namespace through the tab of switching namespace
File UID (required)file-uidstringThe unique identifier of the file
OutputIDTypeDescription
Chunkschunksarray[object]Chunks metadata of the file in catalog

#Get File In Markdown

get the file content in markdown format

InputIDTypeDescription
Task ID (required)taskstringTASK_GET_FILE_IN_MARKDOWN
Catalog ID (required)catalog-idstringCatalog ID that you input to search files in the Catalog
Namespace (required)namespacestringFill in your namespace, you can get namespace through the tab of switching namespace
File UID (required)file-uidstringThe unique identifier of the file
OutputIDTypeDescription
File UIDoriginal-file-uidstringThe unique identifier of the file
ContentcontentstringThe content of the file in markdown format
Create Timecreate-timestringTThe creation time of the source file in ISO 8601 format
Update Timeupdate-timestringThe update time of the source file in ISO 8601 format

#Match File Status

Check if the specified file's processing status is done

InputIDTypeDescription
Task ID (required)taskstringTASK_MATCH_FILE_STATUS
Catalog ID (required)catalog-idstringCatalog ID that you input to check files' processing status in the Catalog
Namespace (required)namespacestringFill in your namespace, you can get namespace through the tab of switching namespace
File UID (required)file-uidstringThe unique identifier of the file
OutputIDTypeDescription
StatussucceededbooleanThe status of the file processing, if succeeded, return true

#Retrieve

search the chunks in the catalog

InputIDTypeDescription
Task ID (required)taskstringTASK_RETRIEVE
Catalog ID (required)catalog-idstringCatalog ID that you input to search files in the Catalog
Namespace (required)namespacestringFill in your namespace, you can get namespace through the tab of switching namespace
Text Prompt (required)text-promptstringThe prompt string to search the chunks
Top Ktop-kintegerThe number of top chunks to return. The range is from 1~20, and default is 5
OutputIDTypeDescription
Chunkschunksarray[object]Chunks data from smart search

#Ask

Reply the questions based on the files in the catalog

InputIDTypeDescription
Task ID (required)taskstringTASK_ASK
Catalog ID (required)catalog-idstringCatalog ID that you input to search files in the Catalog
Namespace (required)namespacestringFill in your namespace, you can get namespace through the tab of switching namespace
Question (required)questionstringThe question to reply
Top Ktop-kintegerThe number of top answers to return. The range is from 1~20, and default is 5
OutputIDTypeDescription
AnsweranswerstringAnswers data from smart search
Chunks (optional)chunksarray[object]Chunks data to answer question

#Example Recipes

Recipe for the Ask your Catalog pipeline.


version: v1beta
component:
artifact-0:
type: artifact
task: TASK_ASK
input:
catalog-id: ${variable.catalog_name}
namespace: ${variable.namespace}
question: ${variable.question}
top-k: 5
variable:
catalog_name:
title: catalog-name
description: The name of your catalog i.e. "instill-ai"
instill-format: string
namespace:
title: namespace
description: The namespace of your catalog i.e. "instill-ai"
instill-format: string
question:
title: question
description: The question to ask your catalog i.e. "What is Instill AI doing?", "What is Artifact?"
instill-format: string
output:
answer:
title: answer
value: ${artifact-0.output.answer}