Model

A Model component is an algorithm run on unstructured visual data to solve a certain CV Task.

VDP uses Triton Inference server for model serving. It supports multiple deep learning frameworks including TensorFlow, PyTorch, TensorRT and ONNX. Besides, the Python Backend enables Triton to support any model written in Python. To make your models VDP-ready, please refer to Prepare Models.

#Definition

VDP uses ModelDefinition to define how to configure and import a model from a supported model source. Please check out Import Models to learn more.

Instill AI develops and maintains model sources (ModelDefinition). We use release stage to indicate a model source's readiness.

#Model Instance

A Model Instance represents a tagged snapshot of a Model. A model may have multiple model instances. The tag of a model instance depends on the model source and what versioning control tool the model source uses.

VDP captures the versioning of your models via model instances. During model development, ML teams version models to track and manage changes and collaborate between members. As you use VDP to import your model, different versions of your model are also imported as tagged model instances. In this case, you can keep a single history of your models and the original versioning serves as the single source of truth.

Throughout the documentation, we use the two terms model and model instance interchangeably. For example, when we say that Model is the core component of data pipeline to convert the unstructured visual data to structured insights, what actually happens underneath is the model instances being used to construct a data pipeline.

A VDP pipeline can have multiple model instances. The examples below showcase pipeline recipes that incorporate single or multiple model instances.

single-model-instance-pipeline
multiple-model-instance-pipeline
Copy

{
"source": "source-connectors/source-http",
"model_instances": ["models/model-1/instances/v1.0"],
"destination": "destination-connectors/postgres-db"
}

#Model deployment

When importing a model in VDP, all versions of the model are imported as model instances associated with the original version tags. Please refer to Import Models to learn about model versioning with supported model sources. Among all model instances, you can choose specific ones into the model repository to deploy and then to use in data pipelines.

VDP model and model instance

#State

The state of a model instance can be UNSPECIFIED, OFFLINE, ONLINE or ERROR.

When a model is initially created, the states of all model instances of this model are by default OFFLINE.

A model instance can be switched to OFFLINE state by invoking the model-backend endpoint :undeploy only when its original state is ONLINE.

A model instance can be switch to ONLINE state by invoking the model-backend endpoint :deploy only when its original state is OFFLINE. Model deployment operation can take time depending on factors like Internet connection and model size. Before a model instance is deployed online, the state of the model instance will be UNSPECIFIED.

If the state of a model instance ends up with ERROR, the model instance is undeployable on VDP. Please refer to Prepare Models to make your model VDP-ready.

model instance state

The finite-state-machine (FSM) diagram for the model instance state transition logic.

#Inference

An inference is a prediction to a question or task. In the concept of Machine Learning (ML) and Artificial Intelligence (AI), the term inference is often compared with training. To put it simple, inference is where capabilities learnt during training are put to analyze data to "infer" a result. Inference can be found and are applied everywhere across industries from photo tagging to autonomous driving.

VDP provides automatic model inference server. After importing a model into VDP and deploying its model instance online, VDP dynamically generate dedicated API endpoints for model testing and debugging. You can then build end-to-end data pipelines using the models to run ETL operations.

The API supports batch processing: you can send multiple images of popular formats (PNG and JPEG) in one request. Check the examples below. The API accepts images

  • sent by remote URL and Base64 or
  • uploaded by multipart.
cURL(url)
cURL(base64)
cURL(multipart)
Copy

curl -X POST http://localhost:8083/v1alpha/models/{model-id}/instances/{instance-id}:test -d '{
"inputs": [
{
"image_url": "https://artifacts.instill.tech/imgs/dog.jpg"
},
{
"image_url": "https://artifacts.instill.tech/imgs/horse.jpg"
}
]
}'

in which http://localhost:8083 is the model-backend default URL, and {model-id} and {instance-id} correspond to the model and model instance ID respectively.

Last updated: 8/18/2022, 8:26:22 PM