A Model component is an algorithm run on unstructured data to solve a certain AI task.
VDP uses Triton Inference server for model serving. It supports multiple deep learning frameworks including TensorFlow, PyTorch, TensorRT and ONNX. Besides, the Python Backend enables Triton to support any model written in Python. To make your models VDP-ready, please refer to Prepare Models.
#Model Connector (coming soon!)
VDP uses ModelConnector to define how to connect AI models for processing unstructured data from different sources, no matter it is a 3rd party model provider or a in-house model serving platform.
#Definition
VDP uses ModelDefinition to define how to configure and import a model from a supported model source. Please check out Import Models to learn more.
Instill AI develops and maintains model sources (ModelDefinition
). We use release stage to indicate a model source's readiness.
#Model
A Model, the core component of a VDP data pipeline, is a piece of ML algorithm specific to processs unstructured data for a certain AI task.
A VDP pipeline can have multiple models running in parallel. The examples below showcase pipeline recipes that incorporate single or multiple models.
#Model importing and deployment
VDP provides automatic model inference server. After importing a model from a supported model source (e.g., GitHub and Hugging Face), and deploying it online, VDP dynamically generate dedicated API endpoints for model testing and debugging. You can then build end-to-end data pipelines using the models to run ETL operations. Please refer to Import Models to learn about model versioning with supported model sources.
#State
The state of a model can be UNSPECIFIED
, OFFLINE
, ONLINE
or ERROR
.
When a model is initially created, the states is by default OFFLINE
.
A model can be switched to OFFLINE
state by invoking the model-backend
endpoint /undeploy
only when its original state is ONLINE
.
A model can be switch to ONLINE
state by invoking the model-backend
endpoint /deploy
only when its original state is OFFLINE
.
Model deployment operation can take time depending on factors like Internet connection and model size.
Before a model is deployed online, the state will be UNSPECIFIED
.
If the state of a model ends up with ERROR
, it is undeployable on VDP. Please refer to Prepare Models to make your model VDP-ready.
#Inference
An inference is a prediction to a question or task. In the concept of Machine Learning (ML) and Artificial Intelligence (AI), the term inference is often compared with training. To put it simple, inference is where capabilities learnt during training are put to analyze data to "infer" a result. Inference can be found and are applied everywhere across industries from photo tagging to autonomous driving.
After deploying a model, you can send multiple images of popular formats (PNG and JPEG) in one request to the generated model API endpoint. Check the examples below. The API accepts batched images
- sent by remote URL and Base64 or
- uploaded by multipart.
in which {id}
corresponds to the ID of a model.