To prepare a model for Instill Model:
- Create a model card
README.md
to describe your model - Write a pre-processing model and a post-processing model that are compatible with the Triton Python Backend
- Prepare the model configuration file for your inference model
- Set up an ensemble model to encapsulate a
pre-processing model → inference model → post-processing model
procedure - Organise the model files into valid Instill Model model layout
#Model Card
Model card is a README.md
file that accompanies the model to describe handy information with additional model metadata. Under the hood, a Model card is associated with a specific model.
It is an crucial for reproducibility, sharing and discoverability. We highly recommend adding a model card README.md
file when preparing your model used in Instill Model.
In a model card, you can provide information about:
- the model itself
- its use cases and limitations
- the datasets used to train the model
- the training experiments and configuration
- benchmarking and evaluation results
- reference materials
After importing a model into Instill Model, the model card will be rendered in the Console on the Model page. Here shows a model card example of a model imported from a GitHub repository model-mobilenetv2.

Try our Import GitHub models guideline to import a model from GitHub
#Model Card Metadata
You can insert Front Matter in a model card to define the model metadata.
Start with three ---
at the top, then include all the metadata and close the section with ---
like the example below.
#Specify an AI Task
When importing the model, Instill Model will detect the Task
in the model card and verify if output of the model fulfils the AI task requirements.
If the model is verified, Instill Model will automatically convert the model output into format of the corresponding standardised VDP AI task format whenever using the model.
Please check the supported AI tasks and the corresponding output format for each task.
If not specified, the model will be recognised with Unspecified
AI task,
and the raw model output will be wrapped in a standard format.
❓ How to know if the AI task metadata is correctly recognised?
If you include valid AI task metadata, they will show on the Model page of the Console like this:

#Model Layout
Leveraging the Triton Inference server for model serving, Instill Model extends its support to multiple deep learning frameworks such as TensorFlow, PyTorch, TensorRT, and ONNX. Furthermore, the Python Backend empowers Instill Model to accommodate any Python-written model with ease.
To deploy a model on Instill Model, we suggest you to prepare the model files following the layout:
├── README.md├── <pre-model>│ ├── 1│ │ └── model.py│ └── config.pbtxt├── <infer-model>│ ├── 1│ │ └── <model-file>│ └── config.pbtxt├── <post-model>│ ├── 1│ │ └── model.py │ └── config.pbtxt└── <ensemble-model> ├── 1 │ └── .keep └── config.pbtxt
The above layout displays a typical Instill Model model consisting of
README.md
- model card to embed the metadata in front matter and descriptions in Markdown format<pre-model>
- Python model to pre-process input images<infer-model>
- Model to convert the unstructured data into structured data output, usually a Deep Learning (DL) / Machine Learning (ML) model<post-model>
- Python model to post-process the output of theinfer-model
into desired formats<ensemble-model>
- Triton ensemble model to connect the input and output tensors between the pre-processing, inference and post-processing models.config.pbtxt
- Model configuration for each sub model
You can name <pre-model>
, <infer-model>
, <post-model>
and <ensemble-model>
folders freely provided that the folder names are clear and semantic. All these models bundle into a deployable model for Instill Model.
As long as your model fulfils the required Triton model repository layout, it can be safely imported into Instill Model and deployed online.
#Serve Models Written in Python
To deploy your pre-processing and post-processing models with Python code, use Triton Python Backend that supports conda-pack
to deploy Python models with dependencies.
We have prepared a custom Conda environment with pre-installed libraries including
scikit-learn, Pillow, PyTorch, torchvision, Transformers and triton_python_model.
It is shipped with the NVIDIA GPU Cloud containers using Python 3.8.
If your model is not compatible with Python 3.8 or if it requires additional dependencies, you could create your own Conda environment and configure the config.pbtext
to point to the custom conda-pack tar file accordingly.
#Prepare Pre-processing Model
🙌 After preparing your model to be Instill Model compatible, check out Import Models to learn about how to import the model into Instill Model from different sources.
To prepare pre-processing model in Python, create a Python file with a structure similar to below:
Follow the above structure and implement the abstract method pre_process_batch_request
to pre-process the input images in a batch request.
#Prepare Post-processing Model
You can prepare the post-processing model the same way as the pre-processing model. However, to get the model inference output in a standarised format you can
- specify a supported AI task when creating the model card
- create a Python model that inherits the corresponding post-processing task class in triton_python_model.
If no task is specified when creating a model, the output will the raw model output in a serialized JSON message.
#Image Classification
Learn more about Image Classification task
Assume we have a "cat vs. dog" model to infer whether an image is a cat image or dog image. Create a labels.txt
file to list all the pre-defined categories, with one category label per line. Add the file to the folder of inference model.
labels.txt
example
catdog
Include the label file labels.txt
in the model configuration of the inference model.
config.pbtxt
example
...output [ { ... label_filename: "labels.txt" }]...
Check the standarised output for Image Classification, here shows an output example:
{ "task": "TASK_CLASSIFICATION", "task_outputs": [ { "classification": { "category": "dog", "score": 0.9 } } ]}
#Object Detection
Learn more about Object Detection task
Create a Python file with a structure similar to below. The file inherits the PostDetectionModel
class and implement the post_process_per_image
abstract method.
Then, add the file in the post-processing model folder:
Check the standardised output for Object Detection, here shows an output example:
{ "task": "TASK_DETECTION", "task_outputs": [ { "detection": { "objects": [ { "category": "dog", "score": 0.98, "bounding_box": { "top": 102, "left": 324, "width": 208, "height": 405 } } ] } } ]}
#Keypoint Detection
Learn more about Keypoint Detection task
Create a Python file with a structure similar to below and add the file in the post-processing model folder:
Check the standardised output for Keypoint Detection, here shows an output example:
{ "task": "TASK_KEYPOINT", "task_outputs": [ { "keypoint": { "objects": [ { "keypoints": [ { "x": 1052.8419, "y": 610.0058, "v": 0.84 }, { "x": 1047.5118, "y": 514.04474, "v": 0.81 }, ... ], "score": 0.99, "bounding_box": { "top": 299, "left": 185, "width": 1130, "height": 1210 } } ] } } ]}
#Instance Segmentation
Learn more about Instance Segmentation task
Check the standardised output for Instance Segmentation task, here shows an output example:
{ "task": "TASK_INSTANCE_SEGMENTATION", "task_outputs": [ { "instance_segmentation": { "objects": [ { "rle": "2918,12,382,33,...", "score": 0.99, "bounding_box": { "top": 95, "left": 320, "width": 215, "height": 406 }, "category": "dog" }, ... ] } } ]}
#Unspecified
Learn more about Unspecified AI task
If your model is imported without specifying any task metadata, the model will be recognised to solve an Unspecified
task.
There is no need to prepare your model outputs to fit any format.
Check the standardised output for Unspecified AI task. Assume we import the above "cat vs. dog" model without specifying the AI task metadata, here shows an output example:
{ "task": "TASK_UNSPECIFIED", "task_outputs": [ { "unspecified": { "raw_outputs": [ { "data": [0, 1], "data_type": "FP32", "name": "output", "shape": [2] } ] } } ]}