Prepare Models | Documentation

To prepare a model for Instill Model:

Create a model card README.md to describe your model
Write a pre-processing model and a post-processing model that are compatible with the Triton Python Backend
Prepare the model configuration file for your inference model
Set up an ensemble model to encapsulate a pre-processing model → inference model → post-processing model procedure
Organise the model files into valid Instill Model model layout

#Model Card

Model card is a README.md file that accompanies the model to describe handy information with additional model metadata. Under the hood, a Model card is associated with a specific model. It is an crucial for reproducibility, sharing and discoverability. We highly recommend adding a model card README.md file when preparing your model used in Instill Model.

In a model card, you can provide information about:

the model itself
its use cases and limitations
the datasets used to train the model
the training experiments and configuration
benchmarking and evaluation results
reference materials

After importing a model into Instill Model, the model card will be rendered in the Console on the Model page. Here shows a model card example of a model imported from a GitHub repository model-mobilenetv2.

Model card in Console

INFO

Try our Import GitHub models guideline to import a model from GitHub

#Model Card Metadata

You can insert Front Matter in a model card to define the model metadata. Start with three --- at the top, then include all the metadata and close the section with --- like the example below.

README.md

---
Task: "any Instill Model supported AI task identifier"
Tags:
  - tag1
  - tag2
  - tag3
---

#Specify an AI Task

When importing the model, Instill Model will detect the Task in the model card and verify if output of the model fulfils the AI task requirements. If the model is verified, Instill Model will automatically convert the model output into format of the corresponding standardised VDP AI task format whenever using the model. Please check the supported AI tasks and the corresponding output format for each task.

image-classification

object-detection

keypoint-detection

Task: Classification

If not specified, the model will be recognised with Unspecified AI task, and the raw model output will be wrapped in a standard format.

❓ How to know if the AI task metadata is correctly recognised?

If you include valid AI task metadata, they will show on the Model page of the Console like this:

AI task label in Console

#Model Layout

Leveraging the Triton Inference server for model serving, Instill Model extends its support to multiple deep learning frameworks such as TensorFlow, PyTorch, TensorRT, and ONNX. Furthermore, the Python Backend empowers Instill Model to accommodate any Python-written model with ease.

To deploy a model on Instill Model, we suggest you to prepare the model files following the layout:

├── README.md
├── <pre-model>
│   ├── 1
│   │   └── model.py
│   └── config.pbtxt
├── <infer-model>
│   ├── 1
│   │   └── <model-file>
│   └── config.pbtxt
├── <post-model>
│   ├── 1
│   │   └── model.py 
│   └── config.pbtxt
└── <ensemble-model>
    ├── 1
    │   └── .keep
    └── config.pbtxt

The above layout displays a typical Instill Model model consisting of

README.md - model card to embed the metadata in front matter and descriptions in Markdown format
<pre-model> - Python model to pre-process input images
<infer-model> - Model to convert the unstructured data into structured data output, usually a Deep Learning (DL) / Machine Learning (ML) model
<post-model> - Python model to post-process the output of the infer-model into desired formats
<ensemble-model> - Triton ensemble model to connect the input and output tensors between the pre-processing, inference and post-processing models.
config.pbtxt - Model configuration for each sub model

You can name <pre-model>, <infer-model>, <post-model> and <ensemble-model> folders freely provided that the folder names are clear and semantic. All these models bundle into a deployable model for Instill Model.

INFO

As long as your model fulfils the required Triton model repository layout, it can be safely imported into Instill Model and deployed online.

#Serve Models Written in Python

To deploy your pre-processing and post-processing models with Python code, use Triton Python Backend that supports conda-pack to deploy Python models with dependencies. We have prepared a custom Conda environment with pre-installed libraries including scikit-learn, Pillow, PyTorch, torchvision, Transformers and triton_python_model. It is shipped with the NVIDIA GPU Cloud containers using Python 3.8.

If your model is not compatible with Python 3.8 or if it requires additional dependencies, you could create your own Conda environment and configure the config.pbtext to point to the custom conda-pack tar file accordingly.

#Prepare Pre-processing Model

🙌 After preparing your model to be Instill Model compatible, check out Import Models to learn about how to import the model into Instill Model from different sources.

To prepare pre-processing model in Python, create a Python file with a structure similar to below:

model.py

import triton_python_backend_utils as pb_utils
from abc import ABC, abstractmethod
class TritonPythonModel(ABC):
    """Your Python model must use the same class name.
    Every Python model that is created must have "TritonPythonModel" as the class name.
    """
    def initialize(self, args):
        """`initialize` is called only once when the model is being loaded.
        Implementing `initialize` function is optional. This function allows
        the model to initialize any state associated with this model.
        Parameters
        ----------
        args : dict
          Both keys and values are strings. The dictionary keys and values are:
          * model_config: A JSON string containing the model configuration
          * model_instance_kind: A string containing model instance kind
          * model_instance_device_id: A string containing model instance device ID
          * model_repository: Model repository path
          * model_version: Model version
          * model_name: Model name
        """
        print('Initialized...')
    def execute(self, requests):
        """`execute` must be implemented in every Python model. `execute`
        function receives a list of pb_utils.InferenceRequest as the only
        argument. This function is called when an inference is requested
        for this model.
        Parameters
        ----------
        requests : list
          A list of pb_utils.InferenceRequest
        Returns
        -------
        list
          A list of pb_utils.InferenceResponse. The length of this list must
          be the same as `requests`
        """
        responses = []
        # Every Python backend must iterate through list of requests and create
        # an instance of pb_utils.InferenceResponse class for each of them. You
        # should avoid storing any of the input Tensors in the class attributes
        # as they will be overridden in subsequent inference requests. You can
        # make a copy of the underlying NumPy array and store it if it is
        # required.
        for request in requests:
            # Perform inference on the request and append it to responses list...
            response = self.pre_process_batch_request(request)
            responses.append(response)
        # You must return a list of pb_utils.InferenceResponse. Length
        # of this list must match the length of `requests` list.
        return responses
    def finalize(self):
        """`finalize` is called only once when the model is being unloaded.
        Implementing `finalize` function is optional. This function allows
        the model to perform any necessary clean ups before exit.
        """
        print('Cleaning up...')
    @abstractmethod
    def pre_process_batch_request(self, request):
        """`pre_process_batch_request` pre-processes a Triton Inference request
        and outputs a Triton Inference response
        Parameters
        ----------
        request : pb_utils.InferenceRequest
          A Triton Inference Request
        Returns
        -------
        pb_utils.InferenceResponse
          A Triton Inference Response
        """
        raise NotImplementedError(
            f'Implement pre-process function for a Triton Inference request')

Follow the above structure and implement the abstract method pre_process_batch_request to pre-process the input images in a batch request.

#Prepare Post-processing Model

You can prepare the post-processing model the same way as the pre-processing model. However, to get the model inference output in a standarised format you can

specify a supported AI task when creating the model card
create a Python model that inherits the corresponding post-processing task class in triton_python_model.

If no task is specified when creating a model, the output will the raw model output in a serialized JSON message.

#Image Classification

INFO

Learn more about Image Classification task

Assume we have a "cat vs. dog" model to infer whether an image is a cat image or dog image. Create a labels.txt file to list all the pre-defined categories, with one category label per line. Add the file to the folder of inference model.

labels.txt example

cat
dog

Include the label file labels.txt in the model configuration of the inference model.

config.pbtxt example

...
output [
  {
    ...
    label_filename: "labels.txt"
  }
]
...

Check the standarised output for Image Classification, here shows an output example:

{
  "task": "TASK_CLASSIFICATION",
  "task_outputs": [
    {
      "classification": {
        "category": "dog",
        "score": 0.9
      }
    }
  ]
}

#Object Detection

INFO

Learn more about Object Detection task

Create a Python file with a structure similar to below. The file inherits the PostDetectionModel class and implement the post_process_per_image abstract method. Then, add the file in the post-processing model folder:

model.py

import numpy as np
from triton_python_model.task.detection import PostDetectionModel
class TritonPythonModel(PostDetectionModel):
    """Your Python model must use the same class name.
    Every Python model that is created must have "TritonPythonModel" as the class name.
    """
    def __init__(self):
        """ Constructor function must be implemented in every model.
        This function initializes the names of the input and output
        variables in the model configuration.
        """
        # super().__init__(input_names=[...], output_names=[...])
        # ...
    def post_process_per_image(self, inputs):
        """`post_process_per_image` must be implemented in every Python model.
        This function detects objects in one image of a batch.
        Args:
            inputs (Tuple[np.ndarray]): a sequence of model input array for one image
        Raises:
            NotImplementedError: all subclasses must implement this function of per-image post-processing for `DETECTION` task.
        Returns:
            Tuple[np.ndarray, np.ndarray]: a tuple of bounding box with label array: (`bboxes`, `labels`).
                - `bboxes`: the bounding boxes detected in this image with shape (n,5) or (0,). The bounding box format is [x1, y1, x2, y2, score] in the original image. `n` is the number of bounding boxes detected in this image.
                - `labels`: the labels corresponding to the bounding boxes with shape (n,) or (0,).
                The length of `bboxes` must be the same as that of `labels`.
        """
        # return np.array([[324, 102, 532, 507, 0.98]]), np.array(["dog"]) # Dummy detection example

Check the standardised output for Object Detection, here shows an output example:

{
  "task": "TASK_DETECTION",
  "task_outputs": [
    {
      "detection": {
        "objects": [
          {
            "category": "dog",
            "score": 0.98,
            "bounding_box": {
              "top": 102,
              "left": 324,
              "width": 208,
              "height": 405
            }
          }
        ]
      }
    }
  ]
}

#Keypoint Detection

INFO

Learn more about Keypoint Detection task

Create a Python file with a structure similar to below and add the file in the post-processing model folder:

model.py

import numpy as np
from triton_python_model.task.keypoint import PostKeypointDetectionModel
class TritonPythonModel(PostKeypointDetectionModel):
    """Your Python model must use the same class name.
    Every Python model that is created must have "TritonPythonModel" as the class name.
    """
    def __init__(self):
        """ Constructor function must be implemented in every model.
        This function initializes the names of the input and output
        variables in the model configuration.
        """
        # super().__init__(input_names=[...], output_names=[...])
        # ...
    def post_process_per_image(self, inputs):
        """`post_process_per_image` must be implemented in every Python model.
        This function detects keypoints of objects in one image of a batch.
        Args:
            inputs (Tuple[np.ndarray]): a sequence of model input array for one image
        Raises:
            NotImplementedError: all subclasses must implement this function of per-image post-processing for `KEYPOINT` task.
        Returns:
            np.ndarray: Keypoint Detection score array `scores` for this image. The shape of `scores` should be (n,), where n is the number of categories.
        """

Check the standardised output for Keypoint Detection, here shows an output example:

{
  "task": "TASK_KEYPOINT",
  "task_outputs": [
    {
      "keypoint": {
        "objects": [
          {
            "keypoints": [
              {
                "x": 1052.8419,
                "y": 610.0058,
                "v": 0.84
              },
              {
                "x": 1047.5118,
                "y": 514.04474,
                "v": 0.81
              },
              ...
            ],
            "score": 0.99,
            "bounding_box": {
              "top": 299,
              "left": 185,
              "width": 1130,
              "height": 1210
            }
          }
        ]
      }
    }
  ]
}

#Instance Segmentation

INFO

Learn more about Instance Segmentation task

Check the standardised output for Instance Segmentation task, here shows an output example:

{
  "task": "TASK_INSTANCE_SEGMENTATION",
  "task_outputs": [
    {
      "instance_segmentation": {
        "objects": [
          {
            "rle": "2918,12,382,33,...",
            "score": 0.99,
            "bounding_box": {
              "top": 95,
              "left": 320,
              "width": 215,
              "height": 406
            },
            "category": "dog"
          },
          ...
        ]
      }
    }
  ]
}

#Unspecified

INFO

Learn more about Unspecified AI task

If your model is imported without specifying any task metadata, the model will be recognised to solve an Unspecified task. There is no need to prepare your model outputs to fit any format.

Check the standardised output for Unspecified AI task. Assume we import the above "cat vs. dog" model without specifying the AI task metadata, here shows an output example:

{
  "task": "TASK_UNSPECIFIED",
  "task_outputs": [
    {
      "unspecified": {
        "raw_outputs": [
          {
            "data": [0, 1],
            "data_type": "FP32",
            "name": "output",
            "shape": [2]
          }
        ]
      }
    }
  ]
}