To prepare pre-processing model in Python, create a Python file with a structure similar to below:
model.py
import triton_python_backend_utils as pb_utilsfrom abc import ABC, abstractmethodclass TritonPythonModel(ABC): """Your Python model must use the same class name. Every Python model that is created must have "TritonPythonModel" as the class name. """ def initialize(self, args): """`initialize` is called only once when the model is being loaded. Implementing `initialize` function is optional. This function allows the model to initialize any state associated with this model. Parameters ---------- args : dict Both keys and values are strings. The dictionary keys and values are: * model_config: A JSON string containing the model configuration * model_instance_kind: A string containing model instance kind * model_instance_device_id: A string containing model instance device ID * model_repository: Model repository path * model_version: Model version * model_name: Model name """ print('Initialized...') def execute(self, requests): """`execute` must be implemented in every Python model. `execute` function receives a list of pb_utils.InferenceRequest as the only argument. This function is called when an inference is requested for this model. Parameters ---------- requests : list A list of pb_utils.InferenceRequest Returns ------- list A list of pb_utils.InferenceResponse. The length of this list must be the same as `requests` """ responses = [] # Every Python backend must iterate through list of requests and create # an instance of pb_utils.InferenceResponse class for each of them. You # should avoid storing any of the input Tensors in the class attributes # as they will be overridden in subsequent inference requests. You can # make a copy of the underlying NumPy array and store it if it is # required. for request in requests: # Perform inference on the request and append it to responses list... response = self.pre_process_batch_request(request) responses.append(response) # You must return a list of pb_utils.InferenceResponse. Length # of this list must match the length of `requests` list. return responses def finalize(self): """`finalize` is called only once when the model is being unloaded. Implementing `finalize` function is optional. This function allows the model to perform any necessary clean ups before exit. """ print('Cleaning up...') @abstractmethod def pre_process_batch_request(self, request): """`pre_process_batch_request` pre-processes a Triton Inference request and outputs a Triton Inference response Parameters ---------- request : pb_utils.InferenceRequest A Triton Inference Request Returns ------- pb_utils.InferenceResponse A Triton Inference Response """ raise NotImplementedError( f'Implement pre-process function for a Triton Inference request')
Follow the above structure and implement the abstract method pre_process_batch_request
to pre-process the input images in a batch request.