Model is the pipeline component used to process ingested unstructured data. VDP uses Triton Inference server for model serving. It supports multiple deep learning frameworks including TensorFlow, PyTorch, TensorRT, ONNX and OpenVINO. Besides, the Triton Python Backend enables Triton to support any model written in Python.
#VDP model layout
To deploy a model on VDP, we suggest you to prepare the model files following the layout:
├── README.md├── <pre-model>│ ├── 1│ │ └── model.py│ └── config.pbtxt├── <infer-model>│ ├── 1│ │ └── <model-file>│ └── config.pbtxt├── <post-model>│ ├── 1│ │ └── model.py │ └── config.pbtxt└── <ensemble-model> ├── 1 │ └── .keep └── config.pbtxt
The above layout displays a typical VDP model consisting of
README.md
- model card to embed the metadata in front matter and descriptions in Markdown format<pre-model>
- Python model to pre-process input images<infer-model>
- Model to convert the unstructured data into structured data output, usually a Deep Learning (DL) / Machine Learning (ML) model<post-model>
- Python model to post-process the output of theinfer-model
into desired formats<ensemble-model>
- Triton ensemble model to connect the input and output tensors between the pre-processing, inference and post-processing models.config.pbtxt
- Model configuration for each sub model
You can name <pre-model>
, <infer-model>
, <post-model>
and <ensemble-model>
folders freely provided that the folder names are clear and semantic. All these models bundle into a deployable model for VDP.
As long as your model fulfils the required Triton model repository layout, it can be safely imported into VDP and deployed online.
#Serve models written in Python
To deploy your pre-processing and post-processing models with Python code, use Triton Python Backend that supports conda-pack
to deploy Python models with dependencies.
We have prepared a custom Conda environment with pre-installed libraries including
scikit-learn, Pillow, PyTorch, torchvision, Transformers and triton_python_model.
It is shipped with the NVIDIA GPU Cloud containers using Python 3.8.
If your model is not compatible with Python 3.8 or if it requires additional dependencies, you could create your own Conda environment and configure the config.pbtext
to point to the custom conda-pack tar file accordingly.
#Prepare your model to be VDP compatible
- Create a model card
README.md
to describe your model - Write a pre-processing model and a post-processing model that are compatible with the Triton Python Backend
- Prepare the model configuration file for your inference model
- Set up an ensemble model to encapsulate a
pre-processing model → inference model → post-processing model
procedure - Organise the model files into valid VDP model layout
🙌 After preparing your model to be VDP compatible, check out Import Models to learn about how to import the model into VDP from different sources.