Pipeline

A Pipeline component is an end-to-end workflow that automates a sequence of sub-components to process visual data.

It is defined by a recipe which is essentially a JSON object:


{
"source": <source-connector-resource-name>,
"model_instances": [
<model-instance-resource-name>,
...
],
"destination": <destination-connector-resource-name>
}

A recipe describes a pipeline resource consisting of:

  • source - where the pipeline starts to ingest image/video data to be processed
  • model_instances - a number of deployed Vision AI models to process the ingested visual data in parallel to generate structured outputs
  • destination - where to send the structured outputs

Depending on the use case, the user can create a pipeline in SYNC, ASYNC, or FETCH mode. The pipeline mode is determined by the combination of selected source and destination. Continue to read for more details.

#Mode

#SYNC mode

A pipeline in the SYNC mode responds to a request synchronously. The result is sent back to the user right after the data is processed by Model. This mode is for real-time inference where low latency is of concern. The request flow when triggering a SYNC pipeline is shown below:

SYNC pipeline mode

To create a SYNC pipeline, both source and destination needs to be configured with the same protocol type. VDP supports HTTP and gRPC for a SYNC pipeline.

For example, this recipe defines a gRPC SYNC pipeline for real-time YOLOv7 inference:


{
"source": "source-connectors/source-grpc",
"model_instances": ["models/yolov7/instances/v1.0-cpu"],
"destination": "destination-connectors/destination-grpc"
}

#ASYNC mode

A pipeline in the ASYNC mode performs asynchronous workload. The user triggers the pipeline with an asynchronous request and only receives an acknowledged response. Once the data has been processed by Model, the result is sent to the data destination. This mode is for use cases that do not require inference results immediately.

ASYNC pipeline mode

To create an ASYNC pipeline, the source can be either HTTP or gRPC, and the destination can be any Airbyte destination connectors.

For example, this recipe defines anASYNC pipeline to detect images by YOLOv7 via HTTP request and write the structured detection output into a PostgreSQL database:


{
"source": "source-connectors/source-http",
"model_instances": ["models/yolov7/instances/v1.0-cpu"],
"destination": "destination-connectors/postgres"
}

#FETCH mode (coming soon!)

A pipeline in the FETCH mode performs scheduled workload to regularly fetch data from the Source to send to Model for inference and write to destination in the end.

FETCH pipeline mode

#State

When a pipeline is initially created, the pipeline state is determined by the pipeline's sub-components with the precedence:

  • ERROR if ANY of its sub-components are in their ERROR state
  • UNSPECIFIED if ANY of its sub-components are in their UNSPECIFIED state
  • INACTIVE if ANY of its sub-components are in their negative state
  • ACTIVE if ALL of its sub-components are in their positive state

A pipeline can be switched to INACTIVE state by invoking the pipeline-backend endpoint :deactivate only when its original state is ACTIVE.

A pipeline can be switched to ACTIVE state by invoking the pipeline-backend endpoint :activate only when its original state is INACTIVE.

pipeline state

The finite-state-machine (FSM) diagram for the pipeline state transition logic.

Last updated: 8/19/2022, 1:07:06 AM