#Unlock new Possibilities with โ๏ธ Instill Model
โ๏ธ Instill Model is a sophisticated MLOps/LLMOps platform specially designed to orchestrate model serving and monitoring to ensure consistent and reliable performance. It allows for efficient management and deployment of deep learning models for unstructured data ETL, and is used to deploy and serve models locally with ๐ฎ Instill Core, and on the cloud with โ๏ธ Instill Cloud.
#Why use โ๏ธ Instill Model?
- Seamless Integration with ๐ง Instill VDP: integrate effortlessly with our Versatile Data Pipeline, allowing for streamlined unstructured data ETL and model serving workflows.
- No Code Console Builder: Easily utilize custom models defined with โ๏ธ Instill Model as modular AI components via ๐บ Instill Console, allowing for seamless integration into downstream tasks.
- AutoML Feature (Coming Soon): With the upcoming AutoML feature, โ๏ธ Instill Model will soon be capable of automating model training and tuning, simplifying model optimization for deployment.
This step-by-step tutorial will guide you through the process of setting up your own custom model with Instill Model for local deployment with ๐ฎ Instill Core.
#Prerequisites
-
Please ensure that you have installed the latest version of the Python SDK by running:
pip install instill-sdk -
Docker: Both ๐ฎ Instill Core and โ๏ธ Instill Model use Docker to ensure that models and code can be deployed in consistent, isolated and reproducible environments. Please ensure that you have Docker installed and running by following the official instructions, and see our deployment guide for recommended resource settings.
-
โจ๏ธ Instill CLI: The easiest way to launch ๐ฎ Instill Core for local deployment is via the โจ๏ธ Instill CLI. To install โจ๏ธ Instill CLI using Homebrew, please run the following command in your terminal:
brew install instill-ai/tap/inst -
Launch ๐ฎ Instill Core: To launch simply run:
inst local deployPlease note that the initial launch process may take up to 1 hour, depending on your internet speed. Subsequent launches will be much faster, usually completing in under 5 minutes.
-
Now that ๐ฎ Instill Core has been deployed, you can access ๐บ Instill Console at http://localhost:3000. Please use the following initial login details to initiate the password reset process for onboarding:
- Username:
admin
- Password:
password
- Username:
For further details about launching ๐ฎ Instill Core, we recommend that you refer to the deployment guide which covers alternative ways you can launch with Docker Compose or Kubernetes with Helm.
Finally, please note that this guide assumes you have a basic understanding of machine learning and can code in Python. If you are new to these concepts, we recommend that you look at our quickstart guide which introduces our no/low-code ๐ง Instill VDP pipeline builder, and also take a look at some of our other tutorials.
#Step-by-Step Tutorial
#Step 1: Create a Model Namespace
To get started, navigate to the Model page in the console window and click the
+ Create Model
button.
This should bring up a configuration window (see image below) where you are able to configure your model settings. For a full description of the available fields, please refer to the Create Namespace page.
In this tutorial, we will be walking through how to create and deploy a version of the TinyLlama-1.1B-Chat model. To follow along, please fill in the configuration fields as per the image below.
You have now created an empty model namespace on โ๏ธ Instill Model. In the next sections of this tutorial we will show you how to define your own custom model for deployment!
#Step 2: Create a Model Config
To prepare a model to be served with โ๏ธ Instill Model, you first need to
create your own model directory containing two files - model.py
and
instill.yaml
. Within the Python SDK, you can run the following helper command
to generate corresponding template files which we can modify:
instill init
To configure the
TinyLlama-1.1B-Chat
model, simply open the instill.yaml
file and populate it with:
This file specifies the dependencies required to run the model. We will be loading these libraries in the next stage where we define our model class!
#Step 3. Write a Model Script
In this step we will create the model.py
file, which will contain the model
class definition. This will be broken down in three phases to demonstrate the
structure of the model class, and explain the methods it should implement.
#Define the Model Initialization
The first phase involves defining the model class and creating the __init__
constructor which is responsible for loading the model. Here we will use
pipeline()
from the
transformers library to
directly load in the
TinyLlama-1.1B-Chat
model.
The instill.helpers.ray_config
package contains the decorators and deployment
object for the model class, which we will use to convert the model class into a
servable model. These are required to properly define a model class for โ๏ธ
Instill Model.
#Define the Model Metadata method
In the second phase, we define the ModelMetadata
method which is responsible
for communicating the models expected input and output shapes to the backend
service. To easily facilitate this, we can make use of the Python
SDK through the instill.helpers
module which
provides a number of functions that can be selected according to the AI Task the
model performs.
Here, we recognise that the
TinyLlama-1.1B-Chat
model falls under the Text Generation
Chat AI Task,
and so we will make use of the
construct_text_generation_chat_metadata_response
helper function.
Please refer here for a full list of the supported AI Tasks.
#Implement the Inference Method
In the third phase, we implement the inference method __call__
, which handles
the trigger request from โ๏ธ Instill Model, contains the necessary logic to
run the inference, and constructs the response. We use the StandardTaskIO
module to parse the request payload into input parameters, and convert the model
outputs to the appropriate response format.
The TextGenerationChatInput
class from instill.helpers.const
is used to
define the input format for the Text Generation
Chat AI Task,
and the construct_text_generation_chat_infer_response
function from
instill.helpers
is used to format the model output into the appropriate
response format.
Putting it all together, your model.py
file should now look like this:
Awesome ๐, you have now defined your own custom model class for model serving with โ๏ธ Instill Model. In the next step, we will show you how to build and deploy this model locally with ๐ฎ Instill Core!
#Step 4: Build and Deploy the Model
First, you must ensure that you have the same Python version installed in your
local environment as specified in the instill.yaml
file in step
2, in this case python_version: "3.11"
.
#Build the Model Image
You can now build your model image by running the following command from within
the directory containing the model.py
and instill.yaml
files:
instill build USER_ID/MODEL_ID -t v1
Importantly, you must replace USER_ID
with admin
, and replace MODEL_ID
for
tinyllama
- the same Model ID that was specified in Step
1.
If you are building on a different architecture to the one you are deploying
on, you must explicitly specify the target architecture using the
--target-arch
flag. For example, when building on an ARM machine and
deploying to an AMD64 architecture, you must pass --target-arch amd64
when
running instill build
. If unspecified, the target architecture will default
to that of the system you are building on.
This command will build the model image under version tag v1
. Upon successful
completion, you should see a similar output to the following:
2024-05-28 01:54:44,404.404 INFO [Instill Builder] admin/tinyllama:v1 built2024-05-28 01:54:44,423.423 INFO [Instill Builder] Done
#Push the Model Image
To push the model image to the ๐ฎ Instill Core instance we will need to be able to login to the hosted Docker registry. To do this, you first need to create an API token by:
- Selecting the profile icon in the top right corner of the console window and choosing the Settings option.
- Select API Tokens from the left-hand menu.
- Click the
Create Token
button and give it a name, e.g.tutorial
. Copy the generated API token.
Now we can login to the Docker register in the ๐ฎ Instill Core instance by running:
docker login localhost:8080
and entering the following credentials: - Username: admin
- Password:
API_TOKEN
(replace this with the token you generated in the previous step)
Once logged in, we can push model image v1
to โ๏ธ Instill Model with:
instill push USER_ID/MODEL_ID -t v1 -u localhost:8080
Again, you must remember replace USER_ID
with admin
, and replace MODEL_ID
for
tinyllama
.
Upon successful completion, you should see a similar output to the following:
2024-05-23 23:05:03,484.484 INFO [Instill Builder] localhost:8080/admin/tinyllama:v1 pushed2024-05-23 23:05:03,485.485 INFO [Instill Builder] Done
โ๏ธ Instill Model will then automatically allocate the resources required by your model and deploy it. Please note that the deployment time varies based on the model size and hardware type.
#Status Check
To check the status of your deployed model version you can:
- Navigate back to the Models page on the Console.
- Select the
admin/tinyllama
model you created in Step 1. - Click the Versions tab, where you will see the corresponding version ID
or tag of your pushed model image and the
Status
of deployment. You should then see a similar screen to the image below.
The Status
will initially show as Starting
, indicating that your model is
offline and โ๏ธ Instill Model is still in the process of allocating resources
and deploying it (this may take a few minutes). Once this status changes to
Active
, your model is ready to serve requests. ๐
#Step 5. Inference
Once your model is deployed and Active
, you can easily test its behaviour
following these steps:
- Navigating to the Overview tab for your
admin/tinyllama
model. - Enter a prompt in the Input pane (e.g. What is a rainbow?)
- Scroll down and hit Run to trigger the model inference.
You should see a response generated by the TinyLlama-1.1B-Chat model in the Output pane.
To access model inferences via the API:
- Navigate to the API tab for your
admin/tinyllama
model. - Follow the instructions by setting your own
INSTILL_API_TOKEN
as an environment variable:and using the providedexport INSTILL_API_TOKEN=********curl
command to send a request to the model endpoint:curl --location 'https://api.instill.tech/model/v1alpha/users/admin/models/tinyllama/trigger' \--header "Content-Type: application/json" \--header "Authorization: Bearer $INSTILL_API_TOKEN" \--data '{"taskInputs": [{"textGeneration": {"prompt": "How is the weather today?","chatHistory": [{"role": "user","content": [{"type": "text","text": "hi"}]}],"systemMessage": "you are a helpful assistant","maxNewTokens": 1024,"topK": 5,"temperature": 0.7}}]}'
#Step 6. Tear Everything Down
After you have finished testing and serving your model, you might want to tear down the local ๐ฎ Instill Core instance to free up system resources. You can do this using the โจ๏ธ Instill CLI command:
inst local undeploy
#Conclusion
Congratulations on successfully deploying and serving a custom model with โ๏ธ Instill Model and ๐ฎ Instill Core! ๐
By following this tutorial, you've accomplished the following:
- Set up ๐ฎ Instill Core for local deployment.
- Created a model namespace and configured model settings.
- Defined and implemented a custom model class for the TinyLlama-1.1B-Chat model.
- Built and deployed your model image.
- Tested your model's inference capabilities via the Console and API.
- Undeployed the local ๐ฎ Instill Core instance.
โ๏ธ Instill Model and ๐ฎ Instill Core together provide a powerful, streamlined solution for managing and deploying your own deep learning models.
Excitingly, you can now connect your own custom models via the โ๏ธ Instill Model AI component to construct bespoke ๐ง Instill VDP pipelines tailored to your unstructured data ETL requirements. Please see the Create Pipeline page for more information on building ๐ง Instill VDP pipelines with ๐บ Instill Console.
Ultimately these tools allow you the freedom and creativity to develop and iterate innovative AI-powered workflows to solve your real-world use cases.
Thank you for following along with this tutorial and stay tuned for more updates soon! ๐