How to utilize TorchServe 3- configuration

configuration and deploy the model

Sangmyeong Woo
How to utilize TorchServe 3- configuration

TorchServe Configuration


TorchServe provides model developers like us who don't specialize in server development with a structured server for deployment.However, there is still a lot of customization that needs to be done. Before running TorchServe, you can configure the server through config.properties. For more information, see Torchserve configuration. There are so many different configuration variables that it's impossible to explain them all, so I'll just cover the essentials

  • inference_address=
    The API address for inference, which defaults to127.0.0.1:8080. See TorchServe(1) - Overview for more details.
    If you want to use Containers -> Servers ->External Access, use instead of
  • management_address=
    API address for server management. SeeTorchServe(1) - Overview for details.
  • metrics_address=
  • batch_size=1
  • default_workers_per_model=1
    The default number of processes to use per model

There are many, many more environment variables.The descriptions are a little unfriendly, but if you fumble around, it seems like there's pretty much everything you need. Set the appropriate environment variables based on what's currently working for your deployment.

Run TorchServe

Once you're ready to configure, all that's left is to run it. The command to run it is simple.

torchserve --start --model-store ./ --modelscustom=CUSTOM.mar --ts-config custom_config

The --model-store is the path where the .mar file is stored, it doesn't matter if you set it in config. This will start torchserve and complete the model deployment. Let's poke around with the actual API to see how it works a bit.

I changed the port of the inference api to 8079, ran torchserve, and pushed a file via PUT to test inference.

Execute command: curl http://localhost:8079/predictions/custom/1.0-T test.txt

When I ran torchserve, I gave custom to the models option, so I ran it with predictions/custom, and when I created the .mar model with torch-model-archiver, the version was 1.0, so I ran the curl command with predictions/custom/1.0.And the text.txt that I push with -T is{"text": "TorchServe is working great"} with one line.

The result is the server logo as shown below. You can see that the value was passed to the handle method as an entry point!

Server side message

We also got the text we wanted as the return value of the API call.

The test handle method we used to do this is shown below.

TorchServe Handle method test
TorchServe Handle method test

Now that we've run torchserve and verified that we're getting the value back, let's play around with the handle to our heart's content and model code for deployment! Declare and load the serialized model you included when you created the .mar file in initialize, do any necessary preprocessing on the input, model feedforward, post processing, etc. and export the desired output to return.

Table of Contents
Talk to Expert