Learning
TorchServe provides model developers like us who don't specialize in server development with a structured server for deployment.However, there is still a lot of customization that needs to be done. Before running TorchServe, you can configure the server through config.properties. For more information, see Torchserve configuration. There are so many different configuration variables that it's impossible to explain them all, so I'll just cover the essentials
There are many, many more environment variables.The descriptions are a little unfriendly, but if you fumble around, it seems like there's pretty much everything you need. Set the appropriate environment variables based on what's currently working for your deployment.
Once you're ready to configure, all that's left is to run it. The command to run it is simple.
torchserve --start --model-store ./ --modelscustom=CUSTOM.mar --ts-config custom_config
The --model-store is the path where the .mar file is stored, it doesn't matter if you set it in config. This will start torchserve and complete the model deployment. Let's poke around with the actual API to see how it works a bit.
I changed the port of the inference api to 8079, ran torchserve, and pushed a file via PUT to test inference.
Execute command: curl http://localhost:8079/predictions/custom/1.0-T test.txt
When I ran torchserve, I gave custom to the models option, so I ran it with predictions/custom, and when I created the .mar model with torch-model-archiver, the version was 1.0, so I ran the curl command with predictions/custom/1.0.And the text.txt that I push with -T is{"text": "TorchServe is working great"} with one line.
The result is the server logo as shown below. You can see that the value was passed to the handle method as an entry point!
We also got the text we wanted as the return value of the API call.
The test handle method we used to do this is shown below.
Now that we've run torchserve and verified that we're getting the value back, let's play around with the handle to our heart's content and model code for deployment! Declare and load the serialized model you included when you created the .mar file in initialize, do any necessary preprocessing on the input, model feedforward, post processing, etc. and export the desired output to return.