Usage

Overview

This section of the Skyrun SDK documentation elaborates on using a trained model for inference to generate predictions or recommendations based on new input data. After the model training phase, it is compiled into Rust for enhanced efficiency during inference and is automatically deployed on Kubernetes to ensure scalability. The inference process is designed to be straightforward, requiring only the user ID as input to generate personalized recommendations.

Configuring Inference Parameters

To perform inference with a sequence-to-sequence context-aware transformer model in the Skyrun SDK, you need to provide a unique user ID and specify the number of desired recommendations. The inference is conducted through the client.recommender.transformer.inference function, which is optimized for efficiency and simplicity.

Defining Inference Parameters

Here is an example of how to use the inference function:

res = client.recommender.transformer.inference(
    user_id="ARFRKID8WSIIL",
    k=3,
    model_uri="first-model-0peibudq"
)
print(res)

Parameters:

  • user_id (Required): A unique identifier for the user. The system uses the user's historical interactions to generate personalized recommendations.

  • k (Required): The number of recommendations to return. This parameter allows control over the quantity of predictions tailored to the user's or application's needs.

  • model_uri (Required): The unique identifier URI of the deployed model. This specifies which model to send the inference request to.

Usage Notes:

  • Post-training, the model is compiled into Rust, optimizing the efficiency of the inference process.

  • Automatic deployment to Kubernetes facilitates scalable and robust handling of inference requests, suitable for applications of any scale.

Example Usage

To perform inference and obtain recommendations, use the code snippet below:

res = client.recommender.transformer.inference(
    user_id="ARFRKID8WSIIL",
    k=3,
    model_uri="first-model-0peibudq"
)
print(res)

Expected Output

The inference function returns a JSON response with the model's predictions or recommendations based on the user ID provided. This includes the top k recommendations, enabling personalized interactions. In the event of an error, such as network issues or an invalid model URI, a detailed exception will be raised, specifying the nature of the problem encountered.