多任務推薦器

在 TensorFlow.org 上檢視 在 Google Colab 中執行 在 GitHub 上檢視原始碼 下載筆記本

基本擷取教學課程中,我們使用電影觀看記錄做為正向互動訊號,建構了擷取系統。

然而,在許多應用程式中,有多種豐富的回饋來源可以利用。例如,電子商務網站可能會記錄使用者瀏覽產品頁面 (數量豐富,但訊號相對較低)、圖片點擊、加入購物車,以及最終的購買行為。甚至可能記錄購買後的訊號,例如評論和退貨。

整合所有這些不同形式的回饋,對於建構使用者喜愛的系統至關重要,而且不會為了最佳化單一指標而犧牲整體效能。

此外,為多個任務建構聯合模型,可能比建構多個特定任務模型產生更好的結果。當某些資料豐富 (例如點擊),而某些資料稀疏 (購買、退貨、人工評論) 時,尤其如此。在這些情況下,聯合模型可能能夠利用從豐富任務中學習到的表示法,透過稱為遷移學習的現象,來改善其對稀疏任務的預測。例如,這篇論文顯示,透過新增使用豐富點擊記錄資料的輔助任務,可以大幅改善從稀疏使用者調查中預測明確使用者評分的模型。

在本教學課程中,我們將使用隱含 (電影觀看記錄) 和明確訊號 (評分) 為 Movielens 建構多目標推薦器。

匯入

讓我們先處理匯入作業。

pip install -q tensorflow-recommenders
pip install -q --upgrade tensorflow-datasets
import os
import pprint
import tempfile

from typing import Dict, Text

import numpy as np
import tensorflow as tf
import tensorflow_datasets as tfds
2022-12-14 12:23:34.727681: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2022-12-14 12:23:34.727787: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2022-12-14 12:23:34.727798: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.
import tensorflow_recommenders as tfrs

準備資料集

我們將使用 Movielens 100K 資料集。

ratings = tfds.load('movielens/100k-ratings', split="train")
movies = tfds.load('movielens/100k-movies', split="train")

# Select the basic features.
ratings = ratings.map(lambda x: {
    "movie_title": x["movie_title"],
    "user_id": x["user_id"],
    "user_rating": x["user_rating"],
})
movies = movies.map(lambda x: x["movie_title"])
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/autograph/pyct/static_analysis/liveness.py:83: Analyzer.lamba_check (from tensorflow.python.autograph.pyct.static_analysis.liveness) is deprecated and will be removed after 2023-09-23.
Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/autograph/pyct/static_analysis/liveness.py:83: Analyzer.lamba_check (from tensorflow.python.autograph.pyct.static_analysis.liveness) is deprecated and will be removed after 2023-09-23.
Instructions for updating:
Lambda fuctions will be no more assumed to be used in the statement where they are used, or at least in the same block. https://github.com/tensorflow/tensorflow/issues/56089

並重複我們的準備工作,以建構詞彙表並將資料分割為訓練集和測試集

# Randomly shuffle data and split between train and test.
tf.random.set_seed(42)
shuffled = ratings.shuffle(100_000, seed=42, reshuffle_each_iteration=False)

train = shuffled.take(80_000)
test = shuffled.skip(80_000).take(20_000)

movie_titles = movies.batch(1_000)
user_ids = ratings.batch(1_000_000).map(lambda x: x["user_id"])

unique_movie_titles = np.unique(np.concatenate(list(movie_titles)))
unique_user_ids = np.unique(np.concatenate(list(user_ids)))

多任務模型

多任務推薦器有兩個關鍵部分

  1. 它們針對兩個或更多目標進行最佳化,因此具有兩個或更多損失。
  2. 它們在任務之間共用變數,從而實現遷移學習。

在本教學課程中,我們將像以前一樣定義模型,但我們將有兩個任務,而不是單一任務:一個任務預測評分,另一個任務預測電影觀看記錄。

使用者和電影模型與以前相同

user_model = tf.keras.Sequential([
  tf.keras.layers.StringLookup(
      vocabulary=unique_user_ids, mask_token=None),
  # We add 1 to account for the unknown token.
  tf.keras.layers.Embedding(len(unique_user_ids) + 1, embedding_dimension)
])

movie_model = tf.keras.Sequential([
  tf.keras.layers.StringLookup(
      vocabulary=unique_movie_titles, mask_token=None),
  tf.keras.layers.Embedding(len(unique_movie_titles) + 1, embedding_dimension)
])

但是,現在我們將有兩個任務。第一個是評分任務

tfrs.tasks.Ranking(
    loss=tf.keras.losses.MeanSquaredError(),
    metrics=[tf.keras.metrics.RootMeanSquaredError()],
)

其目標是盡可能準確地預測評分。

第二個是擷取任務

tfrs.tasks.Retrieval(
    metrics=tfrs.metrics.FactorizedTopK(
        candidates=movies.batch(128)
    )
)

與之前一樣,此任務的目標是預測使用者將觀看或不觀看哪些電影。

整合在一起

我們將所有內容整合在模型類別中。

此處的新組件是 - 由於我們有兩個任務和兩個損失 - 我們需要決定每個損失的重要性。我們可以透過為每個損失指定權重,並將這些權重視為超參數來執行此操作。如果我們為評分任務指定較大的損失權重,我們的模型將專注於預測評分 (但仍會使用擷取任務中的一些資訊);如果我們為擷取任務指定較大的損失權重,它將專注於擷取。

class MovielensModel(tfrs.models.Model):

  def __init__(self, rating_weight: float, retrieval_weight: float) -> None:
    # We take the loss weights in the constructor: this allows us to instantiate
    # several model objects with different loss weights.

    super().__init__()

    embedding_dimension = 32

    # User and movie models.
    self.movie_model: tf.keras.layers.Layer = tf.keras.Sequential([
      tf.keras.layers.StringLookup(
        vocabulary=unique_movie_titles, mask_token=None),
      tf.keras.layers.Embedding(len(unique_movie_titles) + 1, embedding_dimension)
    ])
    self.user_model: tf.keras.layers.Layer = tf.keras.Sequential([
      tf.keras.layers.StringLookup(
        vocabulary=unique_user_ids, mask_token=None),
      tf.keras.layers.Embedding(len(unique_user_ids) + 1, embedding_dimension)
    ])

    # A small model to take in user and movie embeddings and predict ratings.
    # We can make this as complicated as we want as long as we output a scalar
    # as our prediction.
    self.rating_model = tf.keras.Sequential([
        tf.keras.layers.Dense(256, activation="relu"),
        tf.keras.layers.Dense(128, activation="relu"),
        tf.keras.layers.Dense(1),
    ])

    # The tasks.
    self.rating_task: tf.keras.layers.Layer = tfrs.tasks.Ranking(
        loss=tf.keras.losses.MeanSquaredError(),
        metrics=[tf.keras.metrics.RootMeanSquaredError()],
    )
    self.retrieval_task: tf.keras.layers.Layer = tfrs.tasks.Retrieval(
        metrics=tfrs.metrics.FactorizedTopK(
            candidates=movies.batch(128).map(self.movie_model)
        )
    )

    # The loss weights.
    self.rating_weight = rating_weight
    self.retrieval_weight = retrieval_weight

  def call(self, features: Dict[Text, tf.Tensor]) -> tf.Tensor:
    # We pick out the user features and pass them into the user model.
    user_embeddings = self.user_model(features["user_id"])
    # And pick out the movie features and pass them into the movie model.
    movie_embeddings = self.movie_model(features["movie_title"])

    return (
        user_embeddings,
        movie_embeddings,
        # We apply the multi-layered rating model to a concatentation of
        # user and movie embeddings.
        self.rating_model(
            tf.concat([user_embeddings, movie_embeddings], axis=1)
        ),
    )

  def compute_loss(self, features: Dict[Text, tf.Tensor], training=False) -> tf.Tensor:

    ratings = features.pop("user_rating")

    user_embeddings, movie_embeddings, rating_predictions = self(features)

    # We compute the loss for each task.
    rating_loss = self.rating_task(
        labels=ratings,
        predictions=rating_predictions,
    )
    retrieval_loss = self.retrieval_task(user_embeddings, movie_embeddings)

    # And combine them using the loss weights.
    return (self.rating_weight * rating_loss
            + self.retrieval_weight * retrieval_loss)

專門用於評分的模型

根據我們指定的權重,模型將編碼不同的任務平衡。讓我們先從僅考慮評分的模型開始。

model = MovielensModel(rating_weight=1.0, retrieval_weight=0.0)
model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1))
cached_train = train.shuffle(100_000).batch(8192).cache()
cached_test = test.batch(4096).cache()
model.fit(cached_train, epochs=3)
metrics = model.evaluate(cached_test, return_dict=True)

print(f"Retrieval top-100 accuracy: {metrics['factorized_top_k/top_100_categorical_accuracy']:.3f}.")
print(f"Ranking RMSE: {metrics['root_mean_squared_error']:.3f}.")
Epoch 1/3
10/10 [==============================] - 7s 319ms/step - root_mean_squared_error: 2.2354 - factorized_top_k/top_1_categorical_accuracy: 3.3750e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0026 - factorized_top_k/top_10_categorical_accuracy: 0.0060 - factorized_top_k/top_50_categorical_accuracy: 0.0305 - factorized_top_k/top_100_categorical_accuracy: 0.0599 - loss: 4.5809 - regularization_loss: 0.0000e+00 - total_loss: 4.5809
Epoch 2/3
10/10 [==============================] - 3s 319ms/step - root_mean_squared_error: 1.1220 - factorized_top_k/top_1_categorical_accuracy: 2.6250e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0025 - factorized_top_k/top_10_categorical_accuracy: 0.0056 - factorized_top_k/top_50_categorical_accuracy: 0.0304 - factorized_top_k/top_100_categorical_accuracy: 0.0601 - loss: 1.2614 - regularization_loss: 0.0000e+00 - total_loss: 1.2614
Epoch 3/3
10/10 [==============================] - 3s 315ms/step - root_mean_squared_error: 1.1170 - factorized_top_k/top_1_categorical_accuracy: 2.6250e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0024 - factorized_top_k/top_10_categorical_accuracy: 0.0057 - factorized_top_k/top_50_categorical_accuracy: 0.0304 - factorized_top_k/top_100_categorical_accuracy: 0.0605 - loss: 1.2500 - regularization_loss: 0.0000e+00 - total_loss: 1.2500
5/5 [==============================] - 3s 185ms/step - root_mean_squared_error: 1.1125 - factorized_top_k/top_1_categorical_accuracy: 5.0000e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0034 - factorized_top_k/top_10_categorical_accuracy: 0.0065 - factorized_top_k/top_50_categorical_accuracy: 0.0309 - factorized_top_k/top_100_categorical_accuracy: 0.0599 - loss: 1.2326 - regularization_loss: 0.0000e+00 - total_loss: 1.2326
Retrieval top-100 accuracy: 0.060.
Ranking RMSE: 1.113.

該模型在預測評分方面表現尚可 (RMSE 約為 1.11),但在預測哪些電影會被觀看方面表現不佳:其前 100 名準確率幾乎比僅經過訓練以預測觀看記錄的模型差 4 倍。

專門用於擷取的模型

現在讓我們嘗試一個僅專注於擷取的模型。

model = MovielensModel(rating_weight=0.0, retrieval_weight=1.0)
model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1))
model.fit(cached_train, epochs=3)
metrics = model.evaluate(cached_test, return_dict=True)

print(f"Retrieval top-100 accuracy: {metrics['factorized_top_k/top_100_categorical_accuracy']:.3f}.")
print(f"Ranking RMSE: {metrics['root_mean_squared_error']:.3f}.")
Epoch 1/3
10/10 [==============================] - 4s 309ms/step - root_mean_squared_error: 3.6972 - factorized_top_k/top_1_categorical_accuracy: 5.8750e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0056 - factorized_top_k/top_10_categorical_accuracy: 0.0131 - factorized_top_k/top_50_categorical_accuracy: 0.0751 - factorized_top_k/top_100_categorical_accuracy: 0.1483 - loss: 69829.1612 - regularization_loss: 0.0000e+00 - total_loss: 69829.1612
Epoch 2/3
10/10 [==============================] - 3s 301ms/step - root_mean_squared_error: 3.6905 - factorized_top_k/top_1_categorical_accuracy: 0.0010 - factorized_top_k/top_5_categorical_accuracy: 0.0118 - factorized_top_k/top_10_categorical_accuracy: 0.0272 - factorized_top_k/top_50_categorical_accuracy: 0.1425 - factorized_top_k/top_100_categorical_accuracy: 0.2634 - loss: 67466.0661 - regularization_loss: 0.0000e+00 - total_loss: 67466.0661
Epoch 3/3
10/10 [==============================] - 3s 300ms/step - root_mean_squared_error: 3.6877 - factorized_top_k/top_1_categorical_accuracy: 0.0016 - factorized_top_k/top_5_categorical_accuracy: 0.0183 - factorized_top_k/top_10_categorical_accuracy: 0.0391 - factorized_top_k/top_50_categorical_accuracy: 0.1782 - factorized_top_k/top_100_categorical_accuracy: 0.3048 - loss: 66294.5128 - regularization_loss: 0.0000e+00 - total_loss: 66294.5128
5/5 [==============================] - 1s 188ms/step - root_mean_squared_error: 3.6884 - factorized_top_k/top_1_categorical_accuracy: 9.5000e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0093 - factorized_top_k/top_10_categorical_accuracy: 0.0203 - factorized_top_k/top_50_categorical_accuracy: 0.1199 - factorized_top_k/top_100_categorical_accuracy: 0.2330 - loss: 31092.1455 - regularization_loss: 0.0000e+00 - total_loss: 31092.1455
Retrieval top-100 accuracy: 0.233.
Ranking RMSE: 3.688.

我們得到了相反的結果:一個在擷取方面表現良好,但在預測評分方面表現不佳的模型。

聯合模型

現在讓我們訓練一個為兩個任務都指定正權重的模型。

model = MovielensModel(rating_weight=1.0, retrieval_weight=1.0)
model.compile(optimizer=tf.keras.optimizers.Adagrad(0.1))
model.fit(cached_train, epochs=3)
metrics = model.evaluate(cached_test, return_dict=True)

print(f"Retrieval top-100 accuracy: {metrics['factorized_top_k/top_100_categorical_accuracy']:.3f}.")
print(f"Ranking RMSE: {metrics['root_mean_squared_error']:.3f}.")
Epoch 1/3
10/10 [==============================] - 4s 309ms/step - root_mean_squared_error: 2.0230 - factorized_top_k/top_1_categorical_accuracy: 5.8750e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0051 - factorized_top_k/top_10_categorical_accuracy: 0.0123 - factorized_top_k/top_50_categorical_accuracy: 0.0768 - factorized_top_k/top_100_categorical_accuracy: 0.1509 - loss: 69787.3722 - regularization_loss: 0.0000e+00 - total_loss: 69787.3722
Epoch 2/3
10/10 [==============================] - 3s 312ms/step - root_mean_squared_error: 1.3647 - factorized_top_k/top_1_categorical_accuracy: 0.0012 - factorized_top_k/top_5_categorical_accuracy: 0.0120 - factorized_top_k/top_10_categorical_accuracy: 0.0275 - factorized_top_k/top_50_categorical_accuracy: 0.1438 - factorized_top_k/top_100_categorical_accuracy: 0.2642 - loss: 67453.3125 - regularization_loss: 0.0000e+00 - total_loss: 67453.3125
Epoch 3/3
10/10 [==============================] - 3s 309ms/step - root_mean_squared_error: 1.1934 - factorized_top_k/top_1_categorical_accuracy: 0.0016 - factorized_top_k/top_5_categorical_accuracy: 0.0190 - factorized_top_k/top_10_categorical_accuracy: 0.0394 - factorized_top_k/top_50_categorical_accuracy: 0.1771 - factorized_top_k/top_100_categorical_accuracy: 0.3037 - loss: 66299.1676 - regularization_loss: 0.0000e+00 - total_loss: 66299.1676
5/5 [==============================] - 1s 190ms/step - root_mean_squared_error: 1.1100 - factorized_top_k/top_1_categorical_accuracy: 9.5000e-04 - factorized_top_k/top_5_categorical_accuracy: 0.0086 - factorized_top_k/top_10_categorical_accuracy: 0.0210 - factorized_top_k/top_50_categorical_accuracy: 0.1237 - factorized_top_k/top_100_categorical_accuracy: 0.2349 - loss: 31075.5518 - regularization_loss: 0.0000e+00 - total_loss: 31075.5518
Retrieval top-100 accuracy: 0.235.
Ranking RMSE: 1.110.

結果是一個模型,在兩個任務上的效能大致與每個專門模型一樣好。

進行預測

我們可以利用訓練好的多任務模型來取得訓練好的使用者和電影嵌入,以及預測的評分

trained_movie_embeddings, trained_user_embeddings, predicted_rating = model({
      "user_id": np.array(["42"]),
      "movie_title": np.array(["Dances with Wolves (1990)"])
  })
print("Predicted rating:")
print(predicted_rating)
Predicted rating:
tf.Tensor([[4.604047]], shape=(1, 1), dtype=float32)

雖然此處的結果並未顯示聯合模型在此案例中具有明顯的準確率優勢,但多任務學習通常是一個非常有用的工具。當我們可以將知識從資料豐富的任務 (例如點擊) 轉移到密切相關的資料稀疏任務 (例如購買) 時,我們可以預期獲得更好的結果。