使用決策樹模型提升 (Uplifting)

在 TensorFlow.org 上檢視 在 Google Colab 中執行 在 GitHub 上檢視 下載筆記本

歡迎使用 TensorFlow 決策樹模型 (TF-DF) 的提升教學課程。在本教學課程中,您將學習什麼是提升 (uplifting)、為何如此重要,以及如何在 TF-DF 中執行提升。

本教學課程假設您已熟悉 TF-DF 的基本概念,尤其是安裝程序。初學者教學課程是開始學習 TF-DF 的絕佳起點。

在本 Colab 中,您將:

  • 瞭解什麼是提升模型 (Uplift Modeling)。
  • Hillstrom 電子郵件行銷資料集上訓練提升隨機森林模型。
  • 評估此模型的品質。

安裝 TensorFlow 決策樹模型

執行以下儲存格以安裝 TF-DF。

Wurlitzer 是在 Colab 中顯示詳細訓練記錄 (當在模型建構函式中使用 verbose=2 時) 所需的套件。

pip install tensorflow_decision_forests wurlitzer

匯入程式庫

import tensorflow_decision_forests as tfdf

import os
import numpy as np
import pandas as pd
import tensorflow as tf
import math
import matplotlib.pyplot as plt

隱藏程式碼儲存格會限制 Colab 中的輸出高度。

# Check the version of TensorFlow Decision Forests
print("Found TensorFlow Decision Forests v" + tfdf.__version__)
Found TensorFlow Decision Forests v1.9.0

什麼是提升模型 (Uplift Modeling)?

提升模型 (Uplift Modeling) 是一種統計模型技術,用於預測動作對對象的增量影響。此動作通常稱為處理,可能會套用或不套用。

提升模型 (Uplift Modeling) 通常用於目標行銷活動,以根據個人接觸到的行銷曝光量,預測個人購買 (或任何其他所需動作) 可能性的增加。

例如,提升模型 (Uplift Modeling) 可以預測電子郵件的效果。效果定義為條件機率 \begin{align} \text{effect}(\text{email}) = &\Pr(\text{outcome}=\text{purchase}\ \vert\ \text{treatment}=\text{with email})\ &- \Pr(\text{outcome}=\text{purchase} \ \vert\ \text{treatment}=\text{no email}), \end{align} 其中 \(\Pr(\text{outcome}=\text{purchase}\ \vert\ ...)\) 是取決於是否收到電子郵件的購買機率。

將此與分類模型進行比較:使用分類模型,可以預測購買機率。但是,即使顧客沒有收到電子郵件,高機率顧客仍可能在商店消費。

同樣地,可以使用數值提升 (Numerical Uplifting) 來預測收到電子郵件時支出的數值增加量。相比之下,迴歸模型只能增加預期支出,在許多情況下,這是一個較不實用的指標。

在 TF-DF 中定義提升模型 (Uplift Modeling)

TF-DF 期望以「扁平」格式呈現提升資料集。顧客資料集可能如下所示:

處理 結果 特徵 1 特徵 2
0 1 0.1 blue
0 0 0.2 blue
1 1 0.3 blue
1 1 0.4 blue

處理是一個二元變數,表示範例是否已接受處理。在上述範例中,處理表示顧客是否已收到電子郵件。結果 (標籤) 表示範例在接受處理 (或未接受處理) 後的狀態。TF-DF 支援類別結果以用於類別提升 (Categorical Uplifting),以及數值結果以用於數值提升 (Numerical Uplifting)。

訓練提升模型 (Uplift Modeling)

在本範例中,我們將使用「Hillstrom 電子郵件行銷」資料集。

此資料集包含 64,000 位在過去 12 個月內曾購買的顧客。這些顧客參與了一項電子郵件測試:

  • 隨機選取 1/3 的顧客接收以男性商品為主的電子郵件行銷活動。
  • 隨機選取 1/3 的顧客接收以女性商品為主的電子郵件行銷活動。
  • 隨機選取 1/3 的顧客不接收電子郵件行銷活動。

在電子郵件行銷活動後的兩週期間,追蹤了結果。任務是判斷男性或女性電子郵件行銷活動是否成功。

如需進一步瞭解資料集,請參閱其文件。本教學課程使用由 TensorFlow Datasets 管理的資料集。

# Install the TensorFlow Datasets package
pip install tensorflow-datasets -U --quiet
# Load the dataset
import tensorflow_datasets as tfds
raw_train, raw_test = tfds.load('hillstrom', split=['train[:80%]', 'train[20%:]'])

# Display the first 10 examples of the test fold.
pd.DataFrame(list(raw_test.batch(10).take(1))[0])
2024-04-20 11:22:09.063782: W tensorflow/core/kernels/data/cache_dataset_ops.cc:858] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.
2024-04-20 11:22:09.069098: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence

資料集預先處理

由於 TF-DF 目前僅支援二元處理,因此合併「男性電子郵件」和「女性電子郵件」行銷活動。本教學課程使用二元變數 conversion 作為結果。這表示此問題是類別提升 (Categorical Uplifting) 問題。如果我們使用數值變數 spend,則問題會是數值提升 (Numerical Uplifting) 問題。

def prepare_dataset(example):
  # Use a binary treatment class.
  example['treatment'] = 1 if example['segment'] == b'Mens E-Mail' or example['segment'] == b'Womens E-Mail' else 0
  outcome = example['conversion']
  # Restrict the dataset to the input features.
  input_features = ['channel', 'history', 'mens', 'womens', 'newbie', 'recency', 'zip_code', 'treatment']
  example = {feature: example[feature] for feature in input_features}
  return example, outcome

train_ds = raw_train.map(prepare_dataset).batch(100)
test_ds = raw_test.map(prepare_dataset).batch(100)

模型訓練

最後,照常訓練和評估模型。請注意,TF-DF 僅支援用於提升 (uplifting) 的隨機森林模型。

%set_cell_height 300

# Configure the model and its hyper-parameters.
model = tfdf.keras.RandomForestModel(
    verbose=2,
    task=tfdf.keras.Task.CATEGORICAL_UPLIFT,
    uplift_treatment='treatment'
)

# Train the model.
model.fit(train_ds)
<IPython.core.display.Javascript object>
Warning: The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
WARNING:absl:The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
Use /tmpfs/tmp/tmppyeh4gae as temporary training directory
Reading training dataset...
Training tensor examples:
Features: {'channel': <tf.Tensor 'data:0' shape=(None,) dtype=string>, 'history': <tf.Tensor 'data_1:0' shape=(None,) dtype=float32>, 'mens': <tf.Tensor 'data_2:0' shape=(None,) dtype=int64>, 'womens': <tf.Tensor 'data_3:0' shape=(None,) dtype=int64>, 'newbie': <tf.Tensor 'data_4:0' shape=(None,) dtype=int64>, 'recency': <tf.Tensor 'data_5:0' shape=(None,) dtype=int64>, 'zip_code': <tf.Tensor 'data_6:0' shape=(None,) dtype=string>, 'treatment': <tf.Tensor 'data_7:0' shape=(None,) dtype=int32>}
Label: Tensor("data_8:0", shape=(None,), dtype=int64)
Weights: None
Normalized tensor features:
 {'channel': SemanticTensor(semantic=<Semantic.CATEGORICAL: 2>, tensor=<tf.Tensor 'data:0' shape=(None,) dtype=string>), 'history': SemanticTensor(semantic=<Semantic.NUMERICAL: 1>, tensor=<tf.Tensor 'data_1:0' shape=(None,) dtype=float32>), 'mens': SemanticTensor(semantic=<Semantic.NUMERICAL: 1>, tensor=<tf.Tensor 'Cast:0' shape=(None,) dtype=float32>), 'womens': SemanticTensor(semantic=<Semantic.NUMERICAL: 1>, tensor=<tf.Tensor 'Cast_1:0' shape=(None,) dtype=float32>), 'newbie': SemanticTensor(semantic=<Semantic.NUMERICAL: 1>, tensor=<tf.Tensor 'Cast_2:0' shape=(None,) dtype=float32>), 'recency': SemanticTensor(semantic=<Semantic.NUMERICAL: 1>, tensor=<tf.Tensor 'Cast_3:0' shape=(None,) dtype=float32>), 'zip_code': SemanticTensor(semantic=<Semantic.CATEGORICAL: 2>, tensor=<tf.Tensor 'data_6:0' shape=(None,) dtype=string>)}
Training dataset read in 0:00:04.974222. Found 51200 examples.
Training model...
Standard output detected as not visible to the user e.g. running in a notebook. Creating a training log redirection. If training gets stuck, try calling tfdf.keras.set_training_logs_redirection(False).
[INFO 24-04-20 11:22:14.2334 UTC kernel.cc:771] Start Yggdrasil model training
[INFO 24-04-20 11:22:14.2335 UTC kernel.cc:772] Collect training examples
[INFO 24-04-20 11:22:14.2335 UTC kernel.cc:785] Dataspec guide:
column_guides {
  column_name_pattern: "^__LABEL$"
  type: CATEGORICAL
}
default_column_guide {
  categorial {
    max_vocab_count: 2000
  }
  discretized_numerical {
    maximum_num_bins: 255
  }
}
ignore_columns_without_guides: false
detect_numerical_as_discretized_numerical: false

[INFO 24-04-20 11:22:14.2339 UTC kernel.cc:391] Number of batches: 512
[INFO 24-04-20 11:22:14.2339 UTC kernel.cc:392] Number of examples: 51200
[INFO 24-04-20 11:22:14.2463 UTC kernel.cc:792] Training dataset:
Number of records: 51200
Number of columns: 9

Number of columns by type:
    NUMERICAL: 5 (55.5556%)
    CATEGORICAL: 4 (44.4444%)

Columns:

NUMERICAL: 5 (55.5556%)
    2: "history" NUMERICAL mean:241.833 min:29.99 max:3345.93 sd:255.292
    3: "mens" NUMERICAL mean:0.550391 min:0 max:1 sd:0.497454
    4: "newbie" NUMERICAL mean:0.503086 min:0 max:1 sd:0.49999
    5: "recency" NUMERICAL mean:5.75514 min:1 max:12 sd:3.50281
    7: "womens" NUMERICAL mean:0.549687 min:0 max:1 sd:0.497525

CATEGORICAL: 4 (44.4444%)
    0: "__LABEL" CATEGORICAL integerized vocab-size:3 no-ood-item
    1: "channel" CATEGORICAL has-dict vocab-size:4 zero-ood-items most-frequent:"Web" 22576 (44.0938%)
    6: "treatment" CATEGORICAL integerized vocab-size:3 no-ood-item
    8: "zip_code" CATEGORICAL has-dict vocab-size:4 zero-ood-items most-frequent:"Surburban" 22966 (44.8555%)

Terminology:
    nas: Number of non-available (i.e. missing) values.
    ood: Out of dictionary.
    manually-defined: Attribute whose type is manually defined by the user, i.e., the type was not automatically inferred.
    tokenized: The attribute value is obtained through tokenization.
    has-dict: The attribute is attached to a string dictionary e.g. a categorical attribute stored as a string.
    vocab-size: Number of unique values.

[INFO 24-04-20 11:22:14.2464 UTC kernel.cc:808] Configure learner
[INFO 24-04-20 11:22:14.2466 UTC kernel.cc:822] Training config:
learner: "RANDOM_FOREST"
features: "^channel$"
features: "^history$"
features: "^mens$"
features: "^newbie$"
features: "^recency$"
features: "^womens$"
features: "^zip_code$"
label: "^__LABEL$"
task: CATEGORICAL_UPLIFT
random_seed: 123456
uplift_treatment: "treatment"
metadata {
  framework: "TF Keras"
}
pure_serving_model: false
[yggdrasil_decision_forests.model.random_forest.proto.random_forest_config] {
  num_trees: 300
  decision_tree {
    max_depth: 16
    min_examples: 5
    in_split_min_examples_check: true
    keep_non_leaf_label_distribution: true
    num_candidate_attributes: 0
    missing_value_policy: GLOBAL_IMPUTATION
    allow_na_conditions: false
    categorical_set_greedy_forward {
      sampling: 0.1
      max_num_items: -1
      min_item_frequency: 1
    }
    growing_strategy_local {
    }
    categorical {
      cart {
      }
    }
    axis_aligned_split {
    }
    internal {
      sorting_strategy: PRESORTED
    }
    uplift {
      min_examples_in_treatment: 5
      split_score: KULLBACK_LEIBLER
    }
  }
  winner_take_all_inference: true
  compute_oob_performances: true
  compute_oob_variable_importances: false
  num_oob_variable_importances_permutations: 1
  bootstrap_training_dataset: true
  bootstrap_size_ratio: 1
  adapt_bootstrap_size_ratio_for_maximum_training_duration: false
  sampling_with_replacement: true
}

[INFO 24-04-20 11:22:14.2469 UTC kernel.cc:825] Deployment config:
cache_path: "/tmpfs/tmp/tmppyeh4gae/working_cache"
num_threads: 32
try_resume_training: true

[INFO 24-04-20 11:22:14.2472 UTC kernel.cc:887] Train model
[INFO 24-04-20 11:22:14.2473 UTC random_forest.cc:416] Training random forest on 51200 example(s) and 7 feature(s).
[WARNING 24-04-20 11:22:14.3731 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:14.3741 UTC random_forest.cc:802] Training of tree  1/300 (tree index:2) done qini:0.000172044 auuc:0.0025137
[WARNING 24-04-20 11:22:14.4012 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:14.4027 UTC random_forest.cc:802] Training of tree  15/300 (tree index:31) done qini:1.41341e-05 auuc:0.0023575
[WARNING 24-04-20 11:22:14.5302 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:14.5327 UTC random_forest.cc:802] Training of tree  25/300 (tree index:23) done qini:-2.19346e-05 auuc:0.00235455
[WARNING 24-04-20 11:22:14.6034 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:14.6058 UTC random_forest.cc:802] Training of tree  35/300 (tree index:33) done qini:0.00013211 auuc:0.0025086
[WARNING 24-04-20 11:22:14.6887 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:14.6910 UTC random_forest.cc:802] Training of tree  45/300 (tree index:45) done qini:-2.28572e-05 auuc:0.00235363
[WARNING 24-04-20 11:22:14.7656 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:14.7680 UTC random_forest.cc:802] Training of tree  55/300 (tree index:55) done qini:-8.67727e-05 auuc:0.00228972
[WARNING 24-04-20 11:22:14.8354 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:14.8379 UTC random_forest.cc:802] Training of tree  65/300 (tree index:56) done qini:-0.000112323 auuc:0.00226417
[WARNING 24-04-20 11:22:14.9052 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:14.9077 UTC random_forest.cc:802] Training of tree  75/300 (tree index:74) done qini:-0.000109942 auuc:0.00226655
[WARNING 24-04-20 11:22:14.9680 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:14.9704 UTC random_forest.cc:802] Training of tree  101/300 (tree index:101) done qini:-0.000112409 auuc:0.00226408
[WARNING 24-04-20 11:22:15.1148 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:15.1196 UTC random_forest.cc:802] Training of tree  121/300 (tree index:118) done qini:-0.000299795 auuc:0.00207669
[WARNING 24-04-20 11:22:15.2280 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:15.2305 UTC random_forest.cc:802] Training of tree  131/300 (tree index:138) done qini:-0.000153133 auuc:0.00222336
[WARNING 24-04-20 11:22:15.3108 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:15.3155 UTC random_forest.cc:802] Training of tree  141/300 (tree index:139) done qini:-0.000173194 auuc:0.0022033
[WARNING 24-04-20 11:22:15.3853 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:15.3877 UTC random_forest.cc:802] Training of tree  168/300 (tree index:162) done qini:-0.000130945 auuc:0.00224554
[WARNING 24-04-20 11:22:15.5471 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:15.5519 UTC random_forest.cc:802] Training of tree  178/300 (tree index:178) done qini:-0.000145457 auuc:0.00223103
[WARNING 24-04-20 11:22:15.6367 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:15.6414 UTC random_forest.cc:802] Training of tree  188/300 (tree index:189) done qini:-0.000124566 auuc:0.00225192
[WARNING 24-04-20 11:22:15.6876 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:15.6901 UTC random_forest.cc:802] Training of tree  217/300 (tree index:213) done qini:-0.000161956 auuc:0.00221453
[WARNING 24-04-20 11:22:15.8731 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:15.8795 UTC random_forest.cc:802] Training of tree  227/300 (tree index:229) done qini:-0.000133605 auuc:0.00224288
[WARNING 24-04-20 11:22:15.9403 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:15.9428 UTC random_forest.cc:802] Training of tree  237/300 (tree index:239) done qini:-0.000101549 auuc:0.00227494
[WARNING 24-04-20 11:22:16.0044 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:16.0068 UTC random_forest.cc:802] Training of tree  247/300 (tree index:253) done qini:-0.000141334 auuc:0.00223516
[WARNING 24-04-20 11:22:16.0749 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:16.0773 UTC random_forest.cc:802] Training of tree  257/300 (tree index:257) done qini:-0.000135416 auuc:0.00224107
[WARNING 24-04-20 11:22:16.1446 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:16.1471 UTC random_forest.cc:802] Training of tree  267/300 (tree index:261) done qini:-0.000131112 auuc:0.00224538
[WARNING 24-04-20 11:22:16.2109 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:16.2132 UTC random_forest.cc:802] Training of tree  277/300 (tree index:275) done qini:-0.000149751 auuc:0.00222674
[WARNING 24-04-20 11:22:16.2724 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:16.2746 UTC random_forest.cc:802] Training of tree  287/300 (tree index:283) done qini:-0.000168736 auuc:0.00220775
[WARNING 24-04-20 11:22:16.3282 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:16.3306 UTC random_forest.cc:802] Training of tree  297/300 (tree index:299) done qini:-0.000181665 auuc:0.00219482
[WARNING 24-04-20 11:22:16.3623 UTC random_forest.cc:1105] Internal error: Non empty oob evaluation
[INFO 24-04-20 11:22:16.3646 UTC random_forest.cc:802] Training of tree  300/300 (tree index:298) done qini:-0.000173258 auuc:0.00220323
[INFO 24-04-20 11:22:16.3680 UTC random_forest.cc:882] Final OOB metrics: qini:-0.000173258 auuc:0.00220323
[INFO 24-04-20 11:22:16.3843 UTC kernel.cc:919] Export model in log directory: /tmpfs/tmp/tmppyeh4gae with prefix 568256236db544eb
[INFO 24-04-20 11:22:16.4274 UTC kernel.cc:937] Save model in resources
[INFO 24-04-20 11:22:16.4309 UTC abstract_model.cc:881] Model self evaluation:
Number of predictions (without weights): 51200
Number of predictions (with weights): 51200
Task: CATEGORICAL_UPLIFT
Label: __LABEL

Number of treatments: 2
AUUC: 0.00220323
Qini: -0.000173258

[INFO 24-04-20 11:22:16.4580 UTC kernel.cc:1233] Loading model from path /tmpfs/tmp/tmppyeh4gae/model/ with prefix 568256236db544eb
[INFO 24-04-20 11:22:16.6557 UTC decision_forest.cc:734] Model loaded with 300 root(s), 60190 node(s), and 7 input feature(s).
[INFO 24-04-20 11:22:16.6557 UTC abstract_model.cc:1344] Engine "RandomForestGeneric" built
[INFO 24-04-20 11:22:16.6557 UTC kernel.cc:1061] Use fast generic engine
Model trained in 0:00:02.442514
Compiling model...
Model compiled.
<tf_keras.src.callbacks.History at 0x7f5dd40eb7c0>

評估提升模型 (Uplift Modeling)。

提升模型 (Uplift Modeling) 的指標

用於評估提升模型 (uplift modeling) 的兩個最重要的指標是 AUUC (提升曲線下面積) 指標和 Qini (Qini 曲線下面積) 指標。這類似於分類問題中 AUC 和準確度的使用方式。對於這兩個指標而言,值越大越好。

AUUC 和 Qini 都不是標準化指標。這表示指標的最佳可能值可能因資料集而異。這與例如 AUC 指標不同,AUC 指標的值始終在 0 到 1 之間。

AUUC 的正式定義如下。如需進一步瞭解這些指標,請參閱 GuelmanBetlei 等人

模型自我評估

TF-DF 隨機森林模型對訓練資料集的袋外範例執行自我評估。對於提升模型 (uplift modeling),它們會公開 AUUC 和 Qini 指標。您可以透過檢查器直接擷取訓練資料集上的這兩個指標

稍後,我們將在測試資料集上「手動」重新計算 AUUC 指標。請注意,由於 AUUC 不是標準化指標,因此預期這兩個指標不會完全相等 (訓練集的袋外與測試集)。

# The self-evaluation is available through the model inspector
insp = model.make_inspector()
insp.evaluation()
Evaluation(num_examples=51200, accuracy=None, loss=None, rmse=None, ndcg=None, aucs=None, auuc=0.0022032308892709586, qini=-0.00017325819500263418)

手動計算 AUUC

在本節中,我們將手動計算 AUUC 並繪製提升曲線。

接下來的幾段將更詳細地說明 AUUC 指標,可以略過。

計算 AUUC

假設您有一個已標記的資料集,其中包含 \(|T|\) 個接受處理的範例和 \(|C|\) 個未接受處理的範例 (稱為對照範例)。對於每個範例,提升模型 \(f\) 會產生對範例進行處理將產生正面結果的條件機率。

假設決策者需要使用提升模型 \(f\) 來決定要向哪些客戶發送電子郵件。模型會產生電子郵件將導致轉換的 (條件) 機率。因此,決策者可能會選擇要發送的電子郵件數量 \(k\),並將這 \(k\) 封電子郵件發送給機率最高的客戶。

使用已標記的測試資料集,可以研究 \(k\) 對行銷活動成功與否的影響。首先,我們感興趣的是收到電子郵件並轉換的客戶與收到電子郵件的客戶總數的比率 \(\frac{|C \cap T|}{|T|}\)。此處 \(C\) 是收到電子郵件並轉換的客戶集合,而 \(T\) 是收到電子郵件的客戶總數。我們繪製此比率與 \(k\) 的關係圖。

理想情況下,我們希望此曲線陡峭上升。這表示模型優先向那些在收到電子郵件時會產生轉換的客戶發送電子郵件。

# Compute all predictions on the test dataset
predictions = model.predict(test_ds).flatten()
# Extract outcomes and treatments
outcomes = np.concatenate([outcome.numpy() for _, outcome in test_ds])
treatment = np.concatenate([example['treatment'].numpy() for example,_ in test_ds])
control = 1 - treatment

num_treatments = np.sum(treatment)
# Clients without treatment are called 'control' group
num_control = np.sum(control)
num_examples = len(predictions)

# Sort labels and treatments according to predictions in descending order
prediction_order = predictions.argsort()[::-1]
outcomes_sorted = outcomes[prediction_order]
treatment_sorted = treatment[prediction_order]
control_sorted = control[prediction_order]
ratio_treatment = np.cumsum(np.multiply(outcomes_sorted, treatment_sorted), axis=0)/num_treatments

fig, ax = plt.subplots()
ax.plot(ratio_treatment, label='Conversion ratio of treatment')
ax.set_xlabel('k')
ax.set_ylabel('Ratio of conversion')
ax.legend()
512/512 [==============================] - 3s 5ms/step
2024-04-20 11:22:25.165808: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
2024-04-20 11:22:25.975008: W tensorflow/core/framework/local_rendezvous.cc:404] Local rendezvous is aborting with status: OUT_OF_RANGE: End of sequence
<matplotlib.legend.Legend at 0x7f5cbc41ac70>

png

同樣地,我們也可以計算和繪製未收到電子郵件者的轉換率,稱為對照組。理想情況下,此曲線最初是平坦的:這表示模型不會優先向那些即使收到電子郵件仍會產生轉換的客戶發送電子郵件

ratio_control = np.cumsum(np.multiply(outcomes_sorted, control_sorted), axis=0)/num_control
ax.plot(ratio_control, label='Conversion ratio of control')
ax.legend()
fig

png

AUUC 指標衡量這兩條曲線之間的面積,將 y 軸標準化在 0 到 1 之間

x = np.linspace(0, 1, num_examples)
plt.plot(x,ratio_treatment, label='Conversion ratio of treatment')
plt.plot(x,ratio_control, label='Conversion ratio of control')
plt.fill_between(x, ratio_treatment, ratio_control, where=(ratio_treatment > ratio_control), color='C0', alpha=0.3)
plt.fill_between(x, ratio_treatment, ratio_control, where=(ratio_treatment < ratio_control), color='C1', alpha=0.3)
plt.xlabel('k')
plt.ylabel('Ratio of conversion')
plt.legend()

# Approximate the integral of the difference between the two curves.
auuc = np.trapz(ratio_treatment-ratio_control, dx=1/num_examples)
print(f'The AUUC on the test dataset is {auuc}')
The AUUC on the test dataset is 0.007513928513572819

png