遷移範例:罐頭估算器

在 TensorFlow.org 上檢視 在 Google Colab 中執行 在 GitHub 上檢視原始碼 下載筆記本

傳統上,罐頭 (或預製) 估算器在 TensorFlow 1 中被快速且輕鬆地用於訓練各種典型使用案例的模型。TensorFlow 2 透過 Keras 模型,為其中許多估算器提供了直接的近似替代方案。對於那些沒有內建 TensorFlow 2 替代方案的罐頭估算器,您仍然可以相當輕鬆地建構自己的替代方案。

本指南將引導您完成一些直接等效和自訂替代方案的範例,以示範如何將 TensorFlow 1 的 tf.estimator 衍生模型遷移至具有 Keras 的 TensorFlow 2。

具體而言,本指南包含遷移下列項目的範例

  • 從 TensorFlow 1 中的 tf.estimatorLinearEstimatorClassifierRegressor 遷移至 TensorFlow 2 中的 Keras tf.compat.v1.keras.models.LinearModel
  • 從 TensorFlow 1 中的 tf.estimatorDNNEstimatorClassifierRegressor 遷移至 TensorFlow 2 中的自訂 Keras DNN 模型 Keras
  • 從 TensorFlow 1 中的 tf.estimatorDNNLinearCombinedEstimatorClassifierRegressor 遷移至 TensorFlow 2 中的 tf.compat.v1.keras.models.WideDeepModel
  • 從 TensorFlow 1 中的 tf.estimatorBoostedTreesEstimatorClassifierRegressor 遷移至 TensorFlow 2 中的 tfdf.keras.GradientBoostedTreesModel

模型訓練的常見前置步驟是特徵預先處理,TensorFlow 1 Estimator 模型使用 tf.feature_column 進行預先處理。如需 TensorFlow 2 中特徵預先處理的詳細資訊,請參閱這份關於從特徵欄遷移至 Keras 預先處理層 API 的指南

設定

從幾個必要的 TensorFlow 匯入項目開始,

pip install tensorflow_decision_forests
import pandas as pd
import tensorflow as tf
import tensorflow.compat.v1 as tf1
import tensorflow_decision_forests as tfdf
from tensorflow import keras

準備一些來自標準鐵達尼號資料集的簡單資料以進行示範,

x_train = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')
x_eval = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/eval.csv')
x_train['sex'].replace(('male', 'female'), (0, 1), inplace=True)
x_eval['sex'].replace(('male', 'female'), (0, 1), inplace=True)

x_train['alone'].replace(('n', 'y'), (0, 1), inplace=True)
x_eval['alone'].replace(('n', 'y'), (0, 1), inplace=True)

x_train['class'].replace(('First', 'Second', 'Third'), (1, 2, 3), inplace=True)
x_eval['class'].replace(('First', 'Second', 'Third'), (1, 2, 3), inplace=True)

x_train.drop(['embark_town', 'deck'], axis=1, inplace=True)
x_eval.drop(['embark_town', 'deck'], axis=1, inplace=True)

y_train = x_train.pop('survived')
y_eval = x_eval.pop('survived')
# Data setup for TensorFlow 1 with `tf.estimator`
def _input_fn():
  return tf1.data.Dataset.from_tensor_slices((dict(x_train), y_train)).batch(32)


def _eval_input_fn():
  return tf1.data.Dataset.from_tensor_slices((dict(x_eval), y_eval)).batch(32)


FEATURE_NAMES = [
    'age', 'fare', 'sex', 'n_siblings_spouses', 'parch', 'class', 'alone'
]

feature_columns = []
for fn in FEATURE_NAMES:
  feat_col = tf1.feature_column.numeric_column(fn, dtype=tf.float32)
  feature_columns.append(feat_col)
WARNING:tensorflow:From /tmpfs/tmp/ipykernel_10080/2801132002.py:16: numeric_column (from tensorflow.python.feature_column.feature_column_v2) is deprecated and will be removed in a future version.
Instructions for updating:
Use Keras preprocessing layers instead, either directly or via the `tf.keras.utils.FeatureSpace` utility. Each of `tf.feature_column.*` has a functional equivalent in `tf.keras.layers` for feature preprocessing when training a Keras model.

並建立一種方法來例項化一個簡單的範例最佳化工具,以搭配各種 TensorFlow 1 Estimator 和 TensorFlow 2 Keras 模型使用。

def create_sample_optimizer(tf_version):
  if tf_version == 'tf1':
    optimizer = lambda: tf.keras.optimizers.legacy.Ftrl(
        l1_regularization_strength=0.001,
        learning_rate=tf1.train.exponential_decay(
            learning_rate=0.1,
            global_step=tf1.train.get_global_step(),
            decay_steps=10000,
            decay_rate=0.9))
  elif tf_version == 'tf2':
    optimizer = tf.keras.optimizers.legacy.Ftrl(
        l1_regularization_strength=0.001,
        learning_rate=tf.keras.optimizers.schedules.ExponentialDecay(
            initial_learning_rate=0.1, decay_steps=10000, decay_rate=0.9))
  return optimizer

範例 1:從 LinearEstimator 遷移

TensorFlow 1:使用 LinearEstimator

在 TensorFlow 1 中,您可以使用 tf.estimator.LinearEstimator 建立用於迴歸和分類問題的基準線性模型。

linear_estimator = tf.estimator.LinearEstimator(
    head=tf.estimator.BinaryClassHead(),
    feature_columns=feature_columns,
    optimizer=create_sample_optimizer('tf1'))
WARNING:tensorflow:From /tmpfs/tmp/ipykernel_10080/2944250643.py:2: BinaryClassHead.__init__ (from tensorflow_estimator.python.estimator.head.binary_class_head) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
WARNING:tensorflow:From /tmpfs/tmp/ipykernel_10080/2944250643.py:1: LinearEstimatorV2.__init__ (from tensorflow_estimator.python.estimator.canned.linear) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/canned/linear.py:1124: Estimator.__init__ (from tensorflow_estimator.python.estimator.estimator) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/estimator.py:1844: RunConfig.__init__ (from tensorflow_estimator.python.estimator.run_config) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmpfs/tmp/tmpcvrw6s1d
INFO:tensorflow:Using config: {'_model_dir': '/tmpfs/tmp/tmpcvrw6s1d', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
linear_estimator.train(input_fn=_input_fn, steps=100)
linear_estimator.evaluate(input_fn=_eval_input_fn, steps=10)
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/estimator.py:385: StopAtStepHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Calling model_fn.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/keras/src/optimizers/legacy/ftrl.py:173: calling Constant.__init__ (from tensorflow.python.ops.init_ops) with dtype is deprecated and will be removed in a future version.
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/model_fn.py:250: EstimatorSpec.__new__ (from tensorflow_estimator.python.estimator.model_fn) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Done calling model_fn.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/estimator.py:1416: NanTensorHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/estimator.py:1419: LoggingTensorHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/basic_session_run_hooks.py:232: SecondOrStepTimer.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow_estimator/python/estimator/estimator.py:1456: CheckpointSaverHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Create CheckpointSaverHook.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:579: StepCounterHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:586: SummarySaverHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmpfs/tmp/tmpcvrw6s1d/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:1455: SessionRunArgs.__new__ (from tensorflow.python.training.session_run_hook) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:1454: SessionRunContext.__init__ (from tensorflow.python.training.session_run_hook) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/monitored_session.py:1474: SessionRunValues.__new__ (from tensorflow.python.training.session_run_hook) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:loss = 0.6931472, step = 0
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 20...
INFO:tensorflow:Saving checkpoints for 20 into /tmpfs/tmp/tmpcvrw6s1d/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 20...
INFO:tensorflow:Loss for final step: 0.55268794.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2023-09-16T01:21:55
WARNING:tensorflow:From /tmpfs/src/tf_docs_env/lib/python3.9/site-packages/tensorflow/python/training/evaluation.py:260: FinalOpsHook.__init__ (from tensorflow.python.training.basic_session_run_hooks) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmpfs/tmp/tmpcvrw6s1d/model.ckpt-20
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [1/10]
INFO:tensorflow:Evaluation [2/10]
INFO:tensorflow:Evaluation [3/10]
INFO:tensorflow:Evaluation [4/10]
INFO:tensorflow:Evaluation [5/10]
INFO:tensorflow:Evaluation [6/10]
INFO:tensorflow:Evaluation [7/10]
INFO:tensorflow:Evaluation [8/10]
INFO:tensorflow:Evaluation [9/10]
INFO:tensorflow:Inference Time : 0.58102s
INFO:tensorflow:Finished evaluation at 2023-09-16-01:21:55
INFO:tensorflow:Saving dict for global step 20: accuracy = 0.70075756, accuracy_baseline = 0.625, auc = 0.75472915, auc_precision_recall = 0.65362054, average_loss = 0.5759378, global_step = 20, label/mean = 0.375, loss = 0.5704812, precision = 0.6388889, prediction/mean = 0.41331062, recall = 0.46464646
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 20: /tmpfs/tmp/tmpcvrw6s1d/model.ckpt-20
{'accuracy': 0.70075756,
 'accuracy_baseline': 0.625,
 'auc': 0.75472915,
 'auc_precision_recall': 0.65362054,
 'average_loss': 0.5759378,
 'label/mean': 0.375,
 'loss': 0.5704812,
 'precision': 0.6388889,
 'prediction/mean': 0.41331062,
 'recall': 0.46464646,
 'global_step': 20}

TensorFlow 2:使用 Keras LinearModel

在 TensorFlow 2 中,您可以建立 Keras tf.compat.v1.keras.models.LinearModel 的執行個體,它是 tf.estimator.LinearEstimator 的替代方案。tf.compat.v1.keras 路徑用於表示預製模型的存在是為了相容性。

linear_model = tf.compat.v1.keras.experimental.LinearModel()
linear_model.compile(loss='mse', optimizer=create_sample_optimizer('tf2'), metrics=['accuracy'])
linear_model.fit(x_train, y_train, epochs=10)
linear_model.evaluate(x_eval, y_eval, return_dict=True)
Epoch 1/10
20/20 [==============================] - 0s 2ms/step - loss: 3.6712 - accuracy: 0.6077
Epoch 2/10
20/20 [==============================] - 0s 2ms/step - loss: 0.2146 - accuracy: 0.6715
Epoch 3/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1980 - accuracy: 0.6874
Epoch 4/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1880 - accuracy: 0.7129
Epoch 5/10
20/20 [==============================] - 0s 9ms/step - loss: 0.1805 - accuracy: 0.7337
Epoch 6/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1807 - accuracy: 0.7624
Epoch 7/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1692 - accuracy: 0.7783
Epoch 8/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1691 - accuracy: 0.7927
Epoch 9/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1717 - accuracy: 0.7911
Epoch 10/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1596 - accuracy: 0.7990
9/9 [==============================] - 0s 2ms/step - loss: 0.1853 - accuracy: 0.7348
{'loss': 0.185323566198349, 'accuracy': 0.7348484992980957}

範例 2:從 DNNEstimator 遷移

TensorFlow 1:使用 DNNEstimator

在 TensorFlow 1 中,您可以使用 tf.estimator.DNNEstimator 建立用於迴歸和分類問題的基準深度神經網路 (DNN) 模型。

dnn_estimator = tf.estimator.DNNEstimator(
    head=tf.estimator.BinaryClassHead(),
    feature_columns=feature_columns,
    hidden_units=[128],
    activation_fn=tf.nn.relu,
    optimizer=create_sample_optimizer('tf1'))
WARNING:tensorflow:From /tmpfs/tmp/ipykernel_10080/1828606501.py:1: DNNEstimatorV2.__init__ (from tensorflow_estimator.python.estimator.canned.dnn) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmpfs/tmp/tmpyih539cq
INFO:tensorflow:Using config: {'_model_dir': '/tmpfs/tmp/tmpyih539cq', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
dnn_estimator.train(input_fn=_input_fn, steps=100)
dnn_estimator.evaluate(input_fn=_eval_input_fn, steps=10)
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
2023-09-16 01:21:57.950353: W tensorflow/core/common_runtime/type_inference.cc:339] Type inference failed. This indicates an invalid graph that escaped type checking. Error message: INVALID_ARGUMENT: expected compatible input types, but input 1:
type_id: TFT_OPTIONAL
args {
  type_id: TFT_PRODUCT
  args {
    type_id: TFT_TENSOR
    args {
      type_id: TFT_INT64
    }
  }
}
 is neither a subtype nor a supertype of the combined inputs preceding it:
type_id: TFT_OPTIONAL
args {
  type_id: TFT_PRODUCT
  args {
    type_id: TFT_TENSOR
    args {
      type_id: TFT_INT32
    }
  }
}

    for Tuple type infernce function 0
    while inferring type of node 'dnn/zero_fraction/cond/output/_18'
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmpfs/tmp/tmpyih539cq/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 0.9991276, step = 0
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 20...
INFO:tensorflow:Saving checkpoints for 20 into /tmpfs/tmp/tmpyih539cq/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 20...
INFO:tensorflow:Loss for final step: 0.5818331.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2023-09-16T01:21:59
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmpfs/tmp/tmpyih539cq/model.ckpt-20
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [1/10]
INFO:tensorflow:Evaluation [2/10]
INFO:tensorflow:Evaluation [3/10]
INFO:tensorflow:Evaluation [4/10]
INFO:tensorflow:Evaluation [5/10]
INFO:tensorflow:Evaluation [6/10]
INFO:tensorflow:Evaluation [7/10]
INFO:tensorflow:Evaluation [8/10]
INFO:tensorflow:Evaluation [9/10]
INFO:tensorflow:Inference Time : 0.52606s
INFO:tensorflow:Finished evaluation at 2023-09-16-01:21:59
INFO:tensorflow:Saving dict for global step 20: accuracy = 0.70454544, accuracy_baseline = 0.625, auc = 0.6964494, auc_precision_recall = 0.60180384, average_loss = 0.5988959, global_step = 20, label/mean = 0.375, loss = 0.59320897, precision = 0.6363636, prediction/mean = 0.38547936, recall = 0.4949495
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 20: /tmpfs/tmp/tmpyih539cq/model.ckpt-20
{'accuracy': 0.70454544,
 'accuracy_baseline': 0.625,
 'auc': 0.6964494,
 'auc_precision_recall': 0.60180384,
 'average_loss': 0.5988959,
 'label/mean': 0.375,
 'loss': 0.59320897,
 'precision': 0.6363636,
 'prediction/mean': 0.38547936,
 'recall': 0.4949495,
 'global_step': 20}

TensorFlow 2:使用 Keras 建立自訂 DNN 模型

在 TensorFlow 2 中,您可以建立自訂 DNN 模型來替代由 tf.estimator.DNNEstimator 產生的模型,並具有類似的使用者指定自訂程度 (例如,如上一個範例所示,自訂所選模型最佳化工具的能力)。

類似的工作流程可用於以 Keras 遞迴神經網路 (RNN) 模型取代 tf.estimator.experimental.RNNEstimator。Keras 透過 tf.keras.layers.RNNtf.keras.layers.LSTMtf.keras.layers.GRU 提供許多內建、可自訂的選項。若要進一步瞭解,請查看 Keras RNN 指南內建 RNN 層:簡單範例章節。

dnn_model = tf.keras.models.Sequential(
    [tf.keras.layers.Dense(128, activation='relu'),
     tf.keras.layers.Dense(1)])

dnn_model.compile(loss='mse', optimizer=create_sample_optimizer('tf2'), metrics=['accuracy'])
dnn_model.fit(x_train, y_train, epochs=10)
dnn_model.evaluate(x_eval, y_eval, return_dict=True)
Epoch 1/10
20/20 [==============================] - 0s 2ms/step - loss: 581.7921 - accuracy: 0.4338
Epoch 2/10
20/20 [==============================] - 0s 2ms/step - loss: 1.2502 - accuracy: 0.4450
Epoch 3/10
20/20 [==============================] - 0s 2ms/step - loss: 0.7286 - accuracy: 0.5183
Epoch 4/10
20/20 [==============================] - 0s 2ms/step - loss: 0.5274 - accuracy: 0.5486
Epoch 5/10
20/20 [==============================] - 0s 2ms/step - loss: 0.4200 - accuracy: 0.5550
Epoch 6/10
20/20 [==============================] - 0s 2ms/step - loss: 0.3405 - accuracy: 0.5821
Epoch 7/10
20/20 [==============================] - 0s 2ms/step - loss: 0.2975 - accuracy: 0.6124
Epoch 8/10
20/20 [==============================] - 0s 2ms/step - loss: 0.2628 - accuracy: 0.6507
Epoch 9/10
20/20 [==============================] - 0s 2ms/step - loss: 0.2382 - accuracy: 0.7002
Epoch 10/10
20/20 [==============================] - 0s 2ms/step - loss: 0.2218 - accuracy: 0.7273
9/9 [==============================] - 0s 2ms/step - loss: 0.2414 - accuracy: 0.6894
{'loss': 0.2413530945777893, 'accuracy': 0.689393937587738}

範例 3:從 DNNLinearCombinedEstimator 遷移

TensorFlow 1:使用 DNNLinearCombinedEstimator

在 TensorFlow 1 中,您可以使用 tf.estimator.DNNLinearCombinedEstimator 建立用於迴歸和分類問題的基準組合模型,並具有自訂線性和 DNN 元件的能力。

optimizer = create_sample_optimizer('tf1')

combined_estimator = tf.estimator.DNNLinearCombinedEstimator(
    head=tf.estimator.BinaryClassHead(),
    # Wide settings
    linear_feature_columns=feature_columns,
    linear_optimizer=optimizer,
    # Deep settings
    dnn_feature_columns=feature_columns,
    dnn_hidden_units=[128],
    dnn_optimizer=optimizer)
WARNING:tensorflow:From /tmpfs/tmp/ipykernel_10080/1505653152.py:3: DNNLinearCombinedEstimatorV2.__init__ (from tensorflow_estimator.python.estimator.canned.dnn_linear_combined) is deprecated and will be removed in a future version.
Instructions for updating:
Use tf.keras instead.
INFO:tensorflow:Using default config.
WARNING:tensorflow:Using temporary folder as model directory: /tmpfs/tmp/tmpun15otq5
INFO:tensorflow:Using config: {'_model_dir': '/tmpfs/tmp/tmpun15otq5', '_tf_random_seed': None, '_save_summary_steps': 100, '_save_checkpoints_steps': None, '_save_checkpoints_secs': 600, '_session_config': allow_soft_placement: true
graph_options {
  rewrite_options {
    meta_optimizer_iterations: ONE
  }
}
, '_keep_checkpoint_max': 5, '_keep_checkpoint_every_n_hours': 10000, '_log_step_count_steps': 100, '_train_distribute': None, '_device_fn': None, '_protocol': None, '_eval_distribute': None, '_experimental_distribute': None, '_experimental_max_worker_delay_secs': None, '_session_creation_timeout_secs': 7200, '_checkpoint_save_graph_def': True, '_service': None, '_cluster_spec': ClusterSpec({}), '_task_type': 'worker', '_task_id': 0, '_global_id_in_cluster': 0, '_master': '', '_evaluation_master': '', '_is_chief': True, '_num_ps_replicas': 0, '_num_worker_replicas': 1}
combined_estimator.train(input_fn=_input_fn, steps=100)
combined_estimator.evaluate(input_fn=_eval_input_fn, steps=10)
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Create CheckpointSaverHook.
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmpfs/tmp/tmpun15otq5/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 4.244113, step = 0
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 20...
INFO:tensorflow:Saving checkpoints for 20 into /tmpfs/tmp/tmpun15otq5/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 20...
INFO:tensorflow:Loss for final step: 0.5406369.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2023-09-16T01:22:04
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmpfs/tmp/tmpun15otq5/model.ckpt-20
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [1/10]
INFO:tensorflow:Evaluation [2/10]
INFO:tensorflow:Evaluation [3/10]
INFO:tensorflow:Evaluation [4/10]
INFO:tensorflow:Evaluation [5/10]
INFO:tensorflow:Evaluation [6/10]
INFO:tensorflow:Evaluation [7/10]
INFO:tensorflow:Evaluation [8/10]
INFO:tensorflow:Evaluation [9/10]
INFO:tensorflow:Inference Time : 0.58801s
INFO:tensorflow:Finished evaluation at 2023-09-16-01:22:04
INFO:tensorflow:Saving dict for global step 20: accuracy = 0.71590906, accuracy_baseline = 0.625, auc = 0.7440466, auc_precision_recall = 0.6447197, average_loss = 0.5923795, global_step = 20, label/mean = 0.375, loss = 0.5745624, precision = 0.65384614, prediction/mean = 0.3921669, recall = 0.5151515
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 20: /tmpfs/tmp/tmpun15otq5/model.ckpt-20
{'accuracy': 0.71590906,
 'accuracy_baseline': 0.625,
 'auc': 0.7440466,
 'auc_precision_recall': 0.6447197,
 'average_loss': 0.5923795,
 'label/mean': 0.375,
 'loss': 0.5745624,
 'precision': 0.65384614,
 'prediction/mean': 0.3921669,
 'recall': 0.5151515,
 'global_step': 20}

TensorFlow 2:使用 Keras WideDeepModel

在 TensorFlow 2 中,您可以建立 Keras tf.compat.v1.keras.models.WideDeepModel 的執行個體,以替代由 tf.estimator.DNNLinearCombinedEstimator 產生的模型,並具有類似的使用者指定自訂程度 (例如,如上一個範例所示,自訂所選模型最佳化工具的能力)。

WideDeepModel 是根據組成元件 LinearModel 和自訂 DNN 模型建構而成,這兩者都在前兩個範例中討論過。如果需要,也可以使用自訂線性模型來取代內建 Keras LinearModel

如果您想要建構自己的模型,而不是使用罐頭估算器,請查看 Keras Sequential 模型指南。如需自訂訓練和最佳化工具的詳細資訊,請查看自訂訓練:逐步解說指南。

# Create LinearModel and DNN Model as in Examples 1 and 2
optimizer = create_sample_optimizer('tf2')

linear_model = tf.compat.v1.keras.experimental.LinearModel()
linear_model.compile(loss='mse', optimizer=optimizer, metrics=['accuracy'])
linear_model.fit(x_train, y_train, epochs=10, verbose=0)

dnn_model = tf.keras.models.Sequential(
    [tf.keras.layers.Dense(128, activation='relu'),
     tf.keras.layers.Dense(1)])
dnn_model.compile(loss='mse', optimizer=optimizer, metrics=['accuracy'])
combined_model = tf.compat.v1.keras.experimental.WideDeepModel(linear_model,
                                                               dnn_model)
combined_model.compile(
    optimizer=[optimizer, optimizer], loss='mse', metrics=['accuracy'])
combined_model.fit([x_train, x_train], y_train, epochs=10)
combined_model.evaluate(x_eval, y_eval, return_dict=True)
Epoch 1/10
20/20 [==============================] - 1s 3ms/step - loss: 682.2249 - accuracy: 0.6858
Epoch 2/10
20/20 [==============================] - 0s 2ms/step - loss: 0.2909 - accuracy: 0.7193
Epoch 3/10
20/20 [==============================] - 0s 2ms/step - loss: 0.2140 - accuracy: 0.7400
Epoch 4/10
20/20 [==============================] - 0s 2ms/step - loss: 0.2084 - accuracy: 0.7671
Epoch 5/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1809 - accuracy: 0.7719
Epoch 6/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1699 - accuracy: 0.7911
Epoch 7/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1745 - accuracy: 0.7687
Epoch 8/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1665 - accuracy: 0.7974
Epoch 9/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1693 - accuracy: 0.7911
Epoch 10/10
20/20 [==============================] - 0s 2ms/step - loss: 0.1748 - accuracy: 0.7847
9/9 [==============================] - 0s 2ms/step - loss: 0.2050 - accuracy: 0.7159
{'loss': 0.2049505114555359, 'accuracy': 0.7159090638160706}

範例 4:從 BoostedTreesEstimator 遷移

TensorFlow 1:使用 BoostedTreesEstimator

在 TensorFlow 1 中,您可以使用 tf.estimator.BoostedTreesEstimator 建立基準,以使用決策樹系綜建立用於迴歸和分類問題的基準梯度提升模型。此功能在 TensorFlow 2 中已不再包含。

bt_estimator = tf1.estimator.BoostedTreesEstimator(
    head=tf.estimator.BinaryClassHead(),
    n_batches_per_layer=1,
    max_depth=10,
    n_trees=1000,
    feature_columns=feature_columns)
bt_estimator.train(input_fn=_input_fn, steps=1000)
bt_estimator.evaluate(input_fn=_eval_input_fn, steps=100)

TensorFlow 2:使用 TensorFlow Decision Forests

在 TensorFlow 2 中,tf.estimator.BoostedTreesEstimator 已由 tfdf.keras.GradientBoostedTreesModel 取代,後者來自 TensorFlow Decision Forests 套件。

TensorFlow Decision Forests 相較於 tf.estimator.BoostedTreesEstimator 具有各種優勢,特別是在品質、速度、易用性和彈性方面。若要瞭解 TensorFlow Decision Forests,請從初學者 colab 開始。

以下範例示範如何使用 TensorFlow 2 訓練梯度提升樹狀結構模型

安裝 TensorFlow Decision Forests。

pip install tensorflow_decision_forests

建立 TensorFlow 資料集。請注意,Decision Forests 原生支援多種特徵類型,且不需要預先處理。

train_dataframe = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/train.csv')
eval_dataframe = pd.read_csv('https://storage.googleapis.com/tf-datasets/titanic/eval.csv')

# Convert the Pandas Dataframes into TensorFlow datasets.
train_dataset = tfdf.keras.pd_dataframe_to_tf_dataset(train_dataframe, label="survived")
eval_dataset = tfdf.keras.pd_dataframe_to_tf_dataset(eval_dataframe, label="survived")

train_dataset 資料集上訓練模型。

# Use the default hyper-parameters of the model.
gbt_model = tfdf.keras.GradientBoostedTreesModel()
gbt_model.fit(train_dataset)
Warning: The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
WARNING:absl:The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
Use /tmpfs/tmp/tmpinxrd9bl as temporary training directory
Reading training dataset...
[WARNING 23-09-16 01:22:09.3074 UTC gradient_boosted_trees.cc:1818] "goss_alpha" set but "sampling_method" not equal to "GOSS".
[WARNING 23-09-16 01:22:09.3074 UTC gradient_boosted_trees.cc:1829] "goss_beta" set but "sampling_method" not equal to "GOSS".
[WARNING 23-09-16 01:22:09.3074 UTC gradient_boosted_trees.cc:1843] "selective_gradient_boosting_ratio" set but "sampling_method" not equal to "SELGB".
Training dataset read in 0:00:03.607059. Found 627 examples.
Training model...
Model trained in 0:00:00.226305
Compiling model...
[INFO 23-09-16 01:22:13.1488 UTC kernel.cc:1243] Loading model from path /tmpfs/tmp/tmpinxrd9bl/model/ with prefix b67347ee49794d62
[INFO 23-09-16 01:22:13.1525 UTC abstract_model.cc:1311] Engine "GradientBoostedTreesQuickScorerExtended" built
[INFO 23-09-16 01:22:13.1525 UTC kernel.cc:1075] Use fast generic engine
WARNING:tensorflow:AutoGraph could not transform <function simple_ml_inference_op_with_handle at 0x7f7376b33700> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING:tensorflow:AutoGraph could not transform <function simple_ml_inference_op_with_handle at 0x7f7376b33700> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
WARNING: AutoGraph could not transform <function simple_ml_inference_op_with_handle at 0x7f7376b33700> and will run it as-is.
Please report this to the TensorFlow team. When filing the bug, set the verbosity to 10 (on Linux, `export AUTOGRAPH_VERBOSITY=10`) and attach the full output.
Cause: could not get source code
To silence this warning, decorate the function with @tf.autograph.experimental.do_not_convert
Model compiled.
<keras.src.callbacks.History at 0x7f72a4271430>

評估 eval_dataset 資料集上模型的品質。

gbt_model.compile(metrics=['accuracy'])
gbt_evaluation = gbt_model.evaluate(eval_dataset, return_dict=True)
print(gbt_evaluation)
1/1 [==============================] - 0s 296ms/step - loss: 0.0000e+00 - accuracy: 0.8295
{'loss': 0.0, 'accuracy': 0.8295454382896423}

梯度提升樹狀結構只是 TensorFlow Decision Forests 中提供的眾多決策樹系演算法之一。例如,隨機森林 (以 tfdf.keras.GradientBoostedTreesModel 提供) 非常抗過度擬合,而 CART (以 tfdf.keras.CartModel 提供) 非常適合模型解譯。

在下一個範例中,訓練並繪製隨機森林模型。

# Train a Random Forest model
rf_model = tfdf.keras.RandomForestModel()
rf_model.fit(train_dataset)

# Evaluate the Random Forest model
rf_model.compile(metrics=['accuracy'])
rf_evaluation = rf_model.evaluate(eval_dataset, return_dict=True)
print(rf_evaluation)
Warning: The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
WARNING:absl:The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
Use /tmpfs/tmp/tmpdvqhwuwo as temporary training directory
Reading training dataset...
Training dataset read in 0:00:00.187950. Found 627 examples.
Training model...
Model trained in 0:00:00.191396
Compiling model...
[INFO 23-09-16 01:22:15.4265 UTC kernel.cc:1243] Loading model from path /tmpfs/tmp/tmpdvqhwuwo/model/ with prefix 39c681f57c12496f
[INFO 23-09-16 01:22:15.5264 UTC decision_forest.cc:660] Model loaded with 300 root(s), 34556 node(s), and 9 input feature(s).
[INFO 23-09-16 01:22:15.5265 UTC kernel.cc:1075] Use fast generic engine
Model compiled.
1/1 [==============================] - 0s 141ms/step - loss: 0.0000e+00 - accuracy: 0.8333
{'loss': 0.0, 'accuracy': 0.8333333134651184}

在最後一個範例中,訓練並評估 CART 模型。

# Train a CART model
cart_model = tfdf.keras.CartModel()
cart_model.fit(train_dataset)

# Plot the CART model
tfdf.model_plotter.plot_model_in_colab(cart_model, max_depth=2)
Warning: The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
WARNING:absl:The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.
Use /tmpfs/tmp/tmpmgmlcvs6 as temporary training directory
Reading training dataset...
Training dataset read in 0:00:00.187860. Found 627 examples.
Training model...
Model trained in 0:00:00.018191
Compiling model...
Model compiled.
[INFO 23-09-16 01:22:16.0918 UTC kernel.cc:1243] Loading model from path /tmpfs/tmp/tmpmgmlcvs6/model/ with prefix efef116800e041d8
[INFO 23-09-16 01:22:16.0921 UTC decision_forest.cc:660] Model loaded with 1 root(s), 21 node(s), and 5 input feature(s).
[INFO 23-09-16 01:22:16.0922 UTC kernel.cc:1075] Use fast generic engine