TFX Estimator 元件教學課程

TensorFlow Extended (TFX) 元件逐一介紹

這個以 Colab 為基礎的教學課程將互動式逐步引導您瞭解 TensorFlow Extended (TFX) 的每個內建元件。


完成後,這個筆記本的內容可以自動匯出為 TFX 管線原始碼,您可以使用 Apache Airflow 和 Apache Beam 來協調這些原始碼。


這個筆記本示範如何在 Jupyter/Colab 環境中使用 TFX。在這裡,我們會逐步瀏覽互動式筆記本中的芝加哥計程車範例。

在互動式筆記本中工作是熟悉 TFX 管線結構的實用方法。當您將自己的管線當做輕量型開發環境進行開發時,也很有用,但您應該注意,互動式筆記本的協調方式以及其存取中繼資料成品的方式有所不同。


在 TFX 的生產環境部署中,您將使用協調器 (例如 Apache Airflow、Kubeflow Pipelines 或 Apache Beam) 來協調預先定義的 TFX 元件管線圖。在互動式筆記本中,筆記本本身就是協調器,會在您執行筆記本儲存格時執行每個 TFX 元件。


在 TFX 的生產環境部署中,您將透過 ML Metadata (MLMD) API 存取中繼資料。MLMD 會將中繼資料屬性儲存在資料庫 (例如 MySQL 或 SQLite) 中,並將中繼資料酬載儲存在永久儲存空間 (例如您的檔案系統) 中。在互動式筆記本中,屬性和酬載都會儲存在 Jupyter 筆記本或 Colab 伺服器上 /tmp 目錄中的暫時性 SQLite 資料庫中。



升級 Pip

為了避免在本機執行時升級系統中的 Pip,請檢查以確保我們在 Colab 中執行。本機系統當然可以個別升級。

  import colab
  !pip install --upgrade pip

安裝 TFX

pip install tfx


如果您使用 Google Colab,第一次執行上述儲存格時,您必須重新啟動執行階段 (「執行階段」>「重新啟動執行階段」...)。這是因為 Colab 載入套件的方式。


我們匯入必要的套件,包括標準 TFX 元件類別。

import os
import pprint
import tempfile
import urllib

import absl
import tensorflow as tf
import tensorflow_model_analysis as tfma
tf.get_logger().propagate = False
pp = pprint.PrettyPrinter()

from tfx import v1 as tfx
from tfx.orchestration.experimental.interactive.interactive_context import InteractiveContext

%load_ext tfx.orchestration.experimental.interactive.notebook_extensions.skip
print('TensorFlow version: {}'.format(tf.__version__))
print('TFX version: {}'.format(tfx.__version__))
TensorFlow version: 2.15.1
TFX version: 1.15.0


# This is the root directory for your TFX pip package installation.
_tfx_root = tfx.__path__[0]

# This is the directory containing the TFX Chicago Taxi Pipeline example.
_taxi_root = os.path.join(_tfx_root, 'examples/chicago_taxi_pipeline')

# This is the path where your model will be pushed for serving.
_serving_model_dir = os.path.join(
    tempfile.mkdtemp(), 'serving_model/taxi_simple')

# Set up logging.


我們下載範例資料集,以用於我們的 TFX 管線中。

我們使用的資料集是芝加哥市發布的 計程車行程資料集。這個資料集中的欄是


有了這個資料集,我們將建構一個模型來預測行程的 tips (小費)。

_data_root = tempfile.mkdtemp(prefix='tfx-data')
_data_filepath = os.path.join(_data_root, "data.csv")
urllib.request.urlretrieve(DATA_PATH, _data_filepath)
 <http.client.HTTPMessage at 0x7f393fe02ee0>)

快速查看 CSV 檔案。

head {_data_filepath}
,12.45,5,19,6,1400269500,,,,,0.0,,,Credit Card,Chicago Elite Cab Corp. (Chicago Carriag,0,,0.0
,0,3,19,5,1362683700,,,,,0,,,Unknown,Chicago Elite Cab Corp.,300,,0
60,27.05,10,2,3,1380593700,41.836150155,-87.648787952,,,12.6,,,Cash,Taxi Affiliation Services,1380,,0.0
10,5.85,10,1,2,1382319000,41.985015101,-87.804532006,,,0.0,,,Cash,Taxi Affiliation Services,180,,0.0
14,16.65,5,7,5,1369897200,41.968069,-87.721559063,,,0.0,,,Cash,Dispatch Taxi Affiliation,1080,,0.0

免責聲明:本網站提供的應用程式使用來自原始來源 (芝加哥市官方網站) 的修改資料。芝加哥市對於本網站提供的任何資料的內容、準確性、即時性或完整性不做任何聲明。本網站提供的資料隨時可能變更。您瞭解到使用本網站提供的資料須自行承擔風險。

建立 InteractiveContext

最後,我們建立 InteractiveContext,讓我們可以在這個筆記本中以互動方式執行 TFX 元件。

# Here, we create an InteractiveContext using default parameters. This will
# use a temporary directory with an ephemeral ML Metadata database instance.
# To use your own pipeline root or database, the optional properties
# `pipeline_root` and `metadata_connection_config` may be passed to
# InteractiveContext. Calls to InteractiveContext are no-ops outside of the
# notebook.
context = InteractiveContext()
以互動方式執行 TFX 元件

在後續的儲存格中,我們會逐一建立 TFX 元件、執行每個元件,並視覺化其輸出成品。


ExampleGen 元件通常位於 TFX 管線的開頭。它會執行下列作業:

  1. 將資料分割成訓練集和評估集 (預設為 2/3 訓練 + 1/3 評估)
  2. 將資料轉換為 tf.Example 格式 (如要瞭解詳情,請參閱這裡)
  3. 將資料複製到 _tfx_root 目錄,供其他元件存取

ExampleGen 會將資料來源的路徑做為輸入。在我們的案例中,這是包含已下載 CSV 的 _data_root 路徑。

example_gen = tfx.components.CsvExampleGen(input_base=_data_root)
INFO:absl:Running driver for CsvExampleGen
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:select span and version = (0, None)
INFO:absl:latest span and version = (0, None)
INFO:absl:Running executor for CsvExampleGen
INFO:absl:Generating examples.
WARNING:apache_beam.runners.interactive.interactive_environment:Dependencies required for Interactive Beam PCollection visualization are not available, please use: `pip install apache-beam[interactive]` to install necessary dependencies to enable all data visualization features.
INFO:absl:Processing input csv data /tmpfs/tmp/tfx-datafzazc6h4/* to TFExample.'t find python-snappy so the implementation of _TFRecordUtil._masked_crc32c is not as fast as it could be.
INFO:absl:Examples generated.
INFO:absl:Running publisher for CsvExampleGen
INFO:absl:MetadataStore with DB connection initialized

讓我們檢查 ExampleGen 的輸出成品。這個元件會產生兩個成品:訓練範例和評估範例

artifact = example_gen.outputs['examples'].get()[0]
print(artifact.split_names, artifact.uri)
["train", "eval"] /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/CsvExampleGen/examples/1


# Get the URI of the output artifact representing the training examples, which is a directory
train_uri = os.path.join(example_gen.outputs['examples'].get()[0].uri, 'Split-train')

# Get the list of files in this directory (all compressed TFRecord files)
tfrecord_filenames = [os.path.join(train_uri, name)
                      for name in os.listdir(train_uri)]

# Create a `TFRecordDataset` to read these files
dataset =, compression_type="GZIP")

# Iterate over the first 3 records and decode them.
for tfrecord in dataset.take(3):
  serialized_example = tfrecord.numpy()
  example = tf.train.Example()
features {
  feature {
    key: "company"
    value {
      bytes_list {
        value: "Chicago Elite Cab Corp. (Chicago Carriag"
  feature {
    key: "dropoff_census_tract"
    value {
      int64_list {
  feature {
    key: "dropoff_community_area"
    value {
      int64_list {
  feature {
    key: "dropoff_latitude"
    value {
      float_list {
  feature {
    key: "dropoff_longitude"
    value {
      float_list {
  feature {
    key: "fare"
    value {
      float_list {
        value: 12.449999809265137
  feature {
    key: "payment_type"
    value {
      bytes_list {
        value: "Credit Card"
  feature {
    key: "pickup_census_tract"
    value {
      int64_list {
  feature {
    key: "pickup_community_area"
    value {
      int64_list {
  feature {
    key: "pickup_latitude"
    value {
      float_list {
  feature {
    key: "pickup_longitude"
    value {
      float_list {
  feature {
    key: "tips"
    value {
      float_list {
        value: 0.0
  feature {
    key: "trip_miles"
    value {
      float_list {
        value: 0.0
  feature {
    key: "trip_seconds"
    value {
      int64_list {
        value: 0
  feature {
    key: "trip_start_day"
    value {
      int64_list {
        value: 6
  feature {
    key: "trip_start_hour"
    value {
      int64_list {
        value: 19
  feature {
    key: "trip_start_month"
    value {
      int64_list {
        value: 5
  feature {
    key: "trip_start_timestamp"
    value {
      int64_list {
        value: 1400269500

features {
  feature {
    key: "company"
    value {
      bytes_list {
        value: "Taxi Affiliation Services"
  feature {
    key: "dropoff_census_tract"
    value {
      int64_list {
  feature {
    key: "dropoff_community_area"
    value {
      int64_list {
  feature {
    key: "dropoff_latitude"
    value {
      float_list {
  feature {
    key: "dropoff_longitude"
    value {
      float_list {
  feature {
    key: "fare"
    value {
      float_list {
        value: 27.049999237060547
  feature {
    key: "payment_type"
    value {
      bytes_list {
        value: "Cash"
  feature {
    key: "pickup_census_tract"
    value {
      int64_list {
  feature {
    key: "pickup_community_area"
    value {
      int64_list {
        value: 60
  feature {
    key: "pickup_latitude"
    value {
      float_list {
        value: 41.836151123046875
  feature {
    key: "pickup_longitude"
    value {
      float_list {
        value: -87.64878845214844
  feature {
    key: "tips"
    value {
      float_list {
        value: 0.0
  feature {
    key: "trip_miles"
    value {
      float_list {
        value: 12.600000381469727
  feature {
    key: "trip_seconds"
    value {
      int64_list {
        value: 1380
  feature {
    key: "trip_start_day"
    value {
      int64_list {
        value: 3
  feature {
    key: "trip_start_hour"
    value {
      int64_list {
        value: 2
  feature {
    key: "trip_start_month"
    value {
      int64_list {
        value: 10
  feature {
    key: "trip_start_timestamp"
    value {
      int64_list {
        value: 1380593700

features {
  feature {
    key: "company"
    value {
      bytes_list {
  feature {
    key: "dropoff_census_tract"
    value {
      int64_list {
  feature {
    key: "dropoff_community_area"
    value {
      int64_list {
  feature {
    key: "dropoff_latitude"
    value {
      float_list {
  feature {
    key: "dropoff_longitude"
    value {
      float_list {
  feature {
    key: "fare"
    value {
      float_list {
        value: 16.450000762939453
  feature {
    key: "payment_type"
    value {
      bytes_list {
        value: "Cash"
  feature {
    key: "pickup_census_tract"
    value {
      int64_list {
  feature {
    key: "pickup_community_area"
    value {
      int64_list {
        value: 13
  feature {
    key: "pickup_latitude"
    value {
      float_list {
        value: 41.98363494873047
  feature {
    key: "pickup_longitude"
    value {
      float_list {
        value: -87.72357940673828
  feature {
    key: "tips"
    value {
      float_list {
        value: 0.0
  feature {
    key: "trip_miles"
    value {
      float_list {
        value: 6.900000095367432
  feature {
    key: "trip_seconds"
    value {
      int64_list {
        value: 780
  feature {
    key: "trip_start_day"
    value {
      int64_list {
        value: 3
  feature {
    key: "trip_start_hour"
    value {
      int64_list {
        value: 12
  feature {
    key: "trip_start_month"
    value {
      int64_list {
        value: 11
  feature {
    key: "trip_start_timestamp"
    value {
      int64_list {
        value: 1446554700

現在 ExampleGen 已完成擷取資料,下一個步驟是資料分析。


StatisticsGen 元件會計算資料集的統計資料,以進行資料分析,以及供下游元件使用。它使用 TensorFlow Data Validation 程式庫。

StatisticsGen 會將我們剛使用 ExampleGen 擷取的資料集做為輸入。

statistics_gen = tfx.components.StatisticsGen(examples=example_gen.outputs['examples'])
StatisticsGen 完成執行後,我們可以視覺化輸出的統計資料。試著玩玩不同的圖表!['statistics'])


SchemaGen 元件會根據您的資料統計資料產生結構描述。(結構描述定義資料集中特徵的預期界限、類型和屬性。) 它也使用 TensorFlow Data Validation 程式庫。

SchemaGen 會將我們使用 StatisticsGen 產生的統計資料做為輸入,預設查看訓練分割。

schema_gen = tfx.components.SchemaGen(
SchemaGen 完成執行後,我們可以將產生的結構描述視覺化為表格。['schema'])


如要進一步瞭解結構描述,請參閱 SchemaGen 文件


ExampleValidator 元件會根據結構描述定義的預期,偵測資料中的異常。它也使用 TensorFlow Data Validation 程式庫。

ExampleValidator 會將來自 StatisticsGen 的統計資料和來自 SchemaGen 的結構描述做為輸入。

example_validator = tfx.components.ExampleValidator(
ExampleValidator 完成執行後,我們可以將異常視覺化為表格。['anomalies'])



Transform 元件會針對訓練和服務執行特徵工程。它使用 TensorFlow Transform 程式庫。

Transform 會將來自 ExampleGen 的資料、來自 SchemaGen 的結構描述,以及包含使用者定義 Transform 程式碼的模組做為輸入。

讓我們看看下方使用者定義 Transform 程式碼的範例 (如要瞭解 TensorFlow Transform API 的簡介,請參閱教學課程)。首先,我們定義一些用於特徵工程的常數

_taxi_constants_module_file = ''
%%writefile {_taxi_constants_module_file}

# Categorical features are assumed to each have a maximum value in the dataset.

    'trip_start_hour', 'trip_start_day', 'trip_start_month',
    'pickup_census_tract', 'dropoff_census_tract', 'pickup_community_area',

DENSE_FLOAT_FEATURE_KEYS = ['trip_miles', 'fare', 'trip_seconds']

# Number of buckets used by tf.transform for encoding each feature.

    'pickup_latitude', 'pickup_longitude', 'dropoff_latitude',

# Number of vocabulary terms used for encoding VOCAB_FEATURES by tf.transform

# Count of out-of-vocab buckets in which unrecognized VOCAB_FEATURES are hashed.


# Keys
LABEL_KEY = 'tips'
FARE_KEY = 'fare'

接下來,我們編寫 preprocessing_fn,它會將原始資料做為輸入,並傳回我們的模型可以據以訓練的轉換後特徵

_taxi_transform_module_file = ''
%%writefile {_taxi_transform_module_file}

import tensorflow as tf
import tensorflow_transform as tft

import taxi_constants

_VOCAB_SIZE = taxi_constants.VOCAB_SIZE
_OOV_SIZE = taxi_constants.OOV_SIZE
_FARE_KEY = taxi_constants.FARE_KEY
_LABEL_KEY = taxi_constants.LABEL_KEY

def preprocessing_fn(inputs):
  """tf.transform's callback function for preprocessing inputs.
    inputs: map from feature keys to raw not-yet-transformed features.
    Map from string feature key to transformed feature operations.
  outputs = {}
    # If sparse make it dense, setting nan's to 0 or '', and apply zscore.
    outputs[key] = tft.scale_to_z_score(

  for key in _VOCAB_FEATURE_KEYS:
    # Build a vocabulary for this feature.
    outputs[key] = tft.compute_and_apply_vocabulary(

  for key in _BUCKET_FEATURE_KEYS:
    outputs[key] = tft.bucketize(
        _fill_in_missing(inputs[key]), _FEATURE_BUCKET_COUNT)

    outputs[key] = _fill_in_missing(inputs[key])

  # Was this passenger a big tipper?
  taxi_fare = _fill_in_missing(inputs[_FARE_KEY])
  tips = _fill_in_missing(inputs[_LABEL_KEY])
  outputs[_LABEL_KEY] = tf.where(
      tf.cast(tf.zeros_like(taxi_fare), tf.int64),
      # Test if the tip was > 20% of the fare.
          tf.greater(tips, tf.multiply(taxi_fare, tf.constant(0.2))), tf.int64))

  return outputs

def _fill_in_missing(x):
  """Replace missing values in a SparseTensor.
  Fills in missing values of `x` with '' or 0, and converts to a dense tensor.
    x: A `SparseTensor` of rank 2.  Its dense shape should have size at most 1
      in the second dimension.
    A rank 1 tensor where missing values of `x` have been filled in.
  if not isinstance(x, tf.sparse.SparseTensor):
    return x

  default_value = '' if x.dtype == tf.string else 0
  return tf.squeeze(
          tf.SparseTensor(x.indices, x.values, [x.dense_shape[0], 1]),

現在,我們將這個特徵工程程式碼傳遞至 Transform 元件並執行它,以轉換您的資料。

transform = tfx.components.Transform(
INFO:absl:Feature company has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature fare has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature payment_type has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature tips has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_miles has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_seconds has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_day has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_hour has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_month has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_timestamp has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature company has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature fare has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature payment_type has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature tips has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_miles has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_seconds has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_day has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_hour has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_month has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature company has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature fare has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature payment_type has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature tips has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_miles has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_seconds has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_day has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_hour has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_month has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_timestamp has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature company has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature fare has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature payment_type has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature tips has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_miles has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_seconds has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_day has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_hour has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_month has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_timestamp has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature company has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature fare has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature payment_type has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature tips has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_miles has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_seconds has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_day has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_hour has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_month has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_timestamp has no shape. Setting to varlen_sparse_tensor.
INFO:absl:If the number of unique tokens is smaller than the provided top_k or approximation error is acceptable, consider using tft.experimental.approximate_vocabulary for a potentially more efficient implementation.
INFO:absl:If the number of unique tokens is smaller than the provided top_k or approximation error is acceptable, consider using tft.experimental.approximate_vocabulary for a potentially more efficient implementation.
INFO:absl:Feature company has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature fare has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature payment_type has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature tips has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_miles has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_seconds has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_day has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_hour has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_month has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature company has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature fare has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature payment_type has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature tips has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_miles has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_seconds has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_day has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_hour has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_month has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_timestamp has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature company has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature fare has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature payment_type has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature tips has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_miles has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_seconds has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_day has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_hour has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_month has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature company has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature fare has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature payment_type has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature tips has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_miles has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_seconds has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_day has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_hour has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_month has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_timestamp has no shape. Setting to varlen_sparse_tensor.
讓我們檢查 Transform 的輸出成品。這個元件會產生兩種輸出類型

  • transform_graph 是可以執行前處理作業的圖表 (這個圖表將包含在服務和評估模型中)。
  • transformed_examples 代表前處理後的訓練和評估資料。
{'transform_graph': OutputChannel(artifact_type=TransformGraph, producer_component_id=Transform, output_key=transform_graph, additional_properties={}, additional_custom_properties={}, _input_trigger=None, _is_async=False),
 'transformed_examples': OutputChannel(artifact_type=Examples, producer_component_id=Transform, output_key=transformed_examples, additional_properties={}, additional_custom_properties={}, _input_trigger=None, _is_async=False),
 'updated_analyzer_cache': OutputChannel(artifact_type=TransformCache, producer_component_id=Transform, output_key=updated_analyzer_cache, additional_properties={}, additional_custom_properties={}, _input_trigger=None, _is_async=False),
 'pre_transform_schema': OutputChannel(artifact_type=Schema, producer_component_id=Transform, output_key=pre_transform_schema, additional_properties={}, additional_custom_properties={}, _input_trigger=None, _is_async=False),
 'pre_transform_stats': OutputChannel(artifact_type=ExampleStatistics, producer_component_id=Transform, output_key=pre_transform_stats, additional_properties={}, additional_custom_properties={}, _input_trigger=None, _is_async=False),
 'post_transform_schema': OutputChannel(artifact_type=Schema, producer_component_id=Transform, output_key=post_transform_schema, additional_properties={}, additional_custom_properties={}, _input_trigger=None, _is_async=False),
 'post_transform_stats': OutputChannel(artifact_type=ExampleStatistics, producer_component_id=Transform, output_key=post_transform_stats, additional_properties={}, additional_custom_properties={}, _input_trigger=None, _is_async=False),
 'post_transform_anomalies': OutputChannel(artifact_type=ExampleAnomalies, producer_component_id=Transform, output_key=post_transform_anomalies, additional_properties={}, additional_custom_properties={}, _input_trigger=None, _is_async=False)}

快速查看 transform_graph 成品。它指向包含三個子目錄的目錄。

train_uri = transform.outputs['transform_graph'].get()[0].uri
['transform_fn', 'metadata', 'transformed_metadata']

transformed_metadata 子目錄包含前處理後資料的結構描述。transform_fn 子目錄包含實際的前處理圖表。metadata 子目錄包含原始資料的結構描述。


# Get the URI of the output artifact representing the transformed examples, which is a directory
train_uri = os.path.join(transform.outputs['transformed_examples'].get()[0].uri, 'Split-train')

# Get the list of files in this directory (all compressed TFRecord files)
tfrecord_filenames = [os.path.join(train_uri, name)
                      for name in os.listdir(train_uri)]

# Create a `TFRecordDataset` to read these files
dataset =, compression_type="GZIP")

# Iterate over the first 3 records and decode them.
for tfrecord in dataset.take(3):
  serialized_example = tfrecord.numpy()
  example = tf.train.Example()
features {
  feature {
    key: "company"
    value {
      int64_list {
        value: 8
  feature {
    key: "dropoff_census_tract"
    value {
      int64_list {
        value: 0
  feature {
    key: "dropoff_community_area"
    value {
      int64_list {
        value: 0
  feature {
    key: "dropoff_latitude"
    value {
      int64_list {
        value: 0
  feature {
    key: "dropoff_longitude"
    value {
      int64_list {
        value: 9
  feature {
    key: "fare"
    value {
      float_list {
        value: 0.061060599982738495
  feature {
    key: "payment_type"
    value {
      int64_list {
        value: 1
  feature {
    key: "pickup_census_tract"
    value {
      int64_list {
        value: 0
  feature {
    key: "pickup_community_area"
    value {
      int64_list {
        value: 0
  feature {
    key: "pickup_latitude"
    value {
      int64_list {
        value: 0
  feature {
    key: "pickup_longitude"
    value {
      int64_list {
        value: 9
  feature {
    key: "tips"
    value {
      int64_list {
        value: 0
  feature {
    key: "trip_miles"
    value {
      float_list {
        value: -0.15886740386486053
  feature {
    key: "trip_seconds"
    value {
      float_list {
        value: -0.7118487358093262
  feature {
    key: "trip_start_day"
    value {
      int64_list {
        value: 6
  feature {
    key: "trip_start_hour"
    value {
      int64_list {
        value: 19
  feature {
    key: "trip_start_month"
    value {
      int64_list {
        value: 5

features {
  feature {
    key: "company"
    value {
      int64_list {
        value: 0
  feature {
    key: "dropoff_census_tract"
    value {
      int64_list {
        value: 0
  feature {
    key: "dropoff_community_area"
    value {
      int64_list {
        value: 0
  feature {
    key: "dropoff_latitude"
    value {
      int64_list {
        value: 0
  feature {
    key: "dropoff_longitude"
    value {
      int64_list {
        value: 9
  feature {
    key: "fare"
    value {
      float_list {
        value: 1.2521240711212158
  feature {
    key: "payment_type"
    value {
      int64_list {
        value: 0
  feature {
    key: "pickup_census_tract"
    value {
      int64_list {
        value: 0
  feature {
    key: "pickup_community_area"
    value {
      int64_list {
        value: 60
  feature {
    key: "pickup_latitude"
    value {
      int64_list {
        value: 0
  feature {
    key: "pickup_longitude"
    value {
      int64_list {
        value: 3
  feature {
    key: "tips"
    value {
      int64_list {
        value: 0
  feature {
    key: "trip_miles"
    value {
      float_list {
        value: 0.532160758972168
  feature {
    key: "trip_seconds"
    value {
      float_list {
        value: 0.5509493350982666
  feature {
    key: "trip_start_day"
    value {
      int64_list {
        value: 3
  feature {
    key: "trip_start_hour"
    value {
      int64_list {
        value: 2
  feature {
    key: "trip_start_month"
    value {
      int64_list {
        value: 10

features {
  feature {
    key: "company"
    value {
      int64_list {
        value: 48
  feature {
    key: "dropoff_census_tract"
    value {
      int64_list {
        value: 0
  feature {
    key: "dropoff_community_area"
    value {
      int64_list {
        value: 0
  feature {
    key: "dropoff_latitude"
    value {
      int64_list {
        value: 0
  feature {
    key: "dropoff_longitude"
    value {
      int64_list {
        value: 9
  feature {
    key: "fare"
    value {
      float_list {
        value: 0.3873794376850128
  feature {
    key: "payment_type"
    value {
      int64_list {
        value: 0
  feature {
    key: "pickup_census_tract"
    value {
      int64_list {
        value: 0
  feature {
    key: "pickup_community_area"
    value {
      int64_list {
        value: 13
  feature {
    key: "pickup_latitude"
    value {
      int64_list {
        value: 9
  feature {
    key: "pickup_longitude"
    value {
      int64_list {
        value: 0
  feature {
    key: "tips"
    value {
      int64_list {
        value: 0
  feature {
    key: "trip_miles"
    value {
      float_list {
        value: 0.21955278515815735
  feature {
    key: "trip_seconds"
    value {
      float_list {
        value: 0.0019067146349698305
  feature {
    key: "trip_start_day"
    value {
      int64_list {
        value: 3
  feature {
    key: "trip_start_hour"
    value {
      int64_list {
        value: 12
  feature {
    key: "trip_start_month"
    value {
      int64_list {
        value: 11

Transform 元件將您的資料轉換為特徵後,下一個步驟是訓練模型。


Trainer 元件將訓練您在 TensorFlow 中定義的模型 (使用 Estimator API 或搭配 model_to_estimator 的 Keras API)。

Trainer 會將來自 SchemaGen 的結構描述、來自 Transform 的轉換後資料和圖表、訓練參數,以及包含使用者定義模型程式碼的模組做為輸入。

讓我們看看下方使用者定義模型程式碼的範例 (如要瞭解 TensorFlow Estimator API 的簡介,請參閱教學課程)

_taxi_trainer_module_file = ''
%%writefile {_taxi_trainer_module_file}

import tensorflow as tf
import tensorflow_model_analysis as tfma
import tensorflow_transform as tft
from tensorflow_transform.tf_metadata import schema_utils
from tfx_bsl.tfxio import dataset_options

import taxi_constants

_VOCAB_SIZE = taxi_constants.VOCAB_SIZE
_OOV_SIZE = taxi_constants.OOV_SIZE
_LABEL_KEY = taxi_constants.LABEL_KEY

# Tf.Transform considers these features as "raw"
def _get_raw_feature_spec(schema):
  return schema_utils.schema_as_feature_spec(schema).feature_spec

def _build_estimator(config, hidden_units=None, warm_start_from=None):
  """Build an estimator for predicting the tipping behavior of taxi riders.
    config: tf.estimator.RunConfig defining the runtime environment for the
      estimator (including model_dir).
    hidden_units: [int], the layer sizes of the DNN (input layer first)
    warm_start_from: Optional directory to warm start from.
    A dict of the following:
      - estimator: The estimator that will be used for training and eval.
      - train_spec: Spec for training.
      - eval_spec: Spec for eval.
      - eval_input_receiver_fn: Input function for eval.
  real_valued_columns = [
      tf.feature_column.numeric_column(key, shape=())
      for key in _DENSE_FLOAT_FEATURE_KEYS
  categorical_columns = [
          key, num_buckets=_VOCAB_SIZE + _OOV_SIZE, default_value=0)
      for key in _VOCAB_FEATURE_KEYS
  categorical_columns += [
          key, num_buckets=_FEATURE_BUCKET_COUNT, default_value=0)
      for key in _BUCKET_FEATURE_KEYS
  categorical_columns += [
      tf.feature_column.categorical_column_with_identity(  # pylint: disable=g-complex-comprehension
          default_value=0) for key, num_buckets in zip(
  return tf.estimator.DNNLinearCombinedClassifier(
      dnn_hidden_units=hidden_units or [100, 70, 50, 25],

def _example_serving_receiver_fn(tf_transform_graph, schema):
  """Build the serving in inputs.
    tf_transform_graph: A TFTransformOutput.
    schema: the schema of the input data.
    Tensorflow graph which parses examples, applying tf-transform to them.
  raw_feature_spec = _get_raw_feature_spec(schema)

  raw_input_fn = tf.estimator.export.build_parsing_serving_input_receiver_fn(
      raw_feature_spec, default_batch_size=None)
  serving_input_receiver = raw_input_fn()

  transformed_features = tf_transform_graph.transform_raw_features(

  return tf.estimator.export.ServingInputReceiver(
      transformed_features, serving_input_receiver.receiver_tensors)

def _eval_input_receiver_fn(tf_transform_graph, schema):
  """Build everything needed for the tf-model-analysis to run the model.
    tf_transform_graph: A TFTransformOutput.
    schema: the schema of the input data.
    EvalInputReceiver function, which contains:
      - Tensorflow graph which parses raw untransformed features, applies the
        tf-transform preprocessing operators.
      - Set of raw, untransformed features.
      - Label against which predictions will be compared.
  # Notice that the inputs are raw features, not transformed features here.
  raw_feature_spec = _get_raw_feature_spec(schema)

  serialized_tf_example = tf.compat.v1.placeholder(
      dtype=tf.string, shape=[None], name='input_example_tensor')

  # Add a parse_example operator to the tensorflow graph, which will parse
  # raw, untransformed, tf examples.
  features =, raw_feature_spec)

  # Now that we have our raw examples, process them through the tf-transform
  # function computed during the preprocessing step.
  transformed_features = tf_transform_graph.transform_raw_features(

  # The key name MUST be 'examples'.
  receiver_tensors = {'examples': serialized_tf_example}

  # NOTE: Model is driven by transformed features (since training works on the
  # materialized output of TFT, but slicing will happen on raw features.

  return tfma.export.EvalInputReceiver(

def _input_fn(file_pattern, data_accessor, tf_transform_output, batch_size=200):
  """Generates features and label for tuning/training.

    file_pattern: List of paths or patterns of input tfrecord files.
    data_accessor: DataAccessor for converting input to RecordBatch.
    tf_transform_output: A TFTransformOutput.
    batch_size: representing the number of consecutive elements of returned
      dataset to combine in a single batch

    A dataset that contains (features, indices) tuple where features is a
      dictionary of Tensors, and indices is a single Tensor of label indices.
  return data_accessor.tf_dataset_factory(
          batch_size=batch_size, label_key=_LABEL_KEY),

# TFX will call this function
def trainer_fn(trainer_fn_args, schema):
  """Build the estimator using the high level API.
    trainer_fn_args: Holds args used to train the model as name/value pairs.
    schema: Holds the schema of the training examples.
    A dict of the following:
      - estimator: The estimator that will be used for training and eval.
      - train_spec: Spec for training.
      - eval_spec: Spec for eval.
      - eval_input_receiver_fn: Input function for eval.
  # Number of nodes in the first layer of the DNN
  first_dnn_layer_size = 100
  num_dnn_layers = 4
  dnn_decay_factor = 0.7

  train_batch_size = 40
  eval_batch_size = 40

  tf_transform_graph = tft.TFTransformOutput(trainer_fn_args.transform_output)

  train_input_fn = lambda: _input_fn(  # pylint: disable=g-long-lambda

  eval_input_fn = lambda: _input_fn(  # pylint: disable=g-long-lambda

  train_spec = tf.estimator.TrainSpec(  # pylint: disable=g-long-lambda

  serving_receiver_fn = lambda: _example_serving_receiver_fn(  # pylint: disable=g-long-lambda
      tf_transform_graph, schema)

  exporter = tf.estimator.FinalExporter('chicago-taxi', serving_receiver_fn)
  eval_spec = tf.estimator.EvalSpec(

  run_config = tf.estimator.RunConfig(
      save_checkpoints_steps=999, keep_checkpoint_max=1)

  run_config = run_config.replace(model_dir=trainer_fn_args.serving_model_dir)

  estimator = _build_estimator(
      # Construct layers sizes with exponetial decay
          max(2, int(first_dnn_layer_size * dnn_decay_factor**i))
          for i in range(num_dnn_layers)

  # Create an input receiver for TFMA processing
  receiver_fn = lambda: _eval_input_receiver_fn(  # pylint: disable=g-long-lambda
      tf_transform_graph, schema)

  return {
      'estimator': estimator,
      'train_spec': train_spec,
      'eval_spec': eval_spec,
      'eval_input_receiver_fn': receiver_fn

現在,我們將這個模型程式碼傳遞至 Trainer 元件並執行它,以訓練模型。

from tfx.components.trainer.executor import Executor
from tfx.dsl.components.base import executor_spec

trainer = tfx.components.Trainer(
INFO:absl:Full user module path is 'taxi_trainer@/tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/_wheels/tfx_user_code_Trainer-0.0+e337a512821685b6d91445dbd0628b47de0e4c751e9e54edf78bcf0866309618-py3-none-any.whl'
INFO:absl:Running driver for Trainer
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Running executor for Trainer
INFO:absl:Train on the 'train' split when train_args.splits is not set.
INFO:absl:Evaluate on the 'eval' split when eval_args.splits is not set.
WARNING:absl:Examples artifact does not have payload_format custom property. Falling back to FORMAT_TF_EXAMPLE
WARNING:absl:Examples artifact does not have payload_format custom property. Falling back to FORMAT_TF_EXAMPLE
WARNING:absl:Examples artifact does not have payload_format custom property. Falling back to FORMAT_TF_EXAMPLE
INFO:absl:udf_utils.get_fn {'train_args': '{\n  "num_steps": 10000\n}', 'eval_args': '{\n  "num_steps": 5000\n}', 'module_file': None, 'run_fn': None, 'trainer_fn': None, 'custom_config': 'null', 'module_path': 'taxi_trainer@/tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/_wheels/tfx_user_code_Trainer-0.0+e337a512821685b6d91445dbd0628b47de0e4c751e9e54edf78bcf0866309618-py3-none-any.whl'} 'trainer_fn'
INFO:absl:Installing '/tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/_wheels/tfx_user_code_Trainer-0.0+e337a512821685b6d91445dbd0628b47de0e4c751e9e54edf78bcf0866309618-py3-none-any.whl' to a temporary directory.
INFO:absl:Executing: ['/tmpfs/src/tf_docs_env/bin/python', '-m', 'pip', 'install', '--target', '/tmpfs/tmp/tmp86irddos', '/tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/_wheels/tfx_user_code_Trainer-0.0+e337a512821685b6d91445dbd0628b47de0e4c751e9e54edf78bcf0866309618-py3-none-any.whl']
Processing /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/_wheels/tfx_user_code_Trainer-0.0+e337a512821685b6d91445dbd0628b47de0e4c751e9e54edf78bcf0866309618-py3-none-any.whl
INFO:absl:Successfully installed '/tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/_wheels/tfx_user_code_Trainer-0.0+e337a512821685b6d91445dbd0628b47de0e4c751e9e54edf78bcf0866309618-py3-none-any.whl'.
Installing collected packages: tfx-user-code-Trainer
Successfully installed tfx-user-code-Trainer-0.0+e337a512821685b6d91445dbd0628b47de0e4c751e9e54edf78bcf0866309618
INFO:absl:Training model.
INFO:absl:Feature company has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_census_tract has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_community_area has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_latitude has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_longitude has a shape . Setting to DenseTensor.
INFO:absl:Feature fare has a shape . Setting to DenseTensor.
INFO:absl:Feature payment_type has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_census_tract has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_community_area has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_latitude has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_longitude has a shape . Setting to DenseTensor.
INFO:absl:Feature tips has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_miles has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_seconds has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_day has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_hour has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_month has a shape . Setting to DenseTensor.
INFO:absl:Feature company has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_census_tract has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_community_area has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_latitude has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_longitude has a shape . Setting to DenseTensor.
INFO:absl:Feature fare has a shape . Setting to DenseTensor.
INFO:absl:Feature payment_type has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_census_tract has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_community_area has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_latitude has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_longitude has a shape . Setting to DenseTensor.
INFO:absl:Feature tips has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_miles has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_seconds has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_day has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_hour has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_month has a shape . Setting to DenseTensor.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 0...
INFO:tensorflow:Saving checkpoints for 0 into /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model_run/6/Format-Serving/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 0...
INFO:tensorflow:loss = 0.7004041, step = 0
INFO:absl:Feature company has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_census_tract has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_community_area has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_latitude has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_longitude has a shape . Setting to DenseTensor.
INFO:absl:Feature fare has a shape . Setting to DenseTensor.
INFO:absl:Feature payment_type has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_census_tract has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_community_area has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_latitude has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_longitude has a shape . Setting to DenseTensor.
INFO:absl:Feature tips has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_miles has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_seconds has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_day has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_hour has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_month has a shape . Setting to DenseTensor.
INFO:absl:Feature company has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_census_tract has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_community_area has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_latitude has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_longitude has a shape . Setting to DenseTensor.
INFO:absl:Feature fare has a shape . Setting to DenseTensor.
INFO:absl:Feature payment_type has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_census_tract has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_community_area has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_latitude has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_longitude has a shape . Setting to DenseTensor.
INFO:absl:Feature tips has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_miles has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_seconds has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_day has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_hour has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_month has a shape . Setting to DenseTensor.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2024-05-08T09:29:55
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Evaluation [500/5000]
INFO:tensorflow:Inference Time : 31.08471s
INFO:tensorflow:global_step/sec: 2.90748
INFO:tensorflow:loss = 0.42686376, step = 1000 (34.394 sec)
INFO:tensorflow:global_step/sec: 126.964
INFO:tensorflow:loss = 0.44425964, step = 1100 (0.788 sec)
INFO:tensorflow:global_step/sec: 127.535
INFO:tensorflow:loss = 0.47197676, step = 1200 (0.784 sec)
INFO:tensorflow:global_step/sec: 125.471
INFO:tensorflow:loss = 0.506038, step = 1300 (0.797 sec)
INFO:tensorflow:global_step/sec: 127.967
INFO:tensorflow:loss = 0.4099118, step = 1400 (0.781 sec)
INFO:tensorflow:global_step/sec: 126.812
INFO:tensorflow:loss = 0.5171037, step = 1500 (0.789 sec)
INFO:tensorflow:global_step/sec: 131.046
INFO:tensorflow:loss = 0.4371317, step = 1600 (0.763 sec)
INFO:tensorflow:global_step/sec: 128.146
INFO:tensorflow:loss = 0.45776543, step = 1700 (0.780 sec)
INFO:tensorflow:global_step/sec: 128.92
INFO:tensorflow:loss = 0.50659925, step = 1800 (0.776 sec)
INFO:tensorflow:global_step/sec: 129.06
INFO:tensorflow:loss = 0.43417373, step = 1900 (0.775 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 1998...
INFO:tensorflow:Saving checkpoints for 1998 into /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model_run/6/Format-Serving/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 1998...
INFO:tensorflow:Skip the current checkpoint eval due to throttle secs (600 secs).
INFO:tensorflow:global_step/sec: 103.772
INFO:tensorflow:loss = 0.45863923, step = 2000 (0.963 sec)
INFO:tensorflow:global_step/sec: 125.899
INFO:tensorflow:loss = 0.3514557, step = 2100 (0.795 sec)
INFO:tensorflow:global_step/sec: 129.752
INFO:tensorflow:loss = 0.43468484, step = 2200 (0.771 sec)
INFO:tensorflow:global_step/sec: 131.014
INFO:tensorflow:loss = 0.48132992, step = 2300 (0.763 sec)
INFO:tensorflow:global_step/sec: 130.271
INFO:tensorflow:loss = 0.44048753, step = 2400 (0.768 sec)
INFO:tensorflow:global_step/sec: 129.449
INFO:tensorflow:loss = 0.3523005, step = 2500 (0.773 sec)
INFO:tensorflow:global_step/sec: 130.936
INFO:tensorflow:loss = 0.3773502, step = 2600 (0.764 sec)
INFO:tensorflow:global_step/sec: 129.258
INFO:tensorflow:loss = 0.43350023, step = 2700 (0.774 sec)
INFO:tensorflow:global_step/sec: 133.75
INFO:tensorflow:loss = 0.37304792, step = 2800 (0.748 sec)
INFO:tensorflow:global_step/sec: 130.275
INFO:tensorflow:loss = 0.3801176, step = 2900 (0.768 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 2997...
INFO:tensorflow:Saving checkpoints for 2997 into /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model_run/6/Format-Serving/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 2997...
INFO:tensorflow:Skip the current checkpoint eval due to throttle secs (600 secs).
INFO:tensorflow:global_step/sec: 107.092
INFO:tensorflow:loss = 0.3836586, step = 3000 (0.933 sec)
INFO:tensorflow:global_step/sec: 133.249
INFO:tensorflow:loss = 0.43525982, step = 3100 (0.751 sec)
INFO:tensorflow:global_step/sec: 124.615
INFO:tensorflow:loss = 0.42075485, step = 3200 (0.803 sec)
INFO:tensorflow:global_step/sec: 122.737
INFO:tensorflow:loss = 0.3901537, step = 3300 (0.815 sec)
INFO:tensorflow:global_step/sec: 122.54
INFO:tensorflow:loss = 0.35952353, step = 3400 (0.816 sec)
INFO:tensorflow:global_step/sec: 124.721
INFO:tensorflow:loss = 0.3873772, step = 3500 (0.802 sec)
INFO:tensorflow:global_step/sec: 128.574
INFO:tensorflow:loss = 0.36566123, step = 3600 (0.778 sec)
INFO:tensorflow:global_step/sec: 126.009
INFO:tensorflow:loss = 0.40229043, step = 3700 (0.794 sec)
INFO:tensorflow:global_step/sec: 122.1
INFO:tensorflow:loss = 0.4070228, step = 3800 (0.819 sec)
INFO:tensorflow:global_step/sec: 123.903
INFO:tensorflow:loss = 0.4688112, step = 3900 (0.807 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 3996...
INFO:tensorflow:Saving checkpoints for 3996 into /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model_run/6/Format-Serving/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 3996...
INFO:tensorflow:Skip the current checkpoint eval due to throttle secs (600 secs).
INFO:tensorflow:global_step/sec: 100.776
INFO:tensorflow:loss = 0.49602365, step = 4000 (0.992 sec)
INFO:tensorflow:global_step/sec: 123.848
INFO:tensorflow:loss = 0.2742646, step = 4100 (0.808 sec)
INFO:tensorflow:global_step/sec: 123.423
INFO:tensorflow:loss = 0.44800407, step = 4200 (0.810 sec)
INFO:tensorflow:global_step/sec: 123.175
INFO:tensorflow:loss = 0.43835735, step = 4300 (0.812 sec)
INFO:tensorflow:global_step/sec: 123.891
INFO:tensorflow:loss = 0.32207388, step = 4400 (0.807 sec)
INFO:tensorflow:global_step/sec: 125.024
INFO:tensorflow:loss = 0.38216084, step = 4500 (0.800 sec)
INFO:tensorflow:global_step/sec: 124.891
INFO:tensorflow:loss = 0.45092455, step = 4600 (0.801 sec)
INFO:tensorflow:global_step/sec: 123.988
INFO:tensorflow:loss = 0.3750969, step = 4700 (0.807 sec)
INFO:tensorflow:global_step/sec: 125.77
INFO:tensorflow:loss = 0.3925597, step = 4800 (0.795 sec)
INFO:tensorflow:global_step/sec: 125.459
INFO:tensorflow:loss = 0.43026057, step = 4900 (0.797 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 4995...
INFO:tensorflow:Saving checkpoints for 4995 into /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model_run/6/Format-Serving/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 4995...
INFO:tensorflow:Skip the current checkpoint eval due to throttle secs (600 secs).
INFO:tensorflow:global_step/sec: 103.654
INFO:tensorflow:loss = 0.41449785, step = 5000 (0.964 sec)
INFO:tensorflow:global_step/sec: 125.195
INFO:tensorflow:loss = 0.3261056, step = 5100 (0.799 sec)
INFO:tensorflow:global_step/sec: 127.462
INFO:tensorflow:loss = 0.35417694, step = 5200 (0.785 sec)
INFO:tensorflow:global_step/sec: 127.844
INFO:tensorflow:loss = 0.4030676, step = 5300 (0.782 sec)
INFO:tensorflow:global_step/sec: 130.923
INFO:tensorflow:loss = 0.4126954, step = 5400 (0.764 sec)
INFO:tensorflow:global_step/sec: 130.98
INFO:tensorflow:loss = 0.32259554, step = 5500 (0.764 sec)
INFO:tensorflow:global_step/sec: 128.589
INFO:tensorflow:loss = 0.3811575, step = 5600 (0.777 sec)
INFO:tensorflow:global_step/sec: 125.918
INFO:tensorflow:loss = 0.40286702, step = 5700 (0.794 sec)
INFO:tensorflow:global_step/sec: 129.137
INFO:tensorflow:loss = 0.32921094, step = 5800 (0.775 sec)
INFO:tensorflow:global_step/sec: 130.104
INFO:tensorflow:loss = 0.4013093, step = 5900 (0.768 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 5994...
INFO:tensorflow:Saving checkpoints for 5994 into /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model_run/6/Format-Serving/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 5994...
INFO:tensorflow:Skip the current checkpoint eval due to throttle secs (600 secs).
INFO:tensorflow:global_step/sec: 106.58
INFO:tensorflow:loss = 0.31678432, step = 6000 (0.938 sec)
INFO:tensorflow:global_step/sec: 126.52
INFO:tensorflow:loss = 0.3150363, step = 6100 (0.791 sec)
INFO:tensorflow:global_step/sec: 128.723
INFO:tensorflow:loss = 0.39068645, step = 6200 (0.777 sec)
INFO:tensorflow:global_step/sec: 128.852
INFO:tensorflow:loss = 0.2832807, step = 6300 (0.776 sec)
INFO:tensorflow:global_step/sec: 125.866
INFO:tensorflow:loss = 0.32048258, step = 6400 (0.794 sec)
INFO:tensorflow:global_step/sec: 125.299
INFO:tensorflow:loss = 0.38626963, step = 6500 (0.798 sec)
INFO:tensorflow:global_step/sec: 125.342
INFO:tensorflow:loss = 0.39416704, step = 6600 (0.798 sec)
INFO:tensorflow:global_step/sec: 127.235
INFO:tensorflow:loss = 0.30232263, step = 6700 (0.786 sec)
INFO:tensorflow:global_step/sec: 126.055
INFO:tensorflow:loss = 0.41977397, step = 6800 (0.793 sec)
INFO:tensorflow:global_step/sec: 127.477
INFO:tensorflow:loss = 0.47491065, step = 6900 (0.784 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 6993...
INFO:tensorflow:Saving checkpoints for 6993 into /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model_run/6/Format-Serving/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 6993...
INFO:tensorflow:Skip the current checkpoint eval due to throttle secs (600 secs).
INFO:tensorflow:global_step/sec: 104.669
INFO:tensorflow:loss = 0.35919297, step = 7000 (0.955 sec)
INFO:tensorflow:global_step/sec: 126.032
INFO:tensorflow:loss = 0.42433387, step = 7100 (0.794 sec)
INFO:tensorflow:global_step/sec: 126.079
INFO:tensorflow:loss = 0.3359905, step = 7200 (0.793 sec)
INFO:tensorflow:global_step/sec: 124.834
INFO:tensorflow:loss = 0.4118205, step = 7300 (0.801 sec)
INFO:tensorflow:global_step/sec: 126.048
INFO:tensorflow:loss = 0.3594822, step = 7400 (0.793 sec)
INFO:tensorflow:global_step/sec: 127.316
INFO:tensorflow:loss = 0.3544901, step = 7500 (0.786 sec)
INFO:tensorflow:global_step/sec: 125.922
INFO:tensorflow:loss = 0.3517708, step = 7600 (0.794 sec)
INFO:tensorflow:global_step/sec: 127.473
INFO:tensorflow:loss = 0.32316074, step = 7700 (0.784 sec)
INFO:tensorflow:global_step/sec: 127.602
INFO:tensorflow:loss = 0.28583208, step = 7800 (0.784 sec)
INFO:tensorflow:global_step/sec: 128.23
INFO:tensorflow:loss = 0.379911, step = 7900 (0.780 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 7992...
INFO:tensorflow:Saving checkpoints for 7992 into /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model_run/6/Format-Serving/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 7992...
INFO:tensorflow:Skip the current checkpoint eval due to throttle secs (600 secs).
INFO:tensorflow:global_step/sec: 105.424
INFO:tensorflow:loss = 0.3968008, step = 8000 (0.948 sec)
INFO:tensorflow:global_step/sec: 128.249
INFO:tensorflow:loss = 0.43308416, step = 8100 (0.780 sec)
INFO:tensorflow:global_step/sec: 128.472
INFO:tensorflow:loss = 0.42253828, step = 8200 (0.778 sec)
INFO:tensorflow:global_step/sec: 125.642
INFO:tensorflow:loss = 0.39132017, step = 8300 (0.796 sec)
INFO:tensorflow:global_step/sec: 128.607
INFO:tensorflow:loss = 0.30107036, step = 8400 (0.777 sec)
INFO:tensorflow:global_step/sec: 126.434
INFO:tensorflow:loss = 0.30194753, step = 8500 (0.791 sec)
INFO:tensorflow:global_step/sec: 127.391
INFO:tensorflow:loss = 0.30165237, step = 8600 (0.785 sec)
INFO:tensorflow:global_step/sec: 127.042
INFO:tensorflow:loss = 0.44196972, step = 8700 (0.787 sec)
INFO:tensorflow:global_step/sec: 126.923
INFO:tensorflow:loss = 0.42164555, step = 8800 (0.788 sec)
INFO:tensorflow:global_step/sec: 127.39
INFO:tensorflow:loss = 0.3490799, step = 8900 (0.785 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 8991...
INFO:tensorflow:Saving checkpoints for 8991 into /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model_run/6/Format-Serving/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 8991...
INFO:tensorflow:Skip the current checkpoint eval due to throttle secs (600 secs).
INFO:tensorflow:global_step/sec: 105.59
INFO:tensorflow:loss = 0.31310123, step = 9000 (0.947 sec)
INFO:tensorflow:global_step/sec: 124.933
INFO:tensorflow:loss = 0.4325568, step = 9100 (0.801 sec)
INFO:tensorflow:global_step/sec: 127.59
INFO:tensorflow:loss = 0.30360752, step = 9200 (0.784 sec)
INFO:tensorflow:global_step/sec: 130.138
INFO:tensorflow:loss = 0.29442087, step = 9300 (0.768 sec)
INFO:tensorflow:global_step/sec: 130.544
INFO:tensorflow:loss = 0.31136292, step = 9400 (0.766 sec)
INFO:tensorflow:global_step/sec: 130.849
INFO:tensorflow:loss = 0.34016177, step = 9500 (0.764 sec)
INFO:tensorflow:global_step/sec: 129.551
INFO:tensorflow:loss = 0.39522016, step = 9600 (0.772 sec)
INFO:tensorflow:global_step/sec: 130.262
INFO:tensorflow:loss = 0.34697112, step = 9700 (0.768 sec)
INFO:tensorflow:global_step/sec: 129.093
INFO:tensorflow:loss = 0.38748953, step = 9800 (0.775 sec)
INFO:tensorflow:global_step/sec: 131.427
INFO:tensorflow:loss = 0.29107836, step = 9900 (0.761 sec)
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 9990...
INFO:tensorflow:Saving checkpoints for 9990 into /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model_run/6/Format-Serving/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 9990...
INFO:tensorflow:Skip the current checkpoint eval due to throttle secs (600 secs).
INFO:tensorflow:Calling checkpoint listeners before saving checkpoint 10000...
INFO:tensorflow:Saving checkpoints for 10000 into /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model_run/6/Format-Serving/model.ckpt.
INFO:tensorflow:Calling checkpoint listeners after saving checkpoint 10000...
INFO:tensorflow:Skip the current checkpoint eval due to throttle secs (600 secs).
INFO:absl:Feature company has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_census_tract has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_community_area has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_latitude has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_longitude has a shape . Setting to DenseTensor.
INFO:absl:Feature fare has a shape . Setting to DenseTensor.
INFO:absl:Feature payment_type has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_census_tract has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_community_area has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_latitude has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_longitude has a shape . Setting to DenseTensor.
INFO:absl:Feature tips has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_miles has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_seconds has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_day has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_hour has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_month has a shape . Setting to DenseTensor.
INFO:absl:Feature company has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_census_tract has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_community_area has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_latitude has a shape . Setting to DenseTensor.
INFO:absl:Feature dropoff_longitude has a shape . Setting to DenseTensor.
INFO:absl:Feature fare has a shape . Setting to DenseTensor.
INFO:absl:Feature payment_type has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_census_tract has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_community_area has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_latitude has a shape . Setting to DenseTensor.
INFO:absl:Feature pickup_longitude has a shape . Setting to DenseTensor.
INFO:absl:Feature tips has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_miles has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_seconds has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_day has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_hour has a shape . Setting to DenseTensor.
INFO:absl:Feature trip_start_month has a shape . Setting to DenseTensor.
INFO:tensorflow:Calling model_fn.
INFO:tensorflow:Done calling model_fn.
INFO:tensorflow:Starting evaluation at 2024-05-08T09:31:40
INFO:tensorflow:Graph was finalized.
INFO:tensorflow:Restoring parameters from /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model_run/6/Format-Serving/model.ckpt-10000
INFO:tensorflow:Running local_init_op.
INFO:tensorflow:Done running local_init_op.
INFO:tensorflow:Evaluation [500/5000]
INFO:tensorflow:Evaluation [1000/5000]
INFO:tensorflow:Evaluation [1500/5000]
INFO:tensorflow:Evaluation [2000/5000]
INFO:tensorflow:Evaluation [2500/5000]
INFO:tensorflow:Evaluation [3000/5000]
INFO:tensorflow:Evaluation [3500/5000]
INFO:tensorflow:Evaluation [4000/5000]
INFO:tensorflow:Evaluation [4500/5000]
INFO:tensorflow:Evaluation [5000/5000]
INFO:tensorflow:Inference Time : 30.60751s
INFO:tensorflow:Finished evaluation at 2024-05-08-09:32:11
INFO:tensorflow:Saving dict for global step 10000: accuracy = 0.78671, accuracy_baseline = 0.771205, auc = 0.93178755, auc_precision_recall = 0.69907445, average_loss = 0.34567952, global_step = 10000, label/mean = 0.228795, loss = 0.34567845, precision = 0.7014421, prediction/mean = 0.23119248, recall = 0.117987715
INFO:tensorflow:Saving 'checkpoint_path' summary for global step 10000: /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model_run/6/Format-Serving/model.ckpt-10000
INFO:tensorflow:Performing the final export in the end of training.
INFO:absl:Feature company has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature fare has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature payment_type has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature tips has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_miles has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_seconds has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_day has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_hour has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_month has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_timestamp has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Exporting eval_savedmodel for TFMA.
INFO:absl:Feature company has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature dropoff_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature fare has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature payment_type has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_census_tract has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_community_area has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_latitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature pickup_longitude has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature tips has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_miles has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_seconds has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_day has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_hour has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_month has no shape. Setting to varlen_sparse_tensor.
INFO:absl:Feature trip_start_timestamp has no shape. Setting to varlen_sparse_tensor.
使用 TensorBoard 分析訓練

您可以選擇性地將 TensorBoard 連線至 Trainer,以分析模型的訓練曲線。

# Get the URI of the output artifact representing the training logs, which is a directory
model_run_dir = trainer.outputs['model_run'].get()[0].uri

%load_ext tensorboard
%tensorboard --logdir {model_run_dir}


Evaluator 元件會計算評估集中模型的效能指標。它使用 TensorFlow Model Analysis 程式庫。Evaluator 也可以選擇性地驗證新訓練的模型是否優於先前的模型。這在生產環境管線設定中很有用,在這種設定中,您可以每天自動訓練和驗證模型。在這個筆記本中,我們只訓練一個模型,因此 Evaluator 會自動將模型標示為「良好」。

Evaluator 會將來自 ExampleGen 的資料、來自 Trainer 的已訓練模型,以及分層設定做為輸入。分層設定可讓您根據特徵值對指標進行分層 (例如,您的模型在早上 8 點開始的計程車行程與晚上 8 點開始的計程車行程上的效能如何?)。請參閱下方的設定範例

eval_config = tfma.EvalConfig(
        # Using signature 'eval' implies the use of an EvalSavedModel. To use
        # a serving model remove the signature to defaults to 'serving_default'
        # and add a label_key.
            # The metrics added here are in addition to those saved with the
            # model (assuming either a keras model or EvalSavedModel is used).
            # Any metrics added into the saved model (for example using
            # model.compile(..., metrics=[...]), etc) will be computed
            # automatically.
            # To add validation thresholds for metrics saved with the model,
            # add them keyed by metric name to the thresholds map.
            thresholds = {
                'accuracy': tfma.MetricThreshold(
                        lower_bound={'value': 0.5}),
                    # Change threshold will be ignored if there is no
                    # baseline model resolved from MLMD (first run).
                       absolute={'value': -1e-10}))
        # An empty slice spec means the overall slice, i.e. the whole dataset.
        # Data can be sliced along a feature column. In this case, data is
        # sliced along feature column trip_start_hour.

接下來,我們將這個設定提供給 Evaluator 並執行它。

# Use TFMA to compute a evaluation statistics over features of a model and
# validate them against a baseline.

# The model resolver is only required if performing model validation in addition
# to evaluation. In this case we validate against the latest blessed model. If
# no model has been blessed before (as in this case) the evaluator will make our
# candidate the first blessed model.
model_resolver = tfx.dsl.Resolver(

evaluator = tfx.components.Evaluator(
INFO:absl:Running driver for latest_blessed_model_resolver
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Running publisher for latest_blessed_model_resolver
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Running driver for Evaluator
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Running executor for Evaluator
INFO:absl:udf_utils.get_fn {'eval_config': '{\n  "metrics_specs": [\n    {\n      "metrics": [\n        {\n          "class_name": "ExampleCount"\n        }\n      ],\n      "thresholds": {\n        "accuracy": {\n          "change_threshold": {\n            "absolute": -1e-10,\n            "direction": "HIGHER_IS_BETTER"\n          },\n          "value_threshold": {\n            "lower_bound": 0.5\n          }\n        }\n      }\n    }\n  ],\n  "model_specs": [\n    {\n      "signature_name": "eval"\n    }\n  ],\n  "slicing_specs": [\n    {},\n    {\n      "feature_keys": [\n        "trip_start_hour"\n      ]\n    }\n  ]\n}', 'feature_slicing_spec': None, 'fairness_indicator_thresholds': 'null', 'example_splits': 'null', 'module_file': None, 'module_path': None} 'custom_eval_shared_model'
INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config=
model_specs {
  signature_name: "eval"
slicing_specs {
slicing_specs {
  feature_keys: "trip_start_hour"
metrics_specs {
  metrics {
    class_name: "ExampleCount"
  thresholds {
    key: "accuracy"
    value {
      value_threshold {
        lower_bound {
          value: 0.5

INFO:absl:Using /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Trainer/model/6/Format-TFMA as  model.
WARNING:tensorflow:SavedModel saved prior to TF 2.5 detected when loading Keras model. Please ensure that you are saving the model with or tf.keras.models.save_model(), *NOT* To confirm, there should be a file named "keras_metadata.pb" in the SavedModel directory.
INFO:absl:The 'example_splits' parameter is not set, using 'eval' split.
INFO:absl:Evaluating model.
INFO:absl:udf_utils.get_fn {'eval_config': '{\n  "metrics_specs": [\n    {\n      "metrics": [\n        {\n          "class_name": "ExampleCount"\n        }\n      ],\n      "thresholds": {\n        "accuracy": {\n          "change_threshold": {\n            "absolute": -1e-10,\n            "direction": "HIGHER_IS_BETTER"\n          },\n          "value_threshold": {\n            "lower_bound": 0.5\n          }\n        }\n      }\n    }\n  ],\n  "model_specs": [\n    {\n      "signature_name": "eval"\n    }\n  ],\n  "slicing_specs": [\n    {},\n    {\n      "feature_keys": [\n        "trip_start_hour"\n      ]\n    }\n  ]\n}', 'feature_slicing_spec': None, 'fairness_indicator_thresholds': 'null', 'example_splits': 'null', 'module_file': None, 'module_path': None} 'custom_extractors'
INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config=
model_specs {
  signature_name: "eval"
slicing_specs {
slicing_specs {
  feature_keys: "trip_start_hour"
metrics_specs {
  metrics {
    class_name: "ExampleCount"
  model_names: ""
  thresholds {
    key: "accuracy"
    value {
      value_threshold {
        lower_bound {
          value: 0.5

INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config=
model_specs {
  signature_name: "eval"
slicing_specs {
slicing_specs {
  feature_keys: "trip_start_hour"
metrics_specs {
  metrics {
    class_name: "ExampleCount"
  model_names: ""
  thresholds {
    key: "accuracy"
    value {
      value_threshold {
        lower_bound {
          value: 0.5

INFO:absl:eval_shared_models have model_types: {'tfma_eval'}
INFO:absl:Request was made to ignore the baseline ModelSpec and any change thresholds. This is likely because a baseline model was not provided: updated_config=
model_specs {
  signature_name: "eval"
slicing_specs {
slicing_specs {
  feature_keys: "trip_start_hour"
metrics_specs {
  metrics {
    class_name: "ExampleCount"
  model_names: ""
  thresholds {
    key: "accuracy"
    value {
      value_threshold {
        lower_bound {
          value: 0.5
INFO:absl:Checking validation results.
現在讓我們檢查 Evaluator 的輸出成品。

{'evaluation': OutputChannel(artifact_type=ModelEvaluation, producer_component_id=Evaluator, output_key=evaluation, additional_properties={}, additional_custom_properties={}, _input_trigger=None, _is_async=False),
 'blessing': OutputChannel(artifact_type=ModelBlessing, producer_component_id=Evaluator, output_key=blessing, additional_properties={}, additional_custom_properties={}, _input_trigger=None, _is_async=False)}

使用 evaluation 輸出,我們可以顯示整個評估集中全域指標的預設視覺化。['evaluation'])
SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'Overall', 'metrics':…

如要查看分層評估指標的視覺化,我們可以直接呼叫 TensorFlow Model Analysis 程式庫。

import tensorflow_model_analysis as tfma

# Get the TFMA output result path and load the result.
PATH_TO_RESULT = evaluator.outputs['evaluation'].get()[0].uri
tfma_result = tfma.load_eval_result(PATH_TO_RESULT)

# Show data sliced along feature column trip_start_hour.
    tfma_result, slicing_column='trip_start_hour')
SlicingMetricsViewer(config={'weightedExamplesColumn': 'example_count'}, data=[{'slice': 'trip_start_hour:19',…

這個視覺化顯示相同的指標,但針對 trip_start_hour 的每個特徵值計算,而不是針對整個評估集計算。

TensorFlow Model Analysis 支援許多其他視覺化,例如 Fairness Indicators 和繪製模型效能的時間序列。如要瞭解詳情,請參閱教學課程

由於我們已將門檻新增至設定,因此驗證輸出也可用。存在 blessing 成品表示我們的模型通過驗證。由於這是第一次執行驗證,因此候選模型會自動通過驗證。

blessing_uri = evaluator.outputs['blessing'].get()[0].uri
!ls -l {blessing_uri}
total 0
-rw-rw-r-- 1 kbuilder kbuilder 0 May  8 09:32 BLESSED


PATH_TO_RESULT = evaluator.outputs['evaluation'].get()[0].uri
validation_ok: true
validation_details {
  slicing_details {
    slicing_spec {
    num_matching_slices: 25


Pusher 元件通常位於 TFX 管線的結尾。它會檢查模型是否通過驗證,如果通過驗證,則將模型匯出至 _serving_model_dir

pusher = tfx.components.Pusher(
INFO:absl:Running driver for Pusher
INFO:absl:MetadataStore with DB connection initialized
INFO:absl:Running executor for Pusher
INFO:absl:Model version: 1715160747
INFO:absl:Model written to serving path /tmpfs/tmp/tmp15vk_44e/serving_model/taxi_simple/1715160747.
INFO:absl:Model pushed to /tmpfs/tmp/tfx-interactive-2024-05-08T09_28_51.324450-cz1zlfzs/Pusher/pushed_model/9.
INFO:absl:Running publisher for Pusher
INFO:absl:MetadataStore with DB connection initialized

讓我們檢查 Pusher 的輸出成品。

{'pushed_model': OutputChannel(artifact_type=PushedModel, producer_component_id=Pusher, output_key=pushed_model, additional_properties={}, additional_custom_properties={}, _input_trigger=None, _is_async=False)}

特別是,Pusher 會以 SavedModel 格式匯出您的模型,如下所示

push_uri = pusher.outputs['pushed_model'].get()[0].uri
model = tf.saved_model.load(push_uri)

for item in model.signatures.items():
 <ConcreteFunction () -> Dict[['outputs', TensorSpec(shape=(None, 1), dtype=tf.float32, name=None)]] at 0x7F38C020BDF0>)
 <ConcreteFunction () -> Dict[['class_ids', TensorSpec(shape=(None, 1), dtype=tf.int64, name=None)], ['classes', TensorSpec(shape=(None, 1), dtype=tf.string, name=None)], ['probabilities', TensorSpec(shape=(None, 2), dtype=tf.float32, name=None)], ['logits', TensorSpec(shape=(None, 1), dtype=tf.float32, name=None)], ['all_class_ids', TensorSpec(shape=(None, 2), dtype=tf.int32, name=None)], ['logistic', TensorSpec(shape=(None, 1), dtype=tf.float32, name=None)], ['all_classes', TensorSpec(shape=(None, 2), dtype=tf.string, name=None)]] at 0x7F39100E0A90>)
 <ConcreteFunction () -> Dict[['scores', TensorSpec(shape=(None, 2), dtype=tf.float32, name=None)], ['classes', TensorSpec(shape=(None, 2), dtype=tf.string, name=None)]] at 0x7F38C0424700>)
 <ConcreteFunction () -> Dict[['classes', TensorSpec(shape=(None, 2), dtype=tf.string, name=None)], ['scores', TensorSpec(shape=(None, 2), dtype=tf.float32, name=None)]] at 0x7F38C0620D60>)

我們已完成內建 TFX 元件的導覽!