Evaluator TFX 管道組件

Evaluator TFX 管道組件會對模型的訓練結果執行深度分析，協助您瞭解模型在資料子集上的效能。Evaluator 也可協助您驗證匯出的模型，確保模型「夠好」可推送至生產環境。

啟用驗證時，Evaluator 會將新模型與基準線 (例如目前提供的模型) 進行比較，以判斷新模型相較於基準線是否「夠好」。做法是在評估資料集上評估這兩個模型，並計算其在指標 (例如 AUC、損失) 上的效能。如果新模型的指標符合開發人員指定的相對於基準線模型的條件 (例如 AUC 不會降低)，則模型會「通過認證」(標示為良好)，向 Pusher 指出可將模型推送至生產環境。

取用
- 來自範例的評估分割
- 來自 Trainer 的已訓練模型
- 先前通過認證的模型 (如果要執行驗證)
發出
- 分析結果至 ML Metadata
- 驗證結果至 ML Metadata (如果要執行驗證)

Evaluator 和 TensorFlow Model Analysis

Evaluator 利用 TensorFlow Model Analysis 程式庫執行分析，而該程式庫又使用 Apache Beam 進行可擴充的處理。

使用 Evaluator 組件

Evaluator 管道組件通常非常容易部署，且只需要少量自訂，因為大部分工作都是由 Evaluator TFX 組件完成。

若要設定評估器，需要下列資訊

要設定的指標 (僅在新增模型儲存指標以外的其他指標時才需要)。如需詳細資訊，請參閱 Tensorflow Model Analysis Metrics。
要設定的切片 (如果未提供任何切片，則預設會新增「整體」切片)。如需詳細資訊，請參閱 Tensorflow Model Analysis Setup。

如果要加入驗證，則需要下列額外資訊

要比較的模型 (最新通過認證的模型等等)。
要驗證的模型驗證 (臨界值)。如需詳細資訊，請參閱 Tensorflow Model Analysis Model Validations。

啟用後，系統會針對所有已定義的指標和切片執行驗證。

一般程式碼看起來像這樣

import tensorflow_model_analysis as tfma
...

# For TFMA evaluation

eval_config = tfma.EvalConfig(
    model_specs=[
        # This assumes a serving model with signature 'serving_default'. If
        # using estimator based EvalSavedModel, add signature_name='eval' and
        # remove the label_key. Note, if using a TFLite model, then you must set
        # model_type='tf_lite'.
        tfma.ModelSpec(label_key='<label_key>')
    ],
    metrics_specs=[
        tfma.MetricsSpec(
            # The metrics added here are in addition to those saved with the
            # model (assuming either a keras model or EvalSavedModel is used).
            # Any metrics added into the saved model (for example using
            # model.compile(..., metrics=[...]), etc) will be computed
            # automatically.
            metrics=[
                tfma.MetricConfig(class_name='ExampleCount'),
                tfma.MetricConfig(
                    class_name='BinaryAccuracy',
                    threshold=tfma.MetricThreshold(
                        value_threshold=tfma.GenericValueThreshold(
                            lower_bound={'value': 0.5}),
                        change_threshold=tfma.GenericChangeThreshold(
                            direction=tfma.MetricDirection.HIGHER_IS_BETTER,
                            absolute={'value': -1e-10})))
            ]
        )
    ],
    slicing_specs=[
        # An empty slice spec means the overall slice, i.e. the whole dataset.
        tfma.SlicingSpec(),
        # Data can be sliced along a feature column. In this case, data is
        # sliced along feature column trip_start_hour.
        tfma.SlicingSpec(feature_keys=['trip_start_hour'])
    ])

# The following component is experimental and may change in the future. This is
# required to specify the latest blessed model will be used as the baseline.
model_resolver = Resolver(
      strategy_class=dsl.experimental.LatestBlessedModelStrategy,
      model=Channel(type=Model),
      model_blessing=Channel(type=ModelBlessing)
).with_id('latest_blessed_model_resolver')

model_analyzer = Evaluator(
      examples=examples_gen.outputs['examples'],
      model=trainer.outputs['model'],
      baseline_model=model_resolver.outputs['model'],
      # Change threshold will be ignored if there is no baseline (first run).
      eval_config=eval_config)

評估器會產生 EvalResult (以及選用的 ValidationResult，如果使用驗證)，可以使用 TFMA 載入。以下範例說明如何將結果載入 Jupyter 筆記本

import tensorflow_model_analysis as tfma

output_path = evaluator.outputs['evaluation'].get()[0].uri

# Load the evaluation results.
eval_result = tfma.load_eval_result(output_path)

# Visualize the metrics and plots using tfma.view.render_slicing_metrics,
# tfma.view.render_plot, etc.
tfma.view.render_slicing_metrics(tfma_result)
...

# Load the validation results
validation_result = tfma.load_validation_result(output_path)
if not validation_result.validation_ok:
  ...

如需更多詳細資訊，請參閱 Evaluator API 參考資料。