TF.Text 指標

在 TensorFlow.org 上查看 在 Google Colab 中執行 在 GitHub 上查看 下載筆記本

總覽

TensorFlow Text 提供一系列與文字指標相關的類別和運算,可與 TensorFlow 2.0 搭配使用。此程式庫包含文字相似度指標 (例如 ROUGE-L) 的實作,這些指標是自動評估文字產生模型時的必要項目。

在評估模型時使用這些運算的優點是,它們與 TPU 評估相容,並且可與 TF 串流指標 API 良好搭配運作。

設定

pip install -q "tensorflow-text==2.11.*"
import tensorflow as tf
import tensorflow_text as text
2023-11-16 14:15:17.221359: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-11-16 14:15:18.023785: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer.so.7'; dlerror: libnvinfer.so.7: cannot open shared object file: No such file or directory
2023-11-16 14:15:18.023888: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libnvinfer_plugin.so.7'; dlerror: libnvinfer_plugin.so.7: cannot open shared object file: No such file or directory
2023-11-16 14:15:18.023899: W tensorflow/compiler/tf2tensorrt/utils/py_utils.cc:38] TF-TRT Warning: Cannot dlopen some TensorRT libraries. If you would like to use Nvidia GPU with TensorRT, please make sure the missing libraries mentioned above are installed properly.

ROUGE-L

Rouge-L 指標是介於 0 到 1 之間的分數,表示兩個序列的相似程度,評估依據是最長共同子序列 (LCS) 的長度。具體來說,Rouge-L 是加權調和平均數 (或 f-measure),結合了 LCS 精確度 (假設序列中由 LCS 涵蓋的百分比) 和 LCS 召回率 (參考序列中由 LCS 涵蓋的百分比)。

來源:https://www.microsoft.com/en-us/research/publication/rouge-a-package-for-automatic-evaluation-of-summaries/

TF.Text 實作會針對每個 (假設、參考) 配對傳回 F-measure、精確度和召回率。

考慮以下假設/參考配對

hypotheses = tf.ragged.constant([['captain', 'of', 'the', 'delta', 'flight'],
                                 ['the', '1990', 'transcript']])
references = tf.ragged.constant([['delta', 'air', 'lines', 'flight'],
                                 ['this', 'concludes', 'the', 'transcript']])
2023-11-16 14:15:19.800786: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2023-11-16 14:15:19.800884: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublas.so.11'; dlerror: libcublas.so.11: cannot open shared object file: No such file or directory
2023-11-16 14:15:19.800949: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcublasLt.so.11'; dlerror: libcublasLt.so.11: cannot open shared object file: No such file or directory
2023-11-16 14:15:19.801005: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcufft.so.10'; dlerror: libcufft.so.10: cannot open shared object file: No such file or directory
2023-11-16 14:15:19.858671: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcusparse.so.11'; dlerror: libcusparse.so.11: cannot open shared object file: No such file or directory
2023-11-16 14:15:19.858893: W tensorflow/core/common_runtime/gpu/gpu_device.cc:1934] Cannot dlopen some GPU libraries. Please make sure the missing libraries mentioned above are installed properly if you would like to use GPU. Follow the guide at https://tensorflow.dev.org.tw/install/gpu for how to download and setup the required libraries for your platform.
Skipping registering GPU devices...

假設和參考預期為符記的 tf.RaggedTensors。需要符記而非原始句子,因為沒有單一符記化策略適合所有任務。

現在我們可以呼叫 text.metrics.rouge_l 並取回結果

result = text.metrics.rouge_l(hypotheses, references)
print('F-Measure: %s' % result.f_measure)
print('P-Measure: %s' % result.p_measure)
print('R-Measure: %s' % result.r_measure)
F-Measure: tf.Tensor([0.44444448 0.57142854], shape=(2,), dtype=float32)
P-Measure: tf.Tensor([0.4       0.6666667], shape=(2,), dtype=float32)
R-Measure: tf.Tensor([0.5 0.5], shape=(2,), dtype=float32)

ROUGE-L 還有一個額外超參數 alpha,可決定用於計算 F-Measure 的調和平均數權重。值越接近 0,表示越重視召回率;值越接近 1,表示越重視精確度。alpha 預設為 0.5,這表示精確度和召回率的權重相等。

# Compute ROUGE-L with alpha=0
result = text.metrics.rouge_l(hypotheses, references, alpha=0)
print('F-Measure (alpha=0): %s' % result.f_measure)
print('P-Measure (alpha=0): %s' % result.p_measure)
print('R-Measure (alpha=0): %s' % result.r_measure)
F-Measure (alpha=0): tf.Tensor([0.5 0.5], shape=(2,), dtype=float32)
P-Measure (alpha=0): tf.Tensor([0.4       0.6666667], shape=(2,), dtype=float32)
R-Measure (alpha=0): tf.Tensor([0.5 0.5], shape=(2,), dtype=float32)
# Compute ROUGE-L with alpha=1
result = text.metrics.rouge_l(hypotheses, references, alpha=1)
print('F-Measure (alpha=1): %s' % result.f_measure)
print('P-Measure (alpha=1): %s' % result.p_measure)
print('R-Measure (alpha=1): %s' % result.r_measure)
F-Measure (alpha=1): tf.Tensor([0.4       0.6666667], shape=(2,), dtype=float32)
P-Measure (alpha=1): tf.Tensor([0.4       0.6666667], shape=(2,), dtype=float32)
R-Measure (alpha=1): tf.Tensor([0.5 0.5], shape=(2,), dtype=float32)