TensorFlow 圖形最佳化與 Grappler

在 TensorFlow.org 上檢視

在 Google Colab 中執行

在 GitHub 上檢視原始碼

下載筆記本

總覽

TensorFlow 同時使用圖形和立即執行來執行計算。tf.Graph 包含一組 tf.Operation 物件 (運算元)，代表計算單位，以及 tf.Tensor 物件，代表在運算元之間流動的資料單位。

Grappler 是 TensorFlow 執行階段中的預設圖形最佳化系統。Grappler 在圖形模式 (在 tf.function 內) 中套用最佳化，透過圖形簡化和其他高階最佳化 (例如內嵌函式主體以啟用跨程序最佳化) 來提升 TensorFlow 計算的效能。最佳化 tf.Graph 也會減少裝置的峰值記憶體用量，並透過最佳化圖形節點到運算資源的對應來提升硬體使用率。

使用 tf.config.optimizer.set_experimental_options() 更精細地控制您的 tf.Graph 最佳化。

可用的圖形最佳化工具

Grappler 透過名為 MetaOptimizer 的頂層驅動程式執行圖形最佳化。下列圖形最佳化工具適用於 TensorFlow

常數折疊最佳化工具 - 在可能的情況下，透過折疊圖形中的常數節點來靜態推斷張量的值，並使用常數具體化結果。
算術最佳化工具 - 透過消除通用子運算式和簡化算術陳述式來簡化算術運算。
版面配置最佳化工具 - 最佳化張量版面配置，以更有效率地執行資料格式相關的運算 (例如卷積)。
重新對應器最佳化工具 - 透過將常見的子圖形替換為最佳化的融合單體核心，將子圖形重新對應到更有效率的實作。
記憶體最佳化工具 - 分析圖形以檢查每個運算的峰值記憶體用量，並插入 CPU-GPU 記憶體複製運算，以將 GPU 記憶體交換到 CPU，從而減少峰值記憶體用量。
依附元件最佳化工具 - 移除或重新排列控制依附元件，以縮短模型步驟的關鍵路徑或啟用其他最佳化。也會移除實際上為空運算 (例如 Identity) 的節點。
剪枝最佳化工具 - 從圖形中剪除對輸出沒有影響的節點。通常會先執行此工具，以縮減圖形大小並加快其他 Grappler 通行中的處理速度。
函式最佳化工具 - 最佳化 TensorFlow 程式的函式庫，並內嵌函式主體以啟用其他跨程序最佳化。
形狀最佳化工具 - 最佳化對形狀和形狀相關資訊進行運算的子圖形。
自動平行化最佳化工具 - 透過沿著批次維度分割來自動平行化圖形。此最佳化工具預設為關閉。
迴圈最佳化工具 - 透過將迴圈不變子圖形提升到迴圈外部，以及移除迴圈中多餘的堆疊運算，來最佳化圖形控制流程。也會最佳化具有靜態已知行程計數的迴圈，並移除條件陳述式中靜態已知的無效分支。
範圍配置器最佳化工具 - 引入範圍配置器以減少資料移動並合併某些運算。
釘選到主機最佳化工具 - 將小型運算交換到 CPU。此最佳化工具預設為關閉。
自動混合精度最佳化工具 - 在適用的情況下將資料類型轉換為 float16 以提升效能。目前僅適用於 GPU。
偵錯剝離器 - 從圖形中剝離與偵錯運算相關的節點，例如 tf.debugging.Assert、tf.debugging.check_numerics 和 tf.print。此最佳化工具預設為關閉。

設定

import numpy as np
import timeit
import traceback
import contextlib


import tensorflow as tf

建立內容管理員以輕鬆切換最佳化工具狀態。

@contextlib.contextmanager
def options(options):
  old_opts = tf.config.optimizer.get_experimental_options()
  tf.config.optimizer.set_experimental_options(options)
  try:
    yield
  finally:
    tf.config.optimizer.set_experimental_options(old_opts)

比較啟用和停用 Grappler 時的執行效能

TensorFlow 2 及更高版本預設以立即模式執行。使用 tf.function 將預設執行切換為圖形模式。Grappler 會在背景自動執行，以套用上述圖形最佳化並提升執行效能。

常數折疊最佳化工具

作為初步範例，請考慮一個對常數執行運算並傳回輸出的函式。

def test_function_1():
  @tf.function
  def simple_function(input_arg):
    print('Tracing!')
    a = tf.constant(np.random.randn(2000,2000), dtype = tf.float32)
    c = a
    for n in range(50):
      c = c@a
    return tf.reduce_mean(c+input_arg)

  return simple_function

關閉常數折疊最佳化工具並執行函式

with options({'constant_folding': False}):
  print(tf.config.optimizer.get_experimental_options())
  simple_function = test_function_1()
  # Trace once
  x = tf.constant(2.2)
  simple_function(x)
  print("Vanilla execution:", timeit.timeit(lambda: simple_function(x), number = 1), "s")

啟用常數折疊最佳化工具並再次執行函式，以觀察函式執行速度加快。

with options({'constant_folding': True}):
  print(tf.config.optimizer.get_experimental_options())
  simple_function = test_function_1()
  # Trace once
  x = tf.constant(2.2)
  simple_function(x)
  print("Constant folded execution:", timeit.timeit(lambda: simple_function(x), number = 1), "s")

偵錯剝離器最佳化工具

考慮一個簡單的函式，該函式會檢查其輸入引數的數值並傳回該引數。

def test_function_2():
  @tf.function
  def simple_func(input_arg):
    output = input_arg
    tf.debugging.check_numerics(output, "Bad!")
    return output
  return simple_func

首先，在關閉偵錯剝離器最佳化工具的情況下執行函式。

test_func = test_function_2()
p1 = tf.constant(float('inf'))
try:
  test_func(p1)
except tf.errors.InvalidArgumentError as e:
  traceback.print_exc(limit=2)

tf.debugging.check_numerics 因為 test_func 的 Inf 引數而引發無效引數錯誤。

啟用偵錯剝離器最佳化工具並再次執行函式。

with options({'debug_stripper': True}):
  test_func2 = test_function_2()
  p1 = tf.constant(float('inf'))
  try:
    test_func2(p1)
  except tf.errors.InvalidArgumentError as e:
    traceback.print_exc(limit=2)

偵錯剝離器最佳化工具會從圖形中剝離 tf.debug.check_numerics 節點，並執行函式而不會引發任何錯誤。

摘要

TensorFlow 執行階段使用 Grappler 在執行前自動最佳化圖形。使用 tf.config.optimizer.set_experimental_options 啟用或停用各種圖形最佳化工具。

如需 Grappler 的詳細資訊，請參閱 TensorFlow 圖形最佳化。