基本訓練迴圈

在 TensorFlow.org 上檢視

在 Google Colab 中執行

在 GitHub 上檢視原始碼

下載筆記本

在先前的指南中，您已瞭解張量、變數、梯度帶和模組。在本指南中，您將整合所有這些內容來訓練模型。

TensorFlow 也包含 tf.Keras API，這是一種高階神經網路 API，可提供實用的抽象概念來減少重複性程式碼。不過，在本指南中，您將使用基本類別。

設定

import tensorflow as tf

import matplotlib.pyplot as plt

colors = plt.rcParams['axes.prop_cycle'].by_key()['color']

解決機器學習問題

解決機器學習問題通常包含下列步驟

取得訓練資料。
定義模型。
定義損失函數。
執行訓練資料，計算與理想值的損失
計算該損失的梯度，並使用最佳化工具調整變數以符合資料。
評估您的結果。

為了說明目的，在本指南中，您將開發一個簡單的線性模型 \(f(x) = x * W + b\)，其中有兩個變數：\(W\) (權重) 和 \(b\) (偏差)。

這是最基本的機器學習問題：給定 \(x\) 和 \(y\)，嘗試透過簡單線性迴歸找出線的斜率和偏移。

資料

監督式學習使用輸入 (通常表示為 x) 和輸出 (表示為 y，通常稱為標籤)。目標是從成對的輸入和輸出中學習，以便您可以從輸入預測輸出的值。

在 TensorFlow 中，資料的每個輸入幾乎總是使用張量表示，而且通常是向量。在監督式訓練中，輸出 (或您想要預測的值) 也是張量。

以下是一些透過將高斯 (常態) 雜訊新增至沿線點所合成的資料。

# The actual line
TRUE_W = 3.0
TRUE_B = 2.0

NUM_EXAMPLES = 201

# A vector of random x values
x = tf.linspace(-2,2, NUM_EXAMPLES)
x = tf.cast(x, tf.float32)

def f(x):
  return x * TRUE_W + TRUE_B

# Generate some noise
noise = tf.random.normal(shape=[NUM_EXAMPLES])

# Calculate y
y = f(x) + noise

# Plot all the data
plt.plot(x, y, '.')
plt.show()

張量通常會聚集在批次中，或將輸入和輸出的群組堆疊在一起。批次處理可以帶來一些訓練優勢，並且與加速器和向量化計算非常搭配。考量到此資料集有多小，您可以將整個資料集視為單一批次。

定義模型

使用 tf.Variable 來表示模型中的所有權重。tf.Variable 會儲存值，並在需要時以張量形式提供。如需更多詳細資訊，請參閱變數指南。

使用 tf.Module 來封裝變數和計算。您可以使用任何 Python 物件，但這樣做可以輕鬆儲存。

在這裡，您將 w 和 b 都定義為變數。

class MyModel(tf.Module):
  def __init__(self, **kwargs):
    super().__init__(**kwargs)
    # Initialize the weights to `5.0` and the bias to `0.0`
    # In practice, these should be randomly initialized
    self.w = tf.Variable(5.0)
    self.b = tf.Variable(0.0)

  def __call__(self, x):
    return self.w * x + self.b

model = MyModel()

# List the variables tf.modules's built-in variable aggregation.
print("Variables:", model.variables)

# Verify the model works
assert model(3.0).numpy() == 15.0

初始變數在此處以固定方式設定，但 Keras 隨附許多初始設定子，您可以使用它們，無論是否搭配 Keras 的其餘部分。

定義損失函數

損失函數會測量模型針對給定輸入的輸出與目標輸出的相符程度。目標是在訓練期間盡量減少此差異。定義標準 L2 損失，也稱為「均方」誤差

# This computes a single loss value for an entire batch
def loss(target_y, predicted_y):
  return tf.reduce_mean(tf.square(target_y - predicted_y))

在訓練模型之前，您可以透過繪製紅色模型的預測和藍色訓練資料來視覺化損失值

plt.plot(x, y, '.', label="Data")
plt.plot(x, f(x), label="Ground truth")
plt.plot(x, model(x), label="Predictions")
plt.legend()
plt.show()

print("Current loss: %1.6f" % loss(y, model(x)).numpy())

定義訓練迴圈

訓練迴圈包含依序重複執行三項工作

將一批輸入傳送通過模型以產生輸出
透過比較輸出與輸出 (或標籤) 來計算損失
使用梯度帶尋找梯度
使用這些梯度最佳化變數

在此範例中，您可以使用梯度下降來訓練模型。

梯度下降方案有很多變體，這些變體都擷取在 tf.keras.optimizers 中。但在從第一原則建構的精神下，您將在 tf.GradientTape 的自動微分和 tf.assign_sub 的協助下自行實作基本數學來遞減值 (它結合了 tf.assign 和 tf.sub)

# Given a callable model, inputs, outputs, and a learning rate...
def train(model, x, y, learning_rate):

  with tf.GradientTape() as t:
    # Trainable variables are automatically tracked by GradientTape
    current_loss = loss(y, model(x))

  # Use GradientTape to calculate the gradients with respect to W and b
  dw, db = t.gradient(current_loss, [model.w, model.b])

  # Subtract the gradient scaled by the learning rate
  model.w.assign_sub(learning_rate * dw)
  model.b.assign_sub(learning_rate * db)

如要查看訓練，您可以將同一批 x 和 y 傳送通過訓練迴圈，並查看 W 和 b 如何演變。

model = MyModel()

# Collect the history of W-values and b-values to plot later
weights = []
biases = []
epochs = range(10)

# Define a training loop
def report(model, loss):
  return f"W = {model.w.numpy():1.2f}, b = {model.b.numpy():1.2f}, loss={loss:2.5f}"


def training_loop(model, x, y):

  for epoch in epochs:
    # Update the model with the single giant batch
    train(model, x, y, learning_rate=0.1)

    # Track this before I update
    weights.append(model.w.numpy())
    biases.append(model.b.numpy())
    current_loss = loss(y, model(x))

    print(f"Epoch {epoch:2d}:")
    print("    ", report(model, current_loss))

執行訓練

current_loss = loss(y, model(x))

print(f"Starting:")
print("    ", report(model, current_loss))

training_loop(model, x, y)

繪製權重隨時間演變的圖

plt.plot(epochs, weights, label='Weights', color=colors[0])
plt.plot(epochs, [TRUE_W] * len(epochs), '--',
         label = "True weight", color=colors[0])

plt.plot(epochs, biases, label='bias', color=colors[1])
plt.plot(epochs, [TRUE_B] * len(epochs), "--",
         label="True bias", color=colors[1])

plt.legend()
plt.show()

視覺化訓練模型的效能

plt.plot(x, y, '.', label="Data")
plt.plot(x, f(x), label="Ground truth")
plt.plot(x, model(x), label="Predictions")
plt.legend()
plt.show()

print("Current loss: %1.6f" % loss(model(x), y).numpy())

相同的解決方案，但使用 Keras

將上述程式碼與 Keras 中的對等程式碼進行比較很有用。

如果您對 tf.keras.Model 進行子類別化，則定義模型看起來完全相同。請記住，Keras 模型最終會從模組繼承。

class MyModelKeras(tf.keras.Model):
  def __init__(self, **kwargs):
    super().__init__(**kwargs)
    # Initialize the weights to `5.0` and the bias to `0.0`
    # In practice, these should be randomly initialized
    self.w = tf.Variable(5.0)
    self.b = tf.Variable(0.0)

  def call(self, x):
    return self.w * x + self.b

keras_model = MyModelKeras()

# Reuse the training loop with a Keras model
training_loop(keras_model, x, y)

# You can also save a checkpoint using Keras's built-in support
keras_model.save_weights("my_checkpoint")

您可以利用 Keras 的內建功能作為捷徑，而不是每次建立模型時都編寫新的訓練迴圈。當您不想編寫或偵錯 Python 訓練迴圈時，這會很有用。

如果您這樣做，則需要使用 model.compile() 來設定參數，並使用 model.fit() 來訓練。使用 Keras 實作 L2 損失和梯度下降可能會減少程式碼，再次作為捷徑。Keras 損失和最佳化工具也可以在這些便利函式之外使用，而先前的範例也可以使用它們。

keras_model = MyModelKeras()

# compile sets the training parameters
keras_model.compile(
    # By default, fit() uses tf.function().  You can
    # turn that off for debugging, but it is on now.
    run_eagerly=False,

    # Using a built-in optimizer, configuring as an object
    optimizer=tf.keras.optimizers.SGD(learning_rate=0.1),

    # Keras comes with built-in MSE error
    # However, you could use the loss function
    # defined above
    loss=tf.keras.losses.mean_squared_error,
)

Keras fit 預期批次資料或完整資料集為 NumPy 陣列。NumPy 陣列會被切分為批次，預設批次大小為 32。

在這種情況下，為了符合手寫迴圈的行為，您應該將 x 以大小為 1000 的單一批次傳入。

print(x.shape[0])
keras_model.fit(x, y, epochs=10, batch_size=1000)

請注意，Keras 會在訓練後而非訓練前印出損失，因此第一個損失看起來較低，但除此之外，這基本上顯示相同的訓練效能。

後續步驟

在本指南中，您已瞭解如何使用張量、變數、模組和梯度帶的核心類別來建構和訓練模型，以及這些概念如何對應到 Keras。

但是，這是一個非常簡單的問題。如需更實用的簡介，請參閱自訂訓練逐步指南。

如需更多關於使用內建 Keras 訓練迴圈的資訊，請參閱本指南。如需更多關於訓練迴圈和 Keras 的資訊，請參閱本指南。如需編寫自訂分散式訓練迴圈，請參閱本指南。