作者: fchollet
![]() |
![]() |
![]() |
![]() |
設定
import tensorflow as tf
import keras
from keras import layers
何時使用 Sequential 模型
Sequential
模型適用於純粹的層堆疊,其中每一層都只有一個輸入張量和一個輸出張量。
以下 Sequential
模型示意圖
# Define Sequential model with 3 layers
model = keras.Sequential(
[
layers.Dense(2, activation="relu", name="layer1"),
layers.Dense(3, activation="relu", name="layer2"),
layers.Dense(4, name="layer3"),
]
)
# Call model on a test input
x = tf.ones((3, 3))
y = model(x)
相當於這個函式
# Create 3 layers
layer1 = layers.Dense(2, activation="relu", name="layer1")
layer2 = layers.Dense(3, activation="relu", name="layer2")
layer3 = layers.Dense(4, name="layer3")
# Call layers on a test input
x = tf.ones((3, 3))
y = layer3(layer2(layer1(x)))
在下列情況下,Sequential 模型不適用:
- 您的模型有多個輸入或多個輸出
- 您的任何層有多個輸入或多個輸出
- 您需要進行層共用
- 您想要非線性拓撲 (例如殘差連線、多分支模型)
建立 Sequential 模型
您可以將層清單傳遞至 Sequential 建構函式來建立 Sequential 模型
model = keras.Sequential(
[
layers.Dense(2, activation="relu"),
layers.Dense(3, activation="relu"),
layers.Dense(4),
]
)
其層可透過 layers
屬性存取
model.layers
[<keras.src.layers.core.dense.Dense at 0x7fa3c8de0100>, <keras.src.layers.core.dense.Dense at 0x7fa3c8de09a0>, <keras.src.layers.core.dense.Dense at 0x7fa5181b5c10>]
您也可以透過 add()
方法以累加方式建立 Sequential 模型
model = keras.Sequential()
model.add(layers.Dense(2, activation="relu"))
model.add(layers.Dense(3, activation="relu"))
model.add(layers.Dense(4))
請注意,還有對應的 pop()
方法可移除層:Sequential 模型非常像層清單。
model.pop()
print(len(model.layers)) # 2
2
另請注意,Sequential 建構函式接受 name
引數,就像 Keras 中的任何層或模型一樣。這對於使用語意上有意義的名稱註解 TensorBoard 圖表非常有用。
model = keras.Sequential(name="my_sequential")
model.add(layers.Dense(2, activation="relu", name="layer1"))
model.add(layers.Dense(3, activation="relu", name="layer2"))
model.add(layers.Dense(4, name="layer3"))
預先指定輸入形狀
一般來說,Keras 中的所有層都需要知道其輸入的形狀,才能建立其權重。因此,當您像這樣建立層時,一開始它沒有權重
layer = layers.Dense(3)
layer.weights # Empty
[]
它會在第一次對輸入呼叫時建立其權重,因為權重的形狀取決於輸入的形狀
# Call layer on a test input
x = tf.ones((1, 4))
y = layer(x)
layer.weights # Now it has weights, of shape (4, 3) and (3,)
[<tf.Variable 'dense_6/kernel:0' shape=(4, 3) dtype=float32, numpy= array([[ 0.1752373 , 0.47623062, 0.24374962], [-0.0298934 , 0.50255656, 0.78478384], [-0.58323103, -0.56861055, -0.7190975 ], [-0.3191281 , -0.23635858, -0.8841506 ]], dtype=float32)>, <tf.Variable 'dense_6/bias:0' shape=(3,) dtype=float32, numpy=array([0., 0., 0.], dtype=float32)>]
當然,這也適用於 Sequential 模型。當您在沒有輸入形狀的情況下具現化 Sequential 模型時,它不會「建構」:它沒有權重 (且呼叫 model.weights
會導致錯誤,說明了這一點)。權重會在模型第一次看到一些輸入資料時建立
model = keras.Sequential(
[
layers.Dense(2, activation="relu"),
layers.Dense(3, activation="relu"),
layers.Dense(4),
]
) # No weights at this stage!
# At this point, you can't do this:
# model.weights
# You also can't do this:
# model.summary()
# Call the model on a test input
x = tf.ones((1, 4))
y = model(x)
print("Number of weights after calling the model:", len(model.weights)) # 6
Number of weights after calling the model: 6
模型「建構」完成後,您可以呼叫其 summary()
方法來顯示其內容
model.summary()
Model: "sequential_3" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense_7 (Dense) (1, 2) 10 dense_8 (Dense) (1, 3) 9 dense_9 (Dense) (1, 4) 16 ================================================================= Total params: 35 (140.00 Byte) Trainable params: 35 (140.00 Byte) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________
但是,以累加方式建構 Sequential 模型時,能夠顯示模型的摘要 (包括目前的輸出形狀) 會非常有用。在這種情況下,您應該從將 Input
物件傳遞至模型開始,以便模型從一開始就知道其輸入形狀
model = keras.Sequential()
model.add(keras.Input(shape=(4,)))
model.add(layers.Dense(2, activation="relu"))
model.summary()
Model: "sequential_4" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense_10 (Dense) (None, 2) 10 ================================================================= Total params: 10 (40.00 Byte) Trainable params: 10 (40.00 Byte) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________
請注意,Input
物件不會顯示為 model.layers
的一部分,因為它不是層
model.layers
[<keras.src.layers.core.dense.Dense at 0x7fa3bc0ba820>]
一個簡單的替代方法是將 input_shape
引數傳遞至您的第一層
model = keras.Sequential()
model.add(layers.Dense(2, activation="relu", input_shape=(4,)))
model.summary()
Model: "sequential_5" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= dense_11 (Dense) (None, 2) 10 ================================================================= Total params: 10 (40.00 Byte) Trainable params: 10 (40.00 Byte) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________
以此類預先定義的輸入形狀建構的模型,始終具有權重 (即使在看到任何資料之前),並且始終具有已定義的輸出形狀。
一般來說,如果您知道 Sequential 模型的輸入形狀,建議的最佳做法是始終預先指定輸入形狀。
常見的偵錯工作流程:add()
+ summary()
在建構新的 Sequential 架構時,以累加方式透過 add()
堆疊層並經常列印模型摘要非常有用。例如,這可讓您監控 Conv2D
和 MaxPooling2D
層堆疊如何對影像特徵圖進行降採樣
model = keras.Sequential()
model.add(keras.Input(shape=(250, 250, 3))) # 250x250 RGB images
model.add(layers.Conv2D(32, 5, strides=2, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))
# Can you guess what the current output shape is at this point? Probably not.
# Let's just print it:
model.summary()
# The answer was: (40, 40, 32), so we can keep downsampling...
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(3))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.Conv2D(32, 3, activation="relu"))
model.add(layers.MaxPooling2D(2))
# And now?
model.summary()
# Now that we have 4x4 feature maps, time to apply global max pooling.
model.add(layers.GlobalMaxPooling2D())
# Finally, we add a classification layer.
model.add(layers.Dense(10))
Model: "sequential_6" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 123, 123, 32) 2432 conv2d_1 (Conv2D) (None, 121, 121, 32) 9248 max_pooling2d (MaxPooling2 (None, 40, 40, 32) 0 D) ================================================================= Total params: 11680 (45.62 KB) Trainable params: 11680 (45.62 KB) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________ Model: "sequential_6" _________________________________________________________________ Layer (type) Output Shape Param # ================================================================= conv2d (Conv2D) (None, 123, 123, 32) 2432 conv2d_1 (Conv2D) (None, 121, 121, 32) 9248 max_pooling2d (MaxPooling2 (None, 40, 40, 32) 0 D) conv2d_2 (Conv2D) (None, 38, 38, 32) 9248 conv2d_3 (Conv2D) (None, 36, 36, 32) 9248 max_pooling2d_1 (MaxPoolin (None, 12, 12, 32) 0 g2D) conv2d_4 (Conv2D) (None, 10, 10, 32) 9248 conv2d_5 (Conv2D) (None, 8, 8, 32) 9248 max_pooling2d_2 (MaxPoolin (None, 4, 4, 32) 0 g2D) ================================================================= Total params: 48672 (190.12 KB) Trainable params: 48672 (190.12 KB) Non-trainable params: 0 (0.00 Byte) _________________________________________________________________
非常實用,對吧?
擁有模型後該怎麼做
模型架構準備就緒後,您會想要
- 訓練您的模型、評估模型,並執行推論。請參閱我們關於使用內建迴圈進行訓練和評估的指南
- 將您的模型儲存到磁碟並還原。請參閱我們關於序列化和儲存的指南。
- 透過利用多個 GPU 來加速模型訓練。請參閱我們關於多 GPU 和分散式訓練的指南。
使用 Sequential 模型進行特徵擷取
建構 Sequential 模型後,它的行為就像 Functional API 模型。這表示每一層都有 input
和 output
屬性。這些屬性可用於執行巧妙的操作,例如快速建立一個模型,擷取 Sequential 模型中所有中繼層的輸出
initial_model = keras.Sequential(
[
keras.Input(shape=(250, 250, 3)),
layers.Conv2D(32, 5, strides=2, activation="relu"),
layers.Conv2D(32, 3, activation="relu"),
layers.Conv2D(32, 3, activation="relu"),
]
)
feature_extractor = keras.Model(
inputs=initial_model.inputs,
outputs=[layer.output for layer in initial_model.layers],
)
# Call feature extractor on test input.
x = tf.ones((1, 250, 250, 3))
features = feature_extractor(x)
以下是一個類似範例,僅從一層擷取特徵
initial_model = keras.Sequential(
[
keras.Input(shape=(250, 250, 3)),
layers.Conv2D(32, 5, strides=2, activation="relu"),
layers.Conv2D(32, 3, activation="relu", name="my_intermediate_layer"),
layers.Conv2D(32, 3, activation="relu"),
]
)
feature_extractor = keras.Model(
inputs=initial_model.inputs,
outputs=initial_model.get_layer(name="my_intermediate_layer").output,
)
# Call feature extractor on test input.
x = tf.ones((1, 250, 250, 3))
features = feature_extractor(x)
使用 Sequential 模型進行遷移學習
遷移學習包括凍結模型中的底層,僅訓練頂層。如果您不熟悉遷移學習,請務必閱讀我們的遷移學習指南。
以下是兩個常見的遷移學習藍圖,涉及 Sequential 模型。
首先,假設您有一個 Sequential 模型,並且想要凍結除最後一層以外的所有層。在這種情況下,您只需疊代 model.layers
並在每一層上設定 layer.trainable = False
,除了最後一層。像這樣
model = keras.Sequential([
keras.Input(shape=(784)),
layers.Dense(32, activation='relu'),
layers.Dense(32, activation='relu'),
layers.Dense(32, activation='relu'),
layers.Dense(10),
])
# Presumably you would want to first load pre-trained weights.
model.load_weights(...)
# Freeze all layers except the last one.
for layer in model.layers[:-1]:
layer.trainable = False
# Recompile and train (this will only update the weights of the last layer).
model.compile(...)
model.fit(...)
另一個常見的藍圖是使用 Sequential 模型來堆疊預先訓練的模型和一些新初始化的分類層。像這樣
# Load a convolutional base with pre-trained weights
base_model = keras.applications.Xception(
weights='imagenet',
include_top=False,
pooling='avg')
# Freeze the base model
base_model.trainable = False
# Use a Sequential model to add a trainable classifier on top
model = keras.Sequential([
base_model,
layers.Dense(1000),
])
# Compile & train
model.compile(...)
model.fit(...)
如果您進行遷移學習,您可能會發現自己經常使用這兩種模式。
這就是您需要了解的關於 Sequential 模型的所有資訊!
若要深入瞭解如何在 Keras 中建構模型,請參閱