注意： TensorFlow Lite 現在是 Google AI Edge 的一部分。最新的文件現已移至 ai.google.dev/edge/lite。瞭解詳情

使用 TensorFlow Lite Model Maker 進行物件偵測

在 TensorFlow.org 上檢視

在 Google Colab 中執行

在 GitHub 上檢視原始碼

下載筆記本

在這個 Colab 筆記本中，您將學習如何使用 TensorFlow Lite Model Maker 函式庫，訓練自訂物件偵測模型，以便在行動裝置上偵測圖片中的沙拉。

Model Maker 函式庫使用遷移學習來簡化使用自訂資料集訓練 TensorFlow Lite 模型的流程。使用您自己的自訂資料集重新訓練 TensorFlow Lite 模型，可減少所需的訓練資料量並縮短訓練時間。

您將使用公開可用的沙拉資料集，該資料集是從 Open Images Dataset V4 建立的。

資料集中的每張圖片都包含標示為下列其中一個類別的物件

烘焙食品
起司
沙拉
海鮮
番茄

資料集包含指定每個物件位置的邊界框，以及物件的標籤。

以下是資料集中的範例圖片

先決條件

安裝必要的套件

首先安裝必要的套件，包括來自 GitHub 存放區的 Model Maker 套件，以及您將用於評估的 pycocotools 函式庫。

sudo apt -y install libportaudio2
pip install -q --use-deprecated=legacy-resolver tflite-model-maker
pip install -q pycocotools
pip install -q opencv-python-headless==4.1.2.30
pip uninstall -y tensorflow && pip install -q tensorflow==2.8.0

匯入必要的套件。

import numpy as np
import os

from tflite_model_maker.config import QuantizationConfig
from tflite_model_maker.config import ExportFormat
from tflite_model_maker import model_spec
from tflite_model_maker import object_detector

import tensorflow as tf
assert tf.__version__.startswith('2')

tf.get_logger().setLevel('ERROR')
from absl import logging
logging.set_verbosity(logging.ERROR)

準備資料集

在這裡，您將使用與 AutoML 快速入門相同的資料集。

沙拉資料集位於：gs://cloud-ml-data/img/openimage/csv/salads_ml_use.csv。

它包含 175 張用於訓練的圖片、25 張用於驗證的圖片和 25 張用於測試的圖片。資料集有五個類別：沙拉、海鮮、番茄、烘焙食品、起司。

資料集以 CSV 格式提供

TRAINING,gs://cloud-ml-data/img/openimage/3/2520/3916261642_0a504acd60_o.jpg,Salad,0.0,0.0954,,,0.977,0.957,,
VALIDATION,gs://cloud-ml-data/img/openimage/3/2520/3916261642_0a504acd60_o.jpg,Seafood,0.0154,0.1538,,,1.0,0.802,,
TEST,gs://cloud-ml-data/img/openimage/3/2520/3916261642_0a504acd60_o.jpg,Tomato,0.0,0.655,,,0.231,0.839,,

每一列對應於較大圖片內本地化的物件，每個物件都特別指定為測試、訓練或驗證資料。您將在本筆記本後面的階段中瞭解更多關於這方面的資訊。
此處包含的三行表示 位於同一張圖片內的三個不同物件，該圖片位於 gs://cloud-ml-data/img/openimage/3/2520/3916261642_0a504acd60_o.jpg。
每一列都有不同的標籤：沙拉、海鮮、番茄等。
邊界框是使用左上角和右下角頂點為每張圖片指定的。

以下是這三行的視覺化

如果您想瞭解更多關於如何準備您自己的 CSV 檔案以及建立有效資料集的最低要求，請參閱準備您的訓練資料指南以瞭解更多詳細資訊。

如果您是 Google Cloud 的新手，您可能會想知道 gs:// URL 的含義。它們是儲存在 Google Cloud Storage (GCS) 上的檔案 URL。如果您將 GCS 上的檔案公開或驗證您的用戶端，Model Maker 可以像讀取您的本機檔案一樣讀取這些檔案。

但是，您不需要將圖片保存在 Google Cloud 上即可使用 Model Maker。您可以在 CSV 檔案中使用本機路徑，Model Maker 也能正常運作。

快速入門

訓練物件偵測模型有六個步驟

步驟 1. 選擇物件偵測模型架構。

本教學課程使用 EfficientDet-Lite0 模型。EfficientDet-Lite[0-4] 是一系列行動裝置/IoT 友善的物件偵測模型，衍生自 EfficientDet 架構。

以下是每個 EfficientDet-Lite 模型相互比較的效能。

模型架構	大小 (MB)*	延遲時間 (毫秒)**	平均精確度***
EfficientDet-Lite0	4.4	37	25.69%
EfficientDet-Lite1	5.8	49	30.55%
EfficientDet-Lite2	7.2	69	33.97%
EfficientDet-Lite3	11.4	116	37.70%
EfficientDet-Lite4	19.9	260	41.96%

* 整數量化模型的大小。
** 在 Pixel 4 上使用 CPU 上的 4 個執行緒測量的延遲時間。
*** 平均精確度是 COCO 2017 驗證資料集上的 mAP（平均平均精確度）。

spec = model_spec.get('efficientdet_lite0')

步驟 2. 載入資料集。

Model Maker 將採用 CSV 格式的輸入資料。使用 object_detector.DataLoader.from_csv 方法載入資料集，並將其分割為訓練、驗證和測試圖片。

訓練圖片：這些圖片用於訓練物件偵測模型以識別沙拉食材。
驗證圖片：這些是模型在訓練過程中未看到的圖片。您將使用它們來決定何時應停止訓練，以避免過度擬合。
測試圖片：這些圖片用於評估最終模型效能。

您可以直接從 Google Cloud Storage 載入 CSV 檔案，但您不需要將圖片保存在 Google Cloud 上即可使用 Model Maker。您可以在電腦上指定本機 CSV 檔案，Model Maker 也能正常運作。

train_data, validation_data, test_data = object_detector.DataLoader.from_csv('gs://cloud-ml-data/img/openimage/csv/salads_ml_use.csv')

步驟 3. 使用訓練資料訓練 TensorFlow 模型。

EfficientDet-Lite0 模型預設使用 epochs = 50，這表示它將執行訓練資料集 50 次。您可以查看訓練期間的驗證準確度並提早停止以避免過度擬合。
在此處設定 batch_size = 8，以便您看到需要 21 個步驟才能完成訓練資料集中 175 張圖片的處理。
設定 train_whole_model=True 以微調整個模型，而不僅僅是訓練頭部層，以提高準確度。缺點是訓練模型可能需要更長的時間。

model = object_detector.create(train_data, model_spec=spec, batch_size=8, train_whole_model=True, validation_data=validation_data)

步驟 4. 使用測試資料評估模型。

在使用訓練資料集中的圖片訓練物件偵測模型後，使用測試資料集中剩餘的 25 張圖片來評估模型在面對從未見過的新資料時的效能。

由於預設批次大小為 64，因此需要 1 個步驟才能完成測試資料集中 25 張圖片的處理。

評估指標與 COCO 相同。

model.evaluate(test_data)

步驟 5. 匯出為 TensorFlow Lite 模型。

透過指定您要將量化模型匯出到的資料夾，將已訓練的物件偵測模型匯出為 TensorFlow Lite 格式。預設的訓練後量化技術是完整整數量化。

model.export(export_dir='.')

步驟 6. 評估 TensorFlow Lite 模型。

匯出到 TFLite 時，有幾個因素會影響模型準確度

量化有助於將模型大小縮小 4 倍，但會犧牲一些準確度。
原始 TensorFlow 模型針對後處理使用每個類別的非最大值抑制 (NMS)，而 TFLite 模型使用全域 NMS，速度更快但準確度較低。Keras 最多輸出 100 個偵測結果，而 tflite 最多輸出 25 個偵測結果。

因此，您必須評估匯出的 TFLite 模型，並將其準確度與原始 TensorFlow 模型進行比較。

model.evaluate_tflite('model.tflite', test_data)

您可以使用 Colab 左側邊欄下載 TensorFlow Lite 模型檔案。在 model.tflite 檔案上按一下滑鼠右鍵，然後選擇 下載，將其下載到您的本機電腦。

此模型可以使用 ObjectDetector API 的 TensorFlow Lite Task Library 整合到 Android 或 iOS 應用程式中。

請參閱 TFLite 物件偵測範例應用程式，以瞭解更多關於如何在工作應用程式中使用模型的詳細資訊。

(選用) 在您的圖片上測試 TFLite 模型

您可以使用來自網際網路的圖片測試已訓練的 TFLite 模型。

將下方的 INPUT_IMAGE_URL 替換為您想要的輸入圖片。
調整 DETECTION_THRESHOLD 以變更模型的靈敏度。較低的閾值表示模型將偵測到更多物件，但也會有更多誤報。同時，較高的閾值表示模型將僅偵測到它有信心偵測到的物件。

雖然目前在 Python 中執行模型需要一些重複程式碼，但將模型整合到行動應用程式中只需要幾行程式碼。

載入已訓練的 TFLite 模型並定義一些視覺化函式

import cv2

from PIL import Image

model_path = 'model.tflite'

# Load the labels into a list
classes = ['???'] * model.model_spec.config.num_classes
label_map = model.model_spec.config.label_map
for label_id, label_name in label_map.as_dict().items():
  classes[label_id-1] = label_name

# Define a list of colors for visualization
COLORS = np.random.randint(0, 255, size=(len(classes), 3), dtype=np.uint8)

def preprocess_image(image_path, input_size):
  """Preprocess the input image to feed to the TFLite model"""
  img = tf.io.read_file(image_path)
  img = tf.io.decode_image(img, channels=3)
  img = tf.image.convert_image_dtype(img, tf.uint8)
  original_image = img
  resized_img = tf.image.resize(img, input_size)
  resized_img = resized_img[tf.newaxis, :]
  resized_img = tf.cast(resized_img, dtype=tf.uint8)
  return resized_img, original_image


def detect_objects(interpreter, image, threshold):
  """Returns a list of detection results, each a dictionary of object info."""

  signature_fn = interpreter.get_signature_runner()

  # Feed the input image to the model
  output = signature_fn(images=image)

  # Get all outputs from the model
  count = int(np.squeeze(output['output_0']))
  scores = np.squeeze(output['output_1'])
  classes = np.squeeze(output['output_2'])
  boxes = np.squeeze(output['output_3'])

  results = []
  for i in range(count):
    if scores[i] >= threshold:
      result = {
        'bounding_box': boxes[i],
        'class_id': classes[i],
        'score': scores[i]
      }
      results.append(result)
  return results


def run_odt_and_draw_results(image_path, interpreter, threshold=0.5):
  """Run object detection on the input image and draw the detection results"""
  # Load the input shape required by the model
  _, input_height, input_width, _ = interpreter.get_input_details()[0]['shape']

  # Load the input image and preprocess it
  preprocessed_image, original_image = preprocess_image(
      image_path,
      (input_height, input_width)
    )

  # Run object detection on the input image
  results = detect_objects(interpreter, preprocessed_image, threshold=threshold)

  # Plot the detection results on the input image
  original_image_np = original_image.numpy().astype(np.uint8)
  for obj in results:
    # Convert the object bounding box from relative coordinates to absolute
    # coordinates based on the original image resolution
    ymin, xmin, ymax, xmax = obj['bounding_box']
    xmin = int(xmin * original_image_np.shape[1])
    xmax = int(xmax * original_image_np.shape[1])
    ymin = int(ymin * original_image_np.shape[0])
    ymax = int(ymax * original_image_np.shape[0])

    # Find the class index of the current object
    class_id = int(obj['class_id'])

    # Draw the bounding box and label on the image
    color = [int(c) for c in COLORS[class_id]]
    cv2.rectangle(original_image_np, (xmin, ymin), (xmax, ymax), color, 2)
    # Make adjustments to make the label visible for all objects
    y = ymin - 15 if ymin - 15 > 15 else ymin + 15
    label = "{}: {:.0f}%".format(classes[class_id], obj['score'] * 100)
    cv2.putText(original_image_np, label, (xmin, y),
        cv2.FONT_HERSHEY_SIMPLEX, 0.5, color, 2)

  # Return the final image
  original_uint8 = original_image_np.astype(np.uint8)
  return original_uint8

執行物件偵測並顯示偵測結果

INPUT_IMAGE_URL = "https://storage.googleapis.com/cloud-ml-data/img/openimage/3/2520/3916261642_0a504acd60_o.jpg"
DETECTION_THRESHOLD = 0.3

TEMP_FILE = '/tmp/image.png'

!wget -q -O $TEMP_FILE $INPUT_IMAGE_URL
im = Image.open(TEMP_FILE)
im.thumbnail((512, 512), Image.ANTIALIAS)
im.save(TEMP_FILE, 'PNG')

# Load the TFLite model
interpreter = tf.lite.Interpreter(model_path=model_path)
interpreter.allocate_tensors()

# Run inference and draw detection result on the local copy of the original file
detection_result_image = run_odt_and_draw_results(
    TEMP_FILE,
    interpreter,
    threshold=DETECTION_THRESHOLD
)

# Show the detection result
Image.fromarray(detection_result_image)

(選用) 為 Edge TPU 編譯

現在您有了一個量化的 EfficientDet Lite 模型，可以編譯並部署到 Coral EdgeTPU。

步驟 1. 安裝 EdgeTPU 編譯器

 curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add -

 echo "deb https://packages.cloud.google.com/apt coral-edgetpu-stable main" | sudo tee /etc/apt/sources.list.d/coral-edgetpu.list

 sudo apt-get update

 sudo apt-get install edgetpu-compiler

步驟 2. 選取 Edge TPU 的數量，編譯

EdgeTPU 具有 8MB 的 SRAM，用於快取模型參數 (更多資訊)。這表示對於大於 8MB 的模型，為了傳輸模型參數，推論時間將會增加。避免這種情況的一種方法是模型管線化 - 將模型分割成可以具有專用 EdgeTPU 的區段。這可以顯著改善延遲時間。

下表可用作要使用的 Edge TPU 數量的參考 - 較大的模型將無法為單個 TPU 編譯，因為中間張量無法容納在晶片上記憶體中。

模型架構	最少 TPU	建議 TPU
EfficientDet-Lite0	1	1
EfficientDet-Lite1	1	1
EfficientDet-Lite2	1	2
EfficientDet-Lite3	2	2
EfficientDet-Lite4	2	3

NUMBER_OF_TPUS =  1

!edgetpu_compiler model.tflite --num_segments=$NUMBER_OF_TPUS

步驟 3. 下載、執行模型

在編譯模型後，現在可以在 EdgeTPU 上執行它們以進行物件偵測。首先，使用 Colab 左側邊欄下載編譯後的 TensorFlow Lite 模型檔案。在 model_edgetpu.tflite 檔案上按一下滑鼠右鍵，然後選擇 下載，將其下載到您的本機電腦。

現在您可以以您偏好的方式執行模型。偵測範例包括

進階使用

本節涵蓋進階使用主題，例如調整模型和訓練超參數。

載入資料集

載入您自己的資料

您可以上傳您自己的資料集來完成本教學課程。使用 Colab 中的左側邊欄上傳您的資料集。

Upload File

如果您不想將資料集上傳到雲端，您也可以按照指南在本機執行函式庫。

以不同的資料格式載入您的資料

Model Maker 函式庫也支援 object_detector.DataLoader.from_pascal_voc 方法，以載入具有 PASCAL VOC 格式的資料。makesense.ai 和 LabelImg 是可以註釋圖片並將註釋儲存為 PASCAL VOC 資料格式 XML 檔案的工具

object_detector.DataLoader.from_pascal_voc(image_dir, annotations_dir, label_map={1: "person", 2: "notperson"})

自訂 EfficientDet 模型超參數

您可以調整的模型和訓練管線參數包括

model_dir：儲存模型檢查點檔案的位置。如果未設定，將使用暫存目錄。
steps_per_execution：每次訓練執行的步驟數。
moving_average_decay：Float。用於維持已訓練參數移動平均值的衰減。
var_freeze_expr：用於對應要凍結的變數前綴名稱的規則運算式，這表示在訓練期間保持不變。更具體地說，在程式碼中使用 re.match(var_freeze_expr, variable_name) 來對應要凍結的變數。
tflite_max_detections：整數，預設為 25。TFLite 模型中輸出偵測結果的最大數量。
strategy：指定要使用的分佈策略的字串。接受的值為 'tpu'、'gpus'、None。tpu' 表示使用 TPUStrategy。'gpus' 表示使用 MirroredStrategy 進行多 GPU 處理。如果為 None，則使用 TF 預設值 OneDeviceStrategy。
tpu：用於訓練的 Cloud TPU。這應該是建立 Cloud TPU 時使用的名稱，或是 grpc://ip.address.of.tpu:8470 url。
use_xla：即使策略不是 tpu 也使用 XLA。如果策略是 tpu，則始終使用 XLA，並且此標誌無效。
profile：啟用設定檔模式。
debug：啟用偵錯模式。

可以在 hparams_config.py 中查看可以調整的其他參數。

例如，您可以設定 var_freeze_expr='efficientnet'，這會凍結名稱前綴為 efficientnet 的變數（預設值為 '(efficientnet|fpn_cells|resample_p6)'）。這允許模型凍結不可訓練的變數，並在整個訓練過程中保持其值不變。

spec = model_spec.get('efficientdet_lite0')
spec.config.var_freeze_expr = 'efficientnet'

變更模型架構

您可以透過變更 model_spec 來變更模型架構。例如，將 model_spec 變更為 EfficientDet-Lite4 模型。

spec = model_spec.get('efficientdet_lite4')

調整訓練超參數

create 函式是 Model Maker 函式庫用於建立模型的主要函式。model_spec 參數定義模型規格。object_detector.EfficientDetSpec 類別目前受到支援。create 函式包含以下步驟

根據 model_spec 建立物件偵測模型。
訓練模型。預設 epoch 和預設批次大小由 model_spec 物件中的 epochs 和 batch_size 變數設定。您也可以調整訓練超參數，例如影響模型準確度的 epochs 和 batch_size。例如：

epochs：整數，預設為 50。更多 epoch 可以實現更好的準確度，但也可能導致過度擬合。
batch_size：整數，預設為 64。在一個訓練步驟中要使用的樣本數。
train_whole_model：布林值，預設為 False。如果為 true，則訓練整個模型。否則，僅訓練與 var_freeze_expr 不符的層。

例如，您可以使用較少的 epoch 進行訓練，並且僅訓練頭部層。您可以增加 epoch 的數量以獲得更好的結果。

model = object_detector.create(train_data, model_spec=spec, epochs=10, validation_data=validation_data)

匯出為不同格式

匯出格式可以是以下其中之一或清單

預設情況下，它僅匯出包含模型 metadata 的 TensorFlow Lite 模型檔案，以便您稍後在裝置端 ML 應用程式中使用。標籤檔案嵌入在 metadata 中。

在許多裝置端 ML 應用程式中，模型大小是一個重要因素。因此，建議您量化模型以使其更小並可能更快地執行。至於 EfficientDet-Lite 模型，預設使用完整整數量化來量化模型。請參閱訓練後量化以瞭解更多詳細資訊。

model.export(export_dir='.')

您也可以選擇匯出與模型相關的其他檔案，以便更好地檢查。例如，同時匯出儲存的模型和標籤檔案，如下所示

model.export(export_dir='.', export_format=[ExportFormat.SAVED_MODEL, ExportFormat.LABEL])

自訂 TensorFlow Lite 模型上的訓練後量化

訓練後量化是一種轉換技術，可以減少模型大小和推論延遲時間，同時還可以提高 CPU 和硬體加速器推論速度，但模型準確度會略有下降。因此，它被廣泛用於優化模型。

Model Maker 函式庫在匯出模型時應用預設的訓練後量化技術。如果您想自訂訓練後量化，Model Maker 支援使用 QuantizationConfig 的多個訓練後量化選項。讓我們以 float16 量化為例。首先，定義量化設定。

config = QuantizationConfig.for_float16()

然後，我們使用此設定匯出 TensorFlow Lite 模型。

model.export(export_dir='.', tflite_filename='model_fp16.tflite', quantization_config=config)

閱讀更多

您可以閱讀我們的物件偵測範例以瞭解技術細節。如需更多資訊，請參閱

TensorFlow Lite Model Maker 指南和 API 參考資料。
Task Library：ObjectDetector 用於部署。
端對端參考應用程式：Android、iOS 和 Raspberry PI。