YOLO モデルの学習の事例

Welcome to Mashykom WebSite

≡ Pytorch Menu

YOLO(You Only Look Once) モデルの学習の実際

　PyTorch はディープラーニングを実装する際に用いられるディープラーニング用ライブラリのPython APIの一つです。もともとは、Torch7と呼ばれるLua言語で書かれたライブラリでした。Chainerは日本のPreferred Networks社が開発したライブラリですが、Pytorchに統合されました。Caffe2もPyTorchに併合されました。現在、PyTorch は Team PyTorch によって開発されています。PyTorchの利点はDefine by Run（動的計算グラフ）と呼ばれる特徴です。Define by Runは入力データのサイズや次元数に合わせてニューラルネットワークの形や計算方法を変更することができます。

　PyTorchは、Define by Runという特徴ゆえに、AIを開発する専門家に必須のアイテムになりつつあります。Object Detection用のライブラリの中では処理速度が最も最速です。現在、PytorchはPytorch team によって開発が担われていますが、中心集団はFacebookの研究者たちです。Caffe2及びDetectronを統合しつつ開発が進められています。

　このページでは、PyTorch を利用した物体検出の Python 実装において重要な役割を果たしているYOLO(You Only Look Once)モデルを学習する実際の事例について取り上げます。YOLOのバージョンはV.2からV.5まであります。YOLOv.3までは、当時大学院生だったJoseph Chet Redmon 氏が開発したものです。YOLO v.3はDarknetというC 言語コードで書かれたスクリプトをコンパイルした実行ファイルを用いて処理を行います。こちらのホームページを参照ください。その後、Aleksey Bochkovskiy氏がDarknetの Githubに依拠して、YOLOV.4の開発を行いました。そのGithub サイトはgithub.com/AlexeyAB/darknetです。

　2020年に入って、Roboflowという会社が Pytorch 向けの YOLOV4 を発表しました。そのGithubがここです。さらに最近（2020年6月に）、Ultralytics社がPytorch向けのYOLO v.5 を公開しました。そのGithubはgithub.com/ultralytics/yolov5です。YOLOv5は、検出精度と演算負荷に応じてs、m、l、xまでの4モデルがあります。YOLOv5のsモデルを使用することで、YOLOv3のfullモデルに近い性能を、1/4以下の演算量で達成することができています。YOLOv5をYOLO名称の系列に含めるべきか否かという論争も起きています。その後（2020年12月）、YOLOv4 並びに EfficientDet の性能を超えると言われる Scaled-YOLOv4モデルが公開され、Darknetを用いた実装と、Pytorchを用いたPython実装が開発されています。

　このページでは、Pytorch-YOLO モデルと Darknet-YOLO モデルのカスタム・データセットを用いた学習、トレーニングの実際の事例を取り上げます。Pytorch のインストール方法は公式ページに行き、install ページの手順に従って、Run this Command:のコマンドラインをコピペして下さい。2021年3月時点での最新バージョンはPytorch 1.8 です。以下の説明も注意書きがしていない限り、このバージョンで実証しています。PyTorch による物体検出に関する説明は、Pytorchを用いた物体検出のページを参照ください。

　Darknetを扱った学習事例におけるC 言語コードのコンパイルの方法は、Google Colabを前提に記述しますので、Linux系OSの方法に従います。WindowsでのC言語コードのビルドの方法は説明していません。

Last updated: 2021.3.25（first uploaded 2021.2.10）

YOLOv5モデルの学習:coco128を用いた例

　YOLOモデルの学習の具体的手順について説明します。最初に、最もシンプルな ultralytics/yolov5を用いた学習の実際を取り上げます。学習にはGPUが必要なので、Google Colabで実装することにします。このColabにあるコードで実行します。

　最初に、Github repoをcloneし、必要なモジュールを読み込みます。ディレクトリyolov5に移動して、GPUの可能性をチェックします。



!git clone https://github.com/ultralytics/yolov5  # clone repo
!pip install -r yolov5/requirements.txt  # install dependencies
%cd yolov5

import torch
from IPython.display import Image, clear_output  # to display images
from utils.google_utils import gdrive_download  # to download models/datasets

clear_output()
print('Setup complete. Using torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

次に、学習済みモデルの重み yolov5{x}.pt をダウンロードします。


!python detect.py --weights yolov5s.pt --img 416 --conf 0.4 --source ./data/images/
Image(filename='runs/detect/exp/zidane.jpg', width=600)

　とdetect.pyを実行すると、自動的にモデルの重みyolovs.ptがダウンロードされます。

　なお、４種類の重みをダウンロードしたいときは


!bash weights/download_weights.sh

　とします。yolov5i.pt, yolov5m.pt, yolov5s.pt, yolov5x.ptの４種類の重みがダウンロードされます。

　ここでは、学習のデータとして、COCO2017の中から128個のデータだけを使用します。coco128という名前で用意されています。このcoco128 dataset、`./data/coco128.yaml`、をダウンロードします。


# Download COCO128
torch.hub.download_url_to_file('https://github.com/ultralytics/yolov5/releases/download/v1.0/coco128.zip', 'tmp.zip')
!unzip -q tmp.zip -d ../ && rm tmp.zip

　tensorboardを開始させます。


# Start tensorboard
%load_ext tensorboard
%tensorboard --logdir runs

yolov5s モデルを用いて、回数5 epochs で学習させます。実際のデータでは、300-1000 epochs が必要とされます。各種の設定は
model configuration file `--cfg ./models/yolo5s.yaml`
dataset configuration file `--data ./data/coco128.yaml`
Start training from pretrained `--weights yolov5s.pt`, or from scratch using `--weights ''`
the directory to save results `--name tutorial`
とします。


# Train YOLOv5s on coco128 for 5 epochs
!python train.py --img 640 --batch 16 --epochs 5 --data ./data/coco128.yaml --cfg ./models/yolov5s.yaml --weights yolov5s.pt --name tutorial --nosave --cache

　以下のような結果が表示されます。

--- 
---
autoanchor: Analyzing anchors... anchors/target = 4.26, Best Possible Recall (BPR) = 0.9946
Image sizes 640 train, 640 test
Using 2 dataloader workers
Logging results to runs/train/tutorial
Starting training for 5 epochs...

     Epoch   gpu_mem       box       obj       cls     total   targets  img_size
       0/4     3.21G   0.04237   0.06417   0.02121    0.1277       183       640: 100% 8/8 [00:05<00:00,  1.46it/s]
               Class      Images     Targets           P           R      mAP@.5  mAP@.5:.95: 100% 4/4 [00:07<00:00,  1.88s/it]
                 all         128         929       0.642       0.638       0.661        0.43

     Epoch   gpu_mem       box       obj       cls     total   targets  img_size
       1/4     6.73G    0.0443     0.064   0.01899    0.1273       166       640: 100% 8/8 [00:02<00:00,  3.94it/s]
               Class      Images     Targets           P           R      mAP@.5  mAP@.5:.95: 100% 4/4 [00:01<00:00,  3.48it/s]
                 all         128         929       0.662       0.625       0.658       0.432

     Epoch   gpu_mem       box       obj       cls     total   targets  img_size
       2/4     6.73G   0.04504   0.06833   0.01908    0.1324       182       640: 100% 8/8 [00:01<00:00,  4.00it/s]
               Class      Images     Targets           P           R      mAP@.5  mAP@.5:.95: 100% 4/4 [00:01<00:00,  3.35it/s]
                 all         128         929       0.656       0.628       0.662       0.434

     Epoch   gpu_mem       box       obj       cls     total   targets  img_size
       3/4     6.73G   0.04356   0.06578    0.0172    0.1265       256       640: 100% 8/8 [00:02<00:00,  3.82it/s]
               Class      Images     Targets           P           R      mAP@.5  mAP@.5:.95: 100% 4/4 [00:01<00:00,  3.19it/s]
                 all         128         929       0.694       0.616       0.664       0.433

     Epoch   gpu_mem       box       obj       cls     total   targets  img_size
       4/4     6.73G    0.0437   0.06515   0.01908    0.1279       289       640: 100% 8/8 [00:02<00:00,  3.86it/s]
               Class      Images     Targets           P           R      mAP@.5  mAP@.5:.95: 100% 4/4 [00:02<00:00,  1.41it/s]
                 all         128         929       0.689       0.617       0.669       0.439
Optimizer stripped from runs/train/tutorial/weights/last.pt, 14.8MB
Optimizer stripped from runs/train/tutorial/weights/best.pt, 14.8MB
5 epochs completed in 0.010 hours.

　すべての学習の結果は runs/train/に蓄積されます。学習された重みは、runs/train/tutorial/weights/ に保存されたことが分かります。last.pt、best.pt の２種類です。確かめてみましょう。


!ls runs/train/tutorial/

confusion_matrix.png				   results.txt
events.out.tfevents.1615266650.45d2d3616093.179.0  test_batch0_labels.jpg
F1_curve.png					   test_batch0_pred.jpg
hyp.yaml					   test_batch1_labels.jpg
labels_correlogram.jpg				   test_batch1_pred.jpg
labels.jpg					   test_batch2_labels.jpg
opt.yaml					   test_batch2_pred.jpg
P_curve.png					   train_batch0.jpg
PR_curve.png					   train_batch1.jpg
R_curve.png					   train_batch2.jpg
results.png					   weights

　学習結果を表示してみましょう。


Image(filename='runs/train/tutorial/train_batch0.jpg', width=800)  # train batch 0 mosaics and labels
Image(filename='runs/train/tutorial/test_batch0_labels.jpg', width=800)  # test batch 0 labels
Image(filename='runs/train/tutorial/test_batch0_pred.jpg', width=800)  # test batch 0 predictions

　test_batch0_pred.jpg は test_batch0.jpg からの検出結果を表示しています。

　学習過程の損失と成果は results.txt logfile に蓄積されているので、表示できます。results.pngとして表示します。


from utils.plots import plot_results 
plot_results(save_dir='runs/train/tutorial')  # plot all results*.txt as results.png
Image(filename='runs/train/tutorial/results.png', width=800)

　使用された画像を調べるためには、例えば、


!ls ../coco128/images/train2017

　とすると、coco128データのtrainの画像がリストが表示できます。この中の任意の画像から物体検出をしたい場合、


!python detect.py --weights runs/train/tutorial/weights/best.pt --img 416 --conf 0.4 --source ../coco128/images/train2017/*370.jpg

　と検出を実行します。ここでは、画像番号 *370.jpg を使用しました。

実行するたびに、runs/train/exp2, runs/train/exp3 で表現される通り、結果は exp3,4,5,...とフォルダーが作成され、保存されます。結果は以下のコードを実行すると表示できます。


Image(filename='runs/detect/exp7/000000000370.jpg', width=600)

370.jpg

　ここでの学習では、websiteに用意されているデータ、coco128 datasets、を利用しました。web上で公開されているdatasetsの代表例として、Roboflowがあります。

　実際には、個人が自らのdatasets を作成して、学習をする必要に迫られます。yolov5におけるdatasetsは、画像用のxyz.jpg とそれに対応するラベリング（アノテーション）の xyz.txt からなるペアの集合です。そして、モデル作成のためのデータ(教師データ、検証データ)格納場所、クラス分類数やクラス名についての設定ファイル data.yaml を作成する必要があります。

　画像のラベリング（アノテーション）を作成するためには、的確なソフトを利用する必要があるでしょう。yoloの仕様に合わせたアノテーションを作成できるソフトとして、labelImgがあります。

YOLOv5モデルの学習:Roboflowのデータを用いた例

　次に、Roboflowにあるデータセットをダウンロードして、それを用いたモデルの学習を取り上げます。その1例が、「Training a Custom Object Detection Model With Yolo-V5」に紹介されています。この解説に沿って、実行してみましょう。

以下の順序で進めます。Google Colab上のノートブックで行います。GPUが利用可能なPC実機で行うときも、Google Driveへのmountの部分を若干修正すれば実行できます。

Preparing the dataset
Environment Setup: Install YOLOv5 dependencies
Setup the data and the directories
Setup the YAML files for training
Training the model
Evaluate the model
Visualize the training data
Running inference on test images
Export the weight files for later use

　「Preparing the dataset」の部分は、スクラッチから自身で作成することはせず、Roboflowからのデータセットを借りることにします。画像のアノテーションもデータセットに含まれています。 Aquarium Datasetをダウンロードします。Githubのアカウント持っていれば、容易にダウンロードできます。ダウンロードしたzipファイルをroboflow.zipという名前で保存します。Google Colabを使用するので、My\ Drive にフォルダーYoloV5を作成して、そのなかに、roboflow.zipをアップロードします。

　Environment Setup: Install YOLOv5 dependenciesをGoogle Colabで行います。このColab を開いてください。


!git clone https://github.com/ultralytics/yolov5  # clone repo
%cd yolov5
%pip install -qr requirements.txt  # install dependencies

import torch
from IPython.display import Image, clear_output  # to display images

clear_output()
print('Setup complete. Using torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

　と入力します。
Setup complete. Using torch 1.8.0+cu101 _CudaDeviceProperties(name='Tesla T4', major=7, minor=5, total_memory=15109MB, multi_processor_count=40)
と表示されます。


!python detect.py --weights yolov5s.pt --img 640 --conf 0.25 --source data/images/
Image(filename='runs/detect/exp/zidane.jpg', width=600)

として、作動状況をチェックします。正常ならOK。

　「Setup the data and the directories」を行います。My\ Driveにmountを実行して、roboflow.zipファイルを読み込み、展開します。


from google.colab import drive
drive.mount('/content/drive')


!cp /content/drive/My\ Drive/YoloV5/roboflow.zip /content/
!unzip /content/roboflow.zip 
!rm /content/roboflow.zip


!ls /content/yolov5

************ result***********
data	    LICENSE		 requirements.txt  train.py	   yolov5s.pt
data.yaml   models		 runs		   tutorial.ipynb
detect.py   README.dataset.txt	 test		   utils
Dockerfile  README.md		 test.py	   valid
hubconf.py  README.roboflow.txt  train		   weights

　データセットがyolov5のなかに、test、train 、valid、data.yamlとして保存されました。

　次に、「Setup the YAML files for training」を行います。


%cat data.yaml

************ result*****
train: ../train/images
val: ../valid/images

nc: 7
names: ['fish', 'jellyfish', 'penguin', 'puffin', 'shark', 'starfish', 'stingray']

　data.yamlはこのような中身になっています。train及びvalのパスが異なっているので、
train: ./train/images
val: ./valid/images
と修正します。Google Colab上での修正は若干複雑なコードが必要です。モデルの設定ファイル yalov5s.yamlを修正します。なぜなら、yolov5s.yamlでは numbe of classes が80 となっているので、ネットワークの構造を同じとしても、roboflowのデータはnumber of classes が７なので、nc:7 とする必要があります。この修正したyamlファイルをcustom_yolov5s.yamlとします。この修正も若干複雑です。


# Below we are changing the configuration so that it becomes compatible to number of classes required in this project
%%writetemplate /content/yolov5/models/custom_yolov5s.yaml

# parameters
nc: {num_classes}  # number of classes  # CHANGED HERE
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple

# anchors
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Focus, [64, 3]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, BottleneckCSP, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 9, BottleneckCSP, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, BottleneckCSP, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 1, SPP, [1024, [5, 9, 13]]],
   [-1, 3, BottleneckCSP, [1024, False]],  # 9
  ]

# YOLOv5 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, BottleneckCSP, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, BottleneckCSP, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, BottleneckCSP, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, BottleneckCSP, [1024, False]],  # 23 (P5/32-large)

   [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

この部分のコードの説明は省略します。自身のPCで実行するときは、このような修正はエディターを使えばすごく単純です。

　「Training the model」に着手します。


%%time
%cd /content/yolov5/
!python train.py --img 416 --batch 80 --epochs 100 --data './data.yaml' --cfg ./models/custom_yolov5s.yaml --weights ''

---
---
   Epoch   gpu_mem       box       obj       cls     total    labels  img_size
     95/99     6.14G   0.08579   0.06447   0.03307    0.1833       589       416: 100% 6/6 [00:17<00:00,  2.85s/it]
               Class      Images      Labels           P           R      mAP@.5  mAP@.5:.95: 100% 1/1 [00:02<00:00,  2.67s/it]
                 all         127         909       0.366       0.215      0.0866      0.0207

     Epoch   gpu_mem       box       obj       cls     total    labels  img_size
     96/99     6.14G   0.08445   0.06218   0.03317    0.1798       540       416: 100% 6/6 [00:16<00:00,  2.79s/it]
               Class      Images      Labels           P           R      mAP@.5  mAP@.5:.95: 100% 1/1 [00:02<00:00,  2.18s/it]
                 all         127         909       0.679      0.0885      0.0678      0.0156

     Epoch   gpu_mem       box       obj       cls     total    labels  img_size
     97/99     6.14G    0.0846   0.06318   0.03267    0.1805       692       416: 100% 6/6 [00:17<00:00,  2.91s/it]
               Class      Images      Labels           P           R      mAP@.5  mAP@.5:.95: 100% 1/1 [00:02<00:00,  2.03s/it]
                 all         127         909       0.668      0.0975      0.0669      0.0145

     Epoch   gpu_mem       box       obj       cls     total    labels  img_size
     98/99     6.14G   0.08446   0.06337   0.03221      0.18       741       416: 100% 6/6 [00:17<00:00,  2.90s/it]
               Class      Images      Labels           P           R      mAP@.5  mAP@.5:.95: 100% 1/1 [00:02<00:00,  2.26s/it]
                 all         127         909       0.657       0.112      0.0693      0.0148

     Epoch   gpu_mem       box       obj       cls     total    labels  img_size
     99/99     6.14G   0.08437   0.06652   0.03114     0.182       652       416: 100% 6/6 [00:17<00:00,  2.88s/it]
               Class      Images      Labels           P           R      mAP@.5  mAP@.5:.95: 100% 1/1 [00:06<00:00,  6.02s/it]
                 all         127         909      0.0607       0.291       0.067      0.0145
                fish         127         459       0.071       0.558       0.176       0.039
           jellyfish         127         155      0.0648         0.6       0.137      0.0286
             penguin         127         104      0.0387       0.394      0.0467      0.0113
              puffin         127          74     0.00646       0.027     0.00689     0.00128
               shark         127          57      0.0812       0.386       0.073      0.0162
            starfish         127          27       0.163      0.0741      0.0249     0.00483
            stingray         127          33           0           0     0.00458    0.000619
Optimizer stripped from runs/train/exp/weights/last.pt, 14.8MB
Optimizer stripped from runs/train/exp/weights/best.pt, 14.8MB
100 epochs completed in 0.585 hours.

CPU times: user 4.61 s, sys: 597 ms, total: 5.21 s
Wall time: 35min 27s

　学習が終了するまで約35分かかります。学習の結果、モデルの重みが runs/train/exp/weights/best.pt に保存されます。

　「Running inference on test images」を行います。学習されたモデルの重みを用いて実行します。


# use the best weights!
# Final weights will be by-default stored at /content/yolov5/runs/train/exp/weights/best.pt
%cd /content/yolov5/
!python detect.py --weights /content/yolov5/runs/train/exp2/weights/best.pt --img 416 --conf 0.4 --source ./test/images

　結果を表示します。


#display inference on ALL test images
#this looks much better with longer training above

import glob
from IPython.display import Image, display

for imageName in glob.glob('/content/yolov5/runs/detect/exp2/*.jpg'): #assuming JPG
    display(Image(filename=imageName))
    print("\n")

roboflow.jpg

　検出精度はかなり低いですね。データの個数が少ないのと、エポック数が少ないからだと思われます。

　学習したモデルの重みを保存します。


%cp /content/yolov5/runs/train/exp/weights/best.pt /content/gdrive/My\ Drive/YoloV5

　「Export the weight files for later use」が完了しました。この後の利用の仕方は省略します。

YOLOv4-Darknetモデルの学習

　YOLOv4-Darknet モデルは実行ファイルのdarknetを使用することもあり、学習により多くの時間が必要です。AlexeyABさんのGithub repoにある「 how to train (to detect your custom objects)」に従って準備します。

　自身で用意するデータセットには、obj.names, obj.data, train.txt, test.txt という設定ファイルが必要です。objという名称は任意です。obj.names は物体検出の対象となる物体の名称を設定します。obj.data はデータのクラス数および画像データへのパスを記述します。train.txt は学習用の画像データのリスト、test.txt はテスト用の画像データのリストを記述します。自身の dataset はディレクトリ/data/ の下に以下のように配置します。ディレクトリdata/obj/の下に画像データとそのラベルデータを配置します。
data/obj.names
data/obj.data
data/train.txt
data/test.txt
data/obj/ images & labels are placed in this directory

　以下で用いる画像データセット（血液画像のBCCD dataset）のラベル化された物体の種類（クラス数）は3 なので、obj.data は


classes = 3
train  = data/train.txt
valid  = data/test.txt
names = data/obj.names
backup = backup/

となります。obj.namesは


Platelets
RBC
WBC

です。plateletsは血小板、RBC（Red Blood Cell）は赤血球、WBC（White Blood Cell）は白血球です。/p>

　この「how to train」に従えば、用いるデータに対応して、yolov4-custom.cfg を書き換える必要があります。修正の具体的な内容は以下のように記述されています。

batch を batch=64 に変更
subdivisions を subdivisions=16 に変更
max_batches を classes*2000 、ただし、画像数より多く、 6000 以下に： f.e. max_batches=6000 if you train for 3 classes
steps を max_batches の 80% と 90% に： f.e. steps=4800,5400
network size の設定で、width=416 height=416 に、または、 any value multiple of 32
３つの各[yolo]-layers で classes=80 を使用するデータの物体のクラス数に変更： line no. 970, 1058, 1146
各[YOLO]層の前にある[convolutional]層にある[filters=255]をfilters=(classes + 5)x3に変更：この[convolutional]層は [YOLO]層の直前に位置するものだけ、合計３つ. line no. 963,1051,1139

　yolov4-custom.cfg は非常に長いので、Colabで行うのが複雑なので、Githubのrepoにあるファイルで書き直します。変更したcfg ファイルをcustom_yolov4_detector.cfg と名前をつけて、ディレクトリcfg のなかに保存します。このように修正を加えたYOLOv4のパッケージを私のGithub repoにアップしました。これを gi clone して利用します。


# clone darknet repo
!git clone https://github.com/mashyko/darknet

　OpenCVとGPUを使用できるように Makefile を書き換えて、コンパイルして、darkent 実行ファイルを作成します。


# change makefile to have GPU and OPENCV enabled
%cd darknet
!sed -i 's/OPENCV=0/OPENCV=1/' Makefile
!sed -i 's/GPU=0/GPU=1/' Makefile
!sed -i 's/CUDNN=0/CUDNN=1/' Makefile
!sed -i 's/CUDNN_HALF=0/CUDNN_HALF=1/' Makefile

# make darknet (builds darknet so that you can then use the darknet executable file to run or train object detectors)

!make

　以上で、darknet の実行ファイルがビルドされました。次に、pre-trained weights-file (162 MB): yolov4.conv.137 をダウンロードします。これを学習の時の初期重みとして利用します。


#download the newly released yolov4 ConvNet weights
%cd /content/darknet
!wget https://github.com/AlexeyAB/darknet/releases/download/darknet_yolo_v3_optimal/yolov4.conv.137

　custom dataset を用意します。RoboflowにあるBCCD Datasetをダウンロードします。ダウンロードする前に、このサイトに行って、keyを取得しておく必要があります。


#if you already have YOLO darknet format, you can skip this step
%cd /content/darknet
!curl -L "https://app.roboflow.com/ds/[your key here]" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip

　BCCD Dataset を取得したら、obj.data 、obj.names 、および、train.txt、 test.txt ファイルを作成します。


#Set up training file directories for custom dataset
%cd /content/darknet/
%cp train/_darknet.labels data/obj.names
%mkdir data/obj
#copy image and labels
%cp train/*.jpg data/obj/
%cp valid/*.jpg data/obj/

%cp train/*.txt data/obj/
%cp valid/*.txt data/obj/

with open('data/obj.data', 'w') as out:
  out.write('classes = 3\n')
  out.write('train = data/train.txt\n')
  out.write('valid = data/valid.txt\n')
  out.write('names = data/obj.names\n')
  out.write('backup = backup/')

#write train file (just the image list)
import os

with open('data/train.txt', 'w') as out:
  for img in [f for f in os.listdir('train') if f.endswith('jpg')]:
    out.write('data/obj/' + img + '\n')

#write the valid file (just the image list)
import os

with open('data/valid.txt', 'w') as out:
  for img in [f for f in os.listdir('valid') if f.endswith('jpg')]:
    out.write('data/obj/' + img + '\n')

　学習を開始します。


!./darknet detector train data/obj.data cfg/custom-yolov4-detector.cfg yolov4.conv.137 -dont_show -map
#If you get CUDA out of memory adjust subdivisions above!
#adjust max batches down for shorter training above

　1時間以上学習を続けましたが、学習は終了しませんでした。おそらく、数時間かかるでしょう。しかし、1時間程度の学習結果を用いても、検出精度はそれなりにでます。バウンディング・ボックスの精度が低いです。この結果は、このColabで確認できます。

BCCD.png

　YOLOv4-darknetモデルよりも非常に短い時間で学習ができるPytorch版のScaled-YOLOv4の学習を取り上げます。上記のYOLOv5モデルの学習と類似の手順を用いた Scaled-YOLOv4のカスタムデータによる学習の事例を見てみましょう。

　この事例はこのColabにあります。

***********
To be continued
***********

ページの先頭に戻る

PyTorch 入門編のページに行く

トップページに戻る

ご質問、コメントなどはこちらからメール送信して下さい。