Releases · ultralytics/yolov3

14 Nov 21:26

v9.6.0

7eb23e3

v9.6.0 - YOLOv5 v6.0 release compatibility update for YOLOv3 Latest

Latest

This release merges the most recent updates to YOLOv5 🚀 from the October 12th, 2021 YOLOv5 v6.0 release into this Ultralytics YOLOv3 repository. This is part of Ultralytics YOLOv3 maintenance and takes place on every major YOLOv5 release. Full details on the YOLOv5 v6.0 release are below.

https://github.com/ultralytics/yolov5/releases/tag/v6.0

This YOLOv5 v6.0 release incorporates many new features and bug fixes (465 PRs from 73 contributors) since our last release v5.0 in April, brings architecture tweaks, and also introduces new P5 and P6 'Nano' models: YOLOv5n and YOLOv5n6. Nano models maintain the YOLOv5s depth multiple of 0.33 but reduce the YOLOv5s width multiple from 0.50 to 0.25, resulting in ~75% fewer parameters, from 7.5M to 1.9M, ideal for mobile and CPU solutions.

Example usage:
python detect.py --weights yolov5n.pt --img 640    # Nano P5 model trained at --img 640 (28.4 mAP@0.5:0.95)
python detect.py --weights yolov5n6.pt --img 1280  # Nano P6 model trained at --img 1280 (34.0 mAP0.5:0.95)
Important Updates

Roboflow Integration ⭐ NEW: Train YOLOv5 models directly on any Roboflow dataset with our new integration! (ultralytics/yolov5#4975 by @Jacobsolawetz)

YOLOv5n 'Nano' models ⭐ NEW: New smaller YOLOv5n (1.9M params) model below YOLOv5s (7.5M params), exports to 2.1 MB INT8 size, ideal for ultralight mobile solutions. (ultralytics/yolov5#5027 by @glenn-jocher)

TensorFlow and Keras Export: TensorFlow, Keras, TFLite, TF.js model export now fully integrated using python export.py --include saved_model pb tflite tfjs (ultralytics/yolov5#1127 by @zldrobit)

OpenCV DNN: YOLOv5 ONNX models are now compatible with both OpenCV DNN and ONNX Runtime (ultralytics/yolov5#4833 by @SamFC10).

Model Architecture: Updated backbones are slightly smaller, faster and more accurate.

Replacement of Focus() with an equivalent Conv(k=6, s=2, p=2) layer (ultralytics/yolov5#4825 by @thomasbi1) for improved exportability

New SPPF() replacement for SPP() layer for reduced ops (ultralytics/yolov5#4420 by @glenn-jocher)

Reduction in P3 backbone layer C3() repeats from 9 to 6 for improved speeds

Reorder places SPPF() at end of backbone

Reintroduction of shortcut in the last C3() backbone layer

Updated hyperparameters with increased mixup and copy-paste augmentation

New Results

YOLOv5-P5 640 Figure (click to expand)

Figure Notes (click to expand)

COCO AP val denotes mAP@0.5:0.95 metric measured on the 5000-image COCO val2017 dataset over various inference sizes from 256 to 1536.

GPU Speed measures average inference time per image on COCO val2017 dataset using a AWS p3.2xlarge V100 instance at batch-size 32.

EfficientDet data from google/automl at batch size 8.

Reproduce by python val.py --task study --data coco.yaml --iou 0.7 --weights yolov5n6.pt yolov5s6.pt yolov5m6.pt yolov5l6.pt yolov5x6.pt

mAP improves from +0.3% to +1.1% across all models, and ~5% FLOPs reduction produces slight speed improvements and a reduced CUDA memory footprint. Example YOLOv5l before and after metrics:

YOLOv5l
^Large size
^(pixels) mAP^val
0.5:0.95 mAP^val
0.5 Speed
^{CPU b1
(ms)} Speed
^{V100 b1
(ms)} Speed
^{V100 b32
(ms)} params
^(M) FLOPs
^{@640 (B)}

v5.0 (previous) 640 48.2 66.9 457.9 11.6 2.8 47.0 115.4

v6.0 (this release) 640 48.8 67.2 424.5 10.9 2.7 46.5 109.1

Pretrained Checkpoints

Model size
^(pixels) mAP^val
0.5:0.95 mAP^val
0.5 Speed
^{CPU b1
(ms)} Speed
^{V100 b1
(ms)} Speed
^{V100 b32
(ms)} params
^(M) FLOPs
^{@640 (B)}

YOLOv5n 640 28.4 46.0 45 6.3 0.6 1.9 4.5

YOLOv5s 640 37.2 56.0 98 6.4 0.9 7.2 16.5

YOLOv5m 640 45.2 63.9 224 8.2 1.7 21.2 49.0

YOLOv5l 640 48.8 67.2 430 10.1 2.7 46.5 109.1

YOLOv5x 640 50.7 68.9 766 12.1 4.8 86.7 205.7

YOLOv5n6 1280 34.0 50.7 153 8.1 2.1 3.2 4.6

YOLOv5s6 1280 44.5 63.0 385 8.2 3.6 16.8 12.6

YOLOv5m6 1280 51.0 69.0 887 11.1 6.8 35.7 50.0

YOLOv5l6 1280 53.6 71.6 1784 15.8 10.5 76.8 111.4

YOLOv5x6
+ TTA 1280
1536 54.7
55.4 72.4
72.3 3136
- 26.2
- 19.4
- 140.7
- 209.8
-

Table Notes (click to expand)

All checkpoints are trained to 300 epochs with default settings. Nano models use hyp.scratch-low.yaml hyperparameters, all others use hyp.scratch-high.yaml.

mAP^val values are for single-model single-scale on COCO val2017 dataset.
Reproduce by python val.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65

Speed averaged over COCO val images using a AWS p3.2xlarge instance. NMS times (~1 ms/img) not included.
Reproduce by python val.py --data coco.yaml --img 640 --conf 0.25 --iou 0.45

TTA Test Time Augmentation includes reflection and scale augmentations.
Reproduce by python val.py --data coco.yaml --img 1536 --iou 0.7 --augment

Changelog

Changes between previous release and this release: ultralytics/yolov5@v5.0...v6.0
Changes since this release: ultralytics/yolov5@v6.0...HEAD

New Features and Bug Fixes (465)

YOLOv5 v5.0 Release patch 1 by @glenn-jocher in ultralytics/yolov5#2764

Flask REST API Example by @robmarkcole in ultralytics/yolov5#2732

ONNX Simplifier by @glenn-jocher in ultralytics/yolov5#2815

YouTube Bug Fix by @glenn-jocher in ultralytics/yolov5#2818

PyTorch Hub cv2 .save() .show() bug fix by @glenn-jocher in ultralytics/yolov5#2831

Create FUNDING.yml by @glenn-jocher in ultralytics/yolov5#2832

Update FUNDING.yml by @glenn-jocher in ultralytics/yolov5#2833

Fix ONNX dynamic axes export support with onnx simplifier, make onnx simplifier optional by @timstokman in ultralytics/yolov5#2856

Update increment_path() to handle file paths by @glenn-jocher in ultralytics/yolov5#2867

Detection cropping+saving feature addition for detect.py and PyTorch Hub by @Ab-Abdurrahman in ultralytics/yolov5#2827

Implement yaml.safe_load() by @glenn-jocher in ultralytics/yolov5#2876

Cleanup load_image() by @JoshSong in ultralytics/yolov5#2871

bug fix: switched rows and cols for correct detections in confusion matrix by @MichHeilig in ultralytics/yolov5#2883

VisDrone2019-DET Dataset Auto-Download by @glenn-jocher in ultralytics/yolov5#2882

Uppercase model filenames enabled by @r-blmnr in ultralytics/yolov5#2890

ACON activation function by @glenn-jocher in ultralytics/yolov5#2893

Explicit opt function arguments by @fcakyon in ultralytics/yolov5#2817

Update yolo.py by @glenn-jocher in ultralytics/yolov5#2899

Update google_utils.py by @glenn-jocher in ultralytics/yolov5#2900

Add detect.py --hide-conf --hide-labels --line-thickness options by @Ashafix in ultralytics/yolov5#2658

Default optimize_for_mobile() on TorchScript models by @glenn-jocher in ultralytics/yolov5#290...

YOLOv5l ^Large	size ^(pixels)	mAP^val 0.5:0.95	mAP^val 0.5	Speed ^{CPU b1 (ms)}	Speed ^{V100 b1 (ms)}	Speed ^{V100 b32 (ms)}	params ^(M)	FLOPs ^{@640 (B)}
v5.0 (previous)	640	48.2	66.9	457.9	11.6	2.8	47.0	115.4
v6.0 (this release)	640	48.8	67.2	424.5	10.9	2.7	46.5	109.1

Model	size ^(pixels)	mAP^val 0.5:0.95	mAP^val 0.5	Speed ^{CPU b1 (ms)}	Speed ^{V100 b1 (ms)}	Speed ^{V100 b32 (ms)}	params ^(M)	FLOPs ^{@640 (B)}
YOLOv5n	640	28.4	46.0	45	6.3	0.6	1.9	4.5
YOLOv5s	640	37.2	56.0	98	6.4	0.9	7.2	16.5
YOLOv5m	640	45.2	63.9	224	8.2	1.7	21.2	49.0
YOLOv5l	640	48.8	67.2	430	10.1	2.7	46.5	109.1
YOLOv5x	640	50.7	68.9	766	12.1	4.8	86.7	205.7

YOLOv5n6	1280	34.0	50.7	153	8.1	2.1	3.2	4.6
YOLOv5s6	1280	44.5	63.0	385	8.2	3.6	16.8	12.6
YOLOv5m6	1280	51.0	69.0	887	11.1	6.8	35.7	50.0
YOLOv5l6	1280	53.6	71.6	1784	15.8	10.5	76.8	111.4
YOLOv5x6 + TTA	1280 1536	54.7 55.4	72.4 72.3	3136 -	26.2 -	19.4 -	140.7 -	209.8 -

Contributors

timstokman, cgerum, and 80 other contributors

Assets 6

12 Apr 16:01

glenn-jocher

v9.5.0

8eb4cde

v9.5.0 - YOLOv5 v5.0 release compatibility update for YOLOv3

This YOLOv3 release merges the most recent updates to YOLOv5 featured in the April 11th, 2021 YOLOv5 v5.0 release into this repository. This is part of routine Ultralytics maintenance and takes place on every major YOLOv5 release. Full details on the YOLOv5 v5.0 release is below.

https://github.com/ultralytics/yolov5/releases/tag/v5.0

This release implements YOLOv5-P6 models and retrained YOLOv5-P5 models:

YOLOv5-P5 models (same architecture as v4.0 release): 3 output layers P3, P4, P5 at strides 8, 16, 32, trained at --img 640

YOLOv5-P6 models: 4 output layers P3, P4, P5, P6 at strides 8, 16, 32, 64 trained at --img 1280

Example usage:
# Command Line
python detect.py --weights yolov5m.pt --img 640  # P5 model at 640
python detect.py --weights yolov5m6.pt --img 640  # P6 model at 640
python detect.py --weights yolov5m6.pt --img 1280  # P6 model at 1280
# PyTorch Hub
model = torch.hub.load('ultralytics/yolov5', 'yolov5m6')  #  P6 model
results = model(imgs, size=1280)  # inference at 1280
All model sizes YOLOv5s/m/l/x are now available in P5 and P6 architectures:
python detect.py --weights yolov5s.pt  # P5 models
                           yolov5m.pt
                           yolov5l.pt
                           yolov5x.pt
                           yolov5s6.pt  # P6 models
                           yolov5m6.pt
                           yolov5l6.pt
                           yolov5x6.pt
Notable Updates

YouTube Inference: Direct inference from YouTube videos, i.e. python detect.py --source 'https://youtu.be/NUsoVlDFqZg'. Live streaming videos and normal videos supported. (ultralytics/yolov5#2752)

AWS Integration: Amazon AWS integration and new AWS Quickstart Guide for simple EC2 instance YOLOv5 training and resuming of interrupted Spot instances. (ultralytics/yolov5#2185)

Supervise.ly Integration: New integration with the Supervisely Ecosystem for training and deploying YOLOv5 models with Supervise.ly (ultralytics/yolov5#2518)

Improved W&B Integration: Allows saving datasets and models directly to Weights & Biases. This allows for --resume directly from W&B (useful for temporary environments like Colab), as well as enhanced visualization tools. See this blog by @AyushExel for details. (ultralytics/yolov5#2125)

Updated Results

P6 models include an extra P6/64 output layer for detection of larger objects, and benefit the most from training at higher resolution. For this reason we trained all P5 models at 640, and all P6 models at 1280.

YOLOv5-P5 640 Figure (click to expand)

Figure Notes (click to expand)

GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS.

EfficientDet data from google/automl at batch size 8.

April 11, 2021: v5.0 release: YOLOv5-P6 1280 models, AWS, Supervise.ly and YouTube integrations.

January 5, 2021: v4.0 release: nn.SiLU() activations, Weights & Biases logging, PyTorch Hub integration.

August 13, 2020: v3.0 release: nn.Hardswish() activations, data autodownload, native AMP.

July 23, 2020: v2.0 release: improved model definition, training and mAP.

Pretrained Checkpoints

Model size
^(pixels) mAP^val
0.5:0.95 mAP^{test
0.5:0.95} mAP^val
0.5 Speed
^{V100 (ms)} params
^(M) FLOPS
^{640 (B)}

YOLOv5s 640 36.7 36.7 55.4 2.0 7.3 17.0

YOLOv5m 640 44.5 44.5 63.3 2.7 21.4 51.3

YOLOv5l 640 48.2 48.2 66.9 3.8 47.0 115.4

YOLOv5x 640 50.4 50.4 68.8 6.1 87.7 218.8

YOLOv5s6 1280 43.3 43.3 61.9 4.3 12.7 17.4

YOLOv5m6 1280 50.5 50.5 68.7 8.4 35.9 52.4

YOLOv5l6 1280 53.4 53.4 71.1 12.3 77.2 117.7

YOLOv5x6 1280 54.4 54.4 72.0 22.4 141.8 222.9

YOLOv5x6 TTA 1280 55.0 55.0 72.0 70.8 - -

Table Notes (click to expand)

AP^test denotes COCO test-dev2017 server results, all other AP results denote val2017 accuracy.

AP values are for single-model single-scale unless otherwise noted. Reproduce mAP by python test.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65

Speed_GPU averaged over 5000 COCO val2017 images using a GCP n1-standard-16 V100 instance, and includes FP16 inference, postprocessing and NMS. Reproduce speed by python test.py --data coco.yaml --img 640 --conf 0.25 --iou 0.45

All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).

Test Time Augmentation (TTA) includes reflection and scale augmentation. Reproduce TTA by python test.py --data coco.yaml --img 1536 --iou 0.7 --augment

Changelog

Changes between previous release and this release: ultralytics/yolov5@v4.0...v5.0
Changes since this release: ultralytics/yolov5@v5.0...HEAD

Click a section below to expand details:

Implemented Enhancements (26)

Return predictions as json #2703

Single channel image training？ #2609

Images in MPO Format are considered corrupted #2446

Improve Validation Visualization #2384

Add ASFF (three fuse feature layers) int the Head for V5(s,m,l,x) #2348

Dear author, can you provide a visualization scheme for YOLOV5 feature graphs during detect.py? Thank you! #2259

Dataloader #2201

Update Train Custom Data wiki page #2187

Multi-class NMS #2162

💡Idea: Mosaic cropping using segmentation labels #2151

Improving Confusion Matrix Interpretability: FP and FN vectors should be switched to align with Predicted and True axis #2071

Interpreting model YoloV5 by Grad-cam #2065

Output optimal confidence threshold based on PR curve #2048

is it valuable that add --cache-images option to detect.py? #2004

I want to change the anchor box to anchor circles, where do you think the change to be made ? #1987

Support for imgaug #1954

Any plan for Knowledge Distillation? #1762

Is there a wasy to run detections on a video/webcam/rtrsp, etc EVERY x SECONDS? #1742

Can yolov5 support rotated target detection? #1728

Deploying yolov5 to TorchServe (GPU compatible) #1681

Why diffrent colors of bboxs? #1638

Yet ano...

Model	size ^(pixels)	mAP^val 0.5:0.95	mAP^{test 0.5:0.95}	mAP^val 0.5	Speed ^{V100 (ms)}	params ^(M)	FLOPS ^{640 (B)}
YOLOv5s	640	36.7	36.7	55.4	2.0	7.3	17.0
YOLOv5m	640	44.5	44.5	63.3	2.7	21.4	51.3
YOLOv5l	640	48.2	48.2	66.9	3.8	47.0	115.4
YOLOv5x	640	50.4	50.4	68.8	6.1	87.7	218.8

YOLOv5s6	1280	43.3	43.3	61.9	4.3	12.7	17.4
YOLOv5m6	1280	50.5	50.5	68.7	8.4	35.9	52.4
YOLOv5l6	1280	53.4	53.4	71.1	12.3	77.2	117.7
YOLOv5x6	1280	54.4	54.4	72.0	22.4	141.8	222.9

YOLOv5x6 TTA	1280	55.0	55.0	72.0	70.8	-	-

Assets 6

13 Jan 07:10

glenn-jocher

v9.1

166a4d5

v9.1 - YOLOv5 Forward Compatibility Updates

This release is a minor update implementing numerous bug fixes, feature additions and performance improvements from https://github.com/ultralytics/yolov5 to this repo. Models remain unchanged from v9.0 release.

Branch Notice

The ultralytics/yolov3 repository is now divided into two branches:

Master branch: Forward-compatible with all YOLOv5 models and methods (recommended).

$ git clone https://github.com/ultralytics/yolov3  # master branch (default)

Archive branch: Backwards-compatible with original darknet *.cfg models (⚠️ no longer maintained).

$ git clone -b archive https://github.com/ultralytics/yolov3  # archive branch

** GPU Speed measures end-to-end time per image averaged over 5000 COCO val2017 images using a V100 GPU with batch size 32, and includes image preprocessing, PyTorch FP16 inference, postprocessing and NMS. EfficientDet data from google/automl at batch size 8.

Pretrained Checkpoints

Model	AP^val	AP^test	AP₅₀	Speed_GPU	FPS_GPU	params	FLOPS
YOLOv3	43.3	43.3	63.0	4.8ms	208	61.9M	156.4B
YOLOv3-SPP	44.3	44.3	64.6	4.9ms	204	63.0M	157.0B
YOLOv3-tiny	17.6	34.9	34.9	1.7ms	588	8.9M	13.3B

** AP^test denotes COCO test-dev2017 server results, all other AP results denote val2017 accuracy.
** All AP numbers are for single-model single-scale without ensemble or TTA. Reproduce mAP by python test.py --data coco.yaml --img 640 --conf 0.001 --iou 0.65
** Speed_GPU averaged over 5000 COCO val2017 images using a GCP n1-standard-16 V100 instance, and includes image preprocessing, FP16 inference, postprocessing and NMS. NMS is 1-2ms/img. Reproduce speed by python test.py --data coco.yaml --img 640 --conf 0.25 --iou 0.45
** All checkpoints are trained to 300 epochs with default settings and hyperparameters (no autoaugmentation).
** Test Time Augmentation (TTA) runs at 3 image sizes. Reproduce TTA by python test.py --data coco.yaml --img 832 --iou 0.65 --augment

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Assets 5

06 Dec 17:04

glenn-jocher

v9.0

e285034

v9.0 - YOLOv5 Forward Compatibility Release

This release is a major update to the https://github.com/ultralytics/yolov3 repository that brings forward-compatibility with YOLOv5, and incorporates numerous bug fixes, feature additions and performance improvements from https://github.com/ultralytics/yolov5 to this repo.

Branch Notice

The ultralytics/yolov3 repository is now divided into two branches:

Master branch: Forward-compatible with all YOLOv5 models and methods (recommended).

$ git clone https://github.com/ultralytics/yolov3  # master branch (default)

Archive branch: Backwards-compatible with original darknet *.cfg models (⚠️ no longer maintained).

$ git clone -b archive https://github.com/ultralytics/yolov3  # archive branch

Pretrained Checkpoints

Model	AP^val	AP^test	AP₅₀	Speed_GPU	FPS_GPU	params	FLOPS
YOLOv3	43.3	43.3	63.0	4.8ms	208	61.9M	156.4B
YOLOv3-SPP	44.3	44.3	64.6	4.9ms	204	63.0M	157.0B
YOLOv3-tiny	17.6	34.9	34.9	1.7ms	588	8.9M	13.3B

Requirements

Python 3.8 or later with all requirements.txt dependencies installed, including torch>=1.7. To install run:

$ pip install -r requirements.txt

Assets 5

18 Nov 23:20

glenn-jocher

98068ef

v8 - Final Darknet Compatible Release

This is the final release of the darknet-compatible version of the https://github.com/ultralytics/yolov3 repository. This release is backwards-compatible with darknet *.cfg files for model configuration.

All pytorch (*.pt) and darknet (*.weights) models/backbones available are attached to this release in the Assets section below.

Breaking Changes

There are no breaking changes in this release.

Bug Fixes

Various

Added Functionality

Various

Speed

https://cloud.google.com/deep-learning-vm/
Machine type: preemptible n1-standard-8 (8 vCPUs, 30 GB memory)
CPU platform: Intel Skylake
GPUs: K80 ($0.14/hr), T4 ($0.11/hr), V100 ($0.74/hr) CUDA with Nvidia Apex FP16/32
HDD: 300 GB SSD
Dataset: COCO train 2014 (117,263 images)
Model: yolov3-spp.cfg
Command: python3 train.py --data coco2017.data --img 416 --batch 32

GPU	n	`--batch-size`	img/s	epoch time	epoch cost
K80	1	32 x 2	11	175 min	$0.41
T4	1 2	32 x 2 64 x 1	41 61	48 min 32 min	$0.09 $0.11
V100	1 2	32 x 2 64 x 1	122 178	16 min 11 min	$0.21 $0.28
2080Ti	1 2	32 x 2 64 x 1	81 140	24 min 14 min	- -

mAP

	Size	COCO mAP @0.5...0.95	COCO mAP @0.5
YOLOv3-tiny YOLOv3 YOLOv3-SPP YOLOv3-SPP-ultralytics	320	14.0 28.7 30.5 37.7	29.1 51.8 52.3 56.8
YOLOv3-tiny YOLOv3 YOLOv3-SPP YOLOv3-SPP-ultralytics	416	16.0 31.2 33.9 41.2	33.0 55.4 56.9 60.6
YOLOv3-tiny YOLOv3 YOLOv3-SPP YOLOv3-SPP-ultralytics	512	16.6 32.7 35.6 42.6	34.9 57.7 59.5 62.4
YOLOv3-tiny YOLOv3 YOLOv3-SPP YOLOv3-SPP-ultralytics	608	16.6 33.1 37.0 43.1	35.4 58.2 60.7 62.8

TODO

Assets 11

04 May 19:24

glenn-jocher

5d42cc1

43.1mAP@0.5:0.95 on COCO2014

This release requires PyTorch >= v1.4 to function properly. Please install the latest version from https://github.com/pytorch/pytorch/releases

Breaking Changes

There are no breaking changes in this release.

Bug Fixes

Various

Added Functionality

Improved training and test ground truth and prediction plotting. #1114
Increased augmentation speed. #1110
Improved Tensorboard integration.
Auto class hyperparameter update based on dataset class count.
Inference time augmentation option added now with --augment argument in test.py and detect.py.
Rectangular training with --rect argument in train.py

Speed

GPU	n	`--batch-size`	img/s	epoch time	epoch cost
K80	1	32 x 2	11	175 min	$0.41
T4	1 2	32 x 2 64 x 1	41 61	48 min 32 min	$0.09 $0.11
V100	1 2	32 x 2 64 x 1	122 178	16 min 11 min	$0.21 $0.28
2080Ti	1 2	32 x 2 64 x 1	81 140	24 min 14 min	- -

mAP

	Size	COCO mAP @0.5...0.95	COCO mAP @0.5
YOLOv3-tiny YOLOv3 YOLOv3-SPP YOLOv3-SPP-ultralytics	320	14.0 28.7 30.5 37.7	29.1 51.8 52.3 56.8
YOLOv3-tiny YOLOv3 YOLOv3-SPP YOLOv3-SPP-ultralytics	416	16.0 31.2 33.9 41.2	33.0 55.4 56.9 60.6
YOLOv3-tiny YOLOv3 YOLOv3-SPP YOLOv3-SPP-ultralytics	512	16.6 32.7 35.6 42.6	34.9 57.7 59.5 62.4
YOLOv3-tiny YOLOv3 YOLOv3-SPP YOLOv3-SPP-ultralytics	608	16.6 33.1 37.0 43.1	35.4 58.2 60.7 62.8

TODO (help and PR's welcome!)

Add iOS App inference to photos and videos in Camera Roll, as well as 'Flexible', or at least rectangular inference. #224

Assets 2

24 Apr 15:52

glenn-jocher

fbf0014

Rectangular Inference, Conv2d + Batchnorm2d Layer Fusion

This release requires PyTorch >= v1.0.0 to function properly. Please install the latest version from https://github.com/pytorch/pytorch/releases

Breaking Changes

There are no breaking changes in this release.

Bug Fixes

NMS now screens out nan and inf values which caused it to hang during some edge cases.

Added Functionality

Rectangular Inference. detect.py now automatically processes images, videos and webcam feeds using rectangular inference, letterboxing to the minimum viable 32-multiple. This speeds up inference by up to 40% on HD video: #232
Conv2d + Batchnorm2d Layer Fusion: detect.py now automatically fuses the Conv2d and Batchnorm2d layers in the model before running inference. This speeds up inference by about 5-10%. #224
Hyperparameters all parameterized and grouped togethor in train.py now. Genetic Hyperparameter Evolution code added to train.py.

Performance

https://cloud.google.com/deep-learning-vm/
Machine type: n1-standard-8 (8 vCPUs, 30 GB memory)
CPU platform: Intel Skylake
GPUs: K80 ($0.198/hr), P4 ($0.279/hr), T4 ($0.353/hr), P100 ($0.493/hr), V100 ($0.803/hr)
HDD: 100 GB SSD
Dataset: COCO train 2014

GPUs	`batch_size`	batch time	epoch time	epoch cost
	(images)	(s/batch)
1 K80	16	1.43s	175min	$0.58
1 P4	8	0.51s	125min	$0.58
1 T4	16	0.78s	94min	$0.55
1 P100	16	0.39s	48min	$0.39
2 P100	32	0.48s	29min	$0.47
4 P100	64	0.65s	20min	$0.65
1 V100	16	0.25s	31min	$0.41
2 V100	32	0.29s	18min	$0.48
4 V100	64	0.41s	13min	$0.70
8 V100	128	0.49s	7min	$0.80

TODO (help and PR's welcome!)

Add iOS App inference to photos and videos in Camera Roll, as well as 'Flexible', or at least rectangular inference. #224
Add parameter to switch between 'darknet' and 'power' wh methods. #168
YAPF linting (including possible wrap to PEP8 79 character-line standard) #88.
Resolve mAP bug: #222
Rectangular training. #232
Genetic Hyperparameter Evolution. HELP NEEDED HERE. If you have available hardware please contact us, as we need help expanding our hyperparameter search, for the benefit of everyone!

Assets 2

03 Apr 10:38

glenn-jocher

5170cd3

Video Inference, Transfer Learning Improvements

This release requires PyTorch >= v1.0.0 to function properly. Please install the latest version from https://github.com/pytorch/pytorch/releases

Breaking Changes

There are no breaking changes in this release.

Bug Fixes

None

Added Functionality

Video Inference. detect.py now automatically processes both images and videos. Image and video results are saved in their respective formats now (video inference saves new videos).
Transfer learning now operates automatically regardless of yolo layer size #152.

Performance

GPUs	`batch_size`	batch time	epoch time	epoch cost
	(images)	(s/batch)
1 K80	16	1.43s	175min	$0.58
1 P4	8	0.51s	125min	$0.58
1 T4	16	0.78s	94min	$0.55
1 P100	16	0.39s	48min	$0.39
2 P100	32	0.48s	29min	$0.47
4 P100	64	0.65s	20min	$0.65
1 V100	16	0.25s	31min	$0.41
2 V100	32	0.29s	18min	$0.48
4 V100	64	0.41s	13min	$0.70
8 V100	128	0.49s	7min	$0.80

TODO (help and PR's welcome!)

Add iOS App inference to photos and videos in Camera Roll.
Add parameter to switch between 'darknet' and 'power' wh methods. #168
YAPF linting (including possible wrap to PEP8 79 character-line standard) #88.
Hyperparameter search for loss function constants.

Assets 2

31 Mar 18:36

glenn-jocher

b56952d

mAP Improvements Past Darknet, Multithreaded DataLoader

This release requires PyTorch >= v1.0.0 to function properly. Please install the latest version from https://github.com/pytorch/pytorch/releases

Breaking Changes

There are no breaking changes in this release.

Bug Fixes

Multi GPU support is now working correctly #21.
test.py now natively outputs the same results as pycocotools to within 1% under most circumstances #2

Added Functionality

Dataloader is now multithread. #141
mAP improved by smarter NMS. mAP now exceeds darknet mAP by a small amount in all image sizes 320-608.

	ultralytics/yolov3 with `pycocotools`	darknet/yolov3
YOLOv3-320	51.8	51.5
YOLOv3-416	55.4	55.3
YOLOv3-608	58.2	57.9

sudo rm -rf yolov3 && git clone https://github.com/ultralytics/yolov3
# bash yolov3/data/get_coco_dataset.sh
sudo rm -rf cocoapi && git clone https://github.com/cocodataset/cocoapi && cd cocoapi/PythonAPI && make && cd ../.. && cp -r cocoapi/PythonAPI/pycocotools yolov3
cd yolov3

python3 test.py --save-json --conf-thres 0.001 --img-size 416
Namespace(batch_size=32, cfg='cfg/yolov3.cfg', conf_thres=0.001, data_cfg='cfg/coco.data', img_size=416, iou_thres=0.5, nms_thres=0.5, save_json=True, weights='weights/yolov3.weights')
Using cuda _CudaDeviceProperties(name='Tesla V100-SXM2-16GB', major=7, minor=0, total_memory=16130MB, multi_processor_count=80)
      Image      Total          P          R        mAP
Calculating mAP: 100%|█████████████████████████████████| 157/157 [08:34<00:00,  2.53s/it]
       5000       5000     0.0896      0.756      0.555
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.312
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.554
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.317
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.145
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.343
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.452
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.268
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.411
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.435
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.244
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.477
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.587
 
python3 test.py --save-json --conf-thres 0.001 --img-size 608 --batch-size 16
Namespace(batch_size=16, cfg='cfg/yolov3.cfg', conf_thres=0.001, data_cfg='cfg/coco.data', img_size=608, iou_thres=0.5, nms_thres=0.5, save_json=True, weights='weights/yolov3.weights')
Using cuda _CudaDeviceProperties(name='Tesla V100-SXM2-16GB', major=7, minor=0, total_memory=16130MB, multi_processor_count=80)
      Image      Total          P          R        mAP
Calculating mAP: 100%|█████████████████████████████████| 313/313 [08:54<00:00,  1.55s/it]
       5000       5000     0.0966      0.786      0.579
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.331
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.582
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.344
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.198
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.362
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.427
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.281
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.437
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.463
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.309
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.494
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.577

Performance

mAP computation is much slower now than before using default settings, as --conf-thres 0.001 captures many boxes that all must be passed through NMS. On a V100 test.py runs in about 8 minutes
Training speed is improved substantially compared to v3.0 due to the addition of the multithreaded PyTorch dataloader.

GPUs	`batch_size`	batch time	epoch time	epoch cost
	(images)	(s/batch)
1 K80	16	1.43s	175min	$0.58
1 P4	8	0.51s	125min	$0.58
1 T4	16	0.78s	94min	$0.55
1 P100	16	0.39s	48min	$0.39
2 P100	32	0.48s	29min	$0.47
4 P100	64	0.65s	20min	$0.65
1 V100	16	0.25s	31min	$0.41
2 V100	32	0.29s	18min	$0.48
4 V100	64	0.41s	13min	$0.70
8 V100	128	0.49s	7min	$0.80

TODO (help and PR's welcome!)

Video Inference. Pass a video file to detect.py.
YAPF linting (including possible wrap to PEP8 79 character-line standard) #88.
Add iOS App inference to photos and videos in Camera Roll.
Add parameter to switch between 'darknet' and 'power' wh methods. #168
Hyperparameter search for loss function constants.

Assets 2

20 Mar 23:19

glenn-jocher

v3.0

d1a1ea2

Multi-GPU, Tutorials, Official COCO mAP Support

This release requires PyTorch >= v1.0.0 to function properly. Please install the latest version from https://github.com/pytorch/pytorch/releases

Breaking Changes

There are no breaking changes in this release.

Bug Fixes

Multi GPU support #21.

Added Functionality

Tutorial created: https://github.com/ultralytics/yolov3/wiki/Example:-Transfer-Learning
Tutorial created: https://github.com/ultralytics/yolov3/wiki/Example:-Train-Single-Image
Tutorial created: https://github.com/ultralytics/yolov3/wiki/Example:-Train-Single-Class
Tutorial created: https://github.com/ultralytics/yolov3/wiki/Train-Custom-Data
test.py optionally outputs pycocotools compatible json files now with the --save-json flag, and computes official COCO mAP using pycocotools. Output is verified against official darknet results from https://arxiv.org/abs/1804.02767. #2 (comment).

	ultralytics/yolov3 mAP	darknet mAP
YOLOv3-320	51.3	51.5
YOLOv3-416	54.9	55.3
YOLOv3-608	57.9	57.9

Performance

10% improvement in training speed via code optimization. Performance should be further improved by multithreading the dataloader (on the TODO list).

https://cloud.google.com/deep-learning-vm/
Machine type: n1-highmem-4 (4 vCPUs, 26 GB memory)
CPU platform: Intel Skylake
GPUs: 1-4 x NVIDIA Tesla P100
HDD: 100 GB SSD

GPUs	`batch_size`	speed	COCO epoch
(P100)	(images)	(s/batch)	(min/epoch)
1	16	0.54s	66min
2	32	0.99s	61min
4	64	1.61s	49min

TODO (help and PR's welcome!)

Dataloader should enable multithread. Single thread loading takes about 230 ms per batch currently, out of total batch time of 550 ms (40%). Low GPU utilization reported. #141
test.py should natively output the same results as pycocotools #2
Video Inference. Pass a video file to detect.py.
YAPF linting (including possible wrap to PEP8 79 character-line standard) #88.
Add iOS App inference to photos and videos in Camera Roll.
Add parameter to switch between 'darknet' and 'power' wh methods.

Assets 2

Releases: ultralytics/yolov3

v9.6.0 - YOLOv5 v6.0 release compatibility update for YOLOv3

Important Updates

New Results

Pretrained Checkpoints

Changelog

Contributors

v9.5.0 - YOLOv5 v5.0 release compatibility update for YOLOv3

Notable Updates

Updated Results

Pretrained Checkpoints

Changelog

v9.1 - YOLOv5 Forward Compatibility Updates

Branch Notice

Pretrained Checkpoints

Requirements

v9.0 - YOLOv5 Forward Compatibility Release

Branch Notice

Pretrained Checkpoints

Requirements

v8 - Final Darknet Compatible Release

Breaking Changes

Bug Fixes

Added Functionality

Speed

mAP

TODO

43.1mAP@0.5:0.95 on COCO2014

Breaking Changes

Bug Fixes

Added Functionality

Speed

mAP

TODO (help and PR's welcome!)

Rectangular Inference, Conv2d + Batchnorm2d Layer Fusion

Breaking Changes

Bug Fixes

Added Functionality

Performance

TODO (help and PR's welcome!)

Video Inference, Transfer Learning Improvements

Breaking Changes

Bug Fixes

Added Functionality

Performance

TODO (help and PR's welcome!)

mAP Improvements Past Darknet, Multithreaded DataLoader

Breaking Changes

Bug Fixes

Added Functionality

Performance

TODO (help and PR's welcome!)

Multi-GPU, Tutorials, Official COCO mAP Support

Breaking Changes

Bug Fixes

Added Functionality

Performance

TODO (help and PR's welcome!)