CocoDetection dataset incompatible with Faster R-CNN model Training and mAP calculation #8270

anirudh6415 · 2024-02-12T20:12:37Z

🐛 Describe the bug

🐛 Bug

I would like to thank you for Object Detection Finetuning tutorial. The CocoDetection dataset appears to be incompatible with the Faster R-CNN model, I have been using transforms-v2-end-to-end-object-detection-segmentation-example for coco detection. The TorchVision Object Detection Finetuning tutorial specifies the format of datasets to be compatible with the Mask R-CNN model: datasets' getitem method should output an image and a target with fields boxes, labels, area, etc. The CocoDetection dataset returns the COCO annotations as the target.

After Training and Evaluation (engine.evaluate) The mAP scrores are always 0 for every epoch.

Dataset:

from torchvision.datasets import CocoDetection ,wrap_dataset_for_transforms_v2

transforms = v2.Compose(
    [
        v2.ToPILImage(),
        v2.Resize(512),
        v2.RandomPhotometricDistort(p=1),
        v2.RandomZoomOut(fill={tv_tensors.Image: (123, 117, 104), "others": 0}),
        v2.RandomIoUCrop(),
        v2.RandomHorizontalFlip(p=1),
        v2.ToTensor(),
        v2.ToDtype(torch.float32, scale=True),
    ]
)
dataset = CocoDetection(root_dir, annotation_file,transforms=transforms)
dataset = wrap_dataset_for_transforms_v2(dataset, target_keys=("boxes", "labels"))

Expected behavior

FasterRcnn model output from evaluation:

45514: {'boxes': tensor([[ 40.1482, 48.6490, 46.0760, 50.4945],
[ 56.4980, 98.9506, 59.9611, 99.8110],
[ 16.5955, 50.3514, 20.7766, 51.7256],
[ 7.7093, 49.7779, 9.9628, 51.1177],
[ 23.2416, 115.2277, 27.9833, 116.0603],
[ 6.2100, 43.7826, 12.1565, 44.4718],
[ 84.3244, 92.1326, 89.6173, 92.8679],
[ 27.8029, 111.4202, 33.1342, 112.3421],
[ 6.4772, 83.6187, 11.6571, 85.1347],
[ 12.0571, 57.7298, 17.3467, 58.6374],
[ 52.0026, 100.2936, 55.6111, 101.0397],
[ 32.9334, 95.2229, 36.7513, 96.0473],
[ 11.9714, 50.6148, 16.0876, 51.3073],
[ 36.5298, 99.2084, 40.1270, 100.3034],
[ 56.8915, 95.4639, 59.8486, 96.2372],
[ 29.7059, 95.3435, 34.5201, 96.1218],
[ 83.4291, 96.4723, 89.3993, 97.1828],
[ 80.5682, 114.7297, 86.4728, 115.2060],
[ 55.3361, 96.1351, 57.5648, 97.3579],
[ 87.8969, 120.9048, 91.9940, 122.8857],
[ 79.1790, 95.7387, 83.8181, 96.0961],
[ 5.2113, 81.8440, 12.3168, 82.5170],
[ 11.9503, 9.1723, 15.8027, 10.5436],
[ 43.5947, 115.0965, 46.6917, 116.0323],
[ 36.3678, 44.5311, 45.5149, 45.0957],
[ 64.0280, 91.6801, 70.0944, 92.6666],
[ 34.9408, 48.2833, 39.4942, 48.7989],
[ 44.6860, 34.5384, 48.7593, 35.4988],
[ 8.5666, 52.0507, 9.7412, 53.2962],
[ 59.0582, 114.7045, 62.3767, 115.6113],
[ 42.6140, 95.4168, 47.2140, 95.9096],
[ 51.6593, 116.3869, 54.5132, 117.3841],
[ 10.2391, 8.1375, 15.3591, 9.5619],
[ 79.1855, 103.1416, 83.1228, 104.2892],
[ 11.6779, 115.0183, 15.2959, 115.6937],
[ 92.2911, 64.1361, 97.0701, 65.2341],
[ 77.5316, 94.2480, 88.9283, 103.3851],
[ 20.0655, 29.8961, 25.1227, 31.4927],
[ 41.8090, 91.6322, 66.6452, 114.4692],
[ 1.7047, 99.8426, 5.1214, 100.9323],
[ 21.0609, 30.4280, 25.2658, 31.7394],
[ 77.2151, 111.6185, 83.4417, 112.1318],
[ 21.3434, 105.4950, 25.2204, 106.5172],
[ 0.0000, 100.4384, 44.0517, 127.3968],
[ 37.8296, 43.6599, 41.4970, 44.9153],
[101.9358, 28.7851, 107.8658, 29.7577],
[ 84.6480, 112.4950, 89.6441, 113.4983],
[ 32.0187, 48.2305, 34.6299, 49.9225],
[ 21.0508, 96.1484, 49.0644, 115.7840],
[ 78.7586, 91.0307, 83.5991, 92.0456],
[ 7.4563, 99.3216, 11.9333, 100.4296],
[ 41.8862, 39.9427, 48.0095, 40.6046],
[ 64.2320, 110.5276, 69.4041, 111.1726],
[ 48.5087, 35.6968, 51.0966, 36.3254],
[ 69.9470, 80.5252, 77.2012, 81.4588],
[ 64.5411, 5.3913, 69.1044, 6.1687],
[ 14.9313, 118.7279, 18.5300, 119.9807],
[ 67.1189, 75.9650, 74.1020, 76.6732],
[104.7447, 31.4134, 109.5322, 32.3514],
[ 68.2009, 112.8180, 71.6182, 113.7542],
[ 77.4721, 33.7746, 80.8393, 34.7606],
[ 9.3352, 80.4828, 12.7890, 82.0704],
[ 65.1386, 107.5109, 71.5020, 108.5092],
[ 0.0000, 95.9446, 24.0685, 127.6092],
[ 43.3848, 100.4263, 46.2075, 101.2998],
[ 14.2563, 116.5107, 17.7641, 117.3038],
[ 75.3176, 28.1107, 79.5012, 29.0837],
[ 21.1844, 99.2131, 24.2272, 100.0415],
[ 59.2131, 7.5139, 63.0848, 8.2906],
[ 0.0000, 48.6410, 43.9125, 57.1178],
[ 92.4588, 61.7449, 97.9917, 62.5649],
[ 0.0000, 40.3103, 21.4952, 74.7319],
[ 26.3296, 18.2271, 30.6895, 19.6554],
[ 24.1920, 8.1525, 29.7535, 9.5155],
[ 0.0000, 82.6814, 1.6050, 86.1069],
[ 69.8429, 91.4722, 74.5880, 92.6022],
[ 40.0346, 106.1212, 43.5103, 107.1371],
[ 77.5447, 109.2828, 81.9013, 110.5845],
[ 68.1803, 44.8517, 73.7433, 45.6382],
[ 0.0000, 84.0534, 3.4056, 85.0159],
[ 38.7503, 35.8580, 56.7992, 47.4848],
[ 50.1666, 31.5740, 53.6334, 32.6867],
[ 31.0113, 101.1863, 33.5362, 101.8402],
[ 53.7563, 9.7722, 55.8778, 10.9179],
[ 51.0325, 13.5929, 54.7412, 14.6228],
[ 18.1654, 104.8237, 21.8600, 105.6227],
[ 19.5623, 35.9696, 24.1356, 37.0714],
[ 69.2776, 28.0173, 88.3530, 39.2125],
[ 75.1365, 115.5374, 77.8445, 116.9997],
[ 31.0881, 58.5975, 34.6037, 59.4643],
[ 1.6351, 80.1350, 6.1082, 81.3295],
[ 22.8064, 117.3966, 62.7837, 127.8345],
[ 63.6129, 70.9242, 69.0646, 71.9814],
[ 3.4624, 87.2172, 8.6216, 88.5247],
[ 55.8403, 28.8055, 59.7083, 30.4423],
[ 26.2743, 18.7395, 31.5674, 19.9733],
[ 26.8567, 117.3554, 32.5164, 117.9245],
[ 55.5966, 104.9360, 58.4963, 105.7098],
[ 88.1490, 100.0630, 91.5376, 101.2722],
[ 61.8169, 10.4709, 64.8416, 11.5875]], device='cuda:0'), 'labels': tensor([13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
13, 13, 13, 13, 13, 13, 13, 13, 13, 13], device='cuda:0'), 'scores': tensor([0.3700, 0.3590, 0.3542, 0.3493, 0.3464, 0.3439, 0.3411, 0.3396, 0.3373,
0.3338, 0.3312, 0.3304, 0.3297, 0.3272, 0.3269, 0.3263, 0.3233, 0.3213,
0.3156, 0.3152, 0.3146, 0.3145, 0.3119, 0.3090, 0.3085, 0.3070, 0.3041,
0.3034, 0.3016, 0.3015, 0.3014, 0.3011, 0.3009, 0.2997, 0.2996, 0.2984,
0.2978, 0.2973, 0.2968, 0.2955, 0.2954, 0.2952, 0.2939, 0.2928, 0.2921,
0.2914, 0.2912, 0.2906, 0.2905, 0.2902, 0.2898, 0.2891, 0.2878, 0.2870,
0.2870, 0.2867, 0.2864, 0.2861, 0.2856, 0.2854, 0.2844, 0.2830, 0.2829,
0.2821, 0.2808, 0.2808, 0.2805, 0.2804, 0.2799, 0.2796, 0.2796, 0.2791,
0.2791, 0.2789, 0.2788, 0.2780, 0.2775, 0.2774, 0.2767, 0.2766, 0.2765,
0.2762, 0.2761, 0.2757, 0.2751, 0.2748, 0.2744, 0.2743, 0.2740, 0.2737,
0.2737, 0.2736, 0.2735, 0.2733, 0.2733, 0.2731, 0.2728, 0.2726, 0.2725,
0.2721], device='cuda:0')}}

But the mAP calculation is always:

Test: [ 0/245] eta: 0:00:23 model_time: 0.0417 (0.0417) evaluator_time: 0.0033 (0.0033) time: 0.0973 data: 0.0519 max mem: 3903
Test: [100/245] eta: 0:00:12 model_time: 0.0388 (0.0390) evaluator_time: 0.0016 (0.0018) time: 0.0882 data: 0.0472 max mem: 3903
Test: [200/245] eta: 0:00:03 model_time: 0.0388 (0.0389) evaluator_time: 0.0015 (0.0018) time: 0.0874 data: 0.0466 max mem: 3903
Test: [244/245] eta: 0:00:00 model_time: 0.0388 (0.0388) evaluator_time: 0.0019 (0.0018) time: 0.0880 data: 0.0478 max mem: 3903
Test: Total time: 0:00:21 (0.0879 s / it)
Averaged stats: model_time: 0.0388 (0.0388) evaluator_time: 0.0019 (0.0018)
Accumulating evaluation results...
DONE (t=0.06s).
Accumulating evaluation results...
DONE (t=0.06s).
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
IoU metric: segm
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000

To Reproduce

Steps to reproduce the behaviour:
Follow the steps in Object Detection Finetuning tutorial substituting a dataset with COCO Detection ( torchvision.datasets.CocoDetection ).
I get predicted mAP 0 within the coco_eval.
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000

Versions

Collecting environment information...
PyTorch version: 2.1.2
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 11 Home
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A

Python version: 3.11.5 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:26:23) [MSC v.1916 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.22631-SP0
Is CUDA available: True
CUDA runtime version: 11.8.89
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3060 Laptop GPU
Nvidia driver version: 522.06
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture=9
CurrentClockSpeed=2300
DeviceID=CPU0
Family=198
L2CacheSize=11776
L2CacheSpeed=
Manufacturer=GenuineIntel
MaxClockSpeed=2300
Name=12th Gen Intel(R) Core(TM) i7-12700H
ProcessorType=3
Revision=

Versions of relevant libraries:
[pip3] numpy==1.26.2
[pip3] pytorch-ignite==0.4.13
[pip3] pytorch-lightning==1.9.5
[pip3] torch==2.1.2
[pip3] torchmetrics==0.10.3
[pip3] torchsummary==1.5.1
[pip3] torchvision==0.16.2
[conda] blas 1.0 mkl
[conda] mkl 2023.1.0 h6b88ed4_46358
[conda] mkl-service 2.4.0 py311h2bbff1b_1
[conda] mkl_fft 1.3.8 py311h2bbff1b_0
[conda] mkl_random 1.2.4 py311h59b6b97_0
[conda] numpy 1.26.2 py311hdab7c0b_0
[conda] numpy-base 1.26.2 py311hd01c5d8_0
[conda] pytorch 2.1.2 py3.11_cuda11.8_cudnn8_0 pytorch
[conda] pytorch-cuda 11.8 h24eeafa_5 pytorch
[conda] pytorch-ignite 0.4.13 pypi_0 pypi
[conda] pytorch-lightning 1.9.5 pypi_0 pypi
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchmetrics 0.10.3 pypi_0 pypi
[conda] torchsummary 1.5.1 pypi_0 pypi
[conda] torchvision 0.16.2 pypi_0 pypi

NicolasHug · 2024-03-04T14:09:02Z

Sorry @anirudh6415 it's difficult to help without having a minimal reproducible example. Would you mind sharing exactly the steps that you've been using (but as minimal as possible)?

anirudh6415 · 2024-03-14T19:50:34Z

I have a dataset in Coco format that I want to load using CocoDetection from torchvision.datasets and build a loader. However, when I start training with Faster R-CNN, the model does not seem to be finetuned, and my evaluation results mAP are zero.

Note: With the same code, I built my custom dataset, which started working.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

CocoDetection dataset incompatible with Faster R-CNN model Training and mAP calculation #8270

CocoDetection dataset incompatible with Faster R-CNN model Training and mAP calculation #8270

anirudh6415 commented Feb 12, 2024 •

edited

NicolasHug commented Mar 4, 2024

anirudh6415 commented Mar 14, 2024

CocoDetection dataset incompatible with Faster R-CNN model Training and mAP calculation #8270

CocoDetection dataset incompatible with Faster R-CNN model Training and mAP calculation #8270

Comments

anirudh6415 commented Feb 12, 2024 • edited

🐛 Describe the bug

🐛 Bug

Expected behavior

To Reproduce

Versions

NicolasHug commented Mar 4, 2024

anirudh6415 commented Mar 14, 2024

anirudh6415 commented Feb 12, 2024 •

edited