Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

CocoDetection dataset incompatible with Faster R-CNN model Training and mAP calculation #8270

Open
anirudh6415 opened this issue Feb 12, 2024 · 2 comments

Comments

@anirudh6415
Copy link

anirudh6415 commented Feb 12, 2024

馃悰 Describe the bug

馃悰 Bug

I would like to thank you for Object Detection Finetuning tutorial. The CocoDetection dataset appears to be incompatible with the Faster R-CNN model, I have been using transforms-v2-end-to-end-object-detection-segmentation-example for coco detection. The TorchVision Object Detection Finetuning tutorial specifies the format of datasets to be compatible with the Mask R-CNN model: datasets' getitem method should output an image and a target with fields boxes, labels, area, etc. The CocoDetection dataset returns the COCO annotations as the target.

After Training and Evaluation (engine.evaluate) The mAP scrores are always 0 for every epoch.

Dataset:

from torchvision.datasets import CocoDetection ,wrap_dataset_for_transforms_v2

transforms = v2.Compose(
    [
        v2.ToPILImage(),
        v2.Resize(512),
        v2.RandomPhotometricDistort(p=1),
        v2.RandomZoomOut(fill={tv_tensors.Image: (123, 117, 104), "others": 0}),
        v2.RandomIoUCrop(),
        v2.RandomHorizontalFlip(p=1),
        v2.ToTensor(),
        v2.ToDtype(torch.float32, scale=True),
    ]
)
dataset = CocoDetection(root_dir, annotation_file,transforms=transforms)
dataset = wrap_dataset_for_transforms_v2(dataset, target_keys=("boxes", "labels"))

Expected behavior

FasterRcnn model output from evaluation:

45514: {'boxes': tensor([[ 40.1482, 48.6490, 46.0760, 50.4945],
[ 56.4980, 98.9506, 59.9611, 99.8110],
[ 16.5955, 50.3514, 20.7766, 51.7256],
[ 7.7093, 49.7779, 9.9628, 51.1177],
[ 23.2416, 115.2277, 27.9833, 116.0603],
[ 6.2100, 43.7826, 12.1565, 44.4718],
[ 84.3244, 92.1326, 89.6173, 92.8679],
[ 27.8029, 111.4202, 33.1342, 112.3421],
[ 6.4772, 83.6187, 11.6571, 85.1347],
[ 12.0571, 57.7298, 17.3467, 58.6374],
[ 52.0026, 100.2936, 55.6111, 101.0397],
[ 32.9334, 95.2229, 36.7513, 96.0473],
[ 11.9714, 50.6148, 16.0876, 51.3073],
[ 36.5298, 99.2084, 40.1270, 100.3034],
[ 56.8915, 95.4639, 59.8486, 96.2372],
[ 29.7059, 95.3435, 34.5201, 96.1218],
[ 83.4291, 96.4723, 89.3993, 97.1828],
[ 80.5682, 114.7297, 86.4728, 115.2060],
[ 55.3361, 96.1351, 57.5648, 97.3579],
[ 87.8969, 120.9048, 91.9940, 122.8857],
[ 79.1790, 95.7387, 83.8181, 96.0961],
[ 5.2113, 81.8440, 12.3168, 82.5170],
[ 11.9503, 9.1723, 15.8027, 10.5436],
[ 43.5947, 115.0965, 46.6917, 116.0323],
[ 36.3678, 44.5311, 45.5149, 45.0957],
[ 64.0280, 91.6801, 70.0944, 92.6666],
[ 34.9408, 48.2833, 39.4942, 48.7989],
[ 44.6860, 34.5384, 48.7593, 35.4988],
[ 8.5666, 52.0507, 9.7412, 53.2962],
[ 59.0582, 114.7045, 62.3767, 115.6113],
[ 42.6140, 95.4168, 47.2140, 95.9096],
[ 51.6593, 116.3869, 54.5132, 117.3841],
[ 10.2391, 8.1375, 15.3591, 9.5619],
[ 79.1855, 103.1416, 83.1228, 104.2892],
[ 11.6779, 115.0183, 15.2959, 115.6937],
[ 92.2911, 64.1361, 97.0701, 65.2341],
[ 77.5316, 94.2480, 88.9283, 103.3851],
[ 20.0655, 29.8961, 25.1227, 31.4927],
[ 41.8090, 91.6322, 66.6452, 114.4692],
[ 1.7047, 99.8426, 5.1214, 100.9323],
[ 21.0609, 30.4280, 25.2658, 31.7394],
[ 77.2151, 111.6185, 83.4417, 112.1318],
[ 21.3434, 105.4950, 25.2204, 106.5172],
[ 0.0000, 100.4384, 44.0517, 127.3968],
[ 37.8296, 43.6599, 41.4970, 44.9153],
[101.9358, 28.7851, 107.8658, 29.7577],
[ 84.6480, 112.4950, 89.6441, 113.4983],
[ 32.0187, 48.2305, 34.6299, 49.9225],
[ 21.0508, 96.1484, 49.0644, 115.7840],
[ 78.7586, 91.0307, 83.5991, 92.0456],
[ 7.4563, 99.3216, 11.9333, 100.4296],
[ 41.8862, 39.9427, 48.0095, 40.6046],
[ 64.2320, 110.5276, 69.4041, 111.1726],
[ 48.5087, 35.6968, 51.0966, 36.3254],
[ 69.9470, 80.5252, 77.2012, 81.4588],
[ 64.5411, 5.3913, 69.1044, 6.1687],
[ 14.9313, 118.7279, 18.5300, 119.9807],
[ 67.1189, 75.9650, 74.1020, 76.6732],
[104.7447, 31.4134, 109.5322, 32.3514],
[ 68.2009, 112.8180, 71.6182, 113.7542],
[ 77.4721, 33.7746, 80.8393, 34.7606],
[ 9.3352, 80.4828, 12.7890, 82.0704],
[ 65.1386, 107.5109, 71.5020, 108.5092],
[ 0.0000, 95.9446, 24.0685, 127.6092],
[ 43.3848, 100.4263, 46.2075, 101.2998],
[ 14.2563, 116.5107, 17.7641, 117.3038],
[ 75.3176, 28.1107, 79.5012, 29.0837],
[ 21.1844, 99.2131, 24.2272, 100.0415],
[ 59.2131, 7.5139, 63.0848, 8.2906],
[ 0.0000, 48.6410, 43.9125, 57.1178],
[ 92.4588, 61.7449, 97.9917, 62.5649],
[ 0.0000, 40.3103, 21.4952, 74.7319],
[ 26.3296, 18.2271, 30.6895, 19.6554],
[ 24.1920, 8.1525, 29.7535, 9.5155],
[ 0.0000, 82.6814, 1.6050, 86.1069],
[ 69.8429, 91.4722, 74.5880, 92.6022],
[ 40.0346, 106.1212, 43.5103, 107.1371],
[ 77.5447, 109.2828, 81.9013, 110.5845],
[ 68.1803, 44.8517, 73.7433, 45.6382],
[ 0.0000, 84.0534, 3.4056, 85.0159],
[ 38.7503, 35.8580, 56.7992, 47.4848],
[ 50.1666, 31.5740, 53.6334, 32.6867],
[ 31.0113, 101.1863, 33.5362, 101.8402],
[ 53.7563, 9.7722, 55.8778, 10.9179],
[ 51.0325, 13.5929, 54.7412, 14.6228],
[ 18.1654, 104.8237, 21.8600, 105.6227],
[ 19.5623, 35.9696, 24.1356, 37.0714],
[ 69.2776, 28.0173, 88.3530, 39.2125],
[ 75.1365, 115.5374, 77.8445, 116.9997],
[ 31.0881, 58.5975, 34.6037, 59.4643],
[ 1.6351, 80.1350, 6.1082, 81.3295],
[ 22.8064, 117.3966, 62.7837, 127.8345],
[ 63.6129, 70.9242, 69.0646, 71.9814],
[ 3.4624, 87.2172, 8.6216, 88.5247],
[ 55.8403, 28.8055, 59.7083, 30.4423],
[ 26.2743, 18.7395, 31.5674, 19.9733],
[ 26.8567, 117.3554, 32.5164, 117.9245],
[ 55.5966, 104.9360, 58.4963, 105.7098],
[ 88.1490, 100.0630, 91.5376, 101.2722],
[ 61.8169, 10.4709, 64.8416, 11.5875]], device='cuda:0'), 'labels': tensor([13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13, 13,
13, 13, 13, 13, 13, 13, 13, 13, 13, 13], device='cuda:0'), 'scores': tensor([0.3700, 0.3590, 0.3542, 0.3493, 0.3464, 0.3439, 0.3411, 0.3396, 0.3373,
0.3338, 0.3312, 0.3304, 0.3297, 0.3272, 0.3269, 0.3263, 0.3233, 0.3213,
0.3156, 0.3152, 0.3146, 0.3145, 0.3119, 0.3090, 0.3085, 0.3070, 0.3041,
0.3034, 0.3016, 0.3015, 0.3014, 0.3011, 0.3009, 0.2997, 0.2996, 0.2984,
0.2978, 0.2973, 0.2968, 0.2955, 0.2954, 0.2952, 0.2939, 0.2928, 0.2921,
0.2914, 0.2912, 0.2906, 0.2905, 0.2902, 0.2898, 0.2891, 0.2878, 0.2870,
0.2870, 0.2867, 0.2864, 0.2861, 0.2856, 0.2854, 0.2844, 0.2830, 0.2829,
0.2821, 0.2808, 0.2808, 0.2805, 0.2804, 0.2799, 0.2796, 0.2796, 0.2791,
0.2791, 0.2789, 0.2788, 0.2780, 0.2775, 0.2774, 0.2767, 0.2766, 0.2765,
0.2762, 0.2761, 0.2757, 0.2751, 0.2748, 0.2744, 0.2743, 0.2740, 0.2737,
0.2737, 0.2736, 0.2735, 0.2733, 0.2733, 0.2731, 0.2728, 0.2726, 0.2725,
0.2721], device='cuda:0')}}

But the mAP calculation is always:

Test: [ 0/245] eta: 0:00:23 model_time: 0.0417 (0.0417) evaluator_time: 0.0033 (0.0033) time: 0.0973 data: 0.0519 max mem: 3903
Test: [100/245] eta: 0:00:12 model_time: 0.0388 (0.0390) evaluator_time: 0.0016 (0.0018) time: 0.0882 data: 0.0472 max mem: 3903
Test: [200/245] eta: 0:00:03 model_time: 0.0388 (0.0389) evaluator_time: 0.0015 (0.0018) time: 0.0874 data: 0.0466 max mem: 3903
Test: [244/245] eta: 0:00:00 model_time: 0.0388 (0.0388) evaluator_time: 0.0019 (0.0018) time: 0.0880 data: 0.0478 max mem: 3903
Test: Total time: 0:00:21 (0.0879 s / it)
Averaged stats: model_time: 0.0388 (0.0388) evaluator_time: 0.0019 (0.0018)
Accumulating evaluation results...
DONE (t=0.06s).
Accumulating evaluation results...
DONE (t=0.06s).
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
IoU metric: segm
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000

To Reproduce

Steps to reproduce the behaviour:
Follow the steps in Object Detection Finetuning tutorial substituting a dataset with COCO Detection ( torchvision.datasets.CocoDetection ).
I get predicted mAP 0 within the coco_eval.
IoU metric: bbox
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.000

Versions

Collecting environment information...
PyTorch version: 2.1.2
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A

OS: Microsoft Windows 11 Home
GCC version: Could not collect
Clang version: Could not collect
CMake version: Could not collect
Libc version: N/A

Python version: 3.11.5 | packaged by Anaconda, Inc. | (main, Sep 11 2023, 13:26:23) [MSC v.1916 64 bit (AMD64)] (64-bit runtime)
Python platform: Windows-10-10.0.22631-SP0
Is CUDA available: True
CUDA runtime version: 11.8.89
CUDA_MODULE_LOADING set to: LAZY
GPU models and configuration: GPU 0: NVIDIA GeForce RTX 3060 Laptop GPU
Nvidia driver version: 522.06
cuDNN version: Could not collect
HIP runtime version: N/A
MIOpen runtime version: N/A
Is XNNPACK available: True

CPU:
Architecture=9
CurrentClockSpeed=2300
DeviceID=CPU0
Family=198
L2CacheSize=11776
L2CacheSpeed=
Manufacturer=GenuineIntel
MaxClockSpeed=2300
Name=12th Gen Intel(R) Core(TM) i7-12700H
ProcessorType=3
Revision=

Versions of relevant libraries:
[pip3] numpy==1.26.2
[pip3] pytorch-ignite==0.4.13
[pip3] pytorch-lightning==1.9.5
[pip3] torch==2.1.2
[pip3] torchmetrics==0.10.3
[pip3] torchsummary==1.5.1
[pip3] torchvision==0.16.2
[conda] blas 1.0 mkl
[conda] mkl 2023.1.0 h6b88ed4_46358
[conda] mkl-service 2.4.0 py311h2bbff1b_1
[conda] mkl_fft 1.3.8 py311h2bbff1b_0
[conda] mkl_random 1.2.4 py311h59b6b97_0
[conda] numpy 1.26.2 py311hdab7c0b_0
[conda] numpy-base 1.26.2 py311hd01c5d8_0
[conda] pytorch 2.1.2 py3.11_cuda11.8_cudnn8_0 pytorch
[conda] pytorch-cuda 11.8 h24eeafa_5 pytorch
[conda] pytorch-ignite 0.4.13 pypi_0 pypi
[conda] pytorch-lightning 1.9.5 pypi_0 pypi
[conda] pytorch-mutex 1.0 cuda pytorch
[conda] torchmetrics 0.10.3 pypi_0 pypi
[conda] torchsummary 1.5.1 pypi_0 pypi
[conda] torchvision 0.16.2 pypi_0 pypi

@NicolasHug
Copy link
Member

Sorry @anirudh6415 it's difficult to help without having a minimal reproducible example. Would you mind sharing exactly the steps that you've been using (but as minimal as possible)?

@anirudh6415
Copy link
Author

I have a dataset in Coco format that I want to load using CocoDetection from torchvision.datasets and build a loader. However, when I start training with Faster R-CNN, the model does not seem to be finetuned, and my evaluation results mAP are zero.

Note: With the same code, I built my custom dataset, which started working.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants