Accuracy is low on the testing set (split from the same distributation of the training set) but is high when conducting K-Fold validation on the training set. #1896

freshn · 2024-05-11T14:51:59Z

分支

main 分支 (mmpretrain 版本)

描述该错误

Thanks to KFoldDataset and tools/kfold-cross-valid.py, we can now do the k-fold validation easily. However, I found a weird problem when evaluating the network on my own dataset. The dataset stores the training set in the folder 'train' and the testing set in the folder 'test'. The annotations of samples in both sets are available. The training set and testing are split out from the same original dataset and could be assumed to have the same distribution.

Now I want to train and test a model, e.g. a ResNet 50, on the dataset. I evaluate two strategies:

train with 'train'; test with 'test'.
Do 5-fold validation with 'train' (split the train into 5 folds, then train with any of 4 folds and test on the rest 1 fold)

For strategy 1, run the command:
python tools/train.py configs/resnet/my_config.py --auto-scale-lr
The result is like this:

Epoch(train) [100][100/227]  lr: 1.2500e-05  eta: 0:00:43  time: 0.3435  data_time: 0.0007  memory: 8052  loss: 0.0046
INFO - Epoch(train) [100][200/227]  lr: 1.2500e-05  eta: 0:00:09  time: 0.3436  data_time: 0.0007  memory: 8052  loss: 0.0058
INFO - Saving checkpoint at 100 epochs
INFO - Epoch(val) [100][40/40]    accuracy/top1: 34.5571  accuracy/top3: 73.9824  single-label/precision: 24.9772  single-label/recall: 23.9435  single-label/f1-score: 23.8994  spec: 11.4134  data_time: 0.0090  time: 0.1106

For strategy 2, run the command:
python tools/kfold-cross-valid.py configs/resnet/my_config.py --num-splits 5 --auto-scale-lr
The result is like this:

Epoch(train) [100][100/182]  lr: 1.2500e-05  eta: 0:00:28  time: 0.3404  data_time: 0.0004  memory: 8053  loss: 0.0027
INFO - Exp name: resnet50_8xb32_spinal_all_20230514_032355
INFO - Saving checkpoint at 100 epochs
INFO - Epoch(val) [100][46/46]    accuracy/top1: 89.2562  accuracy/top3: 99.0358  single-label/precision: 90.5743  single-label/recall: 87.7679  single-label/f1-score: 89.0269  data_time: 0.0078  time: 0.1091

From my understanding, the accuracy of those two experiments should not have a large gap between the accuracy like the one in my case (34.56 against 89.26). Does anyone have any idea?

环境信息

{'sys.platform': 'linux',
'Python': '3.8.16 (default, Mar 2 2023, 03:21:46) [GCC 11.2.0]',
'CUDA available': True,
'numpy_random_seed': 2147483648,
'GPU 0': 'Tesla V100-SXM2-32GB-LS',
'CUDA_HOME': '/jmain02/apps/cuda/10.2',
'NVCC': 'Cuda compilation tools, release 10.2, V10.2.8',
'GCC': 'gcc (GCC) 9.1.0',
'PyTorch': '1.12.1',
'TorchVision': '0.13.1',
'OpenCV': '4.7.0',
'MMEngine': '0.7.3',
'MMCV': '2.0.0rc4',
'MMPreTrain': '1.0.0rc7+e80418a'}

其他信息

No response

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accuracy is low on the testing set (split from the same distributation of the training set) but is high when conducting K-Fold validation on the training set. #1896

Accuracy is low on the testing set (split from the same distributation of the training set) but is high when conducting K-Fold validation on the training set. #1896

freshn commented May 11, 2024

Accuracy is low on the testing set (split from the same distributation of the training set) but is high when conducting K-Fold validation on the training set. #1896

Accuracy is low on the testing set (split from the same distributation of the training set) but is high when conducting K-Fold validation on the training set. #1896

Comments

freshn commented May 11, 2024

分支

描述该错误

环境信息

其他信息