Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accuracy is low on the testing set (split from the same distributation of the training set) but is high when conducting K-Fold validation on the training set. #1896

Open
freshn opened this issue May 11, 2024 · 0 comments

Comments

@freshn
Copy link

freshn commented May 11, 2024

分支

main 分支 (mmpretrain 版本)

描述该错误

Thanks to KFoldDataset and tools/kfold-cross-valid.py, we can now do the k-fold validation easily. However, I found a weird problem when evaluating the network on my own dataset. The dataset stores the training set in the folder 'train' and the testing set in the folder 'test'. The annotations of samples in both sets are available. The training set and testing are split out from the same original dataset and could be assumed to have the same distribution.

Now I want to train and test a model, e.g. a ResNet 50, on the dataset. I evaluate two strategies:

  1. train with 'train'; test with 'test'.
  2. Do 5-fold validation with 'train' (split the train into 5 folds, then train with any of 4 folds and test on the rest 1 fold)

For strategy 1, run the command:
python tools/train.py configs/resnet/my_config.py --auto-scale-lr
The result is like this:

Epoch(train) [100][100/227]  lr: 1.2500e-05  eta: 0:00:43  time: 0.3435  data_time: 0.0007  memory: 8052  loss: 0.0046
INFO - Epoch(train) [100][200/227]  lr: 1.2500e-05  eta: 0:00:09  time: 0.3436  data_time: 0.0007  memory: 8052  loss: 0.0058
INFO - Saving checkpoint at 100 epochs
INFO - Epoch(val) [100][40/40]    accuracy/top1: 34.5571  accuracy/top3: 73.9824  single-label/precision: 24.9772  single-label/recall: 23.9435  single-label/f1-score: 23.8994  spec: 11.4134  data_time: 0.0090  time: 0.1106

For strategy 2, run the command:
python tools/kfold-cross-valid.py configs/resnet/my_config.py --num-splits 5 --auto-scale-lr
The result is like this:

Epoch(train) [100][100/182]  lr: 1.2500e-05  eta: 0:00:28  time: 0.3404  data_time: 0.0004  memory: 8053  loss: 0.0027
INFO - Exp name: resnet50_8xb32_spinal_all_20230514_032355
INFO - Saving checkpoint at 100 epochs
INFO - Epoch(val) [100][46/46]    accuracy/top1: 89.2562  accuracy/top3: 99.0358  single-label/precision: 90.5743  single-label/recall: 87.7679  single-label/f1-score: 89.0269  data_time: 0.0078  time: 0.1091

From my understanding, the accuracy of those two experiments should not have a large gap between the accuracy like the one in my case (34.56 against 89.26). Does anyone have any idea?

环境信息

{'sys.platform': 'linux',
'Python': '3.8.16 (default, Mar 2 2023, 03:21:46) [GCC 11.2.0]',
'CUDA available': True,
'numpy_random_seed': 2147483648,
'GPU 0': 'Tesla V100-SXM2-32GB-LS',
'CUDA_HOME': '/jmain02/apps/cuda/10.2',
'NVCC': 'Cuda compilation tools, release 10.2, V10.2.8',
'GCC': 'gcc (GCC) 9.1.0',
'PyTorch': '1.12.1',
'TorchVision': '0.13.1',
'OpenCV': '4.7.0',
'MMEngine': '0.7.3',
'MMCV': '2.0.0rc4',
'MMPreTrain': '1.0.0rc7+e80418a'}

其他信息

No response

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant