Skip to content

Latest commit

 

History

History
68 lines (56 loc) · 3.02 KB

PREPARE_DATA.md

File metadata and controls

68 lines (56 loc) · 3.02 KB

Prepare Data

Download datasets as you need, and organize them as following:

code_root/
└── data/
   ├── conceptual-captions/
   │   ├── train_image/
   │   ├── val_image/
   │   ├── train_frcnn/
   │   ├── val_frcnn/
   │   ├── train.json
   │   ├── val.json
   │   ├── train_frcnn.json
   │   └── val_frcnn.json
   ├── en_corpus/
   │   ├── wiki.doc
   │   └── bc1g.doc
   ├── vcr/
   │   ├── vcr1images/
   │   ├── train.jsonl
   │   ├── val.jsonl
   │   └── test.jsonl
   └── coco/
       ├── train2014/
       ├── val2014/
       ├── test2015/
       ├── annotations/
       ├── vqa/
       ├── refcoco+/
       │   └── proposal/
       └── vgbua_res101_precomputed/
           ├── trainval2014_resnet101_faster_rcnn_genome
           └── test2015_resnet101_faster_rcnn_genome
       

Pre-training Data

Conceptual Captions

See ReadMe.txt.

English Wikipedia & BooksCorpus

Fine-tuning Data

VCR

  • Download and unzip images & annotations from here.

VQA & RefCOCO+

Common

  • Download and unzip COCO 2014 images & annotations from here.

VQA

  • Download and unzip annotations from here (including "VQA Annotations" and "VQA Input Questions"), place all these files directly under ./data/coco/vqa.

  • Download and unzip following precomputed boxes & features into ./data/coco/vgbua_res101_precomputed.

  • Download answer vocabulary from GoogleDrive / BaiduPan, place it under the folder ./data/coco/vqa/.

RefCOCO+

  • Download and unzip annotations, place all files in refcoco+/ directly under ./data/coco/refcoco+.
  • Download region proposals, place all files in detections/refcoco+_unc directly under ./data/coco/refcoco+/proposal.