Skip to content

Latest commit

 

History

History
executable file
·
93 lines (82 loc) · 2.63 KB

README.md

File metadata and controls

executable file
·
93 lines (82 loc) · 2.63 KB

Use Builtin Datasets

A dataset can be used by wrapping it into a torch Dataset. This document explains how to setup the builtin datasets so they can be used by X-modaler.

X-modaler has builtin support for a few datasets (e.g., MSCOCO or MSVD). The corresponding dataset wrappers are provided in ./xmodaler/datasets:

xmodaler/datasets/
  images/
    mscoco.py
  videos/
    msvd.py  

You can specify which dataset wrapper to use by DATASETS.TRAIN, DATASETS.VAL and DATASETS.TEST in the config file.

Expected structure for xmodaler

First, download the dataset files, pre-trained models and coco_caption.

xmodaler
coco_caption
open_source_dataset/
  mscoco_dataset
  msvd_dataset
  msrvtt_dataset
  ConceptualCaptions
  VQA
  VCR
  flickr30k
pretrain/
  BERT
  TDEN
  Uniter

Expected dataset structure for COCO:

mscoco_dataset/
  mscoco_caption_anno_train.pkl
  mscoco_caption_anno_val.pkl
  mscoco_caption_anno_test.pkl
  vocabulary.txt
  captions_val5k.json
  captions_test5k.json
  # image files that are mentioned in the corresponding json
features/
  up_down/
      *.npz

Expected dataset structure for MSVD:

msvd_dataset/
  msvd_caption_anno_train.pkl
  msvd_caption_anno_val.pkl
  msvd_caption_anno_test.pkl
  vocabulary.txt
  captions_val.json
  captions_test.json
  # videos files that are mentioned in the corresponding json
features/
  resnet152/
    *.npy

Expected dataset structure for MSR-VTT:

msrvtt_dataset/
  msrvtt_caption_anno_train.pkl
  msrvtt_caption_anno_val.pkl
  msrvtt_caption_anno_test.pkl
  vocabulary.txt
  captions_val.json
  captions_test.json
  # videos files that are mentioned in the corresponding json
msrvtt_torch/
  feature/
    resnet152/
      *.npy

When the dataset wrapper and data files are ready, you need to specify the corresponding paths to these data files in the config file. For example,

DATALOADER:
	FEATS_FOLDER: '../open_source_dataset/mscoco_dataset/features/up_down'    # feature folder
	ANNO_FOLDER: '../open_source_dataset/mscoco_dataset' # annotation folders
INFERENCE:
	VOCAB: '../open_source_dataset/mscoco_dataset/vocabulary.txt'
	VAL_ANNFILE: '../open_source_dataset/mscoco_dataset/captions_val5k.json'
	TEST_ANNFILE:  '../open_source_dataset/mscoco_dataset/captions_test5k.json'