Reasoning behind Alapca's default split #364

macsz · 2024-02-01T02:10:21Z

For the Alpaca dataset, the default split comprises 51,800 samples for training and 200 samples for testing [1]. What is the rationale behind such a small test set? I haven't been able to find any recommended split ratios for the Alpaca dataset.

Is the purpose of the small test set merely to serve as a reference point, suggesting that for more reliable testing, another dataset or framework, such as HELM, should be utilized?

[1] https://github.com/facebookresearch/llama-recipes/blob/main/src/llama_recipes/datasets/alpaca_dataset.py#L30

HamidShojanazeri · 2024-02-26T00:21:49Z

@macsz I believe it was more of quick test rather than a recommended setting, agree with you it should be higher, will consider adding a fix.

HamidShojanazeri added the triaged label Feb 26, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Reasoning behind Alapca's default split #364

Reasoning behind Alapca's default split #364

macsz commented Feb 1, 2024

HamidShojanazeri commented Feb 26, 2024

Reasoning behind Alapca's default split #364

Reasoning behind Alapca's default split #364

Comments

macsz commented Feb 1, 2024

HamidShojanazeri commented Feb 26, 2024