You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
For the Alpaca dataset, the default split comprises 51,800 samples for training and 200 samples for testing [1]. What is the rationale behind such a small test set? I haven't been able to find any recommended split ratios for the Alpaca dataset.
Is the purpose of the small test set merely to serve as a reference point, suggesting that for more reliable testing, another dataset or framework, such as HELM, should be utilized?
For the Alpaca dataset, the default split comprises 51,800 samples for training and 200 samples for testing [1]. What is the rationale behind such a small test set? I haven't been able to find any recommended split ratios for the Alpaca dataset.
Is the purpose of the small test set merely to serve as a reference point, suggesting that for more reliable testing, another dataset or framework, such as HELM, should be utilized?
[1] https://github.com/facebookresearch/llama-recipes/blob/main/src/llama_recipes/datasets/alpaca_dataset.py#L30
The text was updated successfully, but these errors were encountered: