You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have read the README and searched the existing issues.
Reproduction
显示无法读取数据文件,但命令行中已经输出了数据样本
Running tokenizer on dataset (num_proc=8): 100%|██████████| 50/50 [00:00<00:00, 119.29 examples/s]
Running tokenizer on dataset (num_proc=8): 100%|██████████| 50/50 [00:00<00:00, 118.64 examples/s]
multiprocess.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home//.conda/envs/LLaMA/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker
result = (True, func(args, kwds))
File "/home//.conda/envs/LLaMA/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 678, in _write_generator_to_queue
for i, result in enumerate(func(kwargs)):
File "/home//.conda/envs/LLaMA/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3595, in _map_single
os.chmod(cache_file_name, 0o666 & ~umask)
FileNotFoundError: [Errno 2] No such file or directory: '/home/*/.cache/huggingface/datasets/json/default-83535c0c955e8ee5/0.0.0/c8d2d9508a2a2067ab02cd118834ecef34c3700d143b31835ec4235bf10109f7/cache-6939a289e8425e94_00000_of_00008.arrow'
"""
Reminder
Reproduction
显示无法读取数据文件,但命令行中已经输出了数据样本
Running tokenizer on dataset (num_proc=8): 100%|██████████| 50/50 [00:00<00:00, 119.29 examples/s]
Running tokenizer on dataset (num_proc=8): 100%|██████████| 50/50 [00:00<00:00, 118.64 examples/s]
multiprocess.pool.RemoteTraceback:
"""
Traceback (most recent call last):
File "/home//.conda/envs/LLaMA/lib/python3.10/site-packages/multiprocess/pool.py", line 125, in worker
result = (True, func(args, kwds))
File "/home//.conda/envs/LLaMA/lib/python3.10/site-packages/datasets/utils/py_utils.py", line 678, in _write_generator_to_queue
for i, result in enumerate(func(kwargs)):
File "/home//.conda/envs/LLaMA/lib/python3.10/site-packages/datasets/arrow_dataset.py", line 3595, in _map_single
os.chmod(cache_file_name, 0o666 & ~umask)
FileNotFoundError: [Errno 2] No such file or directory: '/home/*/.cache/huggingface/datasets/json/default-83535c0c955e8ee5/0.0.0/c8d2d9508a2a2067ab02cd118834ecef34c3700d143b31835ec4235bf10109f7/cache-6939a289e8425e94_00000_of_00008.arrow'
"""
Expected behavior
input_ids:
[21586, 63, 684, 9226, 100, 14482, 100, 4440, 45, 28595, 731, 303, 1547, 303, 14991, 341, 10984, 7692, 451, 331, 1168, 451, 36822, 3966, 51, 465, 10485, 63, 303, 29113, 327, 687, 731, 418, 1168, 451, 29113, 451, 36822, 3966, 51, 465, 3777, 63, 303, 1916, 63, 906, 10984, 7692, 451, 341, 36822, 3966, 51, 303, 1547, 222, 35222, 63, 244]
inputs:
Human: def calculate_average_price(prices):
"""
Calculate the average price of a list of fashion items.
Assistant:
System Info
transformers 4.40.2
torch 2.2.0+cu11.8
deepspeed 0.14.2
datasets 2.19.1
accelerate 0.30.1
Others
No response
The text was updated successfully, but these errors were encountered: