docker cpu 版本，下载模型异常：Server error: 500 - [address=0.0.0.0:33434, pid=15] 'NoneType' object is not subscriptable #1517

Jamel-jun · 2024-05-18T03:24:14Z

Describe the bug

下载minicpm-2b-dpo-bf16 出现错误。

To Reproduce

使用的 docker compose 部署，cpu 版本。

Expected behavior

期望正常下载部署模型。

Additional context

错误日志：

页面提示：

docker compose 部署文件

version: '3.4'
services:
  xinference:
    image: xprobe/xinference:latest-cpu
    container_name: xinference
    hostname: xinference
    command: xinference-local -H 0.0.0.0 --log-level debug
    privileged: true
    user: root
    environment:
      - TZ=Asia/Shanghai
      - XINFERENCE_MODEL_SRC=modelscope
      - XINFERENCE_HOME=/home/data
    volumes:
      - ./data:/home/data
    networks:
      - diy_network
    ports:
      - 9997:9997
    logging:
      driver: "json-file"
      options:
        max-size: "500m"

networks:
  diy_network:
    external: true

ciekawy · 2024-05-21T12:52:29Z

I'm getting the same trying to run on macos with xinference-local --host 0.0.0.0 --port 9997 (non docker), tried few models.

NTLx · 2024-05-22T08:42:31Z

我遇到了同样的报错，怀疑是加载模型时有配置项为空导致，我关注的是这部分报错内容：

kwargs: {'model_uid': 'bge-m3-1-0', 'model_name': 'bge-m3', 'model_size_in_billions': None, 'model_format': None, 'quantization': None, 'model_engine': None, 'model_type': 'embedding', 'n_gpu': None, 'request_limits': None, 'peft_model_config': None, 'gpu_idx': None}

以下是全部的相关报错文本（docker-compose日志）：

xinference    | 2024-05-22 16:31:26,623 xinference.core.supervisor 39 DEBUG    Enter list_model_registrations, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fdcee861d30>, 'rerank'), kwargs: {'detailed': True}
xinference    | 2024-05-22 16:31:26,625 xinference.core.supervisor 39 DEBUG    Leave list_model_registrations, elapsed time: 0 s
xinference    | 2024-05-22 16:31:27,032 xinference.core.supervisor 39 DEBUG    Enter list_model_registrations, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fdcee861d30>, 'embedding'), kwargs: {'detailed': True}
xinference    | 2024-05-22 16:31:27,046 xinference.core.supervisor 39 DEBUG    Leave list_model_registrations, elapsed time: 0 s
xinference    | 2024-05-22 16:31:33,312 xinference.core.supervisor 39 DEBUG    Enter launch_builtin_model, model_uid: bge-m3, model_name: bge-m3, model_size: , model_format: None, quantization: None, replica: 1
xinference    | 2024-05-22 16:31:33,313 xinference.core.worker 39 DEBUG    Enter get_model_count, args: (<xinference.core.worker.WorkerActor object at 0x7fdcee8ca5d0>,), kwargs: {}
xinference    | 2024-05-22 16:31:33,313 xinference.core.worker 39 DEBUG    Leave get_model_count, elapsed time: 0 s
xinference    | 2024-05-22 16:31:33,313 xinference.core.worker 39 DEBUG    Enter launch_builtin_model, args: (<xinference.core.worker.WorkerActor object at 0x7fdcee8ca5d0>,), kwargs: {'model_uid': 'bge-m3-1-0', 'model_name': 'bge-m3', 'model_size_in_billions': None, 'model_format': None, 'quantization': None, 'model_engine': None, 'model_type': 'embedding', 'n_gpu': None, 'request_limits': None, 'peft_model_config': None, 'gpu_idx': None}
xinference    | 2024-05-22 16:31:33,313 xinference.core.worker 39 DEBUG    GPU disabled for model bge-m3-1-0
xinference    | Process IndigenActorPool163577856:
xinference    | 2024-05-22 16:31:37,378 xinference.core.supervisor 39 DEBUG    Enter terminate_model, args: (<xinference.core.supervisor.SupervisorActor object at 0x7fdcee861d30>, 'bge-m3'), kwargs: {'suppress_exception': True}
xinference    | 2024-05-22 16:31:37,378 xinference.core.supervisor 39 DEBUG    Leave terminate_model, elapsed time: 0 s
xinference    | Traceback (most recent call last):
xinference    |   File "/opt/conda/lib/python3.11/multiprocessing/process.py", line 314, in _bootstrap
xinference    |     self.run()
xinference    |   File "/opt/conda/lib/python3.11/multiprocessing/process.py", line 108, in run
xinference    |     self._target(*self._args, **self._kwargs)
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xoscar/backends/indigen/pool.py", line 278, in _start_sub_pool
xinference    |     asyncio.run(coro)
xinference    |   File "/opt/conda/lib/python3.11/asyncio/runners.py", line 190, in run
xinference    |     return runner.run(main)
xinference    |            ^^^^^^^^^^^^^^^^
xinference    |   File "/opt/conda/lib/python3.11/asyncio/runners.py", line 118, in run
xinference    |     return self._loop.run_until_complete(task)
xinference    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
xinference    |   File "/opt/conda/lib/python3.11/asyncio/base_events.py", line 653, in run_until_complete
xinference    |     return future.result()
xinference    |            ^^^^^^^^^^^^^^^
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xoscar/backends/indigen/pool.py", line 293, in _create_sub_pool
xinference    |     os.environ.update(env)
xinference    |   File "<frozen _collections_abc>", line 949, in update
xinference    |   File "<frozen os>", line 683, in __setitem__
xinference    |   File "<frozen os>", line 758, in encode
xinference    | TypeError: str expected, not NoneType
xinference    | 2024-05-22 16:31:37,381 xinference.api.restful_api 1 ERROR    [address=0.0.0.0:44795, pid=39] 'NoneType' object is not subscriptable
xinference    | Traceback (most recent call last):
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xinference/api/restful_api.py", line 697, in launch_model
xinference    |     model_uid = await (await self._get_supervisor_ref()).launch_builtin_model(
xinference    |                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xoscar/backends/context.py", line 227, in send
xinference    |     return self._process_result_message(result)
xinference    |            ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xoscar/backends/context.py", line 102, in _process_result_message
xinference    |     raise message.as_instanceof_cause()
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xoscar/backends/pool.py", line 659, in send
xinference    |     result = await self._run_coro(message.message_id, coro)
xinference    |     ^^^^^^^^^^^^^^^^^
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xoscar/backends/pool.py", line 370, in _run_coro
xinference    |     return await coro
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xoscar/api.py", line 384, in __on_receive__
xinference    |     return await super().__on_receive__(message)  # type: ignore
xinference    |     ^^^^^^^^^^^^^^^^^
xinference    |   File "xoscar/core.pyx", line 558, in __on_receive__
xinference    |     raise ex
xinference    |   File "xoscar/core.pyx", line 520, in xoscar.core._BaseActor.__on_receive__
xinference    |     async with self._lock:
xinference    |     ^^^^^^^^^^^^^^^^^
xinference    |   File "xoscar/core.pyx", line 521, in xoscar.core._BaseActor.__on_receive__
xinference    |     with debug_async_timeout('actor_lock_timeout',
xinference    |     ^^^^^^^^^^^^^^^^^
xinference    |   File "xoscar/core.pyx", line 526, in xoscar.core._BaseActor.__on_receive__
xinference    |     result = await result
xinference    |     ^^^^^^^^^^^^^^^^^
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xinference/core/supervisor.py", line 836, in launch_builtin_model
xinference    |     await _launch_model()
xinference    |     ^^^^^^^^^^^^^^^^^
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xinference/core/supervisor.py", line 800, in _launch_model
xinference    |     await _launch_one_model(rep_model_uid)
xinference    |     ^^^^^^^^^^^^^^^^^
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xinference/core/supervisor.py", line 781, in _launch_one_model
xinference    |     await worker_ref.launch_builtin_model(
xinference    |     ^^^^^^^^^^^^^^^^^
xinference    |   File "xoscar/core.pyx", line 284, in __pyx_actor_method_wrapper
xinference    |     async with lock:
xinference    |   File "xoscar/core.pyx", line 287, in xoscar.core.__pyx_actor_method_wrapper
xinference    |     result = await result
xinference    |     ^^^^^^^^^^^^^^^^^
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xinference/core/utils.py", line 45, in wrapped
xinference    |     ret = await func(*args, **kwargs)
xinference    |     ^^^^^^^^^^^^^^^^^
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xinference/core/worker.py", line 629, in launch_builtin_model
xinference    |     subpool_address, devices = await self._create_subpool(
xinference    |     ^^^^^^^^^^^^^^^^^
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xinference/core/worker.py", line 487, in _create_subpool
xinference    |     subpool_address = await self._main_pool.append_sub_pool(
xinference    |     ^^^^^^^^^^^^^^^^^
xinference    |   File "/opt/conda/lib/python3.11/site-packages/xoscar/backends/indigen/pool.py", line 385, in append_sub_pool
xinference    |     process_index, process_status.external_addresses[0]
xinference    |     ^^^^^^^^^^^^^^^^^
xinference    | TypeError: [address=0.0.0.0:44795, pid=39] 'NoneType' object is not subscriptable

wencan · 2024-05-22T09:50:54Z

+1

zhubinn · 2024-05-22T12:33:10Z

+1

wencan · 2024-05-23T07:41:29Z

The bug is not present in v0.11.0.

XprobeBot added this to the v0.11.2 milestone May 18, 2024

codingl2k1 mentioned this issue May 23, 2024

BUG: Fix start worker failed due to None device name #1539

Merged

qinxuye closed this as completed in #1539 May 24, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docker cpu 版本，下载模型异常：Server error: 500 - [address=0.0.0.0:33434, pid=15] 'NoneType' object is not subscriptable #1517

docker cpu 版本，下载模型异常：Server error: 500 - [address=0.0.0.0:33434, pid=15] 'NoneType' object is not subscriptable #1517

Jamel-jun commented May 18, 2024

ciekawy commented May 21, 2024 •

edited

NTLx commented May 22, 2024

wencan commented May 22, 2024

zhubinn commented May 22, 2024

wencan commented May 23, 2024

docker cpu 版本，下载模型异常：Server error: 500 - [address=0.0.0.0:33434, pid=15] 'NoneType' object is not subscriptable #1517

docker cpu 版本，下载模型异常：Server error: 500 - [address=0.0.0.0:33434, pid=15] 'NoneType' object is not subscriptable #1517

Comments

Jamel-jun commented May 18, 2024

Describe the bug

To Reproduce

Expected behavior

Additional context

ciekawy commented May 21, 2024 • edited

NTLx commented May 22, 2024

wencan commented May 22, 2024

zhubinn commented May 22, 2024

wencan commented May 23, 2024

ciekawy commented May 21, 2024 •

edited