We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug
binary
1.5.0b0
Centos 7.9.2009
3.10.14
No response
两台机器分布启动 ray 集群:
# 192.168.3.21 export ip="192.168.3.21" ray start --head --node-ip-address="${ip}" --port="9010" --include-dashboard=False --disable-usage-stats # 192.168.3.23 export ip="192.168.3.23" ray start --head --node-ip-address="${ip}" --port="9010" --include-dashboard=False --disable-usage-stats
分别执行:
# 192.168.3.23 python3 demo.py -p=client # 192.168.3.21 python3 demo.py -p=server
执行完毕输出日志如下:
[root@sf-3-23 ~]# python3 demo.py -p=client 2024-05-09 11:24:55,420 INFO worker.py:1540 -- Connecting to existing Ray cluster at address: 192.168.3.23:9010... 2024-05-09 11:24:55,436 INFO worker.py:1724 -- Connected to Ray cluster. 2024-05-09 11:24:55.481 INFO api.py:233 [client] -- [Anonymous_job] Started rayfed with {'CLUSTER_ADDRESSES': {'client': '0.0.0.0:9020', 'server': '192.168.3.21:9020'}, 'CURRENT_PARTY_NAME': 'client', 'TLS_CONFIG': {}} 2024-05-09 11:24:55.481 DEBUG message_queue.py:56 [client] -- [Anonymous_job] Starting new thread[DataSendingQueueThread] for message polling. 2024-05-09 11:24:55.482 DEBUG cleanup.py:67 [client] -- [Anonymous_job] Start check sending thread. 2024-05-09 11:24:55.482 DEBUG message_queue.py:56 [client] -- [Anonymous_job] Starting new thread[ErrorSendingQueueThread] for message polling. 2024-05-09 11:24:55.483 DEBUG cleanup.py:69 [client] -- [Anonymous_job] Start check error sending thread. 2024-05-09 11:24:55.483 DEBUG barriers.py:445 [client] -- [Anonymous_job] Starting ReceiverProxyActor with options: {'max_concurrency': 1, 'name': 'SenderReceiverProxyActor'} (SenderReceiverProxyActor pid=17006) 2024-05-09 11:24:56.954 INFO link.py:38 [client] -- [Anonymous_job] brpc options: {'message_max_size_in_bytes': 2147483647, 'timeout_in_ms': 1800000, 'connect_retry_times': 8640, 'connect_retry_interval_ms': 10000, 'recv_timeout_ms': 21600000, 'http_timeout_ms': 21600000, 'exit_on_sending_failure': True} (SenderReceiverProxyActor pid=17006) 2024-05-09 11:24:56.954 WARNING link_config.py:34 [client] -- [Anonymous_job] http_timeout_ms and timeout_ms are set at the same time, http_timeout_ms 21600000 will be used. (SenderReceiverProxyActor pid=17006) I0509 11:24:56.980060 17006 external/com_github_brpc_brpc/src/brpc/server.cpp:1158] Server[yacl::link::transport::internal::ReceiverServiceImpl] is serving on port=9020. (SenderReceiverProxyActor pid=17006) W0509 11:24:56.980113 17006 external/com_github_brpc_brpc/src/brpc/server.cpp:1164] Builtin services are disabled according to ServerOptions.has_builtin_services 2024-05-09 11:25:02.569 INFO barriers.py:465 [client] -- [Anonymous_job] Succeeded to create receiver proxy actor. 2024-05-09 11:25:02.569 INFO barriers.py:520 [client] -- [Anonymous_job] Try ping ['server'] at 0 attemp, up to 3600 attemps. (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:02.579 DEBUG barriers.py:397 [client] -- [Anonymous_job] Sending send data to seq_id ping of server from ping without credentials. (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:02.579 DEBUG barriers.py:408 [client] -- [Anonymous_job] Succeeded to send data to seq_id ping of server from ping. Response is True =========================Start 2024-05-09 11:25:02.631 DEBUG pyu.py:105 [client] -- [Anonymous_job] PYU remote function: <function get_data at 0x7f406d3205e0>, num_returns=None, args len: 1, kwargs len: 0. 2024-05-09 11:25:02.632 DEBUG utils.py:63 [client] -- [Anonymous_job] Insert fed object, arg.party=client 2024-05-09 11:25:02.636 DEBUG pyu.py:105 [client] -- [Anonymous_job] PYU remote function: <function pyu_to_spu.<locals>.get_shares_chunk_count at 0x7f404853d480>, num_returns=None, args len: 4, kwargs len: 0. (_run pid=4520) INFO:jax._src.xla_bridge:Unable to initialize backend 'cuda': (_run pid=4520) INFO:jax._src.xla_bridge:Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig' (_run pid=4520) INFO:jax._src.xla_bridge:Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory 2024-05-09 11:25:04.529 DEBUG utils.py:63 [client] -- [Anonymous_job] Insert fed object, arg.party=client 2024-05-09 11:25:04.540 DEBUG pyu.py:105 [client] -- [Anonymous_job] PYU remote function: <function pyu_to_spu.<locals>.run_spu_io at 0x7f404853e320>, num_returns=4, args len: 4, kwargs len: 0. 2024-05-09 11:25:04.541 DEBUG utils.py:63 [client] -- [Anonymous_job] Insert fed object, arg.party=client 2024-05-09 11:25:04.541 DEBUG utils.py:63 [client] -- [Anonymous_job] Insert fed object, arg.party=client 2024-05-09 11:25:04.542 DEBUG fed_actor.py:104 [client] -- [Anonymous_job] Actor method call: infeed_share, num_returns: 1 2024-05-09 11:25:04.544 DEBUG utils.py:63 [client] -- [Anonymous_job] Insert fed object, arg.party=client 2024-05-09 11:25:04.545 DEBUG fed_actor.py:104 [client] -- [Anonymous_job] Actor method call: del_share, num_returns: 1 2024-05-09 11:25:04.545 DEBUG pyu.py:105 [client] -- [Anonymous_job] PYU remote function: <function get_data at 0x7f406d3205e0>, num_returns=None, args len: 1, kwargs len: 0. 2024-05-09 11:25:04.545 DEBUG pyu.py:105 [client] -- [Anonymous_job] PYU remote function: <function pyu_to_spu.<locals>.get_shares_chunk_count at 0x7f404853e320>, num_returns=None, args len: 4, kwargs len: 0. (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:04.530 DEBUG barriers.py:397 [client] -- [Anonymous_job] Sending send data to seq_id 7 of server from 6#0 without credentials. (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:04.530 DEBUG barriers.py:408 [client] -- [Anonymous_job] Succeeded to send data to seq_id 7 of server from 6#0. Response is True (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:05.609 DEBUG barriers.py:397 [client] -- [Anonymous_job] Sending send data to seq_id 10 of server from 8#1 without credentials. (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:05.609 DEBUG barriers.py:408 [client] -- [Anonymous_job] Succeeded to send data to seq_id 10 of server from 8#1. Response is True (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:05.611 DEBUG barriers.py:397 [client] -- [Anonymous_job] Sending send data to seq_id 10 of server from 8#3 without credentials. (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:05.611 DEBUG barriers.py:408 [client] -- [Anonymous_job] Succeeded to send data to seq_id 10 of server from 8#3. Response is True (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:05.612 DEBUG link.py:93 [client] -- [Anonymous_job] Getting data for 15 from 14#0 of server (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:05.612 DEBUG link.py:114 [client] -- [Anonymous_job] Received data for ping from ping. 2024-05-09 11:25:06.650 DEBUG pyu.py:105 [client] -- [Anonymous_job] PYU remote function: <function pyu_to_spu.<locals>.run_spu_io at 0x7f404853e680>, num_returns=4, args len: 4, kwargs len: 0. 2024-05-09 11:25:06.651 DEBUG utils.py:66 [client] -- [Anonymous_job] Insert recv_op, arg task id 16#1, current task id 17 2024-05-09 11:25:06.652 DEBUG utils.py:66 [client] -- [Anonymous_job] Insert recv_op, arg task id 16#2, current task id 17 2024-05-09 11:25:06.653 DEBUG fed_actor.py:104 [client] -- [Anonymous_job] Actor method call: infeed_share, num_returns: 1 2024-05-09 11:25:06.653 DEBUG utils.py:63 [client] -- [Anonymous_job] Insert fed object, arg.party=client 2024-05-09 11:25:06.653 DEBUG fed_actor.py:104 [client] -- [Anonymous_job] Actor method call: del_share, num_returns: 1 =========================Success (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:06.647 DEBUG link.py:114 [client] -- [Anonymous_job] Received data for 15 from 14#0. (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:06.648 DEBUG link.py:120 [client] -- [Anonymous_job] Getted data for 15 from 14#0 of server. (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:06.654 DEBUG link.py:93 [client] -- [Anonymous_job] Getting data for 17 from 16#1 of server (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:06.655 DEBUG link.py:114 [client] -- [Anonymous_job] Received data for 17 from 16#1. (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:06.655 DEBUG link.py:120 [client] -- [Anonymous_job] Getted data for 17 from 16#1 of server. (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:06.657 DEBUG link.py:93 [client] -- [Anonymous_job] Getting data for 17 from 16#2 of server (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:06.657 DEBUG link.py:114 [client] -- [Anonymous_job] Received data for 17 from 16#2. (SenderReceiverProxyActor pid=17006) 2024-05-09 11:25:06.657 DEBUG link.py:120 [client] -- [Anonymous_job] Getted data for 17 from 16#2 of server.
[root@sf-3-21 ~]# python3 demo.py -p=server 2024-05-09 11:24:52,188 INFO worker.py:1540 -- Connecting to existing Ray cluster at address: 192.168.3.21:9010... 2024-05-09 11:24:52,203 INFO worker.py:1724 -- Connected to Ray cluster. 2024-05-09 11:24:52.249 INFO api.py:233 [server] -- [Anonymous_job] Started rayfed with {'CLUSTER_ADDRESSES': {'client': '192.168.3.23:9020', 'server': '0.0.0.0:9020'}, 'CURRENT_PARTY_NAME': 'server', 'TLS_CONFIG': {}} 2024-05-09 11:24:52.249 DEBUG message_queue.py:56 [server] -- [Anonymous_job] Starting new thread[DataSendingQueueThread] for message polling. 2024-05-09 11:24:52.250 DEBUG cleanup.py:67 [server] -- [Anonymous_job] Start check sending thread. 2024-05-09 11:24:52.250 DEBUG message_queue.py:56 [server] -- [Anonymous_job] Starting new thread[ErrorSendingQueueThread] for message polling. 2024-05-09 11:24:52.250 DEBUG cleanup.py:69 [server] -- [Anonymous_job] Start check error sending thread. 2024-05-09 11:24:52.250 DEBUG barriers.py:445 [server] -- [Anonymous_job] Starting ReceiverProxyActor with options: {'max_concurrency': 1, 'name': 'SenderReceiverProxyActor'} (SenderReceiverProxyActor pid=24536) 2024-05-09 11:24:53.721 INFO link.py:38 [server] -- [Anonymous_job] brpc options: {'message_max_size_in_bytes': 2147483647, 'timeout_in_ms': 1800000, 'connect_retry_times': 8640, 'connect_retry_interval_ms': 10000, 'recv_timeout_ms': 21600000, 'http_timeout_ms': 21600000, 'exit_on_sending_failure': True} (SenderReceiverProxyActor pid=24536) 2024-05-09 11:24:53.722 WARNING link_config.py:34 [server] -- [Anonymous_job] http_timeout_ms and timeout_ms are set at the same time, http_timeout_ms 21600000 will be used. (SenderReceiverProxyActor pid=24536) I0509 11:24:53.749538 24536 external/com_github_brpc_brpc/src/brpc/server.cpp:1158] Server[yacl::link::transport::internal::ReceiverServiceImpl] is serving on port=9020. (SenderReceiverProxyActor pid=24536) W0509 11:24:53.749588 24536 external/com_github_brpc_brpc/src/brpc/server.cpp:1164] Builtin services are disabled according to ServerOptions.has_builtin_services (SenderReceiverProxyActor pid=24536) I0509 11:24:53.869094 24630 external/com_github_brpc_brpc/src/brpc/socket.cpp:2466] Checking Socket{id=0 addr=192.168.3.23:9020} (0x3513080) (SenderReceiverProxyActor pid=24536) I0509 11:24:59.871872 24662 external/com_github_brpc_brpc/src/brpc/socket.cpp:2526] Revived Socket{id=0 addr=192.168.3.23:9020} (0x3513080) (Connectable) 2024-05-09 11:25:02.792 INFO barriers.py:465 [server] -- [Anonymous_job] Succeeded to create receiver proxy actor. 2024-05-09 11:25:02.792 INFO barriers.py:520 [server] -- [Anonymous_job] Try ping ['client'] at 0 attemp, up to 3600 attemps. (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:02.799 DEBUG barriers.py:397 [server] -- [Anonymous_job] Sending send data to seq_id ping of client from ping without credentials. (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:02.800 DEBUG barriers.py:408 [server] -- [Anonymous_job] Succeeded to send data to seq_id ping of client from ping. Response is True =========================Start 2024-05-09 11:25:02.852 DEBUG pyu.py:105 [server] -- [Anonymous_job] PYU remote function: <function get_data at 0x7fa10e3005e0>, num_returns=None, args len: 1, kwargs len: 0. 2024-05-09 11:25:02.852 DEBUG pyu.py:105 [server] -- [Anonymous_job] PYU remote function: <function pyu_to_spu.<locals>.get_shares_chunk_count at 0x7fa10471b0a0>, num_returns=None, args len: 4, kwargs len: 0. (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:02.856 DEBUG link.py:93 [server] -- [Anonymous_job] Getting data for 7 from 6#0 of client (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:02.857 DEBUG link.py:114 [server] -- [Anonymous_job] Received data for ping from ping. 2024-05-09 11:25:04.757 DEBUG pyu.py:105 [server] -- [Anonymous_job] PYU remote function: <function pyu_to_spu.<locals>.run_spu_io at 0x7fa10471b130>, num_returns=4, args len: 4, kwargs len: 0. 2024-05-09 11:25:04.758 DEBUG utils.py:66 [server] -- [Anonymous_job] Insert recv_op, arg task id 8#1, current task id 10 2024-05-09 11:25:04.760 DEBUG utils.py:66 [server] -- [Anonymous_job] Insert recv_op, arg task id 8#3, current task id 10 2024-05-09 11:25:04.762 DEBUG fed_actor.py:104 [server] -- [Anonymous_job] Actor method call: infeed_share, num_returns: 1 2024-05-09 11:25:04.764 DEBUG utils.py:63 [server] -- [Anonymous_job] Insert fed object, arg.party=server 2024-05-09 11:25:04.764 DEBUG fed_actor.py:104 [server] -- [Anonymous_job] Actor method call: del_share, num_returns: 1 2024-05-09 11:25:04.772 DEBUG pyu.py:105 [server] -- [Anonymous_job] PYU remote function: <function get_data at 0x7fa10e3005e0>, num_returns=None, args len: 1, kwargs len: 0. 2024-05-09 11:25:04.772 DEBUG utils.py:63 [server] -- [Anonymous_job] Insert fed object, arg.party=server 2024-05-09 11:25:04.778 DEBUG pyu.py:105 [server] -- [Anonymous_job] PYU remote function: <function pyu_to_spu.<locals>.get_shares_chunk_count at 0x7fa10471b130>, num_returns=None, args len: 4, kwargs len: 0. (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:04.753 DEBUG link.py:114 [server] -- [Anonymous_job] Received data for 7 from 6#0. (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:04.754 DEBUG link.py:120 [server] -- [Anonymous_job] Getted data for 7 from 6#0 of client. (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:04.761 DEBUG link.py:93 [server] -- [Anonymous_job] Getting data for 10 from 8#1 of client 2024-05-09 11:25:06.716 DEBUG utils.py:63 [server] -- [Anonymous_job] Insert fed object, arg.party=server 2024-05-09 11:25:06.720 DEBUG pyu.py:105 [server] -- [Anonymous_job] PYU remote function: <function pyu_to_spu.<locals>.run_spu_io at 0x7fa104719480>, num_returns=4, args len: 4, kwargs len: 0. 2024-05-09 11:25:06.722 DEBUG utils.py:63 [server] -- [Anonymous_job] Insert fed object, arg.party=server 2024-05-09 11:25:06.722 DEBUG utils.py:63 [server] -- [Anonymous_job] Insert fed object, arg.party=server 2024-05-09 11:25:06.722 DEBUG fed_actor.py:104 [server] -- [Anonymous_job] Actor method call: infeed_share, num_returns: 1 2024-05-09 11:25:06.723 DEBUG utils.py:63 [server] -- [Anonymous_job] Insert fed object, arg.party=server 2024-05-09 11:25:06.723 DEBUG fed_actor.py:104 [server] -- [Anonymous_job] Actor method call: del_share, num_returns: 1 =========================Success (_run pid=10863) INFO:jax._src.xla_bridge:Unable to initialize backend 'cuda': (_run pid=10863) INFO:jax._src.xla_bridge:Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig' (_run pid=10863) INFO:jax._src.xla_bridge:Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:06.865 DEBUG link.py:114 [server] -- [Anonymous_job] Received data for 10 from 8#1. (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:06.866 DEBUG link.py:120 [server] -- [Anonymous_job] Getted data for 10 from 8#1 of client. (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:06.867 DEBUG link.py:93 [server] -- [Anonymous_job] Getting data for 10 from 8#3 of client (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:06.868 DEBUG link.py:114 [server] -- [Anonymous_job] Received data for 10 from 8#3. (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:06.868 DEBUG link.py:120 [server] -- [Anonymous_job] Getted data for 10 from 8#3 of client. (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:06.868 DEBUG barriers.py:397 [server] -- [Anonymous_job] Sending send data to seq_id 15 of client from 14#0 without credentials. (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:06.869 DEBUG barriers.py:408 [server] -- [Anonymous_job] Succeeded to send data to seq_id 15 of client from 14#0. Response is True (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:06.870 DEBUG barriers.py:397 [server] -- [Anonymous_job] Sending send data to seq_id 17 of client from 16#1 without credentials. (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:06.870 DEBUG barriers.py:408 [server] -- [Anonymous_job] Succeeded to send data to seq_id 17 of client from 16#1. Response is True (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:06.871 DEBUG barriers.py:397 [server] -- [Anonymous_job] Sending send data to seq_id 17 of client from 16#2 without credentials. (SenderReceiverProxyActor pid=24536) 2024-05-09 11:25:06.871 DEBUG barriers.py:408 [server] -- [Anonymous_job] Succeeded to send data to seq_id 17 of client from 16#2. Response is True
demo.py代码如下:
import argparse import secretflow as sf import logging def ray_init(self_party): sf.shutdown() ip={ "server": "192.168.3.21", "client": "192.168.3.23", }[self_party] sf.init(address=ip+":9010", cluster_config={ 'self_party': self_party, 'parties': { 'client': { 'id': 'client', 'party': 'client', 'address': '192.168.3.23:9020', 'listen_addr': '0.0.0.0:9020', }, 'server': { 'id': 'server', 'party': 'server', 'address': '192.168.3.21:9020', 'listen_addr': '0.0.0.0:9020', } }, }, log_to_driver=True, logging_level=logging.getLevelName(logging.DEBUG).lower(), cross_silo_comm_backend='brpc_link', cross_silo_comm_options={ "message_max_size_in_bytes": (2 << 30) - 1, "timeout_in_ms": 30 * 60 * 1000, # BRPC Config "connect_retry_times": 6 * 60 * 24, "connect_retry_interval_ms": 10 * 1000, "recv_timeout_ms": 6 * 3600 * 1000, "http_timeout_ms": 6 * 3600 * 1000, }, ) def spu_init(): cluster_def = { "runtime_config": { "protocol": "SEMI2K", "field": "FM128", "fxp_fraction_bits": 32, "fxp_div_goldschmidt_iters": 10, }, "nodes": [ { "party": 'client', 'address': '192.168.3.23:9030', "listen_address": "0.0.0.0:9030" }, { "party": 'server', 'address': '192.168.3.21:9030', "listen_address": "0.0.0.0:9030" }, ], } # link_desc link_desc = { "connect_retry_times": 6 * 60 * 24, "connect_retry_interval_ms": 10 * 1000, "recv_timeout_ms": 6 * 3600 * 1000, "http_timeout_ms": 6 * 3600 * 1000, "throttle_window_size": 0, "brpc_channel_protocol": "http", "brpc_channel_connection_type": "pooled", } return sf.SPU(cluster_def=cluster_def, link_desc=link_desc) def get_data(i): return i if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument("-p", "--party", default="", help="party id") args = parser.parse_args() # Ray init ray_init(args.party) spu_device = spu_init() pyus = [sf.PYU("client"), sf.PYU("server")] print("=========================Start") for pyu in pyus: pyu(get_data)(1).to(spu_device) print("=========================Success")
The text was updated successfully, but these errors were encountered:
相关版本信息如下(spu 因为改动过,所以使用的是 0.8.0b0 版本):
# pip3 list | grep secretflow secretflow 1.5.0b0 secretflow-rayfed 0.2.1a1 secretflow-serving-lib 0.3.0.dev20240320 # pip3 list | grep spu spu 0.8.0b0
Sorry, something went wrong.
需要在脚本最后加上 sf.shutdown(),可能会看到报错 AttributeError: 'NoneType' object has no attribute 'get_job_name',这个是已知问题,会尽快修复。
AttributeError: 'NoneType' object has no attribute 'get_job_name'
此外,为了保证在 shutdown 之前执行完任务,建议在 shutdown 之前加上 sf.wait(某个结果),比如:
if __name__ == "__main__": parser = argparse.ArgumentParser() parser.add_argument("-p", "--party", default="", help="party id") args = parser.parse_args() # Ray init ray_init(args.party) spu_device = spu_init() pyus = [sf.PYU("client"), sf.PYU("server")] print("=========================Start") spu_objs = [] for pyu in pyus: obj = pyu(get_data)(1).to(spu_device) spu_objs.append(obj) print("=========================Success") sf.wait(spu_objs) sf.shutdown()
ian-huu
No branches or pull requests
Issue Type
Bug
Source
binary
Secretflow Version
1.5.0b0
OS Platform and Distribution
Centos 7.9.2009
Python version
3.10.14
Bazel version
No response
GCC/Compiler version
No response
What happend and What you expected to happen.
两台机器分布启动 ray 集群:
分别执行:
执行完毕输出日志如下:
Reproduction code to reproduce the issue.
demo.py代码如下:
The text was updated successfully, but these errors were encountered: