Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WebSocket received error while receiving messages: no close frame received or sent #339

Open
ValentaTomas opened this issue Mar 24, 2024 · 7 comments · Fixed by e2b-dev/infra#129
Labels
bug Something isn't working

Comments

@ValentaTomas
Copy link
Member

Sometimes process.wait times out without any reason.

@ValentaTomas ValentaTomas added the bug Something isn't working label Mar 24, 2024
@ValentaTomas
Copy link
Member Author

ValentaTomas commented Mar 24, 2024

With testing code loop:

import logging
import time

from dotenv import load_dotenv
from e2b import Sandbox

load_dotenv()

logging.basicConfig(level=logging.DEBUG)


def run_code(code, num):
    template_id = "code-interpreter"
    binary = "python3"
    filepath = "/index.py"

    print(f"\n\n------------(START - {num})-------------\n\n")
    with Sandbox(template=template_id, timeout=15) as sandbox:
        print(f"\n\n------------(SANDBOX - {sandbox.id})-------------\n\n")
        sandbox.filesystem.write(filepath, code, 15)

        # We use start and wait separately here instead of start_and_wait in order to define the timeout for the wait explicitly
        process = sandbox.process.start(cmd=f"{binary} {filepath}", timeout=30)
        try:
            process.wait(timeout=10)
        except:
            print("TimeoutError -> writing to /timeout")
            sandbox.filesystem.write("/timeout", "true", 15)
            raise
        print(f"\n\n------------(END - {num} - {sandbox.id})-------------\n\n")


for i in range(1000):
    run_code("str('')", i)
    time.sleep(1)

the error logs are:

------------(START - 36)-------------
INFO:e2b.sandbox.main:Creating sandbox code-interpreter
INFO:e2b.sandbox.sandbox_connection:Sandbox for template code-interpreter initialized
INFO:e2b.sandbox.main:Opening sandbox code-interpreter
INFO:e2b.sandbox.sandbox_connection:Sandbox fv7bfqp4wyo42829htzt created (id:ifjurtskbfr0jf8alwukk-40c5eb20)
INFO:e2b.sandbox.sandbox_connection:Started refreshing sandbox fv7bfqp4wyo42829htzt (id: ifjurtskbfr0jf8alwukk-40c5eb20)
DEBUG:asyncio:Using selector: KqueueSelector
INFO:e2b.sandbox.sandbox_rpc:WebSocket waiting to start
DEBUG:e2b.sandbox.websocket_client:WebSocket connecting to wss://49982-ifjurtskbfr0jf8alwukk-40c5eb20.e2b.dev/ws
INFO:e2b.sandbox.websocket_client:WebSocket connected to wss://49982-ifjurtskbfr0jf8alwukk-40c5eb20.e2b.dev/ws
DEBUG:e2b.sandbox.websocket_client:WebSocket starting to send messages
INFO:e2b.sandbox.sandbox_rpc:WebSocket started
INFO:e2b.sandbox.main:Sandbox code-interpreter opened
------------(SANDBOX - ifjurtskbfr0jf8alwukk-40c5eb20)-------------
DEBUG:e2b.sandbox.filesystem:Writing file /index.py
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket queueing message: {"jsonrpc": "2.0", "method": "filesystem_write", "params": ["/index.py", "str('')"], "id": 1}
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket waiting for reply: {"jsonrpc": "2.0", "method": "filesystem_write", "params": ["/index.py", "str('')"], "id": 1}
DEBUG:e2b.sandbox.websocket_client:WebSocket message to send: {"jsonrpc": "2.0", "method": "filesystem_write", "params": ["/index.py", "str('')"], "id": 1}
DEBUG:e2b.sandbox.websocket_client:WebSocket message sent: {"jsonrpc": "2.0", "method": "filesystem_write", "params": ["/index.py", "str('')"], "id": 1}
DEBUG:e2b.sandbox.websocket_client:WebSocket received message: {"jsonrpc":"2.0","id":1,"result":null}
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket received message: {"jsonrpc":"2.0","id":1,"result":null}
DEBUG:e2b.sandbox.sandbox_rpc:Processing message: {"jsonrpc":"2.0","id":1,"result":null}
DEBUG:e2b.sandbox.sandbox_rpc:Current waiting handlers: [1]
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket removed waiting handler for 1
DEBUG:e2b.sandbox.filesystem:Wrote file /index.py
INFO:e2b.sandbox.process:Starting process: python3 /index.py
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket queueing message: {"jsonrpc": "2.0", "method": "process_subscribe", "params": ["onExit", "qXHQf12FhSaU"], "id": 2}
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket queueing message: {"jsonrpc": "2.0", "method": "process_subscribe", "params": ["onStdout", "qXHQf12FhSaU"], "id": 3}
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket waiting for reply: {"jsonrpc": "2.0", "method": "process_subscribe", "params": ["onExit", "qXHQf12FhSaU"], "id": 2}
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket queueing message: {"jsonrpc": "2.0", "method": "process_subscribe", "params": ["onStderr", "qXHQf12FhSaU"], "id": 4}
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket waiting for reply: {"jsonrpc": "2.0", "method": "process_subscribe", "params": ["onStdout", "qXHQf12FhSaU"], "id": 3}
DEBUG:e2b.sandbox.websocket_client:WebSocket message to send: {"jsonrpc": "2.0", "method": "process_subscribe", "params": ["onExit", "qXHQf12FhSaU"], "id": 2}
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket waiting for reply: {"jsonrpc": "2.0", "method": "process_subscribe", "params": ["onStderr", "qXHQf12FhSaU"], "id": 4}
DEBUG:e2b.sandbox.websocket_client:WebSocket message sent: {"jsonrpc": "2.0", "method": "process_subscribe", "params": ["onExit", "qXHQf12FhSaU"], "id": 2}
DEBUG:e2b.sandbox.websocket_client:WebSocket message to send: {"jsonrpc": "2.0", "method": "process_subscribe", "params": ["onStdout", "qXHQf12FhSaU"], "id": 3}
DEBUG:e2b.sandbox.websocket_client:WebSocket message sent: {"jsonrpc": "2.0", "method": "process_subscribe", "params": ["onStdout", "qXHQf12FhSaU"], "id": 3}
DEBUG:e2b.sandbox.websocket_client:WebSocket message to send: {"jsonrpc": "2.0", "method": "process_subscribe", "params": ["onStderr", "qXHQf12FhSaU"], "id": 4}
DEBUG:e2b.sandbox.websocket_client:WebSocket message sent: {"jsonrpc": "2.0", "method": "process_subscribe", "params": ["onStderr", "qXHQf12FhSaU"], "id": 4}
DEBUG:e2b.sandbox.websocket_client:WebSocket received message: {"jsonrpc":"2.0","id":2,"result":"0xfa10253392e0945d8eeb40e6376ed84d"}
DEBUG:e2b.sandbox.websocket_client:WebSocket received message: {"jsonrpc":"2.0","id":3,"result":"0x5da45d078b642df698fb51e7a499cccf"}
DEBUG:e2b.sandbox.websocket_client:WebSocket received message: {"jsonrpc":"2.0","id":4,"result":"0x28ddaa165d77b226c3e4c2a9a1c9b9d"}
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket received message: {"jsonrpc":"2.0","id":2,"result":"0xfa10253392e0945d8eeb40e6376ed84d"}
DEBUG:e2b.sandbox.sandbox_rpc:Processing message: {"jsonrpc":"2.0","id":2,"result":"0xfa10253392e0945d8eeb40e6376ed84d"}
DEBUG:e2b.sandbox.sandbox_rpc:Current waiting handlers: [2, 3, 4]
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket received message: {"jsonrpc":"2.0","id":3,"result":"0x5da45d078b642df698fb51e7a499cccf"}
DEBUG:e2b.sandbox.sandbox_rpc:Processing message: {"jsonrpc":"2.0","id":3,"result":"0x5da45d078b642df698fb51e7a499cccf"}
DEBUG:e2b.sandbox.sandbox_rpc:Current waiting handlers: [2, 3, 4]
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket received message: {"jsonrpc":"2.0","id":4,"result":"0x28ddaa165d77b226c3e4c2a9a1c9b9d"}
DEBUG:e2b.sandbox.sandbox_rpc:Processing message: {"jsonrpc":"2.0","id":4,"result":"0x28ddaa165d77b226c3e4c2a9a1c9b9d"}
DEBUG:e2b.sandbox.sandbox_rpc:Current waiting handlers: [2, 3, 4]
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket removed waiting handler for 3
DEBUG:e2b.sandbox.sandbox_connection:Subscribed to process on_stdout (sub id: 0x5da45d078b642df698fb51e7a499cccf)
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket removed waiting handler for 2
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket removed waiting handler for 4
DEBUG:e2b.sandbox.sandbox_connection:Subscribed to process on_exit (sub id: 0xfa10253392e0945d8eeb40e6376ed84d)
DEBUG:e2b.sandbox.sandbox_connection:Subscribed to process on_stderr (sub id: 0x28ddaa165d77b226c3e4c2a9a1c9b9d)
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket queueing message: {"jsonrpc": "2.0", "method": "process_start", "params": ["qXHQf12FhSaU", "python3 /index.py", {"PYTHONUNBUFFERED": "1"}, ""], "id": 5}
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket waiting for reply: {"jsonrpc": "2.0", "method": "process_start", "params": ["qXHQf12FhSaU", "python3 /index.py", {"PYTHONUNBUFFERED": "1"}, ""], "id": 5}
DEBUG:e2b.sandbox.websocket_client:WebSocket message to send: {"jsonrpc": "2.0", "method": "process_start", "params": ["qXHQf12FhSaU", "python3 /index.py", {"PYTHONUNBUFFERED": "1"}, ""], "id": 5}
DEBUG:e2b.sandbox.websocket_client:WebSocket message sent: {"jsonrpc": "2.0", "method": "process_start", "params": ["qXHQf12FhSaU", "python3 /index.py", {"PYTHONUNBUFFERED": "1"}, ""], "id": 5}
DEBUG:e2b.sandbox.websocket_client:WebSocket received message: {"jsonrpc":"2.0","id":5,"result":"qXHQf12FhSaU"}
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket received message: {"jsonrpc":"2.0","id":5,"result":"qXHQf12FhSaU"}
DEBUG:e2b.sandbox.sandbox_rpc:Processing message: {"jsonrpc":"2.0","id":5,"result":"qXHQf12FhSaU"}
DEBUG:e2b.sandbox.sandbox_rpc:Current waiting handlers: [5]
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket removed waiting handler for 5
INFO:e2b.sandbox.process:Started process (id: qXHQf12FhSaU)
DEBUG:e2b.sandbox.sandbox_connection:Refreshed sandbox isc5zlsuh9yylauaey00w-40c5eb20
DEBUG:e2b.sandbox.sandbox_connection:Cannot refresh sandbox - it was closed. isc5zlsuh9yylauaey00w-40c5eb20
INFO:e2b.sandbox.sandbox_connection:Stopped refreshing sandbox (id: isc5zlsuh9yylauaey00w-40c5eb20)
DEBUG:e2b.sandbox.sandbox_connection:Refreshed sandbox ifjurtskbfr0jf8alwukk-40c5eb20
DEBUG:e2b.sandbox.sandbox_connection:Refreshed sandbox ifjurtskbfr0jf8alwukk-40c5eb20
TimeoutError -> writing to /timeout
DEBUG:e2b.sandbox.filesystem:Writing file /timeout
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket queueing message: {"jsonrpc": "2.0", "method": "filesystem_write", "params": ["/timeout", "true"], "id": 6}
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket waiting for reply: {"jsonrpc": "2.0", "method": "filesystem_write", "params": ["/timeout", "true"], "id": 6}
DEBUG:e2b.sandbox.websocket_client:WebSocket message to send: {"jsonrpc": "2.0", "method": "filesystem_write", "params": ["/timeout", "true"], "id": 6}
DEBUG:e2b.sandbox.websocket_client:WebSocket message sent: {"jsonrpc": "2.0", "method": "filesystem_write", "params": ["/timeout", "true"], "id": 6}
DEBUG:e2b.sandbox.sandbox_connection:Refreshed sandbox ifjurtskbfr0jf8alwukk-40c5eb20
DEBUG:e2b.sandbox.sandbox_connection:Refreshed sandbox ifjurtskbfr0jf8alwukk-40c5eb20
ERROR:e2b.sandbox.sandbox_rpc:WebSocket timed out while waiting for: {"jsonrpc": "2.0", "method": "filesystem_write", "params": ["/timeout", "true"], "id": 6} 
ERROR:e2b.sandbox.sandbox_rpc:WebSocket received error while waiting for: {"jsonrpc": "2.0", "method": "filesystem_write", "params": ["/timeout", "true"], "id": 6} WebSocket timed out while waiting for: {"jsonrpc": "2.0", "method": "filesystem_write", "params": ["/timeout", "true"], "id": 6} 
DEBUG:e2b.sandbox.sandbox_rpc:WebSocket removed waiting handler for 6
INFO:e2b.sandbox.sandbox_connection:Closing sandbox fv7bfqp4wyo42829htzt (id: ifjurtskbfr0jf8alwukk-40c5eb20)
INFO:e2b.sandbox.sandbox_connection:Sandbox closed
Traceback (most recent call last):
  File "/Users/valenta.and.thomas/Developer/e2b/packages/python-sdk/example.py", line 25, in run_code
INFO:e2b.sandbox.websocket_client:WebSocket stopped
    process.wait(timeout=10)
  File "/Users/valenta.and.thomas/Developer/e2b/packages/python-sdk/e2b/sandbox/process.py", line 184, in wait
    raise TimeoutException(f"Process did not finish within {timeout} seconds")
e2b.sandbox.exception.TimeoutException: Process did not finish within 10 seconds
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/Users/valenta.and.thomas/Developer/e2b/packages/python-sdk/e2b/sandbox/sandbox_rpc.py", line 136, in send_message
    r = future_reply.result(timeout=timeout)
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/valenta.and.thomas/Developer/e2b/packages/python-sdk/e2b/utils/future.py", line 23, in result
    return self._future.result(timeout=timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.6_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/concurrent/futures/_base.py", line 458, in result
    raise TimeoutError()
TimeoutError
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/Users/valenta.and.thomas/Developer/e2b/packages/python-sdk/example.py", line 34, in <module>
    run_code("str('')", i)
  File "/Users/valenta.and.thomas/Developer/e2b/packages/python-sdk/example.py", line 28, in run_code
    sandbox.filesystem.write("/timeout", "true", 15)
  File "/Users/valenta.and.thomas/Developer/e2b/packages/python-sdk/e2b/sandbox/filesystem.py", line 111, in write
    self._sandbox._call(
  File "/Users/valenta.and.thomas/Developer/e2b/packages/python-sdk/e2b/sandbox/sandbox_connection.py", line 314, in _call
    return self._rpc.send_message(f"{service}_{method}", params, timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/valenta.and.thomas/Developer/e2b/packages/python-sdk/e2b/sandbox/sandbox_rpc.py", line 145, in send_message
    raise e
  File "/Users/valenta.and.thomas/Developer/e2b/packages/python-sdk/e2b/sandbox/sandbox_rpc.py", line 139, in send_message
    raise TimeoutException(
e2b.sandbox.exception.TimeoutException: WebSocket timed out while waiting for: {"jsonrpc": "2.0", "method": "filesystem_write", "params": ["/timeout", "true"], "id": 6} 

All Grafana logs for this instance are:

2024-03-22 16:15:15.136	
processSvc
Started new process
2024-03-22 16:15:15.124	
processSvc
Process with ID doesn't exist yet. Creating a new process
2024-03-22 16:15:15.117	
processSvc
Start process
2024-03-22 16:15:15.065	
processSvc
Subscribed to process stderr
2024-03-22 16:15:15.060	
processSvc
Subscribed to process stdout
2024-03-22 16:15:15.054	
processSvc
Subscribed to process exit
2024-03-22 16:15:15.048	
processSvc
Subscribing to process stderr
2024-03-22 16:15:15.043	
processSvc
Subscribing to process exit
2024-03-22 16:15:15.036	
processSvc
Subscribing to process stdout



2024-03-22 16:15:14.909	
filesystemSvc
Write file

@ValentaTomas
Copy link
Member Author

After looking at the looks it looks like the whole sandbox becomes unresponsive after the process starts — the second filesystem write is not executed when we try to do that after catching the timeout exception.

@ValentaTomas
Copy link
Member Author

I want to test this in JS too, to check if this is not some problem with WS/connection implementation.

@jsuhncc3d
Copy link

I also get timeout in JS. I set timeouts to 5min but it timeouts after 60sec. This happens with any startAndWait call.

TimeoutError: Promise timed out after 60000 milliseconds.

@ValentaTomas
Copy link
Member Author

Error

ERROR:e2b.sandbox.websocket_client:WebSocket received error while receiving messages: no close frame received or sent

Is probably related — when the sandbox stops responding during the sending of message the message becomes mangled.

@ValentaTomas ValentaTomas changed the title process.wait time out process.wait sometimes times out Apr 10, 2024
@mlejva mlejva reopened this Apr 21, 2024
@mlejva mlejva changed the title process.wait sometimes times out WebSocket received error while receiving messages: no close frame received or sent Apr 21, 2024
@mlejva
Copy link
Member

mlejva commented Apr 21, 2024

I'm re-opening this issue because it's still happening to our users. It's most likely connected to a sandbox running out of memory.

This issue should stay open until

  • We have a better and friendlier error message mentioning OOM
  • We have good CPU + RAM usage monitoring for our users

@ValentaTomas
Copy link
Member Author

ValentaTomas commented Apr 23, 2024

We should also improve the OOM handling by exploring the following:

  • zram
  • zswap
  • swap
  • uresourced
  • systemd-oomd
  • changing oom_score_adj for envd, start command and for the processes started via envd (the score is by default inherited)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: 🚧 Working on it
Development

Successfully merging a pull request may close this issue.

3 participants