Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help- Issue about the failure of control #18

Open
1998frankchen opened this issue May 7, 2024 · 3 comments
Open

Help- Issue about the failure of control #18

1998frankchen opened this issue May 7, 2024 · 3 comments

Comments

@1998frankchen
Copy link

Hi~
I don't know why my character can't be operated by this agent and i can control it via my manual keyborad/mouse operation, but the logger output shows that it has seemingly right self-reflection, information gathering, operation output and so on.
Here is my CLI, Is my failure about error in ms_deformable_im2col_cuda?
Thanks very much~

Frank

  • [ ]

  • [ ]

  • [ ]

(cradle) C:\Users\vipuser\agent\Cradle>python prototype_runner.py
C:\Users\vipuser.conda\envs\cradle\lib\site-packages\torch\functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ..\aten\src\ATen\native\TensorShape.cpp:3527.)
return VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
final text_encoder_type: bert-base-uncased
2024-05-07 20:30:57,041 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-05-07 20:31:16,574 - UAC Logger - INFO - Screen capture started
2024-05-07 20:31:20,878 - UAC Logger - INFO - Gather Information Start Frame ID: -1, End Frame ID: 11
2024-05-07 20:31:21,074 - UAC Logger - INFO - >> Calling INFORMATION GATHERING
2024-05-07 20:31:21,075 - UAC Logger - INFO - Using frame extractor to gather information
2024-05-07 20:31:21,076 - UAC Logger - INFO - Extracting Informative Frames from C:\Users\vipuser\agent\Cradle\runs\1715084999.6247497\video_splits\video
-00001.mp4 .....
2024-05-07 20:31:24,843 - UAC Logger - INFO - Frame Extraction Completed! Total Frames: 0
2024-05-07 20:31:24,845 - UAC Logger - INFO - Using icon replacer to gather information
2024-05-07 20:31:24,845 - UAC Logger - INFO - Start gathering text information from the whole video in parallel
2024-05-07 20:31:24,850 - UAC Logger - INFO - Finish gathering text information from the whole video
2024-05-07 20:31:24,850 - UAC Logger - INFO - Using llm description to gather information
2024-05-07 20:31:24,869 - UAC Logger - INFO - Requesting gpt-4-vision-preview completion...
2024-05-07 20:31:41,320 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-05-07 20:31:41,349 - UAC Logger - INFO - Response received from gpt-4-vision-preview.
2024-05-07 20:31:41,362 - UAC Logger - INFO - Using object detector to gather information
C:\Users\vipuser.conda\envs\cradle\lib\site-packages\transformers\modeling_utils.py:1051: FutureWarning: The device argument is deprecated and will be removed in v5 of Transformers.
warnings.warn(
C:\Users\vipuser.conda\envs\cradle\lib\site-packages\torch\utils\checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
C:\Users\vipuser.conda\envs\cradle\lib\site-packages\torch\utils\checkpoint.py:61: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn(
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
2024-05-07 20:31:51,635 - UAC Logger - INFO - Image Description: The image shows a third-person view in the game Red Dead Redemption 2. The player character is on horseback, holding a lantern in a snowy environment at night. The camera is positioned behind the player character, looking towards another character ahead, probably Dutch van der Linde, indicated by the dialogue caption "Dutch We have to try. Stay close and we'll do our best to stick to the trail." The lower left corner has a mini-map that shows the immediate surroundings of the character; it indicates buildings and the landscape, along with a yellow waypoint line the player character should follow. There is an on-screen prompt that says "Use W to follow Dutch," indicating the control needed to perform the action.
2024-05-07 20:31:51,641 - UAC Logger - INFO - Object Name: null
2024-05-07 20:31:51,642 - UAC Logger - INFO - Reasoning: 1. There is no need to detect an object given the current context; the task is to follow another character, which is not an object detection task.
2. There is no explicit weapon, shoot target, or item specified in the current interface, hence no relevant object needs to be detected according to the provided rules.
2024-05-07 20:31:51,643 - UAC Logger - INFO - Screen Classification: General game interface without any menu
2024-05-07 20:31:51,644 - UAC Logger - INFO - Dialogue: []
2024-05-07 20:31:51,645 - UAC Logger - INFO - Gathered Information: {}
2024-05-07 20:31:51,646 - UAC Logger - INFO - Classification Reasons: []
2024-05-07 20:31:51,647 - UAC Logger - INFO - All Task Guidance: []
2024-05-07 20:31:51,648 - UAC Logger - INFO - Last Task Guidance:
2024-05-07 20:31:51,648 - UAC Logger - INFO - Long Horizon: True
2024-05-07 20:31:51,650 - UAC Logger - INFO - Generated Actions: []
2024-05-07 20:31:51,650 - UAC Logger - INFO - Current Task Guidance:
2024-05-07 20:31:52,734 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-05-07 20:31:52,752 - UAC Logger - INFO - skill_library: ['fight', 'shoot_wolves', 'aim', 'follow', 'mount_horse', 'shoot', 'turn', 'select_weapon', 'turn_and_move_forward', 'select_sidearm', 'turn', 'move_forward', 'turn_and_move_forward']
2024-05-07 20:31:52,791 - UAC Logger - INFO - minimap_information: {'red points': [], 'yellow points': [], 'yellow region': []}
2024-05-07 20:31:52,793 - UAC Logger - INFO - minimap_info_str:
2024-05-07 20:31:52,800 - UAC Logger - INFO - Requesting gpt-4-vision-preview completion...
2024-05-07 20:32:05,753 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-05-07 20:32:05,768 - UAC Logger - INFO - Response received from gpt-4-vision-preview.
2024-05-07 20:32:05,771 - UAC Logger - INFO - R: ['follow()']
2024-05-07 20:32:05,772 - UAC Logger - INFO - Skill Steps: ['follow()']
2024-05-07 20:32:08,026 - UAC Logger - INFO - Executing skill: follow with params: {}
2024-05-07 20:32:11,980 - UAC Logger - INFO - KeyboardInterrupt Ctrl+C detected, exiting.
2024-05-07 20:32:12,251 - UAC Logger - INFO - Screen capture finished
2024-05-07 20:32:12,260 - UAC Logger - INFO - Screen capture thread is not executing

(cradle) C:\Users\vipuser\agent\Cradle>python prototype_runner.py
C:\Users\vipuser.conda\envs\cradle\lib\site-packages\torch\functional.py:504: UserWarning: torch.meshgrid: in an upcoming release, it will be required to pass the indexing argument. (Triggered internally at ..\aten\src\ATen\native\TensorShape.cpp:3527.)
return VF.meshgrid(tensors, **kwargs) # type: ignore[attr-defined]
final text_encoder_type: bert-base-uncased
2024-05-07 20:36:59,913 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-05-07 20:37:14,483 - UAC Logger - INFO - Screen capture started
2024-05-07 20:37:18,884 - UAC Logger - INFO - Gather Information Start Frame ID: -1, End Frame ID: 15
2024-05-07 20:37:19,160 - UAC Logger - INFO - >> Calling INFORMATION GATHERING
2024-05-07 20:37:19,164 - UAC Logger - INFO - Using frame extractor to gather information
2024-05-07 20:37:19,166 - UAC Logger - INFO - Extracting Informative Frames from C:\Users\vipuser\agent\Cradle\runs\1715085406.0478091\video_splits\video
-00001.mp4 .....
2024-05-07 20:37:21,891 - UAC Logger - INFO - Frame Extraction Completed! Total Frames: 1
2024-05-07 20:37:21,895 - UAC Logger - INFO - Using icon replacer to gather information
2024-05-07 20:37:23,268 - UAC Logger - INFO - Start gathering text information from the whole video in parallel
2024-05-07 20:37:25,288 - UAC Logger - INFO - Start gathering text information from the 1th frame
2024-05-07 20:37:25,310 - UAC Logger - INFO - Requesting gpt-4-vision-preview completion...
2024-05-07 20:37:33,203 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-05-07 20:37:33,214 - UAC Logger - INFO - Response received from gpt-4-vision-preview.
2024-05-07 20:37:33,215 - UAC Logger - INFO - Finish gathering text information from the 1th frame
2024-05-07 20:37:33,218 - UAC Logger - INFO - Finish gathering text information from the whole video
2024-05-07 20:37:33,218 - UAC Logger - INFO - Using llm description to gather information
2024-05-07 20:37:33,231 - UAC Logger - INFO - Requesting gpt-4-vision-preview completion...
2024-05-07 20:37:47,632 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 200 OK"
2024-05-07 20:37:47,645 - UAC Logger - INFO - Response received from gpt-4-vision-preview.
2024-05-07 20:37:47,646 - UAC Logger - INFO - Using object detector to gather information
C:\Users\vipuser.conda\envs\cradle\lib\site-packages\transformers\modeling_utils.py:1051: FutureWarning: The device argument is deprecated and will be removed in v5 of Transformers.
warnings.warn(
C:\Users\vipuser.conda\envs\cradle\lib\site-packages\torch\utils\checkpoint.py:429: UserWarning: torch.utils.checkpoint: please pass in use_reentrant=True or use_reentrant=False explicitly. The default value of use_reentrant will be updated to be False in the future. To maintain current behavior, pass use_reentrant=True. It is recommended that you use use_reentrant=False. Refer to docs for more details on the differences between the two variants.
warnings.warn(
C:\Users\vipuser.conda\envs\cradle\lib\site-packages\torch\utils\checkpoint.py:61: UserWarning: None of the inputs have requires_grad=True. Gradients will be None
warnings.warn(
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
error in ms_deformable_im2col_cuda: no kernel image is available for execution on the device
2024-05-07 20:37:51,837 - UAC Logger - INFO - Image Description: The image shows a snowy nighttime scene with the player character riding a horse. The character is holding a lantern, illuminating the area directly in front of them. The background features wooden buildings covered with snow, and heavy snowfall is visible in the air. The horse appears to be spotted with a white and dark coat. On the left side of the screen, there's a minimap with icons: the player's current position is indicated by an arrow, and there seem to be a few structures around, as well as waypoints or objectives marked on the map. No enemies or NPCs are visible in this image.
2024-05-07 20:37:51,839 - UAC Logger - INFO - Object Name: null
2024-05-07 20:37:51,840 - UAC Logger - INFO - Reasoning: 1. The screenshot does not show the weapon interface, hence no weapon is specified.
2. There is no explicit shoot target indicated.
3. No explicit item is specified for interaction in the image.
4. The screenshot is not on the trade or map interfaces.
5. There is no indication of a task that requires detection of an object.
2024-05-07 20:37:51,840 - UAC Logger - INFO - Screen Classification: General game interface without any menu
2024-05-07 20:37:51,842 - UAC Logger - INFO - Dialogue: [{'index': 0, 'object_id': '-00001_0_00_00_500', 'values': 'Dialogue is null'}]
2024-05-07 20:37:51,842 - UAC Logger - INFO - Gathered Information: {0: {'-00001_0_00_00_500': [{'information': '1. null', 'reasoning': '1. The screenshot does not display any text prompts.', 'item_status': 'Item_status is null', 'environment_information': 'Environment information is null', 'notification': 'Notification is null', 'task_guidance': 'Task is null', 'action_guidance': [], 'dialogue': 'Dialogue is null', 'other': 'Other information is null'}]}}
2024-05-07 20:37:51,842 - UAC Logger - INFO - Classification Reasons: [{'index': 0, 'object_id': '-00001_0_00_00_500', 'values': '1. The screenshot does not display any text prompts.'}]
2024-05-07 20:37:51,843 - UAC Logger - INFO - All Task Guidance: [{'index': 0, 'object_id': '-00001_0_00_00_500', 'values': 'Task is null'}]
2024-05-07 20:37:51,843 - UAC Logger - INFO - Last Task Guidance:
2024-05-07 20:37:51,844 - UAC Logger - INFO - Long Horizon: False
2024-05-07 20:37:51,845 - UAC Logger - INFO - Generated Actions: []
2024-05-07 20:37:51,845 - UAC Logger - INFO - Current Task Guidance:
2024-05-07 20:37:52,149 - httpx - INFO - HTTP Request: POST https://api.openai.com/v1/embeddings "HTTP/1.1 200 OK"
2024-05-07 20:37:52,169 - UAC Logger - INFO - skill_library: ['fight', 'shoot_wolves', 'aim', 'follow', 'mount_horse', 'shoot', 'turn', 'select_weapon', 'turn_and_move_forward', 'select_sidearm', 'turn', 'move_forward', 'turn_and_move_forward']
2024-05-07 20:37:52,216 - UAC Logger - INFO - minimap_information: {'red points': [], 'yellow points': [], 'yellow region': []}
2024-05-07 20:37:52,219 - UAC Logger - INFO - minimap_info_str:
2024-05-07 20:37:52,225 - UAC Logger - INFO - Requesting gpt-4-vision-preview completion...
2024-05-07 20:38:05,748 - UAC Logger - INFO - KeyboardInterrupt Ctrl+C detected, exiting.
2024-05-07 20:38:06,079 - UAC Logger - INFO - Screen capture finished
2024-05-07 20:38:06,081 - UAC Logger - INFO - Screen capture thread is not executing

@WeihaoTan
Copy link
Collaborator

Hi, thanks for reaching out. It seems that your cuda version is not compatible with your torch version. Did you strictly install them according to our readme? The recommended cuda version is 11.8 and the torch version is 2.1.1.

@1998frankchen
Copy link
Author

nvcc: NVIDIA (R) Cuda compiler driver
Copyright (c) 2005-2022 NVIDIA Corporation
Built on Wed_Sep_21_10:41:10_Pacific_Daylight_Time_2022
Cuda compilation tools, release 11.8, V11.8.89
Build cuda_11.8.r11.8/compiler.31833905_0

print(torch.version)
2.1.1+cu118

Thanks for your reply.

I strictly followed readme. Do you think the warnings("no kernel image available") in CLI is the root of failed control?

If so, maybe i should choose another server provider to diminish some other potential incompatibles in the pre-installed environment.

@DVampire
Copy link
Contributor

DVampire commented May 9, 2024

The error is caused by ms_deformable_im2col_cuda in groundingdino. Maybe you can try:

  1. Download cuda toolkit 11.8 and install it successfully. https://developer.nvidia.com/cuda-11-8-0-download-archive. We find that if you install 11.3 or other version of cuda may result in this error.
  2. Configure the system environment variables.
CUDA_HOME=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8
CUDA_PATH=C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8

# Add CUDA_PATH and CUDA_HOME to Path
Path=[Path];C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8;C:\Program Files\NVIDIA GPU Computing Toolkit\CUDA\v11.8

image

  1. Install Torch
pip3 install --upgrade torch==2.1.1+cu118 -f https://download.pytorch.org/whl/torch_stable.html
pip3 install torchvision==0.16.1+cu118 -f https://download.pytorch.org/whl/torch_stable.html
  1. Install cudatoolkit
conda install pytorch torchvision cudatoolkit=11.8 -c nvidia -c pytorch
  1. Install groundingdino
git clone https://github.com/IDEA-Research/GroundingDINO.git
cd GroundingDINO

# Build and install it
pip3 install -r requirements.txt
pip3 install -e .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants