New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Necessity of GDINO? #5
Comments
请教一下~尽管论文中解释了GPT4V缺乏定位能力,但是给定icon的描述,GPT4V不具备给出坐标的能力吗?GDINO的存在是必要的吗? |
我们尝试过让GPT-4V根据icon描述给出对应的坐标,但是发现GPT-4V不具备这个能力,换句话说,GPT-4V只具备感知能力,不具备定位能力。希望能帮助到你。 |
We have tried to let GPT-4V give the corresponding coordinates based on the icon description, but we found that GPT-4V does not have this ability, in other words, GPT-4V only has the ability to perceive, not the ability to localize. Hope it can help you. |
Thanks for reply. 未来有针对这个问题的更新计划吗(定位不使用单独的模块,而是整个模型端到端)? |
可以参考这个回复 |
gtp-4v的 grouding能力并不好,这才是为什么这个项目要自己训练模型的原因,大部分做grouding任务的模型都是自己训练的 |
Interesting Work!
The text was updated successfully, but these errors were encountered: