Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The program always executes behavior clone when running CQLLearner. #305

Open
jiangjiadi opened this issue Jul 6, 2023 · 0 comments
Open

Comments

@jiangjiadi
Copy link

When I run the cql algorithm, I found the algorithm only execute behavior clone. I checked the config used. The training step is 100 and the 'num_bc_iters' is set to 50.
When I further dive to the source code of CQLLearner, I found the 'counts' in function 'step' has two keys "steps" and "walltime".
image
However, in the inplementation of 'step', the key used is "learner_steps".
image
The invalid key "learner_steps" makes the "cur_step" always be 0, thus causing the algorithm only execute behavior clone.
When I correct the key "learner_steps" to "steps", the problem is solved.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant