-
Notifications
You must be signed in to change notification settings - Fork 768
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Streamline LitGPT API #1403
Comments
You should accompany any decision with a PoC of how to implement it. I say this because (to the best of my knowledge) a call like def dispatch_finetune(
repo_id, # required
method="lora",
):
if method == "lora":
from litgpt.finetune.lora import main
main(repo_id)
elif ... where based on the arguments you call a different function. Jsonargparse needs to understand this to pull out the arguments from the inner function ( So it might need to be replaced with an alternative tool like It will depend strongly on the tool chosen for the job proposed. |
See also my previous comment on this topic: #996 (comment) |
The limitation you mentioned would be for selectively showing the LoRA args, correct? An alternative would be to show all finetune arguments (full, adapter, lora). I think users will know that the LoRA parameters only have an effect if they select Switching to click could be an option longer term, but I think this would be a bigger lift. |
On 2) Could we keep it pretraining from scratch by default? If not, then there would have to be a very loud warning IMO, and a way to opt out of auto loading a checkpoint. How would that look like? To add to Carlos' comment, if a CLI rewrite is considered we would have to be super sure it can support all our use cases and requirements. There might also be an option to work closer with the jsonargparse author if we're blocked by missing features. |
Personally, I have a slight preference to keep it pretraining from scratch, because it's also what most users would expect in my opinion. |
Yes. But also for the --data argument or the --generate subcommand etc. These are technical details that are currently covered by jsonargparse automagically and an alternative might require you to do something very different. This is why I insist to create a PoC that mimics the script/args/config structure to see how difficult it becomes. You could also completely change the args and config structure if you are doing breaking changes anyway |
To summarize from our meeting this morning, an easier path forward might be to use litgpt finetune_full
litgpt finetune_lora
litgpt finetune_adapter
litgpt finetune_adapter_v2 where we also keep To keep things simple for newcomers, we would only show
in the finetuning docs (plus the |
Following up on a longer internal discussion we had (cc @carmocca @lantiga @awaelchli ), we want to support the following user-friendly API in LitGPT:
# ligpt [action] [model] litgpt download meta-llama/Meta-Llama-3-8B-Instruct litgpt chat meta-llama/Meta-Llama-3-8B-Instruct litgpt finetune meta-llama/Meta-Llama-3-8B-Instruct litgpt pretrain meta-llama/Meta-Llama-3-8B-Instruct litgpt serve meta-llama/Meta-Llama-3-8B-Instruct
So, in other words, we would make the
<repo_id>
a positional argument.1) Root dir
We probably should introduce a
--checkpoint_root_dir
defaulting to"checkpoints"
so thatdownloads to
checkpoints/<repo_id>
and
uses
checkpoints/<repo_id>
2) Pretrain behavior
In pretrain, if someone runs
the question is whether the
<repo_id>
takes the place of--model_name
or--initial_checkpoint_dir
. I don't have a strong opinion here, but pretraining from scratch would probably be the first thing someone thinks of when readingpretrain
(as opposed to continued pretraining).In this case we would use
meta-llama/Meta-Llama-3-8B-Instruct
as--model_name
(or extractMeta-Llama-3-8B-Instruct
as model name`).The counter argument is that using this positional argument to replace
initial_checkpoint_dir
would be more consistent withlitgpt download
andlitgpt finetune
etc.3) Removing subcommands
This means that we probably cannot support subcommands as in
litgpt finetune <repo_id>
andlora finetune lora <repo_id>
(Please correct me if I'm wrong.)
So, we would change it to
4) Deprecating
--repo_id
and--checkpoint_dir
Maybe a full deprecation would not be possible due to the subcommands issue above, but we should probably at least keep
--repo_id
and--checkpoint_dir
for now, and if they are not empty strings, we quit with a message explaining the API change (instead of a error stack trace).The text was updated successfully, but these errors were encountered: