Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor load weights #1603

Open
wants to merge 7 commits into
base: main
Choose a base branch
from

Conversation

grimoire
Copy link
Collaborator

@grimoire grimoire commented May 16, 2024

optimization tp model loading.

requirement

@grimoire grimoire mentioned this pull request May 21, 2024
2 tasks
@grimoire grimoire marked this pull request as draft May 22, 2024 09:09
@grimoire grimoire marked this pull request as ready for review May 22, 2024 09:31
@lvhan028 lvhan028 requested a review from RunningLeon June 4, 2024 06:50
@RunningLeon
Copy link
Collaborator

@zhulinJulia24 hi, could you start a full-scope test of all pytorch engine models using daily_test CI? Thanks.

rank=rank,
world_size=world_size,
prefix='query_key_value')
rowwise_parallelize_linear(self.dense,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

much better than previous version

logger = get_logger('lmdeploy')


def _get_weight_type(model_path: str, use_safetensors: bool = None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use_safetensors can be {True, False, None}. Why not True or False?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for name, param in mod.named_parameters(recurse=False):
dtype = param.dtype
if not loader.has(name):
logger.debug(f'rank [{rank}]'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How to invoke this condition?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some model might shared weight of token embedding, they do not safe redundant weight in checkpoint.

@@ -160,204 +157,3 @@ def sync_qparam_to_context(context: Any, layer_id: str, qparams: dict):
context.set_output(layer_id, last_qparam)
else:
context.set_output(layer_id, qparams)


@torch.no_grad()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it used before?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Almost never.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants