-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NeMo Dev Doc Feature Updates 1: Some parallelisms #9184
Conversation
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
# Conflicts: # docs/source/features/memory_optimizations.rst # docs/source/nlp/text_normalization/text_normalization_as_tagging.rst
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Implementation | ||
^^^^^^^^^^^^^^ | ||
|
||
NeMo's support for GQA and MQA is enabled through the integration of Megatron-Core's Attention mechanism. The underlying implementation details can be explored within the Attention class of Megatron-Core, which provides the functional backbone for these advanced attention methods. To understand the specific modifications and implementations of MQA and GQA, refer to the source code in the Attention class: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Change all instances of Megatron-Core to Megatron Core (unless it is part of the directory structure)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's called megatron-core here: https://docs.nvidia.com/megatron-core/index.html
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the correct use of the product title. I checked with our editor Zenobia (Z) Redeaux <zredeaux@nvidia.com>. She then checked with the Legal Dept Avisheh Madani <avmadani@nvidia.com> and our Tech Pubs manager Robert Morrish rmorrish@nvidia.com.
The correct use is Megatron Core, not Megatron-Core.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for checking. I fixed this
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I completed the copyedit of the files. 1. I suggest making a global change from Megatron-Core to Megatron Core. 2. Please review the comments, decide on a consistent naming convention for the models, and make them consistent throughout the file. For example, Tensor Parallelism or Tensor Model Parallelism, Expert Parallelism or Expert Model Parallelism, etc.
Implementation | ||
^^^^^^^^^^^^^^ | ||
|
||
NeMo's support for GQA and MQA is enabled through the integration of Megatron-Core's Attention mechanism. The underlying implementation details can be explored within the Attention class of Megatron-Core, which provides the functional backbone for these advanced attention methods. To understand the specific modifications and implementations of MQA and GQA, refer to the source code in the Attention class: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding the correct use of the product title. I checked with our editor Zenobia (Z) Redeaux <zredeaux@nvidia.com>. She then checked with the Legal Dept Avisheh Madani <avmadani@nvidia.com> and our Tech Pubs manager Robert Morrish rmorrish@nvidia.com.
The correct use is Megatron Core, not Megatron-Core.
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made a few more copyedits to the file. Please review and implement changes.
I made a few more copyedits to the file. Please review and implement the changes. |
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
I looked at your latest comments and added my responses to your questions. Otherwise, the changes to the files look good. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Review completed and approved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved copyedit changes.
What does this PR do ?
In this pr, we updated docs for TP/PP/VP/CP and EP (moved from #9148).
We added a section for GQA/MQA.
We also added Seq packing for NeVA.
Changelog
Usage
# Add a code snippet demonstrating how to use this
GitHub Actions CI
The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.
The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".
Before your PR is "Ready for review"
Pre checks:
PR Type:
If you haven't finished some of the above items you can still open "Draft" PR.
Who can review?
Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.
Additional Information