Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NeMo Dev Doc Feature Updates 1: Some parallelisms #9184

Merged
merged 30 commits into from
May 17, 2024

Conversation

yaoyu-33
Copy link
Collaborator

@yaoyu-33 yaoyu-33 commented May 13, 2024

What does this PR do ?

In this pr, we updated docs for TP/PP/VP/CP and EP (moved from #9148).
We added a section for GQA/MQA.

We also added Seq packing for NeVA.

Changelog

  • Add specific line by line info of high level changes in this PR.

Usage

  • You can potentially add a usage example below
# Add a code snippet demonstrating how to use this 

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

  • Make sure you read and followed Contributor guidelines
  • Did you write any new necessary tests?
  • Did you add or update any necessary documentation?
  • Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
    • Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

  • New Feature
  • Bugfix
  • Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

  • Related to # (issue)

erastorgueva-nv and others added 24 commits April 22, 2024 16:41
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
# Conflicts:
#	docs/source/features/memory_optimizations.rst
#	docs/source/nlp/text_normalization/text_normalization_as_tagging.rst
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
@akoumpa akoumpa mentioned this pull request May 13, 2024
8 tasks
@ericharper ericharper requested a review from jgerh May 13, 2024 19:11
Implementation
^^^^^^^^^^^^^^

NeMo's support for GQA and MQA is enabled through the integration of Megatron-Core's Attention mechanism. The underlying implementation details can be explored within the Attention class of Megatron-Core, which provides the functional backbone for these advanced attention methods. To understand the specific modifications and implementations of MQA and GQA, refer to the source code in the Attention class:
Copy link
Collaborator

@jgerh jgerh May 14, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change all instances of Megatron-Core to Megatron Core (unless it is part of the directory structure)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's called megatron-core here: https://docs.nvidia.com/megatron-core/index.html

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the correct use of the product title. I checked with our editor Zenobia (Z) Redeaux <zredeaux@nvidia.com>. She then checked with the Legal Dept Avisheh Madani <avmadani@nvidia.com> and our Tech Pubs manager Robert Morrish rmorrish@nvidia.com.

The correct use is Megatron Core, not Megatron-Core.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for checking. I fixed this

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Copy link
Collaborator

@jgerh jgerh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I completed the copyedit of the files. 1. I suggest making a global change from Megatron-Core to Megatron Core. 2. Please review the comments, decide on a consistent naming convention for the models, and make them consistent throughout the file. For example, Tensor Parallelism or Tensor Model Parallelism, Expert Parallelism or Expert Model Parallelism, etc.

docs/source/features/memory_optimizations.rst Outdated Show resolved Hide resolved
docs/source/features/parallelisms.rst Outdated Show resolved Hide resolved
docs/source/features/parallelisms.rst Outdated Show resolved Hide resolved
docs/source/features/parallelisms.rst Outdated Show resolved Hide resolved
docs/source/features/parallelisms.rst Show resolved Hide resolved
Implementation
^^^^^^^^^^^^^^

NeMo's support for GQA and MQA is enabled through the integration of Megatron-Core's Attention mechanism. The underlying implementation details can be explored within the Attention class of Megatron-Core, which provides the functional backbone for these advanced attention methods. To understand the specific modifications and implementations of MQA and GQA, refer to the source code in the Attention class:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regarding the correct use of the product title. I checked with our editor Zenobia (Z) Redeaux <zredeaux@nvidia.com>. She then checked with the Legal Dept Avisheh Madani <avmadani@nvidia.com> and our Tech Pubs manager Robert Morrish rmorrish@nvidia.com.

The correct use is Megatron Core, not Megatron-Core.

docs/source/features/parallelisms.rst Show resolved Hide resolved
docs/source/features/parallelisms.rst Outdated Show resolved Hide resolved
docs/source/features/parallelisms.rst Outdated Show resolved Hide resolved
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Copy link
Collaborator

@jgerh jgerh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a few more copyedits to the file. Please review and implement changes.

@jgerh
Copy link
Collaborator

jgerh commented May 15, 2024

I made a few more copyedits to the file. Please review and implement the changes.

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
@jgerh
Copy link
Collaborator

jgerh commented May 16, 2024

I looked at your latest comments and added my responses to your questions. Otherwise, the changes to the files look good.

Copy link
Collaborator

@jgerh jgerh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Review completed and approved

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>
Copy link
Collaborator

@jgerh jgerh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approved copyedit changes.

@ericharper ericharper merged commit 51c2c3f into main May 17, 2024
12 checks passed
@ericharper ericharper deleted the yuya/feature_updates_1 branch May 17, 2024 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants