NeMo Dev Doc Feature Updates 1: Some parallelisms #9184

yaoyu-33 · 2024-05-13T18:10:27Z

What does this PR do ?

In this pr, we updated docs for TP/PP/VP/CP and EP (moved from #9148).
We added a section for GQA/MQA.

We also added Seq packing for NeVA.

Changelog

Add specific line by line info of high level changes in this PR.

Usage

You can potentially add a usage example below

# Add a code snippet demonstrating how to use this

GitHub Actions CI

The Jenkins CI system has been replaced by GitHub Actions self-hosted runners.

The GitHub Actions CI will run automatically when the "Run CICD" label is added to the PR.
To re-run CI remove and add the label again.
To run CI on an untrusted fork, a NeMo user with write access must first click "Approve and run".

Before your PR is "Ready for review"

Pre checks:

Make sure you read and followed Contributor guidelines
Did you write any new necessary tests?
Did you add or update any necessary documentation?
Does the PR affect components that are optional to install? (Ex: Numba, Pynini, Apex etc)
- Reviewer: Does the PR have correct import guards for all optional libraries?

PR Type:

New Feature
Bugfix
Documentation

If you haven't finished some of the above items you can still open "Draft" PR.

Who can review?

Anyone in the community is free to review the PR once the checks have passed.
Contributor guidelines contains specific people who can review PRs to various areas.

Additional Information

Related to # (issue)

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

# Conflicts: # docs/source/features/memory_optimizations.rst # docs/source/nlp/text_normalization/text_normalization_as_tagging.rst

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

docs/source/features/memory_optimizations.rst

jgerh · 2024-05-14T16:38:26Z

docs/source/features/memory_optimizations.rst

+Implementation
+^^^^^^^^^^^^^^
+
+NeMo's support for GQA and MQA is enabled through the integration of Megatron-Core's Attention mechanism. The underlying implementation details can be explored within the Attention class of Megatron-Core, which provides the functional backbone for these advanced attention methods. To understand the specific modifications and implementations of MQA and GQA, refer to the source code in the Attention class:


Change all instances of Megatron-Core to Megatron Core (unless it is part of the directory structure)

it's called megatron-core here: https://docs.nvidia.com/megatron-core/index.html

Regarding the correct use of the product title. I checked with our editor Zenobia (Z) Redeaux <zredeaux@nvidia.com>. She then checked with the Legal Dept Avisheh Madani <avmadani@nvidia.com> and our Tech Pubs manager Robert Morrish rmorrish@nvidia.com.

The correct use is Megatron Core, not Megatron-Core.

Thanks for checking. I fixed this

docs/source/features/parallelisms.rst

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

jgerh

I completed the copyedit of the files. 1. I suggest making a global change from Megatron-Core to Megatron Core. 2. Please review the comments, decide on a consistent naming convention for the models, and make them consistent throughout the file. For example, Tensor Parallelism or Tensor Model Parallelism, Expert Parallelism or Expert Model Parallelism, etc.

docs/source/features/memory_optimizations.rst

docs/source/features/parallelisms.rst

docs/source/features/throughput_optimizations.rst

jgerh · 2024-05-14T18:52:55Z

docs/source/features/memory_optimizations.rst

+Implementation
+^^^^^^^^^^^^^^
+
+NeMo's support for GQA and MQA is enabled through the integration of Megatron-Core's Attention mechanism. The underlying implementation details can be explored within the Attention class of Megatron-Core, which provides the functional backbone for these advanced attention methods. To understand the specific modifications and implementations of MQA and GQA, refer to the source code in the Attention class:


Regarding the correct use of the product title. I checked with our editor Zenobia (Z) Redeaux <zredeaux@nvidia.com>. She then checked with the Legal Dept Avisheh Madani <avmadani@nvidia.com> and our Tech Pubs manager Robert Morrish rmorrish@nvidia.com.

The correct use is Megatron Core, not Megatron-Core.

docs/source/features/parallelisms.rst

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

docs/source/features/memory_optimizations.rst

jgerh

I made a few more copyedits to the file. Please review and implement changes.

jgerh · 2024-05-15T17:14:01Z

I made a few more copyedits to the file. Please review and implement the changes.

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

jgerh · 2024-05-16T16:58:09Z

I looked at your latest comments and added my responses to your questions. Otherwise, the changes to the files look good.

jgerh

Review completed and approved

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

jgerh

Approved copyedit changes.

erastorgueva-nv and others added 24 commits April 22, 2024 16:41

add various docs fixes

8e88e03

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

make conf.py changes clearer

460a3d6

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Merge branch 'main' into erastorgueva/docs-warnings

65bae68

merge and fix conflicts

867358d

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

fix Duplicate explicit target name error for links

9087cf4

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

more fixes, mainly citations

051b189

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

fix some code formatting

fe704c5

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Merge branch 'main' into erastorgueva/docs-warnings

6a595a5

Merge branch 'main' into erastorgueva/docs-warnings

725b5db

Merge branch 'main' into erastorgueva/docs-warnings

bec5c57

update hf space iframe link

4144a89

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

fix new ERRORs

18a6bde

Signed-off-by: Elena Rastorgueva <erastorgueva@nvidia.com>

Update docs

2e1d4e6

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

Add MQA and GQA

afd2b51

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

Fix small issues

b345450

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

Add parallelisms

e11d6f6

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

Merge branch 'main' into yuya/feature_updates_1

b2c4ba0

# Conflicts: # docs/source/features/memory_optimizations.rst # docs/source/nlp/text_normalization/text_normalization_as_tagging.rst

Add seq packing in NeMo dev doc

62bf13c

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

fix few issues

f2c7ed0

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

fix table

97a7209

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

fix table

be852c1

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

fix table

d0f3428

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

fix table

4eb6466

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

add EP

81ae6d7

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

github-actions bot added the Multi Modal label May 13, 2024

akoumpa mentioned this pull request May 13, 2024

Akoumparouli EP nemo docs #9148

Closed

8 tasks

ericharper requested a review from jgerh May 13, 2024 19:11

jgerh reviewed May 14, 2024

View reviewed changes

docs/source/features/memory_optimizations.rst Show resolved Hide resolved

jgerh reviewed May 14, 2024

View reviewed changes

docs/source/features/memory_optimizations.rst Outdated Show resolved Hide resolved

jgerh reviewed May 14, 2024

View reviewed changes

docs/source/features/parallelisms.rst Outdated Show resolved Hide resolved

squeeze in neva updates

c94f024

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

jgerh reviewed May 14, 2024

View reviewed changes

yaoyu-33 added 3 commits May 14, 2024 13:35

rename Megatron-Core to Megatron Core

f8dce03

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

address comments

82b8c22

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

Fix typo

b64d6a7

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

jgerh reviewed May 15, 2024

View reviewed changes

docs/source/features/memory_optimizations.rst Outdated Show resolved Hide resolved

jgerh reviewed May 15, 2024

View reviewed changes

Update index

6b1ee64

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

timmoon10 mentioned this pull request May 17, 2024

Expand documentation for data parallelism and distributed optimizer #9227

Merged

8 tasks

jgerh approved these changes May 17, 2024

View reviewed changes

fix

a719a6c

Signed-off-by: yaoyu-33 <yaoyu.094@gmail.com>

jgerh approved these changes May 17, 2024

View reviewed changes

ericharper merged commit 51c2c3f into main May 17, 2024
12 checks passed

ericharper deleted the yuya/feature_updates_1 branch May 17, 2024 16:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NeMo Dev Doc Feature Updates 1: Some parallelisms #9184

NeMo Dev Doc Feature Updates 1: Some parallelisms #9184

yaoyu-33 commented May 13, 2024 •

edited

jgerh May 14, 2024 •

edited

yaoyu-33 May 14, 2024

jgerh May 14, 2024

yaoyu-33 May 14, 2024

jgerh left a comment

jgerh May 14, 2024

jgerh left a comment

jgerh commented May 15, 2024

jgerh commented May 16, 2024

jgerh left a comment

jgerh left a comment

NeMo Dev Doc Feature Updates 1: Some parallelisms #9184

NeMo Dev Doc Feature Updates 1: Some parallelisms #9184

Conversation

yaoyu-33 commented May 13, 2024 • edited

What does this PR do ?

Changelog

Usage

GitHub Actions CI

Before your PR is "Ready for review"

Who can review?

Additional Information

jgerh May 14, 2024 • edited

Choose a reason for hiding this comment

yaoyu-33 May 14, 2024

Choose a reason for hiding this comment

jgerh May 14, 2024

Choose a reason for hiding this comment

yaoyu-33 May 14, 2024

Choose a reason for hiding this comment

jgerh left a comment

Choose a reason for hiding this comment

jgerh May 14, 2024

Choose a reason for hiding this comment

jgerh left a comment

Choose a reason for hiding this comment

jgerh commented May 15, 2024

jgerh commented May 16, 2024

jgerh left a comment

Choose a reason for hiding this comment

jgerh left a comment

Choose a reason for hiding this comment

yaoyu-33 commented May 13, 2024 •

edited

jgerh May 14, 2024 •

edited