{"payload":{"feedbackUrl":"https://github.com/orgs/community/discussions/53140","repo":{"id":283874375,"defaultBranch":"main","name":"param","ownerLogin":"facebookresearch","currentUserCanPush":false,"isFork":false,"isEmpty":false,"createdAt":"2020-07-30T20:51:50.000Z","ownerAvatar":"https://avatars.githubusercontent.com/u/16943930?v=4","public":true,"private":false,"isOrgOwned":true},"refInfo":{"name":"","listCacheKey":"v0:1714780169.0","currentOid":""},"activityList":{"items":[{"before":"3c0ce7670e554ac7174fd6875a5af104c7a7d45d","after":"b683e798d6d5748bec243eb436b56b2ea543bab4","ref":"refs/heads/main","pushedAt":"2024-06-05T01:39:00.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"conda laucher\n\nSummary: Added support to launch param tests using conda launcher.\n\nReviewed By: venkatrag1\n\nDifferential Revision: D58159782\n\nfbshipit-source-id: d57ba403066c3fe1304a2c86206dd4e7ab637f95","shortMessageHtmlLink":"conda laucher"}},{"before":"24fc087dc67fc2db7c9d5d0b9552a3fe0e0b6c1d","after":"3c0ce7670e554ac7174fd6875a5af104c7a7d45d","ref":"refs/heads/main","pushedAt":"2024-06-03T15:27:07.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Retry of D58015187 Move AsyncCompile to a different file (#123)\n\nSummary:\nPull Request resolved: https://github.com/facebookresearch/param/pull/123\n\nX-link: https://github.com/pytorch/tnt/pull/842\n\nX-link: https://github.com/pytorch/ao/pull/302\n\nX-link: https://github.com/pytorch/pytorch/pull/127691\n\nThis is a retry of https://github.com/pytorch/pytorch/pull/127545/files\nand\nD58015187, fixing the internal test that also imported codecache\n\nReviewed By: oulgen, msaroufim\n\nDifferential Revision: D58054611\n\nfbshipit-source-id: 7a4d6602effa51c839ee8f650548e254c82a42a4","shortMessageHtmlLink":"Retry of D58015187 Move AsyncCompile to a different file (#123)"}},{"before":"ee9071f4157b3e470aecead2aba428482fe53e8f","after":"24fc087dc67fc2db7c9d5d0b9552a3fe0e0b6c1d","ref":"refs/heads/main","pushedAt":"2024-05-30T21:20:49.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add ET files for integration tests and a couple fixes. (#119)\n\nSummary:\nPull Request resolved: https://github.com/facebookresearch/param/pull/119\n\nThis DIFF includes the following:\n\n1. Added Resnet 2 GPU ET.\n2. Added HuggingFace GPT2 1 GPU ET\n3. Added HiggingFace GPT2 PT2 1 GPU ET\n4. Two bug fixes in et-replay.\n\nReviewed By: sanrise\n\nDifferential Revision: D57760228\n\nfbshipit-source-id: 8cf2a818ca45dede7a4ea97533d438591f320f8e","shortMessageHtmlLink":"Add ET files for integration tests and a couple fixes. (#119)"}},{"before":"49da106be5b584ef4739ca60cffb9fd251b7f051","after":"ee9071f4157b3e470aecead2aba428482fe53e8f","ref":"refs/heads/main","pushedAt":"2024-05-22T22:37:28.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Change et_replay test to a unit test (#118)\n\nSummary:\nPull Request resolved: https://github.com/facebookresearch/param/pull/118\n\nThis DIFF is to change test_et_replay from a binary to a unit test, so when the dependent libraries change, the test will run automatically.\n\nReviewed By: briancoutinho\n\nDifferential Revision: D57590633\n\nfbshipit-source-id: 6724b975936bd36b753f01d609669fdebcdc5f4c","shortMessageHtmlLink":"Change et_replay test to a unit test (#118)"}},{"before":"71b22854c5d4c2dffe955c2c1f81d22ef74f2c58","after":"49da106be5b584ef4739ca60cffb9fd251b7f051","ref":"refs/heads/main","pushedAt":"2024-05-22T17:06:32.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Refactor python_lint.yml to enhance readability and trigger on PRs only (#117)\n\nSummary:\nRefactor python_lint.yml to enhance readability and trigger on PRs only.\n\nPull Request resolved: https://github.com/facebookresearch/param/pull/117\n\nTest Plan: The CI pipeline passes.\n\nReviewed By: sanrise, haowangludx\n\nDifferential Revision: D57676335\n\nPulled By: briancoutinho\n\nfbshipit-source-id: 42b8bd36fdcad509eed900c298febcb340c68b1b","shortMessageHtmlLink":"Refactor python_lint.yml to enhance readability and trigger on PRs on…"}},{"before":"5d1cb1301f513d7b23f5fd31c528122e426d2078","after":"71b22854c5d4c2dffe955c2c1f81d22ef74f2c58","ref":"refs/heads/main","pushedAt":"2024-05-22T16:57:38.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Remove dead code from et_replay/tools/et_replay.py (#115)\n\nSummary:\nRemove dead code from et_replay/tools/et_replay.py. I understand that et_replay should support replaying both computation operators and communication operators simultaneously. However, currently, et_replay does not properly support communication replay. As a step for refactoring, briancoutinho commented out communication-related code from et_replay. I understand this was intended to ease the revival of the communication operator replay feature. However, this approach is not helpful because the interface to the communication operator is likely to undergo significant changes. Therefore, I am proposing this PR to remove the dead commented-out code and start from a clean slate.\n\nPull Request resolved: https://github.com/facebookresearch/param/pull/115\n\nTest Plan: Not needed. Removed commented-out code only.\n\nReviewed By: kingchc\n\nDifferential Revision: D57527434\n\nPulled By: briancoutinho\n\nfbshipit-source-id: 207e18fc3232f56c2fad4826d7fc674e1715c432","shortMessageHtmlLink":"Remove dead code from et_replay/tools/et_replay.py (#115)"}},{"before":"e99ef20a62b216c0158312acd31241e6df225300","after":"5d1cb1301f513d7b23f5fd31c528122e426d2078","ref":"refs/heads/main","pushedAt":"2024-05-20T18:22:47.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Update .gitignore to ignore `__pycache__` (#116)\n\nSummary:\nUpdate .gitignore to ignore `__pycache__`\n\nPull Request resolved: https://github.com/facebookresearch/param/pull/116\n\nTest Plan: Not needed as this is a gitignore file.\n\nReviewed By: kingchc\n\nDifferential Revision: D57527430\n\nPulled By: briancoutinho\n\nfbshipit-source-id: 62f3962990041316d65baaac1d3fce92afea9250","shortMessageHtmlLink":"Update .gitignore to ignore __pycache__ (#116)"}},{"before":"675754e17cbf752f94ad35fc5b779f77c51262f5","after":"e99ef20a62b216c0158312acd31241e6df225300","ref":"refs/heads/main","pushedAt":"2024-05-17T01:36:43.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Enable comm_replay in PARAM by Integrating and Refactoring Comm Code (#112)\n\nSummary:\nX-link: https://github.com/facebookresearch/HolisticTraceAnalysis/pull/137\n\n- **Code Migration**: Copied all comm_replay-related code from train/comms/pt to et_replay/lib/comm. The decision to copy rather than create symbolic links was mandatory to avoid dependency issues and maintain a stable and self-contained code environment, ensuring that the et_replay project remains functional even if the source files change.\n- **Code Cleanup**: Removed obsolete files such as dlrm.py and comms.py to streamline the codebase.\n- **Configuration Update**: Modified import statements and updated pyproject.toml to align with the new directory structure, ensuring proper package management.\n\nPull Request resolved: https://github.com/facebookresearch/param/pull/112\n\nTest Plan:\n```\n$ pip install .\nProcessing /Users/theo/param/et_replay\n Installing build dependencies ... done\n Getting requirements to build wheel ... done\n Installing backend dependencies ... done\n Preparing metadata (pyproject.toml) ... done\nBuilding wheels for collected packages: et_replay\n Building wheel for et_replay (pyproject.toml) ... done\n Created wheel for et_replay: filename=et_replay-1.0.0-py3-none-any.whl size=61490 sha256=d4e4433c55487d790e6bb1bb892eca268348a148f3365d3587fac90aa38692ee\n Stored in directory: /private/var/folders/z0/c9mq5j4s6n14n0_gs7nlt6mc0000gp/T/pip-ephem-wheel-cache-jxux47rn/wheels/3b/3f/aa/d3fc853f83c22c6f3eeb09763570c2cc8031a1a414cb3c18b6\nSuccessfully built et_replay\nInstalling collected packages: et_replay\n Attempting uninstall: et_replay\n Found existing installation: et_replay 1.0.0\n Uninstalling et_replay-1.0.0:\n Successfully uninstalled et_replay-1.0.0\nSuccessfully installed et_replay-1.0.0\n\n$ comm_replay\n[BLOCKED as expected]\n```\n\nRun on mast\nbuck2 run mode/opt -c hpc_comms.use_ncclx=2.18.3 param_bench/train/comms/pt:launcher -- --launcher mast --cluster=MastProdCluster --dp networkai_mast_job_identity --hw tc_any --nnode 8 --ppn 8 --z=0 --module commsTraceReplay --trace-path manifold://param/tree/shengbao/et/torchx-conda-xlformers_ncclexp_70b_fp8_fsdp_pp_ctran_ag-tgqvxwkz --trace-type et --reuse-tensors\n\nhttps://www.internalfb.com/mlhub/pipelines/runs/mast/torchx-param-commsTraceReplay-64gpus-allreduce-5f66a4?job_attempt=0&version=0&tab=scheduling&env=PRODUCTION\n\nReviewed By: shengbao-zheng\n\nDifferential Revision: D57354772\n\nPulled By: briancoutinho\n\nfbshipit-source-id: f4563f6f4823e8f8b097d68aa35da3461aa4c0a0","shortMessageHtmlLink":"Enable comm_replay in PARAM by Integrating and Refactoring Comm Code (#…"}},{"before":"9b1946f62b67c5b9a0357803a02803ac4527b750","after":"675754e17cbf752f94ad35fc5b779f77c51262f5","ref":"refs/heads/main","pushedAt":"2024-05-13T22:28:14.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add ISSUE_TEMPLATE and PULL_REQUEST_TEMPLATE (#113)\n\nSummary:\nAdd ISSUE_TEMPLATE and PULL_REQUEST_TEMPLATE\n\nPull Request resolved: https://github.com/facebookresearch/param/pull/113\n\nTest Plan: Not needed. These are the same templates used in https://github.com/mlcommons/chakra.\n\nDifferential Revision: D57299654\n\nPulled By: briancoutinho\n\nfbshipit-source-id: 38b456180aadb8b09562fbbc1ef1e38a39ee8eff","shortMessageHtmlLink":"Add ISSUE_TEMPLATE and PULL_REQUEST_TEMPLATE (#113)"}},{"before":"2b4cf3e224c10712443f89fc6315c3dc28e03a83","after":"9b1946f62b67c5b9a0357803a02803ac4527b750","ref":"refs/heads/main","pushedAt":"2024-05-07T23:57:56.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Draft refactor of et replay (#110)\n\nSummary:\nX-link: https://github.com/facebookresearch/HolisticTraceAnalysis/pull/131\n\nPull Request resolved: https://github.com/facebookresearch/param/pull/110\n\nAdd a new tree structure for et_replay to enable better encapsulation of this code. We keep comms and compute unchanged, only moving et_replay files.\n\n```\n[bcoutinho@devgpu038.ftw6 ~/fbsource/fbcode/param_bench/et_replay (d4b11e786)]$ tree .\n.\n├── lib\n│ ├── et_replay_utils.py\n│ ├── execution_trace.py\n│ └── utils.py\n├── README.md\n├── tests\n│ ├── inputs\n│ │ ├── 1.0.3-chakra.0.0.4\n│ │ │ └── resnet_1gpu_et.json.gz\n│ │ ├── 1.1.0-chakra.0.0.4\n│ │ │ └── resnet_2gpu_et.json.gz\n│ │ ├── dlrm_kineto.tar.gz\n│ │ ├── dlrm_pytorch_et.tar.gz\n│ │ ├── __init__.py\n│ │ ├── linear_et.json.gz\n│ │ ├── linear_kineto.json.gz\n│ │ ├── resnet_et.json.gz\n│ │ └── resnet_kineto.json.gz\n│ └── test_execution_trace.py\n└── tools\n ├── et_replay.py\n └── validate_trace.py\n```\n\nReviewed By: shengfukevin\n\nDifferential Revision: D56960365\n\nfbshipit-source-id: d2ef172bc6c4629d78222357e616df9bddaec81e","shortMessageHtmlLink":"Draft refactor of et replay (#110)"}},{"before":"721c137abf5737be32bf2d845559ebabb8c0b961","after":"497533d9009cc79d763fd26ac1809f681db503e0","ref":"refs/heads/export-D56960365","pushedAt":"2024-05-07T02:15:09.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"briancoutinho","name":"Brian Coutinho","path":"/briancoutinho","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6922212?s=80&v=4"},"commit":{"message":"Draft refactor of et replay (#110)\n\nSummary:\nPull Request resolved: https://github.com/facebookresearch/param/pull/110\n\nAdd a new tree structure for et_replay to enable better encapsulation of this code. We keep comms and compute unchanged, only moving et_replay files.\n\n```\n[bcoutinho@devgpu038.ftw6 ~/fbsource/fbcode/param_bench/et_replay (d4b11e786)]$ tree .\n.\n├── lib\n│ ├── et_replay_utils.py\n│ ├── execution_trace.py\n│ └── utils.py\n├── README.md\n├── tests\n│ ├── inputs\n│ │ ├── 1.0.3-chakra.0.0.4\n│ │ │ └── resnet_1gpu_et.json.gz\n│ │ ├── 1.1.0-chakra.0.0.4\n│ │ │ └── resnet_2gpu_et.json.gz\n│ │ ├── dlrm_kineto.tar.gz\n│ │ ├── dlrm_pytorch_et.tar.gz\n│ │ ├── __init__.py\n│ │ ├── linear_et.json.gz\n│ │ ├── linear_kineto.json.gz\n│ │ ├── resnet_et.json.gz\n│ │ └── resnet_kineto.json.gz\n│ └── test_execution_trace.py\n└── tools\n ├── et_replay.py\n └── validate_trace.py\n```\n\nDifferential Revision: D56960365","shortMessageHtmlLink":"Draft refactor of et replay (#110)"}},{"before":"5b0f878dc97d49fd8493dc7e7496d0cd430e93c5","after":"721c137abf5737be32bf2d845559ebabb8c0b961","ref":"refs/heads/export-D56960365","pushedAt":"2024-05-07T00:05:21.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"briancoutinho","name":"Brian Coutinho","path":"/briancoutinho","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6922212?s=80&v=4"},"commit":{"message":"Draft refactor of et replay (#110)\n\nSummary:\nPull Request resolved: https://github.com/facebookresearch/param/pull/110\n\nAdd a new tree structure for et_replay to enable better encapsulation of this code. We keep comms and compute unchanged, only moving et_replay files.\n\n```\n[bcoutinho@devgpu038.ftw6 ~/fbsource/fbcode/param_bench/et_replay (d4b11e786)]$ tree .\n.\n├── lib\n│ ├── et_replay_utils.py\n│ ├── execution_trace.py\n│ └── utils.py\n├── README.md\n├── tests\n│ ├── inputs\n│ │ ├── 1.0.3-chakra.0.0.4\n│ │ │ └── resnet_1gpu_et.json.gz\n│ │ ├── 1.1.0-chakra.0.0.4\n│ │ │ └── resnet_2gpu_et.json.gz\n│ │ ├── dlrm_kineto.tar.gz\n│ │ ├── dlrm_pytorch_et.tar.gz\n│ │ ├── __init__.py\n│ │ ├── linear_et.json.gz\n│ │ ├── linear_kineto.json.gz\n│ │ ├── resnet_et.json.gz\n│ │ └── resnet_kineto.json.gz\n│ └── test_execution_trace.py\n└── tools\n ├── et_replay.py\n └── validate_trace.py\n```\n\nDifferential Revision: D56960365","shortMessageHtmlLink":"Draft refactor of et replay (#110)"}},{"before":null,"after":"5b0f878dc97d49fd8493dc7e7496d0cd430e93c5","ref":"refs/heads/export-D56960365","pushedAt":"2024-05-03T23:49:29.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"briancoutinho","name":"Brian Coutinho","path":"/briancoutinho","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6922212?s=80&v=4"},"commit":{"message":"Draft refactor of et replay\n\nDifferential Revision: D56960365","shortMessageHtmlLink":"Draft refactor of et replay"}},{"before":"c83ce8429110a86549c40fec5a01acbd9fbd54a4","after":"2b4cf3e224c10712443f89fc6315c3dc28e03a83","ref":"refs/heads/main","pushedAt":"2024-05-03T17:15:03.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"add example trace with comm args metadata (#105)\n\nSummary:\nAs above\n1. Adds a test trace with 2 GPU resnet job and 12 communication collectives.\n2. Add a commArgs optional argument to ET node, this will be populated soon.\n3. Minor updates to parser and add a new unittest that tries to validate this test\n\n## Testing\n\n```\n(pytorch) [bcoutinho@devgpu038.ftw6 /data/users/bcoutinho]$ export PYTHONPATH=/data/users/bcoutinho\n(pytorch) [bcoutinho@devgpu038.ftw6 /data/users/bcoutinho]$ python3 param_bench/train/compute/python/test/test_execution_trace.py\n record_param_comms, process group args = ('Tuple[String,String]', ['0', 'default_pg'], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['0', 'default_pg'], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['', ''], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['', ''], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['0', 'default_pg'], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['0', 'default_pg'], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['', ''], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['', ''], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['0', 'default_pg'], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['0', 'default_pg'], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['', ''], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['', ''], [[], []])\n.None\nNone\nNone\nNone\nNone\nNone\nNone\nNone\nNone\nNone\nNone\nNone\n.\n----------------------------------------------------------------------\nRan 2 tests in 1.322s\n\nOK\n```\n\nPull Request resolved: https://github.com/facebookresearch/param/pull/105\n\nReviewed By: shengfukevin\n\nDifferential Revision: D56739869\n\nPulled By: briancoutinho\n\nfbshipit-source-id: b53a3f36eb57e637e77b988cc071136b71e96caa","shortMessageHtmlLink":"add example trace with comm args metadata (#105)"}},{"before":"4792cb71ec14643a7702bcf0066cae778e8c62ae","after":"4f61a7ff1f7aa657111ee80f9a8f9604d5402654","ref":"refs/heads/add_comms_arg_trace","pushedAt":"2024-05-02T23:25:27.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"briancoutinho","name":"Brian Coutinho","path":"/briancoutinho","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6922212?s=80&v=4"},"commit":{"message":"add example trace with comm args metadata (#105)\n\nSummary:\nAs above\n1. Adds a test trace with 2 GPU resnet job and 12 communication collectives.\n2. Add a commArgs optional argument to ET node, this will be populated soon.\n3. Minor updates to parser and add a new unittest that tries to validate this test\n\n## Testing\n\n```\n(pytorch) [bcoutinho@devgpu038.ftw6 /data/users/bcoutinho]$ export PYTHONPATH=/data/users/bcoutinho\n(pytorch) [bcoutinho@devgpu038.ftw6 /data/users/bcoutinho]$ python3 param_bench/train/compute/python/test/test_execution_trace.py\n record_param_comms, process group args = ('Tuple[String,String]', ['0', 'default_pg'], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['0', 'default_pg'], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['', ''], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['', ''], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['0', 'default_pg'], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['0', 'default_pg'], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['', ''], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['', ''], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['0', 'default_pg'], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['0', 'default_pg'], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['', ''], [[], []])\n record_param_comms, process group args = ('Tuple[String,String]', ['', ''], [[], []])\n.None\nNone\nNone\nNone\nNone\nNone\nNone\nNone\nNone\nNone\nNone\nNone\n.\n----------------------------------------------------------------------\nRan 2 tests in 1.322s\n\nOK\n```\n\nPull Request resolved: https://github.com/facebookresearch/param/pull/105\n\nReviewed By: shengfukevin\n\nDifferential Revision: D56739869\n\nPulled By: briancoutinho","shortMessageHtmlLink":"add example trace with comm args metadata (#105)"}},{"before":"26771701638ef24d7e350edbfc3e8954060513b5","after":"4792cb71ec14643a7702bcf0066cae778e8c62ae","ref":"refs/heads/add_comms_arg_trace","pushedAt":"2024-05-02T16:38:21.000Z","pushType":"force_push","commitsCount":0,"pusher":{"login":"briancoutinho","name":"Brian Coutinho","path":"/briancoutinho","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6922212?s=80&v=4"},"commit":{"message":"follow up on comments, merge with triton kernel parse","shortMessageHtmlLink":"follow up on comments, merge with triton kernel parse"}},{"before":"425e08abebeac26e891ab553f42ab79053d57e5f","after":"c83ce8429110a86549c40fec5a01acbd9fbd54a4","ref":"refs/heads/main","pushedAt":"2024-05-01T17:41:31.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"add an init-only mode to benchmark initialization alone\n\nSummary:\nas $title\nAdd an init only mode to param benchmark to measure nccl initialization time along. This requires NCCL communicator being initialized in eager mode(not lazy mode that the initialization is trigger by the first collective)\n\nReviewed By: cenzhaometa\n\nDifferential Revision: D56767297\n\nfbshipit-source-id: c4d70540d3f9dc007e2b1a51b08a477da2ca8938","shortMessageHtmlLink":"add an init-only mode to benchmark initialization alone"}},{"before":"c42dd9e71222b4483cb35d5084fdcdf026c4635a","after":"425e08abebeac26e891ab553f42ab79053d57e5f","ref":"refs/heads/main","pushedAt":"2024-04-30T23:48:33.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Support triton kernel replay in PARAM\n\nSummary: This DIFF is to import the captured triton kernels into et_replay, load the kernel file, compile it into cuda binary, and replay it.\n\nReviewed By: briancoutinho\n\nDifferential Revision: D56320143\n\nfbshipit-source-id: f965c69b9cbc922482ae2eb7795d50621e4d0c5e","shortMessageHtmlLink":"Support triton kernel replay in PARAM"}},{"before":"364d73f106b88311c70f373a472ec271f1fab5e8","after":"26771701638ef24d7e350edbfc3e8954060513b5","ref":"refs/heads/add_comms_arg_trace","pushedAt":"2024-04-29T23:56:55.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"briancoutinho","name":"Brian Coutinho","path":"/briancoutinho","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6922212?s=80&v=4"},"commit":{"message":"fix black","shortMessageHtmlLink":"fix black"}},{"before":"c42dd9e71222b4483cb35d5084fdcdf026c4635a","after":"364d73f106b88311c70f373a472ec271f1fab5e8","ref":"refs/heads/add_comms_arg_trace","pushedAt":"2024-04-29T23:52:11.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"briancoutinho","name":"Brian Coutinho","path":"/briancoutinho","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6922212?s=80&v=4"},"commit":{"message":"add example trace with comm args metadata","shortMessageHtmlLink":"add example trace with comm args metadata"}},{"before":null,"after":"c42dd9e71222b4483cb35d5084fdcdf026c4635a","ref":"refs/heads/add_comms_arg_trace","pushedAt":"2024-04-29T23:51:16.000Z","pushType":"branch_creation","commitsCount":0,"pusher":{"login":"briancoutinho","name":"Brian Coutinho","path":"/briancoutinho","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6922212?s=80&v=4"},"commit":{"message":"Move trace_link.py to mlcommons/chakra (#100)\n\nSummary:\nMove trace_link.py to [mlcommons/chakra](https://github.com/mlcommons/chakra) based on the discussion between Meta and NVIDIA.\n\nPull Request resolved: https://github.com/facebookresearch/param/pull/100\n\nReviewed By: shengfukevin\n\nDifferential Revision: D56406386\n\nPulled By: briancoutinho\n\nfbshipit-source-id: e01130ac4f409c86a51f730d28bd0f167fb13900","shortMessageHtmlLink":"Move trace_link.py to mlcommons/chakra (#100)"}},{"before":"6c0067fcdbfc04c726bc9c0f5e2797ced9e1af81","after":"c42dd9e71222b4483cb35d5084fdcdf026c4635a","ref":"refs/heads/main","pushedAt":"2024-04-25T17:25:35.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Move trace_link.py to mlcommons/chakra (#100)\n\nSummary:\nMove trace_link.py to [mlcommons/chakra](https://github.com/mlcommons/chakra) based on the discussion between Meta and NVIDIA.\n\nPull Request resolved: https://github.com/facebookresearch/param/pull/100\n\nReviewed By: shengfukevin\n\nDifferential Revision: D56406386\n\nPulled By: briancoutinho\n\nfbshipit-source-id: e01130ac4f409c86a51f730d28bd0f167fb13900","shortMessageHtmlLink":"Move trace_link.py to mlcommons/chakra (#100)"}},{"before":"05abe4efacd04630878f90acc1b90fc2143c1826","after":"6c0067fcdbfc04c726bc9c0f5e2797ced9e1af81","ref":"refs/heads/main","pushedAt":"2024-04-23T04:12:43.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add Tensor(signed char)\n\nSummary: The most popular trace (according to this dashboard: https://fburl.com/unidash/iv4evgh6) is currently unsupported by param trace because we get a key error when encountering the `Tensor(signed char)` type. This diff resolves this issue.\n\nReviewed By: bahlneeraj\n\nDifferential Revision: D56451067\n\nfbshipit-source-id: a9ef08d7b73b665d7f6e2f4414aec961b9548589","shortMessageHtmlLink":"Add Tensor(signed char)"}},{"before":"7868e09eb10a4b26e466a04ef93d5f02c2ce6f1c","after":"05abe4efacd04630878f90acc1b90fc2143c1826","ref":"refs/heads/main","pushedAt":"2024-04-22T18:36:35.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Add a trace validator helper and simple unit test for\n\nSummary:\n## Summary\n* Adds a trace validation tool that can check PyTorch host execution traces. This is helpful for schema changes, and integration testing\n* Add a unit test to check if execution_trace.py works correctly on preset traces.\n* Minor: helpers to read semantic version of pytorch and chakra!\n\nReviewed By: shengbao-zheng\n\nDifferential Revision: D56325885\n\nfbshipit-source-id: 6092ca6592f1d3d29c84e6f6f69fbdc56cea7310","shortMessageHtmlLink":"Add a trace validator helper and simple unit test for"}},{"before":"c8e3f2f1da22b5f5fa29867b4c12825798873296","after":"7868e09eb10a4b26e466a04ef93d5f02c2ce6f1c","ref":"refs/heads/main","pushedAt":"2024-04-18T19:30:04.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"update commsReplay for 1.0.3-chakra.0.0.4 schema\n\nSummary:\n1.0.3-chakra.0.0.4 schema (PR #124035) logs as the new pg_name instead of pg uid in profiler.\n- group_name remains as the unique identifier, e.g. “0”, \"1\"\n- group_desc will be the user specified name, e.g. \"fsdp\".\n\nThis diff updates the commsReplay to support the new schema\n\nReviewed By: shengfukevin\n\nDifferential Revision: D56288398\n\nfbshipit-source-id: 3663e45507e098cedb407609eecc2e0cec6890a1","shortMessageHtmlLink":"update commsReplay for 1.0.3-chakra.0.0.4 schema"}},{"before":"0a073429d2139b5947212863b32b222a09239cd3","after":"c8e3f2f1da22b5f5fa29867b4c12825798873296","ref":"refs/heads/main","pushedAt":"2024-04-16T18:29:42.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Fixed comm parser issue\n\nSummary:\nThis DIFF is to fix the following two comm parser issue:\n1. process_group:init support both u_id and backend_id\n2. record_param_comms has different number of input.\n\nReviewed By: shengbao-zheng\n\nDifferential Revision: D56091619\n\nfbshipit-source-id: 58e12a515b17150ee68557fc6b4ad729e1614d49","shortMessageHtmlLink":"Fixed comm parser issue"}},{"before":"923ebe763351b8d7698e1ad45dca29e887c3284a","after":"0a073429d2139b5947212863b32b222a09239cd3","ref":"refs/heads/main","pushedAt":"2024-03-23T00:17:32.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Set up black for GitHub CI pipeline (#99)\n\nSummary:\nThis PR introduces a linter, black, for linting purposes.\n\nPull Request resolved: https://github.com/facebookresearch/param/pull/99\n\nTest Plan: Please refer to the actions tab of the forked repository ([link](https://github.com/TaekyungHeo/param/actions)). At present, PARAM does not pass the linter checks. Additional refactoring is necessary.\n\nReviewed By: briancoutinho\n\nDifferential Revision: D55261475\n\nPulled By: shengfukevin\n\nfbshipit-source-id: 2c92790e60e1e3fb1ed17fb2f4c6954a755bec18","shortMessageHtmlLink":"Set up black for GitHub CI pipeline (#99)"}},{"before":"d5f365de9f886205b522da2a2e9e259d01b254fc","after":"923ebe763351b8d7698e1ad45dca29e887c3284a","ref":"refs/heads/main","pushedAt":"2024-03-14T22:33:05.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Fixed missing device for torch.cuda.synchronize (#97)\n\nSummary:\nPull Request resolved: https://github.com/facebookresearch/param/pull/97\n\nUser found when running benchmark with --device option, the timing for devices other than cuda:0 is wrong. The bug is in timer.py, the call to torch.cuda.synchronize does not pass in torch.device. This DIFF is to fix this issue.\n\nDifferential Revision: D54907227\n\nfbshipit-source-id: 78fa7931c39913451a7c0e94f165943dce5055d1","shortMessageHtmlLink":"Fixed missing device for torch.cuda.synchronize (#97)"}},{"before":"529bf781966e5f8cdd5278313810c775bca17aa6","after":"d5f365de9f886205b522da2a2e9e259d01b254fc","ref":"refs/heads/main","pushedAt":"2024-03-11T23:09:13.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Refactor TBE generate requests return type (#95)\n\nSummary:\nPull Request resolved: https://github.com/facebookresearch/param/pull/95\n\nX-link: https://github.com/pytorch/FBGEMM/pull/2411\n\nThis diff refactors the return type of `generate_requests` (TBE random\ninput generator). Prior to this diff, `generate_requests` returns\na list of indices, offsets and per sample weights (optional) tuple\n(`List[Tuple(Tensor, Tensor, Optional[Tensor])]`). If we add another\nreturn value to the tuple, we need to update every request tuple\nunpacking site and update typing to satisfy Pyre requirements. Thus,\nthis diff adds `TBERequest` which is a wrapper of return values of\n`generate_requests`. It allows the user to access each return value\nindividually or as an arbitrary length tuple.\n\nReviewed By: q10\n\nDifferential Revision: D54710610\n\nfbshipit-source-id: fec2e926ff186ea233d4dea109208a68339e3388","shortMessageHtmlLink":"Refactor TBE generate requests return type (#95)"}},{"before":"3c20655099f025dff18e720b4e5255d2a30dcba0","after":"529bf781966e5f8cdd5278313810c775bca17aa6","ref":"refs/heads/main","pushedAt":"2024-03-11T20:14:35.000Z","pushType":"push","commitsCount":1,"pusher":{"login":"facebook-github-bot","name":"Facebook Community Bot","path":"/facebook-github-bot","primaryAvatarUrl":"https://avatars.githubusercontent.com/u/6422482?s=80&v=4"},"commit":{"message":"Fixed missing device for torch.cuda APIs and added benchmark configs for RTP (#96)\n\nSummary:\nPull Request resolved: https://github.com/facebookresearch/param/pull/96\n\nThis DIFF include three parts:\n\n1. Added the missing device or torch.cuda APIs\n\n2. Provided two configs for RTP benchmark, llama2.json includes a list of matmuls extracted from LLama2, resnet.json includes resnet50.\n\n3. Added L2 cache size for H100 and set default of --cuda-l2-cache to on.\n\nDifferential Revision: D54511670\n\nfbshipit-source-id: 1194f0a0f44045e45e2285f5812d051284ec7026","shortMessageHtmlLink":"Fixed missing device for torch.cuda APIs and added benchmark configs …"}}],"hasNextPage":true,"hasPreviousPage":false,"activityType":"all","actor":null,"timePeriod":"all","sort":"DESC","perPage":30,"cursor":"djE6ks8AAAAEXJrAwwA","startCursor":null,"endCursor":null}},"title":"Activity · facebookresearch/param"}