Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce redundant walks when resolving module depends_on #35157

Merged
merged 2 commits into from
May 15, 2024

Conversation

jbardin
Copy link
Member

@jbardin jbardin commented May 14, 2024

The use of depends_on in modules can cause large numbers of nodes to become connected to all of the same dependencies. When using Ancestors to walk the graph and find all of these dependencies, there can be a lot of redundant ancestors shared between each of the original nodes, causing an exponential increase in processing time and memory use.

Instead of traversing the graph from each dependency individually, which can frequently end up duplicating results, we can start a walk from all nodes at once, reducing the redundant graph traversals and automatically removing duplicates.

Fixes #35154

Target Release

v1.8.4

A walk can start from multiple roots, so making the Ancestors and
Descendents methods variadic is the easiest way to add access to that
via the API with no changes in existing callers. We can use this from
terraform to avoid repeatedly walking overlapping sets of nodes.
The existing implementation of parentModuleDependsOn would walk each
reference separately, but if there were many dependencies which were all
interconnected, that would end up walking over the same nodes multiple
times, and storing many duplicate results.

Since a graph walk can start at multiple nodes, we use the extended
version of Ancestors to start at all dependencies simultaneously,
reducing the time spend traversing the graph as well as automatically
removing duplicates.
@jbardin jbardin added the 1.8-backport If you add this label to a PR before merging, backport-assistant will open a new PR once merged label May 14, 2024
@jbardin jbardin requested a review from a team May 14, 2024 19:05
@crw crw added the bug label May 14, 2024
@jbardin jbardin merged commit 3845aa1 into main May 15, 2024
6 checks passed
@jbardin jbardin deleted the jbardin/depends-on-references branch May 15, 2024 15:11
Copy link

Reminder for the merging maintainer: if this is a user-visible change, please update the changelog on the appropriate release branch.

jbardin added a commit that referenced this pull request May 15, 2024
@LaurentLesle
Copy link

Amazing! now Terraform faster than ever on large graph!. Thanks

@remi-f-artelia
Copy link

I had this issue without understand really why

If I did a plan or an apply locally, the plan|apply took ~1m30 on a large graph (>2'000 resources)
But when running on a default azure devops runner ubuntu (7GB RAM if I'm right), it took ~20 min (even if previously it took the same time locally than into the pipeline).
I was thinking about a memory issue (I have 32 GB locally) but I couldn't see a memory spike locally when I ran a plan | apply so...

And then I removed by coincidence the more depend_on block I could find in the codebase.
And I saw a few days later when this housekeeping was done that everything gone back to normal time.

So I was suspecting an issue on Azure Devops runner because everything came back to normal.
But now I understand that was because of the multiple depends_on put everywhere in the code base.

Anyway, good news because now my plan run in ~20 sec for ~2'500 resources...
Thank you for this

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
1.8-backport If you add this label to a PR before merging, backport-assistant will open a new PR once merged bug
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ReferenceTransformer hangs in plan when module has depends_on
5 participants