Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] Speed up ONNX graphsurgeon toposort #2763

Open
wants to merge 1 commit into
base: release/8.6
Choose a base branch
from

Conversation

fynnsu
Copy link

@fynnsu fynnsu commented Mar 14, 2023

The current implementation of toposort doesn't memoize the heirarchy_levels as it computes them. This can result in prohibitively slow toposorts, even on small graphs.

As an example, the below code creates a graph that consists of a sequence of fully connected pairs of nodes. The running time to toposort the nodes increases exponentially with respect to the number of nodes.

import onnx_graphsurgeon as gs
import numpy as np
import time

def create_graph(levels):
    x = inputs = [gs.Variable(name=f'in_{i}', dtype=np.float32, shape=(1)) for i in range(2)]
    
    all_nodes = []
    for l in range(levels):
        outs = [gs.Variable(name=f'out_{l}_{i}', dtype=np.float32, shape=(1)) for i in range(2)]
        all_nodes.extend([gs.Node(op="", inputs=x, outputs=[outs[i]]) for i in range(2)])
        x = outs
    graph = gs.Graph(nodes=all_nodes, inputs=inputs, outputs=x)
    
    return graph


for n_levels in range(10, 23):
    start_time = time.time()
    graph = create_graph(n_levels)
    graph.nodes = list(reversed(graph.nodes))
    graph.toposort()
    print(f'Graph with {len(graph.nodes)} nodes and {len(graph.tensors())} tensors took {time.time() - start_time:.2f} seconds to toposort.')

Output:

Graph with 20 nodes and 22 tensors took 0.01 seconds to toposort.
Graph with 22 nodes and 24 tensors took 0.01 seconds to toposort.
Graph with 24 nodes and 26 tensors took 0.02 seconds to toposort.
Graph with 26 nodes and 28 tensors took 0.04 seconds to toposort.
Graph with 28 nodes and 30 tensors took 0.08 seconds to toposort.
Graph with 30 nodes and 32 tensors took 0.16 seconds to toposort.
Graph with 32 nodes and 34 tensors took 0.32 seconds to toposort.
Graph with 34 nodes and 36 tensors took 0.63 seconds to toposort.
Graph with 36 nodes and 38 tensors took 1.26 seconds to toposort.
Graph with 38 nodes and 40 tensors took 2.47 seconds to toposort.
Graph with 40 nodes and 42 tensors took 4.73 seconds to toposort.
Graph with 42 nodes and 44 tensors took 9.26 seconds to toposort.
Graph with 44 nodes and 46 tensors took 18.52 seconds to toposort.

Signed-off-by: fynnsu <fynnsu@outlook.com>
@rajeevsrao rajeevsrao changed the base branch from main to release/8.6 March 16, 2023 04:56
@rajeevsrao
Copy link
Collaborator

@pranavm-nvidia has this been merged to the internal repo?

@pranavm-nvidia
Copy link
Collaborator

@pranavm-nvidia has this been merged to the internal repo?

Integrating it now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants