Skip to content

Suggestion to improve efficiency in the unicorn notebook  #2

@abouelkhair5

Description

@abouelkhair5

In the unicorn notebook and specially in the prepare_graph function you call nodes.keys and index function twice and those are expensive calls that result in the prepare_graph call taking over 2 hours on a very strong machine

edge_index = [[], []]
for src, dst in edges:
    src_index = list(nodes.keys()).index(src)
    dst_index = list(nodes.keys()).index(dst)
    edge_index[0].append(src_index)
    edge_index[1].append(dst_index)

an alternative would be to precompute all the indencies and store them in a hashmap and compute the graph in a few seconds
for example:

node_index_map = {node: i for i, node in enumerate(nodes.keys())}
for src, dst in tqdm(edges):
    src_index = node_index_map[src]
    dst_index = node_index_map[dst]
    edge_index[0].append(src_index)
    edge_index[1].append(dst_index)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions