Skip to content

Add Mooncake extension for propagate copy_xj (+) fast path and enable tests#677

Open
Parvm1102 wants to merge 1 commit into
JuliaGraphs:masterfrom
Parvm1102:perf/graphconv-mooncake-friendly
Open

Add Mooncake extension for propagate copy_xj (+) fast path and enable tests#677
Parvm1102 wants to merge 1 commit into
JuliaGraphs:masterfrom
Parvm1102:perf/graphconv-mooncake-friendly

Conversation

@Parvm1102

Copy link
Copy Markdown
Contributor

GNN AD Benchmark Results

The rules matched for the following layers, the improvements and the benchmarks are listed below, all layers pass the gradient correctness benchmarks:

GCNConv
[n=512]
  forward: ok   grad-check PASS

  Zygote             Zyg     0.541 ms /   1136 alloc
  Mooncake + RULE     MC     2.443 ms /    401 alloc   (MC+rule / Zyg =  4.52x)
  Mooncake - rule     MC     9.628 ms /    582 alloc   (no-rule / Zyg = 17.80x)

  >>> RULE SPEEDUP (no-rule / +rule) =  3.94x

[n=2048]
  forward: ok   grad-check PASS

  Zygote             Zyg     1.752 ms /   1136 alloc
  Mooncake + RULE     MC    10.260 ms /    401 alloc   (MC+rule / Zyg =  5.86x)
  Mooncake - rule     MC    37.645 ms /    582 alloc   (no-rule / Zyg = 21.49x)

  >>> RULE SPEEDUP (no-rule / +rule) =  3.67x
GraphConv
[n=512]
  forward: ok   grad-check PASS

  Zygote             Zyg     0.486 ms /    470 alloc
  Mooncake + RULE     MC     1.482 ms /    198 alloc   (MC+rule / Zyg =  3.05x)
  Mooncake - rule     MC     7.895 ms /    377 alloc   (no-rule / Zyg = 16.26x)

  >>> RULE SPEEDUP (no-rule / +rule) =  5.33x

[n=2048]
  forward: ok   grad-check PASS

  Zygote             Zyg     1.806 ms /    470 alloc
  Mooncake + RULE     MC     6.200 ms /    198 alloc   (MC+rule / Zyg =  3.43x)
  Mooncake - rule     MC    30.904 ms /    377 alloc   (no-rule / Zyg = 17.11x)

  >>> RULE SPEEDUP (no-rule / +rule) =  4.98x
SAGEConv (aggr=+)
[n=512]
  forward: ok   grad-check PASS

  Zygote             Zyg     0.579 ms /    462 alloc
  Mooncake + RULE     MC     1.868 ms /    190 alloc   (MC+rule / Zyg =  3.23x)
  Mooncake - rule     MC     8.168 ms /    367 alloc   (no-rule / Zyg = 14.11x)

  >>> RULE SPEEDUP (no-rule / +rule) =  4.37x

[n=2048]
  forward: ok   grad-check PASS

  Zygote             Zyg     1.426 ms /    462 alloc
  Mooncake + RULE     MC     7.171 ms /    190 alloc   (MC+rule / Zyg =  5.03x)
  Mooncake - rule     MC    32.122 ms /    367 alloc   (no-rule / Zyg = 22.52x)

  >>> RULE SPEEDUP (no-rule / +rule) =  4.48x
GINConv
[n=512]
  forward: ok   grad-check PASS

  Zygote             Zyg     0.425 ms /    497 alloc
  Mooncake + RULE     MC     1.023 ms /    200 alloc   (MC+rule / Zyg =  2.41x)
  Mooncake - rule     MC     7.461 ms /    379 alloc   (no-rule / Zyg = 17.56x)

  >>> RULE SPEEDUP (no-rule / +rule) =  7.29x

[n=2048]
  forward: ok   grad-check PASS

  Zygote             Zyg     1.174 ms /    497 alloc
  Mooncake + RULE     MC     3.858 ms /    200 alloc   (MC+rule / Zyg =  3.29x)
  Mooncake - rule     MC    29.031 ms /    379 alloc   (no-rule / Zyg = 24.72x)

  >>> RULE SPEEDUP (no-rule / +rule) =  7.53x
SGConv (k=2)
[n=512]
  forward: ok   grad-check PASS

  Zygote             Zyg     2.704 ms /   1813 alloc
  Mooncake + RULE     MC     4.545 ms /    491 alloc   (MC+rule / Zyg =  1.68x)
  Mooncake - rule     MC    17.955 ms /    854 alloc   (no-rule / Zyg =  6.64x)

  >>> RULE SPEEDUP (no-rule / +rule) =  3.95x

[n=2048]
  forward: ok   grad-check PASS

  Zygote             Zyg     5.115 ms /   1813 alloc
  Mooncake + RULE     MC    15.896 ms /    491 alloc   (MC+rule / Zyg =  3.11x)
  Mooncake - rule     MC    71.492 ms /    854 alloc   (no-rule / Zyg = 13.98x)

  >>> RULE SPEEDUP (no-rule / +rule) =  4.50x
TAGConv (k=2)
[n=512]
  forward: ok   grad-check PASS

  Zygote             Zyg     2.625 ms /   1858 alloc
  Mooncake + RULE     MC     6.983 ms /    524 alloc   (MC+rule / Zyg =  2.66x)
  Mooncake - rule     MC    18.938 ms /    887 alloc   (no-rule / Zyg =  7.22x)

  >>> RULE SPEEDUP (no-rule / +rule) =  2.71x

[n=2048]
  forward: ok   grad-check PASS

  Zygote             Zyg     8.645 ms /   1858 alloc
  Mooncake + RULE     MC    21.477 ms /    524 alloc   (MC+rule / Zyg =  2.48x)
  Mooncake - rule     MC    85.246 ms /    887 alloc   (no-rule / Zyg =  9.86x)

  >>> RULE SPEEDUP (no-rule / +rule) =  3.97x
GatedGraphConv (L=2)
[n=512]
  forward: ok   grad-check PASS

  Zygote             Zyg     7.233 ms /   3182 alloc
  Mooncake + RULE     MC    15.051 ms /   2298 alloc   (MC+rule / Zyg =  2.08x)
  Mooncake - rule     MC    31.563 ms /   2661 alloc   (no-rule / Zyg =  4.36x)

  >>> RULE SPEEDUP (no-rule / +rule) =  2.10x

[n=2048]
  forward: ok   grad-check PASS

  Zygote             Zyg    13.064 ms /   3182 alloc
  Mooncake + RULE     MC    58.372 ms /   2298 alloc   (MC+rule / Zyg =  4.47x)
  Mooncake - rule     MC   108.064 ms /   2661 alloc   (no-rule / Zyg =  8.27x)

  >>> RULE SPEEDUP (no-rule / +rule) =  1.85x

The PR also enables Mooncake testing in the same way currently as done here:

https://github.com/JuliaGraphs/GraphNeuralNetworks.jl/blob/f09fcc4af170c1d87dc1d51648c3a015357fea93/GraphNeuralNetworks/test/test_module.jl

I will fix the format as mentioned in #676 in a future PR.

Signed-off-by: Parvm1102 <parvmittal31757@gmail.com>
@Parvm1102 Parvm1102 changed the title Add Mooncake extension for propagate copy_xj fast path and enabled tests Add Mooncake extension for propagate copy_xj (+) fast path and enable tests Jun 27, 2026
@Parvm1102

Copy link
Copy Markdown
Contributor Author

@CarloLucibello Could you please review this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant