Fix #10632: RandomNodeSplit fails on HeteroData by JiwaniZakir · Pull Request #10654 · pyg-team/pytorch_geometric

JiwaniZakir · 2026-04-02T01:48:37Z

Before

RandomNodeSplit.__call__ in torch_geometric/transforms/random_node_split.py skipped any node store that lacked self.key (default: 'y'), regardless of the split mode. On HeteroData, this caused split='train_rest' to silently skip every node type that had no y attribute, producing a transformed object with no masks at all. Accessing .train_mask on the result then raised AttributeError: 'HeteroData' has no attribute 'train_mask'.

After

The skip condition in random_node_split.py (line 77) now gates on self.split != 'train_rest' in addition to the key check:

if (self.split != 'train_rest' and self.key is not None
        and not hasattr(store, self.key)):
    continue

train_rest distributes all remaining nodes to the training set without needing class labels, so it correctly processes every node store in a HeteroData object — including those without a y attribute (e.g., author nodes in MovieLens1M or OGB_MAG). test_rest and random still require the key to be present because they rely on per-class counts.

Changes

torch_geometric/transforms/random_node_split.py: Added self.split != 'train_rest' guard to the early-continue condition so unlabeled node stores are not skipped when the split mode doesn't require y.
test/transforms/test_random_node_split.py: Rewrote test_random_node_split_on_hetero_data to:
- Increase author nodes from 300 → 1000 so num_val=100, num_test=200 fits within the store.
- Explicitly test train_rest on HeteroData with both a labeled (paper) and unlabeled (author) node type, asserting exact mask cardinalities.
- Add a second sub-test for test_rest confirming that author (no y) is still skipped while paper (has y) receives masks.

Testing

The updated test_random_node_split_on_hetero_data asserts:

data['author']: 4 attrs  → x, train_mask, val_mask, test_mask
data['paper']:  5 attrs  → x, y, train_mask, val_mask, test_mask

paper.train_mask.sum() == 1700   # 2000 - 100 - 200
paper.val_mask.sum()   == 100
paper.test_mask.sum()  == 200

author.train_mask.sum() == 700   # 1000 - 100 - 200
author.val_mask.sum()   == 100
author.test_mask.sum()  == 200

For test_rest with num_train_per_class=10, num_val=100, author (no y) is skipped (len == 1) and paper receives all three masks (len == 5).

This PR was created with AI assistance (Claude). The changes were reviewed by quality gates and a critic model before submission.

`split='train_rest'` does not use the `key` attribute at all, so the guard that skips node stores lacking `key` (e.g. `y`) was overly restrictive. Node types without a label tensor were silently skipped, leaving HeteroData graphs (such as MovieLens1M) with no masks set. Scope the guard to only `test_rest` and `random` splits, which actually require class labels, and update the HeteroData test accordingly. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

for more information, see https://pre-commit.ci

codecov · 2026-04-02T01:57:55Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 84.18%. Comparing base (c211214) to head (623ce43).
⚠️ Report is 191 commits behind head on master.

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #10654      +/-   ##
==========================================
- Coverage   86.11%   84.18%   -1.93%     
==========================================
  Files         496      510      +14     
  Lines       33655    36019    +2364     
==========================================
+ Hits        28981    30323    +1342     
- Misses       4674     5696    +1022

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

JiwaniZakir requested review from akihironitta, rusty1s and wsad1 as code owners April 2, 2026 01:48

[pre-commit.ci] auto fixes from pre-commit.com hooks

623ce43

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #10632: RandomNodeSplit fails on HeteroData#10654

Fix #10632: RandomNodeSplit fails on HeteroData#10654
JiwaniZakir wants to merge 2 commits intopyg-team:masterfrom
JiwaniZakir:fix/10632-randomnodesplit-fails-on-heterodata

JiwaniZakir commented Apr 2, 2026

Uh oh!

codecov bot commented Apr 2, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

JiwaniZakir commented Apr 2, 2026

Before

After

Changes

Testing

Uh oh!

codecov bot commented Apr 2, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

codecov bot commented Apr 2, 2026 •

edited

Loading