Skip to content

Add Storage TSGs: physical disk add (HowTo) + CanPool=False (Troubleshoot)#284

Open
AlBurns-MSFT wants to merge 4 commits into
Azure:mainfrom
AlBurns-MSFT:tsg/storage-add-physical-disks
Open

Add Storage TSGs: physical disk add (HowTo) + CanPool=False (Troubleshoot)#284
AlBurns-MSFT wants to merge 4 commits into
Azure:mainfrom
AlBurns-MSFT:tsg/storage-add-physical-disks

Conversation

@AlBurns-MSFT
Copy link
Copy Markdown
Collaborator

@AlBurns-MSFT AlBurns-MSFT commented May 11, 2026

Re-opens PR #281, which was inadvertently closed when the fork was deleted. Same two commits, recovered from refs/pull/281/head — no content change.

Original PR added two Storage TSGs from a customer disk-add engagement:

  • HowTo-Storage-AddPhysicalDisksToS2DPool.md
  • Troubleshoot-Storage-PhysicalDiskCanPoolFalse.md

Copilot review threads from PR #281 (addressed in commit 267151c) do NOT migrate to the new PR. Reviewers can reference the original PR for context.

AlBurns-MSFT and others added 2 commits May 6, 2026 16:31
…hoot)

Adds two new Storage TSGs derived from a customer disk-add engagement:

- HowTo-Storage-AddPhysicalDisksToS2DPool.md
  End-to-end safe procedure for online capacity expansion: pre-checks,
  symmetric insertion, automatic vs manual pool claim, monitoring storage
  jobs, capacity confirmation, and final validation.

- Troubleshoot-Storage-PhysicalDiskCanPoolFalse.md
  Resolution paths for every common CannotPoolReason value (In a Pool,
  Verification in progress / failed, Hardware/Firmware not compliant,
  Offline, Stale metadata), plus data-collection checklist for support.

Both TSGs follow the HowTo-Template and Troubleshoot-Template and reference
the existing Troubleshooting-Storage-With-Support-Diagnostics-Tool TSG.

Internal tracking: msazure/One #37486687.
Drafts were reviewed for public safety (no telemetry, KQL, customer names,
or ARM URIs).

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
…arify version applicability

- Both files: replace manual Add-PhysicalDisk snippet with a defensive
  variant that validates exactly one non-primordial pool and requires
  the operator to enumerate intended new disks by serial number, so
  Add-PhysicalDisk cannot accidentally claim unintended CanPool=True
  disks.

- Troubleshoot file: clarify Affected Versions from 'All versions' to
  'All Azure Local releases (Storage Spaces Direct)' to convey scope
  rather than appearing version-agnostic.

Addresses Copilot review threads on PR Azure#281.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copilot AI review requested due to automatic review settings May 11, 2026 18:23
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds two new Storage troubleshooting guides (TSGs) to the Azure Local Supportability repo: one “HowTo” for safely adding physical disks to an existing S2D pool, and one “Troubleshoot” guide for resolving CanPool=False scenarios using CannotPoolReason. Updates the Storage component index to link to the new guides.

Changes:

  • Added a HowTo guide for end-to-end, safety-gated disk add + post-add monitoring/verification in S2D.
  • Added a Troubleshoot guide mapping common CannotPoolReason values to targeted remediation steps (including guarded destructive actions).
  • Updated TSG/Storage/README.md to include links to both new documents.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 3 comments.

File Description
TSG/Storage/HowTo-Storage-AddPhysicalDisksToS2DPool.md New HowTo TSG covering safe disk-add sequence, monitoring, and verification for S2D pools.
TSG/Storage/Troubleshoot-Storage-PhysicalDiskCanPoolFalse.md New Troubleshoot TSG providing reason-by-reason remediation for CanPool=False disks.
TSG/Storage/README.md Adds index entries linking to the two new Storage TSGs.

Comment on lines +230 to +231
if ($disksToAdd.Count -ne $intendedSerials.Count) {
throw "Disk count mismatch: $($disksToAdd.Count) eligible disks matched " +
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Fixed in cb4d1aa — wrapped the assignment in @() and added an inline comment so future readers understand why:

$disksToAdd = @(Get-PhysicalDisk -CanPool $true |
                Where-Object SerialNumber -in $intendedSerials)

The single-pool guard a few lines above already uses the same @($pool).Count pattern, so this brings the two checks into line.

Comment on lines +250 to +251
if ($disksToAdd.Count -ne $intendedSerials.Count) {
throw "Disk count mismatch: $($disksToAdd.Count) eligible disks matched " +
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same fix as the HowTo. Wrapped in @() in cb4d1aa.

| `Offline` | The disk is offline | Bring only the intended disk online |
| `Insufficient Capacity` | The disk is too small | Replace with a supported disk |
| `Removable media not supported` | The disk is removable or presented as removable | Replace with supported internal storage |
| Stale metadata suspected | The disk has previous data or pool metadata | Reset only after confirming the disk is safe to wipe |
Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Verified against MSFT_PhysicalDisk.CannotPoolReason on Microsoft Learn — the canonical Windows enum is Unknown / Other / In a Pool / Not Healthy / Removable Media / In Use by Cluster / Offline / Insufficient Capacity / Spare Disk / Reserved by subsystem / Starting, plus the S2D Health Service extensions (Verification in progress, Hardware not compliant, etc.). 'Stale metadata suspected' isn't in either set; the real-world signature is CannotPoolReason = 'In a Pool' with Get-StoragePool | Get-PhysicalDisk finding no matching pool.

Fixed in cb4d1aa:

  • Dropped the 'Stale metadata suspected' row from the table.
  • Added a note above the table explaining stale-metadata is a sub-case of In a Pool handled separately in Step 2g.
  • Updated the In a Pool row's Action to: 'Confirm pool membership via Step 2a. If Get-StoragePool | Get-PhysicalDisk finds no match for the disk, treat it as stale metadata (Step 2g).'

Step 2g (Reset-PhysicalDisk) was already correctly written, so no change there.

…fety refinements

- HowTo + Troubleshoot: wrap $disksToAdd assignment in @() so .Count is
  reliable when 0 or 1 disk matches (Copilot review).
- Troubleshoot CannotPoolReason table: drop 'Stale metadata suspected'
  row (not a real enum value). Add explanatory note that stale-metadata
  is a sub-case of 'In a Pool' handled in Step 2g, and link to it from
  the 'In a Pool' row.
- Troubleshoot Step 2e: clarify that firmware-not-compliant is usually
  fixed by updating firmware via OEM tooling, not vendor contact.
- Troubleshoot Step 2f: add [!CAUTION] block warning operators to check
  active storage jobs before bringing a disk online with Set-Disk.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@AlBurns-MSFT
Copy link
Copy Markdown
Collaborator Author

Two additional refinements applied in cb4d1aa beyond the Copilot review:

  1. Step 2e (Firmware not compliant) action wording — softened 'Contact the hardware vendor for firmware alignment' to 'Update firmware to a supported version using OEM update tooling (Dell DSU, HPE SUM, Lenovo XClarity Essentials, etc.), or escalate to the hardware vendor if no supported firmware exists.' This matches the most common real-world resolution path.

  2. Step 2f (Set-Disk -IsOffline $false) safety — added a [!CAUTION] block before the Set-Disk commands warning operators to check Get-StorageJob first. Forcing a disk online during an active repair/regeneration/rebalance can cause the job to retry against the newly-online path and extend the impact window.

@Karl-WE
Copy link
Copy Markdown

Karl-WE commented May 12, 2026

Please add note that Dell OMIMSWAC
Offering a software guided wizard.
This is more convenient for most admins and included in Premier offerings.

Address @Karl-WE comment on PR Azure#284: flag that Dell OMIMSWAC (and
similar OEM WAC extensions) offer a guided disk-add alternative to
the manual PowerShell sequence. Vendor-neutral framing with Dell as
the named example since Microsoft Learn already references Dell
OpenManage Integration by name. Includes the WAC-in-Azure-portal
caveat (OEM extensions not supported in embedded WAC) so the note
doesn't mislead operators using the Azure portal management surface.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@AlBurns-MSFT
Copy link
Copy Markdown
Collaborator Author

AlBurns-MSFT commented May 12, 2026

Thanks @Karl-WE, I added this in commit 4750df8. I kept the framing vendor-neutral with OMIMSWAC as the named example, and flagged that OEM WAC extensions aren't supported in the WAC instance embedded in the Azure portal (only standalone).

PR #284 now shows 4 commits, head is 4750df8.

@Karl-WE
Copy link
Copy Markdown

Karl-WE commented May 12, 2026

That's true. I wonder who is working with Azure WAC extension. It doesn't work most of the time anyway and I gave up reporting it.
WAC gateway (aMode) tho is stable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants