Skip to content

[ENH] Add probabilistic regressor with shrinking intervals#987

Open
arnavk23 wants to merge 15 commits intosktime:mainfrom
arnavk23:reducing-interval-regression
Open

[ENH] Add probabilistic regressor with shrinking intervals#987
arnavk23 wants to merge 15 commits intosktime:mainfrom
arnavk23:reducing-interval-regression

Conversation

@arnavk23
Copy link
Copy Markdown
Contributor

@arnavk23 arnavk23 commented Mar 22, 2026

Reference Issues/PRs

Towards #7

What does this implement/fix? Explain your changes.

This PR adds a new probabilistic baseline estimator ShrinkingNormalIntervalRegressor for interval and quantile prediction.

  • In mean_sd mode, intervals are built as mean ± z * sd / sqrt(n), where z is the Normal critical value for the requested coverage.
  • Because standard error scales as 1/sqrt(n), interval width decreases as sample size grows, giving a simple demonstration of shrinking uncertainty with more data.

What is implemented.

  • A feature-agnostic probabilistic regressor that estimates target location/spread from training targets.
  • Predictive intervals via predict_interval and predictive quantiles via predict_quantiles.
  • Two modes:
  1. mean_sd: Normal-approximation intervals/quantiles with shrinkage in n.
  2. quantile: empirical quantile baseline from training targets.

Limitations

  • mean_sd assumes an approximately Normal target distribution around a global mean and uses a single global spread estimate.
  • quantile mode is static. It does not shrink with n by construction.
  • The estimator ignores feature effects in uncertainty (and in location beyond a global mean) so it is a calibration/baseline-style method rather than a full conditional model.

Does your contribution introduce a new dependency? If yes, which one?

No new dependencies.

What should a reviewer concentrate their feedback on?

  1. Correctness of the shrinkage logic in mean_sd mode, especially width scaling with 1/sqrt(n).
  2. Clarity and appropriateness of the two-mode API (shrinking Normal mode vs static quantile baseline).
  3. Whether the documented limitations are sufficiently explicit for users.

Did you add any tests for the change?

Yes.

  • Added focused tests for interval shrinkage with increasing n in mean_sd mode.
  • Added tests for predict_interval and predict_quantiles shape and value correctness.
  • Added a small-n edge-case test to ensure stable finite outputs.
  • Generic estimator suite coverage remains in place.

Any other comments?

This estimator is intentionally simple and educational as a probabilistic baseline. Feedback on API naming, mode semantics, and whether to keep both modes in one class versus split classes is especially welcome.

PR checklist

For all contributions
  • I've added myself to the list of contributors with any new badges I've earned :-)
  • The PR title starts with either [ENH], [MNT], [DOC], or [BUG].
For new estimators
  • I've added the estimator to the API reference - in docs/source/api_reference/regression.rst, follow the pattern.
  • I've added one or more illustrative usage examples to the docstring, in a pydocstyle compliant Examples section.
  • If the estimator relies on a soft dependency, I've set the python_dependencies tag and ensured dependency isolation.

arnavk23 added 14 commits March 22, 2026 18:09
…ntervalRegressor

- Implement method dispatch for 'mean_sd' and 'quantile' in _predict_interval and _predict_quantiles
- Replace Monte Carlo z-score with scipy.stats.norm.ppf
- Add input validation for method, coverage, and alpha
- Add type hints and improve robustness
- Remove try/except for scipy import; require scipy unconditionally
…le docs/examples so they now match the behavior: mean_sd is the shrinking normal-approximation mode, while quantile is a static empirical baseline. The public import in __init__.py:10 and the tests were updated to use the new name.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant