-
Notifications
You must be signed in to change notification settings - Fork 184
[Cleanup] Combine Batched and Regular KMeans Impl #2015
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 16 commits
66d7fd3
0a09e6f
99a5730
a077406
d659875
ec2e8b7
03a6473
42a8d9d
86af2fa
d4e4e2c
0819af5
e0f079c
c2f7390
b9c3102
e3956c1
986d78a
a8e1d26
384d054
455b286
5462809
6ba759c
e76eaac
afbefdf
e62a63c
e4f08bf
6e4a8f0
4a8a85c
bbf2a9f
410092c
c515c1e
e8e63ab
30c457c
ab96623
269f23c
80a22ca
ac06b05
0569340
8cac63a
f6df4ae
9fc74b1
dec3dc4
0d030a2
b1c034e
a482495
8ecfdc1
1e1525e
ec22e07
d2e410d
b791c38
a05a006
73293cf
880c7b9
e2035ec
e28c200
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -113,9 +113,14 @@ struct params : base_params { | |
| int batch_centroids = 0; | ||
|
|
||
| /** | ||
| * If true, check inertia during iterations for early convergence. | ||
| * Number of samples to randomly draw for the KMeansPlusPlus initialization | ||
| * step. A random subset of this size is used for centroid seeding. | ||
| * When set to 0 the default depends on the data location: | ||
| * - Device data: n_samples (use the full dataset). | ||
| * - Host data: min(3 * n_clusters, n_samples). | ||
| * Default: 0. | ||
| */ | ||
| bool inertia_check = false; | ||
| int64_t init_size = 0; | ||
|
Comment on lines
115
to
+127
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Breaking removal of The Please either:
As per coding guidelines: "API changes flagged for docs/ updates" and "Breaking changes require deprecation warnings and migration guide updates". 🤖 Prompt for AI Agents
Member
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I soflty agree with this rabbit comment, would be worth to add a not in the PR description and code, C++ callers using the C++ API directly will fail to compile, not warn. That's a real source-API break worth a release-notes line at least There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Concretely, I'd suggest three things:
/**
* `@deprecated` since 26.06 — ignored. Inertia-based convergence now always runs.
* Use `init_size` to control KMeans++ seeding sample count.
* Will be removed in a future release.
*/
[[deprecated("inertia_check is ignored since 26.06; inertia-based convergence always runs")]]
bool inertia_check = false;
Would you like me to plan the code changes for items 1 and 2 above? 🧠 Learnings used |
||
|
|
||
| /** | ||
| * Number of samples to process per GPU batch when fitting with host data. | ||
|
|
||
Uh oh!
There was an error while loading. Please reload this page.