Use float64 in Jenks natural breaks internals (#1100) by brendancol · Pull Request #1101 · xarray-contrib/xarray-spatial

brendancol · 2026-03-30T20:55:31Z

Summary

Fixes #1100. The Jenks natural breaks algorithm used float32 for its internal matrices and bin edge array. The naive variance formula sum_squares - (sum * sum) / w loses all significant digits when data has a large offset relative to its spread (elevations around 1000m, projected coordinates in the millions, etc.).

Changed four float32 sites to float64:

lower_class_limits matrix dtype
var_combinations matrix dtype
val = np.float32(data[i4]) cast removed
kclass bin edge array dtype

Test plan

test_natural_breaks_large_offset_1100: five tight clusters at offset 100,000 with spread of 10 -- all 5 classes must be separated cleanly
Full test_classify.py suite: 85 passed, no regressions

The Jenks matrices and bin edge array used float32, causing the naive variance formula (sum_squares - sum*sum/w) to lose all significant digits when data had a large offset relative to its spread. Changed lower_class_limits, var_combinations, val cast, and kclass to float64.

test_natural_breaks_large_offset_1100: five tight clusters offset by 100,000 must be separated into 5 distinct classes. With float32 internals, the variance calculation lost all signal and merged clusters.

brendancol added 2 commits March 30, 2026 13:54

Add regression test for Jenks float32 precision (#1100)

0b8a5c1

test_natural_breaks_large_offset_1100: five tight clusters offset by 100,000 must be separated into 5 distinct classes. With float32 internals, the variance calculation lost all signal and merged clusters.

github-actions bot added the performance PR touches performance-sensitive code label Mar 30, 2026

Retrigger CI

d5e8b02

brendancol merged commit 629d533 into master Mar 31, 2026
11 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use float64 in Jenks natural breaks internals (#1100)#1101

Use float64 in Jenks natural breaks internals (#1100)#1101
brendancol merged 3 commits intomasterfrom
issue-1100

brendancol commented Mar 30, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

brendancol commented Mar 30, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant