Skip to content

Handle Pandas string-like columns#154

Merged
zmek merged 1 commit intomainfrom
handle-pandas-stringlike-cols
Mar 18, 2026
Merged

Handle Pandas string-like columns#154
zmek merged 1 commit intomainfrom
handle-pandas-stringlike-cols

Conversation

@zmek
Copy link
Copy Markdown
Collaborator

@zmek zmek commented Mar 18, 2026

Summary

Fixes ValueError: could not convert string to float when training classifiers with categorical columns that use pandas StringDtype, CategoricalDtype, or ArrowDtype instead of legacy object dtype
Replaces brittle dtype == "object" checks in create_column_transformer and FeatureColumnTransformer.fit with an exclusion-based helper _is_string_like_column that is robust to future pandas dtype changes
Adds tests for StringDtype and CategoricalDtype columns

Test plan

[x] Existing classifier tests pass
[x] New test_string_dtype_columns passes
[x] New test_categorical_dtype_columns passes
[x] Pre-commit hooks pass

@zmek zmek merged commit 9b4831a into main Mar 18, 2026
6 checks passed
@zmek zmek deleted the handle-pandas-stringlike-cols branch March 18, 2026 14:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant