Handle Pandas string-like columns by zmek · Pull Request #154 · UCL-CORU/patientflow

zmek · 2026-03-18T14:44:31Z

Summary

Fixes ValueError: could not convert string to float when training classifiers with categorical columns that use pandas StringDtype, CategoricalDtype, or ArrowDtype instead of legacy object dtype
Replaces brittle dtype == "object" checks in create_column_transformer and FeatureColumnTransformer.fit with an exclusion-based helper _is_string_like_column that is robust to future pandas dtype changes
Adds tests for StringDtype and CategoricalDtype columns

Test plan

[x] Existing classifier tests pass
[x] New test_string_dtype_columns passes
[x] New test_categorical_dtype_columns passes
[x] Pre-commit hooks pass

…goricalDtype instead of object dtype

Fix classifier crash when categorical columns use StringDtype or Cate…

25859ff

…goricalDtype instead of object dtype

zmek merged commit 9b4831a into main Mar 18, 2026
6 checks passed

zmek deleted the handle-pandas-stringlike-cols branch March 18, 2026 14:48

zmek mentioned this pull request Mar 18, 2026

Fix: Handle StringDtype and CategoricalDtype columns in classifier training pipeline #153

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Handle Pandas string-like columns#154

Handle Pandas string-like columns#154
zmek merged 1 commit intomainfrom
handle-pandas-stringlike-cols

zmek commented Mar 18, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

zmek commented Mar 18, 2026

Summary

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant