HistgradientBoost methods from Scikit-learn support native categorical features, meaning no preprocessing is needed, as shown here:
https://scikit-learn.org/stable/modules/ensemble.html#categorical-support-gbdt
However, the MLJ interface seems to enforce the input to be continuous tables, as seen in the source code:
meta(HistGradientBoostingClassifier,
input = Table(Continuous),
target = AbstractVector{<:Finite},
weights = false
)
Since MLJ enforces the scitype schema, the categorical feature columns should be auto-inferred. I hope this can be addressed, thanks.
On second thoughts, since scikit-learn will auto-infer categorical features based on dtype, maybe relaxing the Table type to some union type would suffice.
HistgradientBoost methods from Scikit-learn support native categorical features, meaning no preprocessing is needed, as shown here:
https://scikit-learn.org/stable/modules/ensemble.html#categorical-support-gbdt
However, the MLJ interface seems to enforce the input to be continuous tables, as seen in the source code:
Since MLJ enforces the scitype schema, the categorical feature columns should be auto-inferred. I hope this can be addressed, thanks.
On second thoughts, since scikit-learn will auto-infer categorical features based on dtype, maybe relaxing the
Tabletype to some union type would suffice.