Regression Operator¶
The Regression Operator is a low-code operator for supervised tabular regression. It trains a model from a training dataset, optionally evaluates on held-out test data, and writes a consistent set of artifacts such as predictions, metrics, an HTML report, and a serialized model bundle.
Overview¶
Required inputs
The current implementation requires:
training_datatarget_column
All columns in training_data except target_column are treated as features.
Optional inputs
The operator also supports:
test_datafor held-out evaluationoutput_directoryfor artifact locationcolumn_typesto override automatic type inferencemodel_kwargsto control explicit model runssave_and_deploy_to_mdto save the trained model to OCI Model Catalog and create a Model Deployment
Supported models
The supported model values are:
autolinear_regressionrandom_forestknnxgboost
auto performs cross-validation across the explicit model families and selects the best one for the configured metric. Explicit models use Optuna-based tuning by default.
Preprocessing
By default, the operator:
infers numeric, categorical, and date columns
imputes missing numeric values with the median
imputes missing categorical values with the mode
one-hot encodes categorical columns
expands date columns into
year,month,day,dayofweek, anddayofyear
Artifacts
Depending on the configuration and available data, the operator can write:
training_predictions.csvtest_predictions.csvtraining_metrics.csvtest_metrics.csvglobal_explanations.csvreport.htmlmodel.pklmodel_registration_info.jsondeployment_info.json
global_explanations.csv is written only when generate_explanations: true and explainability output is successfully produced.