crop-yield-regressor

Model Details

  • Version: v1.0.0
  • Model type: HistGradientBoostingRegressor
  • Task: Crop yield regression
  • Target: hg/ha_yield
  • Target unit: hectograms per hectare (hg/ha)
  • Framework: scikit-learn
  • Training timestamp: 2026-07-03T08:18:48.846104+00:00

Intended Use

This model predicts crop yield from country, crop, year, rainfall, pesticide usage, and average temperature features. It is intended for portfolio demonstration, MLOps workflows, API serving examples, and educational experimentation.

It should not be used as the sole basis for agricultural, financial, insurance, or policy decisions without validation on current local agronomic data.

Features

  • Area
  • Item
  • Year
  • average_rain_fall_mm_per_year
  • pesticides_tonnes
  • avg_temp

Preprocessing

  • Numeric features: Year, average_rain_fall_mm_per_year, pesticides_tonnes, avg_temp
  • Categorical features: Area, Item
  • Numeric imputation: median
  • Numeric scaling: standard_scaler
  • Categorical imputation: most_frequent
  • Categorical encoding: one_hot_encoder
  • Unknown categories during inference: ignore
  • Encoded feature count: 115

Evaluation

Holdout split configuration:

  • Test size: 0.2
  • Random state: 42
Metric Value
mae 8942.264751
mse 263803589.277396
rmse 16242.031563
r2 0.963603
mean_prediction 77857.593610

Reproducibility

  • Dataset SHA-256: 3d47d3fdc35950b5333348c0d28dbe5534346237813dd1db9d4c26f2935d888b
  • Training rows after cleaning: 25932
  • Train rows: 20745
  • Test rows: 5187
  • Python version: 3.10.12
  • pandas version: 2.3.3
  • numpy version: 2.2.6
  • scikit-learn version: 1.7.2
  • joblib version: 1.5.3

The full sklearn inference artifact is saved as:

  • crop_yield_model.joblib

The fitted preprocessing artifact is saved as:

  • crop_yield_preprocessor.joblib

Versioned Artifacts

This export contains:

  • crop_yield_model.joblib
  • crop_yield_preprocessor.joblib
  • metrics.json
  • model_metadata.json
  • preprocessing_metadata.json
  • artifact_manifest.json
  • VERSION
  • LICENSE
  • README.md

Limitations

The model is trained on historical tabular data and may not generalize to unseen regions, new farming practices, extreme climate events, or changed measurement methods. Input values should be validated by the serving API before inference.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Space using thananchayan/crop-yield-regressor 1