Instructions to use thananchayan/crop-yield-regressor with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Scikit-learn
How to use thananchayan/crop-yield-regressor with Scikit-learn:
from huggingface_hub import hf_hub_download import joblib model = joblib.load( hf_hub_download("thananchayan/crop-yield-regressor", "sklearn_model.joblib") ) # only load pickle files from sources you trust # read more about it here https://skops.readthedocs.io/en/stable/persistence.html - Notebooks
- Google Colab
- Kaggle
crop-yield-regressor
Model Details
- Version:
v1.0.0 - Model type:
HistGradientBoostingRegressor - Task: Crop yield regression
- Target:
hg/ha_yield - Target unit: hectograms per hectare (
hg/ha) - Framework: scikit-learn
- Training timestamp:
2026-07-03T08:18:48.846104+00:00
Intended Use
This model predicts crop yield from country, crop, year, rainfall, pesticide usage, and average temperature features. It is intended for portfolio demonstration, MLOps workflows, API serving examples, and educational experimentation.
It should not be used as the sole basis for agricultural, financial, insurance, or policy decisions without validation on current local agronomic data.
Features
AreaItemYearaverage_rain_fall_mm_per_yearpesticides_tonnesavg_temp
Preprocessing
- Numeric features:
Year,average_rain_fall_mm_per_year,pesticides_tonnes,avg_temp - Categorical features:
Area,Item - Numeric imputation:
median - Numeric scaling:
standard_scaler - Categorical imputation:
most_frequent - Categorical encoding:
one_hot_encoder - Unknown categories during inference:
ignore - Encoded feature count:
115
Evaluation
Holdout split configuration:
- Test size:
0.2 - Random state:
42
| Metric | Value |
|---|---|
| mae | 8942.264751 |
| mse | 263803589.277396 |
| rmse | 16242.031563 |
| r2 | 0.963603 |
| mean_prediction | 77857.593610 |
Reproducibility
- Dataset SHA-256:
3d47d3fdc35950b5333348c0d28dbe5534346237813dd1db9d4c26f2935d888b - Training rows after cleaning:
25932 - Train rows:
20745 - Test rows:
5187 - Python version:
3.10.12 - pandas version:
2.3.3 - numpy version:
2.2.6 - scikit-learn version:
1.7.2 - joblib version:
1.5.3
The full sklearn inference artifact is saved as:
crop_yield_model.joblib
The fitted preprocessing artifact is saved as:
crop_yield_preprocessor.joblib
Versioned Artifacts
This export contains:
crop_yield_model.joblibcrop_yield_preprocessor.joblibmetrics.jsonmodel_metadata.jsonpreprocessing_metadata.jsonartifact_manifest.jsonVERSIONLICENSEREADME.md
Limitations
The model is trained on historical tabular data and may not generalize to unseen regions, new farming practices, extreme climate events, or changed measurement methods. Input values should be validated by the serving API before inference.