Add spatial-temporal metadata embeddings to CropModel#1334
Conversation
Species distributions vary by location and season. This adds an optional late-fusion pathway that encodes (lat, lon, date) as a small sinusoidal embedding (~1.5% of image features) concatenated with ResNet-50 features before the classifier. Metadata is fully optional — the model gracefully degrades to image-only predictions when metadata is absent. Key changes: - SpatialTemporalEncoder with sinusoidal encoding + small MLP - CropModel.create_model splits into backbone + classifier when use_metadata=True - MetadataImageFolder wraps ImageFolder with CSV sidecar for training - BoundingBoxDataset supports per-crop metadata for inference - predict_tile accepts metadata dict (lat, lon, date) - Config: use_metadata, metadata_dim, metadata_dropout - 19 new tests covering encoder, model, datasets, and integration - Documentation for CropModel docs and configuration reference Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
|
I changed something small, I don't agree with bugbot. |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reapply the CropModel metadata changes on current main so the PR branch can merge cleanly.
Co-authored-by: Cursor <cursoragent@cursor.com>
|
@jveitchmichaelis can you skim this. There was a lot of cursor generated code at the beginning and I'm starting to take ownership over it as it seems to be effective and well balanced when I tested it in the BOEM repo.
There are implementation pieces I'm not sure about, and then there are scientific things that merit further review. Like what happens if we see some that is not supposed to occur in a new case, I want to manually change a few test items to see if the metadata overrides the vision model. It's a small portion of the concat features but I will test it. In terms of engineering and our workflows, I am playing with this process of working with the agents to prototype something, check it out on a different project and then assume ownership of the PR for DeepForest. That hand-off is probably awkward. I want to see which things strike you as needing developer time. @ethanwhite just keeping you abreast as a trial pattern. I'm taking it out of WIP, but not triggering review yet? |
Co-authored-by: Cursor <cursoragent@cursor.com>

Summary
Key changes
SpatialTemporalEncoder: Sinusoidal positional encoding + MLP (6 → configurable dim, default 32)CropModel: Splits into backbone + metadata encoder + classifier whenuse_metadata=TrueMetadataImageFolder: Wraps ImageFolder with a CSV sidecar (filename,lat,lon,date) for trainingBoundingBoxDataset: Supports per-crop metadata dict for inferencepredict_tile: Acceptsmetadata={"lat": ..., "lon": ..., "date": "YYYY-MM-DD"}use_metadata,metadata_dim,metadata_dropoutTest plan
pytest tests/test_metadata_cropmodel.py)pytest tests/test_crop_model.py)🤖 Generated with Claude Code
Note
Medium Risk
Touches core CropModel training/inference data flow and changes batch/forward signatures to support metadata; defaults remain unchanged, but incorrect metadata wiring or shape handling could break crop-model prediction/training in edge cases.
Overview
Adds optional spatial-temporal metadata (lat/lon/date) support to
CropModel, including a newSpatialTemporalEncoderand a backbone+metadata+classifier path whencropmodel.use_metadatais enabled, while keeping image-only behavior as the default.Training can now load per-image metadata from a sidecar CSV via
CropModel.load_from_disk(..., metadata_csv=...), and inference can pass image-level metadata throughdeepforest.predict_tile(..., metadata=...)which is expanded to per-crop tensors and threaded throughpredict/BoundingBoxDataset.Updates config/schema/docs to expose
use_metadata,metadata_dim, andmetadata_dropout, and adds a dedicated test suite covering encoder behavior, training/predict steps, dataset wrappers, and checkpoint save/load.Written by Cursor Bugbot for commit c763c17. This will update automatically on new commits. Configure here.