Skip to content

docs: Add guide for training models for new countries#125

Merged
peterdudfield merged 3 commits intoopenclimatefix:mainfrom
mahendra-918:docs/add-training-guide-new-country
Feb 16, 2026
Merged

docs: Add guide for training models for new countries#125
peterdudfield merged 3 commits intoopenclimatefix:mainfrom
mahendra-918:docs/add-training-guide-new-country

Conversation

@mahendra-918
Copy link
Copy Markdown
Contributor

Pull Request

Description

This PR adds a guide for training models on new countries, which addresses issue #113. I've created a step-by-step guide that covers all the points mentioned in the issue.

The guide walks through:

  1. Getting generation data - different ways to find and collect solar PV data for your country
  2. Getting NWP data - how to download and process GFS data, including how to crop it for your specific country
  3. Creating configs - I've included a full example config file with all the normalization constants (I noticed the example config had all of them, so I made sure to include everything)
  4. Training the model - instructions for both streaming data and using pre-generated samples
  5. Saving model weights - different options for sharing your trained model

I also added a section on choosing countries based on the feedback about focusing on countries with large solar installations. I included specific data sources for countries like USA, Netherlands, Belgium, Germany, and France since those were mentioned as good starting points.

Fixes #113

Changes Made

  • Added docs/training_model_new_country.md - the main guide (about 600 lines)
  • Updated README.md - added a link to the new guide in the documentation section

How Has This Been Tested?

  • Yes

I went through the guide and checked:

  • All the code examples against the actual codebase (especially the config structure)
  • File paths match what's actually in the project
  • The normalization constants section includes all channels (not just placeholders)
  • Links to other docs work correctly
  • Compared the structure with the existing getting_started.md to keep things consistent

I also ran the linter to make sure there are no formatting issues.

No data processing changes, so no plotting needed.

Checklist:

  • My code follows OCF's coding style guidelines
  • I have performed a self-review of my own code
  • I have made corresponding changes to the documentation
  • I have added tests that prove my fix is effective or that my feature works
  • I have checked my code and corrected any misspellings

   Create comprehensive step-by-step guide covering
   Add link to guide in README.md
   Resolves openclimatefix#113
@mahendra-918
Copy link
Copy Markdown
Contributor Author

Hi @peterdudfield

I've added a guide for training models on new countries (issue #113). Included all the steps and added country selection guidance based on your feedback.

Checked the config examples against the actual codebase to make sure everything matches. Happy to make any changes if needed!

Copy link
Copy Markdown
Contributor

@peterdudfield peterdudfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really good. @siddharth7113 do you want to check this?

@siddharth7113
Copy link
Copy Markdown
Contributor

This looks really good. @siddharth7113 do you want to check this?

Sure, I'll take a look at this either today or tommorow. Thanks @mahendra-918 for doing this.

Comment thread docs/training_model_new_country.md Outdated
Copy link
Copy Markdown
Contributor

@peterdudfield peterdudfield left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ive just left one change, would you be able to do that @mahendra-918 and then we can get this merged

@mahendra-918
Copy link
Copy Markdown
Contributor Author

@peterdudfield Thanks for pointing that out! I've updated the code example to use location_id instead of gsp_id and added the link to ocf-data-sampler as requested

Comment thread docs/training_model_new_country.md Outdated
Comment thread docs/training_model_new_country.md Outdated
Comment thread docs/training_model_new_country.md Outdated
Comment thread docs/training_model_new_country.md Outdated
Comment thread docs/training_model_new_country.md Outdated
Comment thread docs/training_model_new_country.md Outdated
Comment thread docs/training_model_new_country.md
time_resolution_minutes: 30
```

### 3.2 Calculate Normalization Constants
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@peterdudfield Do we need to do this for each new country?

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we do, we can probably use some global constant instead, but it might make the models better if we use local constants, perhaps something to test out

Comment thread docs/training_model_new_country.md Outdated
@siddharth7113
Copy link
Copy Markdown
Contributor

Hi @mahendra-918 I have left some comments on the PR, sorry it took long, after some minor tweaks this can be good to merge.

@mahendra-918
Copy link
Copy Markdown
Contributor Author

Thanks @siddharth7113 I've addressed all the feedback in the latest commit:

Documentation Cleanup: Removed the untested 'Complete Workflow' section and the manual 'GFS Download' instructions.
S3 Access: Updated the GFS section to demonstrate direct access via s3fs instead of downloading.
Configuration: Replaced the hardcoded example with a link to example_configuration.yaml.
Data Schema: Enforced Zarr format for manual data and added the detailed schema/dimensions requirements.
Notes: Added the contact note regarding S3 uploads.
The guide is now strictly focused on the core requirements. Ready for re-review

@mahendra-918
Copy link
Copy Markdown
Contributor Author

Hii @peterdudfield @siddharth7113 just checking in to see if you have any further thoughts on the latest documentation updates. Ready to merge whenever you are!

@peterdudfield
Copy link
Copy Markdown
Contributor

Lets get it merged, and we can always add a few more bits here and there

@peterdudfield peterdudfield merged commit 60d726e into openclimatefix:main Feb 16, 2026
2 checks passed
@mahendra-918
Copy link
Copy Markdown
Contributor Author

Thanks for the support and the reviews, @peterdudfield Happy to have this merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Create "General steps on how to train a model for a new country" - readme.md

3 participants