Skip to content

Commit 203d866

Browse files
authored
Merge pull request #104 from openclimatefix/docs/pvnet-instructions
Docs/pvnet instructions
2 parents 89cd8e0 + 07569f7 commit 203d866

7 files changed

Lines changed: 127 additions & 6 deletions

File tree

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -102,5 +102,8 @@ data/
102102
# custom
103103
config_tree.txt
104104
GFS_samples/
105+
GFS_TEST_RUN/
105106
PLACEHOLDER/
106-
*.zarr
107+
*.zarr
108+
example_configuration.yaml # config file for samples adjusted for local paths
109+
experiment_config.yaml

docs/getting_started.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,6 +20,7 @@ Welcome to the Solar Forecasting project! This document will introduce you to th
2020
15. [How This Project Fits into Renewable Energy](#how-this-project-fits-into-renewable-energy)
2121
16. [Development and Testing Guide](#development-and-testing-guide)
2222
17. [Command Line Interface (CLI)](#command-line-interface-cli)
23+
18. [Running PVNet Model](#running-pvnet-model)
2324

2425
---
2526

@@ -722,6 +723,52 @@ Common error messages and their solutions:
722723
- "Error loading dataset": Verify your internet connection and credentials
723724
- "Invalid chunks specification": Ensure chunk string follows the format "dim1:size1,dim2:size2"
724725

726+
727+
## Running PVNet Model
728+
729+
1. Update configuration file
730+
Go to src/open_data_pvnet/configs/PVNet_configs/datamodule/streamed_batches.yaml
731+
732+
Change values if desired (increase at your discretion):
733+
num_train_samples: 5
734+
num_val_samples: 5
735+
736+
2. Update src/open_data_pvnet/configs/PVNet_configs/datamodule/premade_batches.yaml
737+
Change this line to configuration: <your_directory...open-data-pvnet/src/open_data_pvnet/configs/PVNet_configs/datamodule/configuration/example_configuration.yaml>
738+
739+
3. Update src/open_data_pvnet/configs/PVNet_configs/config.yaml
740+
Change the line to - datamodule: premade_batches.yaml
741+
742+
4. Open a Weights & Biases Account https://wandb.ai/
743+
Go to src/open_data_pvnet/configs/PVNet_configs/logger/wandb.yaml
744+
Change to project: "GFS_TEST_RUN"
745+
Change to save_dir: "GFS_TEST_RUN"
746+
747+
5. Run the samples
748+
We recommend you save the samples locally for faster processing
749+
In your main open-data-pvnet directory, run the following command (assumes aws cli is installed locally)
750+
aws s3 sync s3://ocf-open-data-pvnet/data/gfs/v4/2023.zarr/ ./gfs_2023.zarr --no-sign-request
751+
aws s3 sync s3://ocf-open-data-pvnet/data/uk/pvlive/v2/combined_2023_gsp.zarr ./gsp_2023.zarr --no-sign-request
752+
Change the example_configuration.yaml `zarr_path` attributes to local paths you made above
753+
Comment out both of these lines
754+
`public: True` # If you are going to use the actual s3 buckets then leave alone however this may be really slow
755+
In streamed_batches.yaml change this line
756+
`configuration: null` to your actual path of the example_configuration.yaml file
757+
758+
# If running in a virtual environment, be sure to activate it. `source ./venv/bin/activate`
759+
`rm -rf GFS_samples PLACEHOLDER` # to remove previous sample runs
760+
`python src/open_data_pvnet/scripts/save_samples.py`
761+
762+
6. Run the training
763+
Go to config.yaml and change this line
764+
`- datamodule: streamed_batches.yaml` to `- datamodule: premade_batches.yaml`
765+
`python run.py`
766+
767+
768+
769+
770+
771+
725772
---
726773

727774
Thank you for joining us on this journey to advance solar forecasting and renewable energy solutions!

run.py

Lines changed: 70 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,70 @@
1+
"""Run training
2+
"""
3+
4+
import os
5+
6+
import torch
7+
8+
try:
9+
torch.multiprocessing.set_start_method("spawn")
10+
import torch.multiprocessing as mp
11+
12+
mp.set_start_method("spawn")
13+
except RuntimeError:
14+
pass
15+
16+
import logging
17+
import sys
18+
19+
# Tired of seeing these warnings
20+
import warnings
21+
from datetime import datetime
22+
23+
import hydra
24+
from omegaconf import DictConfig
25+
from sqlalchemy import exc as sa_exc
26+
27+
warnings.filterwarnings("ignore", category=sa_exc.SAWarning)
28+
29+
logging.basicConfig(stream=sys.stdout, level=logging.ERROR)
30+
31+
os.environ["HYDRA_FULL_ERROR"] = "1"
32+
33+
if "WANDB_RUN_ID" not in os.environ:
34+
os.environ["WANDB_RUN_ID"] = datetime.now().strftime("%y%m%d%H%M%S")
35+
36+
37+
# this file can be run for example using
38+
# python run.py experiment=example_simple
39+
40+
41+
@hydra.main(
42+
config_path="src/open_data_pvnet/configs/PVNet_configs",
43+
config_name="config.yaml",
44+
version_base="1.2",
45+
)
46+
def main(config: DictConfig):
47+
"""Runs training"""
48+
# Imports should be nested inside @hydra.main to optimize tab completion
49+
# Read more here: https://github.com/facebookresearch/hydra/issues/934
50+
from pvnet.training import train
51+
from pvnet.utils import extras, print_config
52+
53+
# A couple of optional utilities:
54+
# - disabling python warnings
55+
# - easier access to debug mode
56+
# - forcing debug friendly configuration
57+
# - forcing multi-gpu friendly configuration
58+
# You can safely get rid of this line if you don't want those
59+
extras(config)
60+
61+
# Pretty print config using Rich library
62+
if config.get("print_config"):
63+
print_config(config, resolve=True)
64+
65+
# Train model
66+
return train(config)
67+
68+
69+
if __name__ == "__main__":
70+
main()

src/open_data_pvnet/configs/PVNet_configs/config.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ defaults:
55
- _self_
66
- trainer: default.yaml
77
- model: multimodal.yaml
8-
- datamodule: streamed_batches.yaml
8+
- datamodule: streamed_batches.yaml #
99
- callbacks: default.yaml # set this to null if you don't want to use callbacks
1010
# - logger: null
1111
- logger: wandb.yaml # set logger here or use command line (e.g. `python run.py logger=wandb`)

src/open_data_pvnet/configs/PVNet_configs/datamodule/premade_batches.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ configuration: null
77
# The sample_dir is the location batches were saved to using the save_batches.py script
88
# The sample_dir should contain train and val subdirectories with batches
99

10-
sample_output_dir: "GFS_samples"
10+
sample_dir: "GFS_samples"
1111
num_workers: 8
1212
prefetch_factor: 2
1313
batch_size: 8

src/open_data_pvnet/configs/PVNet_configs/datamodule/streamed_batches.yaml

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -8,8 +8,8 @@ prefetch_factor: 2
88
batch_size: 8
99

1010
sample_output_dir: "GFS_samples"
11-
num_train_samples: 1000
12-
num_val_samples: 1000
11+
num_train_samples: 5 #1000 Increase at your discretion
12+
num_val_samples: 5 #1000 Increase at your discretion
1313

1414
train_period:
1515
- null

src/open_data_pvnet/configs/PVNet_configs/logger/wandb.yaml

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,12 +3,13 @@
33
wandb:
44
_target_: lightning.pytorch.loggers.wandb.WandbLogger
55
# wandb project to log to
6-
project: "PLACEHOLDER"
6+
project: "GFS_TEST_RUN"
77
name: "${model_name}"
88
# location to store the wandb local logs
99
save_dir: "PLACEHOLDER"
1010
offline: False # set True to store all logs only locally
1111
id: null # pass correct id to resume experiment!
12+
id: "${oc.env:WANDB_RUN_ID}"
1213
# entity: "" # set to name of your wandb team or just remove it
1314
log_model: True
1415
prefix: ""

0 commit comments

Comments
 (0)