feat: Add script to fetch US solar data from EIA (Issue #109)#127
Conversation
|
Hi @peterdudfield, @jcamier , @siddharth7113 I’ve implemented the EIA data collection script for Issue #109. |
siddharth7113
left a comment
There was a problem hiding this comment.
EIA returns both daily and hourly data, if you could modify the script to get the hourly data or specify which option , it would be better, it also has does EIA region wise and US-48 for our particular use case I would recommend to get US-48 data only and not other region otherwise this could leads to duplicated data.
| end_date: End date string | ||
| data_cols: List of data columns to retrieve | ||
| facets: Dictionary of facets to filter by | ||
| offset: Pagination offset |
There was a problem hiding this comment.
What is offset, and pagination here? Why do we need it?
There was a problem hiding this comment.
We need them for large datasets because the API paginates its responses. offset allows us to request subsequent "pages" of data when the total number of records exceeds the API's single-request limit (usually 5,000).
There was a problem hiding this comment.
shouldnt we check that when we hit the API, and then pull more data if we need to?
There was a problem hiding this comment.
Great point. I'll update the script to automatically handle pagination so it fetches all available data for the requested period without needing manual offset management.
There was a problem hiding this comment.
Done! I updated
get_data
to automatically loop and fetch all available pages until the API returns less than the requested length. This way, users don't need to manually manage offsets. I also added a
test_get_data_pagination
case to verify it.
|
Thanks @siddharth7113 for the feedback! I've addressed the points:
|
|
Thanks @mahendra-918 , you happy this is merged? |
|
Thanks @peterdudfield Yes, I’m happy with the changes. Everything is ready from my side for the merge |
|
Thanks @mahendra-918 for all this. @all-contributors please add @mahendra-918 for code |
|
I've put up a pull request to add @mahendra-918! 🎉 |
|
Thanks for the support and the reviews, @peterdudfield Happy to have this merged |

Description
This PR adds a new script to fetch solar generation data from the US Energy Information Administration (EIA) Open Data API v2. This is the foundational step required to extend PVNet models to the United States (supporting [META] Issue #103).
The new EIAData class allows users to fetch hourly electricity generation data by fuel type (e.g., Solar) for major US grid operators (RTOs).
Fixes #109
How Has This Been Tested?
I have tested this change in two ways:
unittest.mockto verify the API request logic, parameter formatting, and error handling without requiring a real API connection.If your changes affect data processing, have you plotted any changes? i.e. have you done a quick sanity check?
Checklist: