feat(wateruse): add water-use module for the NWDC API by thodson-usgs · Pull Request #328 · DOI-USGS/dataretrieval-python

thodson-usgs · 2026-06-22T19:38:13Z

Summary

Adds a dataretrieval.wateruse module for retrieving USGS National Water
Availability Assessment Data Companion (NWDC) water-use estimates from
https://api.water.usgs.gov/nwaa-data/data. Estimates are modeled on a HUC12
grid and queryable by county, state, or hydrologic unit. This is the modern
replacement for the defunct legacy NWIS water-use service, so
nwis.get_water_use now points callers here.

It covers the same data as the R
dataRetrieval::read_waterdata_use_data
getter, but is written to the Python package's conventions rather than ported
from the R structure.

from dataretrieval import wateruse

df, md = wateruse.get_wateruse(
    model="wu-public-supply-wd",
    variable=["pswdtot", "pswdgw", "pswdsw"],
    state="RI",
    start_date="2020-01",
    time_resolution="monthly",
)

Design notes

The NWDC is a plain CSV REST service, not an OGC API Features collection —
it has no /collections or /conformance, and its error envelope is
{"detail": ...} rather than the OGC engine's {code, description}. So it does
not use the high-level OGC path (get_ogc_data, the CQL2 byte-chunker, the
GeoJSON pager). It does reuse the engine's generic transport plumbing,
supplying only NWDC-specific strategies, and stays consistent with the package
where the shared pieces fit:

Returns the conventional (DataFrame, BaseMetadata) tuple.
Reuses utils._default_headers(), so the documented API_USGS_PAT token
raises the NWDC rate limit just as it does for the OGC getters.
Raises through the shared typed DataRetrievalError taxonomy (via
utils._raise_for_status with an injected detail extractor), surfacing the
NWDC detail (e.g. "Invalid model name: ...") in the message.
Locations are idiomatic state / county / huc selectors (mirroring
ngwmn / waterdata), each accepting a single value or a list. Since NWDC
takes one location per request, a multi-value selector fans out — one
request per location, run concurrently over a shared client.
Date / resolution params are idiomatic snake_case (start_date, end_date,
time_resolution), mapped to the NWDC wire names internally.
Multi-valued variable is comma-joined into a single GET.
Pagination is real and handled transparently. Large areas paginate with
an RFC 8288 Link: <...>; rel="next" header (a huc2 → 7 pages, a populous
state → 4; small queries → a single page). wateruse drives the engine's
generic _paginate with NWDC parse / cursor / error strategies and
concatenates the pages.
huc12_id is parsed as a string so leading zeros survive.

Engine refactor

Building wateruse surfaced that it could reuse the OGC engine's transport
instead of re-implementing it — and extracting the reusable seams also
de-duplicated the engine itself. Net source ≈ −66 LOC, behavior-preserving:

planning._merge_response — one low-level "fold N responses into one"
behind both pagination (_paginate) and the chunked / fan-out aggregation
(_combine_chunk_responses), replacing two near-duplicate implementations.
utils.Ambient[T] — a small generic ContextVar-with-scope class that
collapses each per-call ambient (_row_cap, _ogc_base_url, _dialect, the
chunker's _chunked_client) from a var + hand-written @contextmanager
setter pair into a single declaration.
Rate-limit correctness fix: x-ratelimit-remaining now reports the
lowest value any concurrent sub-request saw (the quota actually left after
a fan-out) via a shared _lowest_remaining, instead of the last-by-index —
fixing a latent inaccuracy in the OGC chunker too.

What's included

dataretrieval/wateruse.py, wired into dataretrieval/__init__.py.
The engine refactor across ogc/{engine,planning,chunking}.py, utils.py,
and waterdata/utils.py.
tests/wateruse_test.py — offline pytest-httpx coverage: single-page parse,
string huc12_id, comma-joined variables, dropped-None params, snake_case →
wire-name mapping, Link-header pagination, bare-host
normalization, shared-header reuse, state/county/huc selectors + fan-out, and
typed-error / detail handling; plus updates to tests/waterdata_* for the
engine changes.
docs/source/reference/wateruse.rst + toctree entry.
README.md usage example and "Available Data Services" entry.
demos/USGS_WaterUse_Examples.ipynb — a motivating walkthrough (where
Wisconsin's public water supply comes from, and its summer demand peak).

Verification

Offline suites pass — wateruse plus the OGC engine / chunking / utils suites
the refactor touches; ruff check / ruff format / mypy --strict clean.
Smoke-tested against the live API: single- and multi-page queries, monthly
and annual resolutions, paginated results byte-identical to the unpaginated
equivalent, concurrent fan-out over multiple states, and the lowest-remaining
rate-limit header confirmed.

🤖 Generated with Claude Code

Add `dataretrieval.wateruse` for USGS National Water Availability Assessment Data Companion (NWDC) water-use estimates — modeled on a HUC12 grid and queryable by state, county, or hydrologic unit. This is the modern replacement for the defunct legacy NWIS water-use service (`nwis.get_water_use` now points callers here). from dataretrieval import wateruse df, md = wateruse.get_wateruse( model="wu-public-supply-wd", variable=["pswdtot", "pswdgw", "pswdsw"], state="RI", start_date="2020-01", time_resolution="monthly", ) The NWDC is a plain CSV REST service, not an OGC API Features collection, so the module supplies the NWDC-specific pieces (CSV parsing, the RFC 8288 Link-header pagination cursor, the `{detail}` error envelope, and state/county/huc location builders) but reuses the OGC engine's generic transport rather than re-implementing it: the shared pager (`_paginate`), the Jupyter-safe anyio sync bridge (`_run_sync`), response/frame aggregation, and `_default_headers`. It keeps the package conventions where they fit — a `(DataFrame, BaseMetadata)` return, the typed `DataRetrievalError` taxonomy (surfacing the NWDC `detail`), `API_USGS_PAT` token support, idiomatic snake_case params, and `state` / `county` / `huc` selectors that each accept a value or a list (a list fans out one concurrent request per location). Large areas paginate transparently. A `FutureWarning` flags the module as experimental, since the NWDC service is new and still changing. Extracting the reusable engine seams also de-duplicated the engine itself (~-66 LOC, behavior-preserving): `planning._merge_response` now backs both pagination and fan-out aggregation; a generic `utils.Ambient[T]` contextvar-with-scope helper collapses the per-call ambients; and `x-ratelimit-remaining` now reports the lowest value any concurrent sub-request saw (the quota actually left after a fan-out), fixing a latent inaccuracy in the OGC chunker too. Includes offline pytest-httpx coverage, a reference page, a README example, and a demo notebook. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_01Sjb14HkwuCydKSKMsaXsgd

thodson-usgs force-pushed the feat/wateruse branch from 8841d52 to 2d55d48 Compare June 22, 2026 19:47

thodson-usgs changed the title ~~feat(wateruse): add water-use module wrapping the NWDC API~~ feat(wateruse): add water-use module for the NWDC API Jun 22, 2026

thodson-usgs mentioned this pull request Jun 23, 2026

explore: reuse + unify the OGC engine (pager, aggregation, ambient state) — net −64 LOC thodson-usgs/dataretrieval-python#4

Merged

thodson-usgs force-pushed the feat/wateruse branch 4 times, most recently from 19105d8 to 0f20ada Compare June 24, 2026 21:00

thodson-usgs force-pushed the feat/wateruse branch from 0f20ada to d9e8c7e Compare June 24, 2026 21:03

thodson-usgs marked this pull request as ready for review June 24, 2026 21:10

thodson-usgs merged commit 4daf771 into DOI-USGS:main Jun 24, 2026
9 checks passed

thodson-usgs deleted the feat/wateruse branch June 24, 2026 21:10

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(wateruse): add water-use module for the NWDC API#328

feat(wateruse): add water-use module for the NWDC API#328
thodson-usgs merged 1 commit into
DOI-USGS:mainfrom
thodson-usgs:feat/wateruse

thodson-usgs commented Jun 22, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Conversation

thodson-usgs commented Jun 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Design notes

Engine refactor

What's included

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

thodson-usgs commented Jun 22, 2026 •

edited

Loading