Channel Measurements and more by ldecicco-USGS · Pull Request #865 · DOI-USGS/dataRetrieval

ldecicco-USGS · 2026-02-24T20:52:46Z

This PR adds:

Improved error handling in WQP functions
Added read_waterdata_field_meta, read_waterdata_combine_meta,
and read_waterdata_channel
Removed readNWISgwl and readNWISmeas as services have been turned off

… we still want to convert those types.

…take these functions out.

jzemmels

These changes look great! I'm excited about the new metadata endpoints and continuous vignette.

I have a few thoughts but didn't run into any major issues, so I'll approve the PR.

jzemmels · 2026-02-25T20:16:59Z

R/read_waterdata_field_meta.R

+#'
+#' groundwater <- read_waterdata_field_meta(monitoring_location_id = "USGS-375907091432201")
+#'
+#' gwl_data <- read_waterdata_field_meta(monitoring_location_id = "USGS-02238500",


Does gwl stand for groundwater level? If yes, I don't think the data pulled here are groundwater.

jzemmels · 2026-02-25T20:21:07Z

R/read_waterdata_samples.R

    httr2::req_url_path_append("samples-data") |>
    httr2::req_url_query(mimeType = "text/csv")

+  token <- Sys.getenv("API_USGS_PAT")


Does samples-data use API tokens too?

What we're just learning is YES! Was working with a developer, and it seems like the API token can be used throughout the api.gov universe. It's different limits vs the OGC APIs, but still useful.

jzemmels · 2026-02-25T20:21:47Z

R/read_waterdata_stats.R

    httr2::req_headers(`Accept-Encoding` = "gzip") |>
    httr2::req_timeout(seconds = 180) |>
-    httr2::req_url_path_append(paste0("v", version)) |>
+    httr2::req_url_path_append(getOption("dataRetrieval.api_version_stat")) |> 


jzemmels · 2026-02-25T20:22:36Z

R/read_waterdata_stats.R

+    httr2::req_url_path_append(getOption("dataRetrieval.api_version_stat")) |> 
    httr2::req_url_path_append(paste0("observation", service))

+  token <- Sys.getenv("API_USGS_PAT")


Is this to future-proof these functions if they eventually use API request limits? Would this header field be ignored currently?

The idea is: there will probably be a time when both v0 and v1 exist (or more). Right now the user can't pass in the default value in the top level functions, so they couldn't compare the different versions. BUT, by putting it in the package options, there is a mechanism to change the version (without needing to have a "version" argument get passed around all over the place). So if v1 came out tomorrow, users could theoretically run:

options("dataRetrieval.api_version_stat" = "v1") x1 <- read_waterdata_stats_por( monitoring_location_id = c("USGS-02319394", "USGS-02171500") ) Requesting: https://api.waterdata.usgs.gov/statistics/v1/observationNormals?monitoring_location_id=USGS-02319394&monitoring_location_id=USGS-02171500

Since there is no v1, that results in:

Error in `req_perform()`: ! HTTP 404 Not Found.

so set it back and it works again:

options("dataRetrieval.api_version_stat" = "v0") x1 <- read_waterdata_stats_por( monitoring_location_id = c("USGS-02319394", "USGS-02171500") )

Since updates to versions are usually pretty infrequent...and most users won't know/care 99% of the time, this seems like a great place for options

"if they eventually use API request limits": what I was recently told is that all api.gov endpoints do use a form of limiting requests per token, so this should be good to go. It's just that in the OGC APIs that the limits are low enough that users need a token much sooner than the other services (where the request limits are set by the api.gov rules)

jzemmels · 2026-02-25T20:28:35Z

vignettes/continuous_pr.Rmd

+
+```
+
+There is an increasing amount of continuous data available from the USGS. Continuous data are collected via automated sensors installed at a monitoring location. They are collected at a high frequency and often at a fixed 15-minute interval. Depending on the specific monitoring location, the data may be transmitted automatically via telemetry and be available on WDFN within minutes of collection, while other times the delivery of data may be delayed if the monitoring location does not have the capacity to automatically transmit data. Continuous data are described by parameter name and parameter code (pcode). These data might also be referred to as "instantaneous values" or "IV".


Is uv ever used anymore for continuous data?

not to my knowledge. This text comes directly from the continuous schema:

dataRetrieval:::get_description("continuous")

jzemmels · 2026-02-25T20:38:42Z

vignettes/continuous_pr.Rmd

+
+```
+
+That's all fine and good if everything works perfectly. What if something goes wrong in the middle of the pull? You could put some `tryCatch` statements in the above code, post process out what was missed, and re-request the missing data...OR...consider using a `targets` pipeline to take care of all of that!


Or the retry package!

jzemmels · 2026-02-25T20:43:25Z

vignettes/continuous_pr.Rmd

+
+```{r}
+library(future.apply)
+plan(multisession)


I think Macs can use multicore also due to their Unix-based architecture, which can be more efficient than multisession. Might be worth mentioning here, or linking to the future/related documentation.

so like this?

library(future.apply) plan(multicore)

or something else? I know when I was trying to write parallel docs for EGRET there were some hiccups in getting the Macs to work right (mostly because I don't have access)

Yep, that is what a Mac use could run. I don't know if it's absolutely necessary to address here, but you might link to future documentation, similar to how you've linked to targets documentation.

jzemmels · 2026-02-25T20:46:59Z

vignettes/continuous_pr.Rmd

+tar_load(all_data)
+```
+
+## Run in parallel


Great addition! This will come in handy for lots of technical users.

jzemmels · 2026-02-25T20:49:21Z

vignettes/continuous_pr.Rmd

+
+## Run in parallel
+
+As mentioned above, if you are running on a fairly standard laptop, feel free to make requests in parallel. However, please don't run queries in parallel on a supercomputer or HPC type environment, your requests will be stopped/killed. There may be techniques to not overwhelm the system, contact comptools@usgs.gov if you need help figuring that out.


This maybe doesn't require clarification here, but can the API automatically detect whether a request originates from a HPC environment? Or is it just assumed that there would be too many simultaneous requests originating from the associated IP address?

I'm honestly not sure what the fix would be here. Maybe adding a randomized Sys.sleep() after each call?

I think that's probably the idea (a randomized Sys.sleep). I'm guessing they've got some automated way to detect if a IP or API token is making >X simultaneous requests and kill it (the text in the vignette comes directly from the api developers).

jzemmels · 2026-02-25T20:50:34Z

vignettes/Status.Rmd

    "readNWISpCode",
-    "readNWISgwl",
-    "readNWISmeas",
+    "readNWISgwl (deprecated)",


Should these still be exported in the NAMESPACE if the associated services are completely offline?

they aren't in the NAMESPACE anymore. I'm switching it to "defunct" to match the python docs.

Okay, I must have messed up loading the package locally because I still saw it in the tab-complete. I can they're gone.

jzemmels · 2026-02-25T21:00:34Z

Forgot to add that I ran each of the new functions a couple times to verify the output. I also re-ran the unit tests locally and everything passed.

ldecicco-USGS added 13 commits February 17, 2026 15:33

add channel, field meta, and combine meta

785055d

added status

f52750c

Use API token in samples

fb7fa1b

Add some continuous info

d7eac82

Get links working

52c7197

missed comments

9c2cb62

updates with calendar year chunks

ff3ffe0

more edits

5326e12

We'll put this back when it's working again.

36cf46a

This isn't needed and doesn't help when the request comes back empty,…

2366760

… we still want to convert those types.

cleanup

dabb2cb

gwlevels not redirecting

685a57a

not working yet

03accec

ldecicco-USGS temporarily deployed to CI_config February 24, 2026 20:52 — with GitHub Actions Inactive

deprecate for real gwl and meas

171d815

ldecicco-USGS temporarily deployed to CI_config February 24, 2026 21:40 — with GitHub Actions Inactive

ldecicco-USGS added 2 commits February 24, 2026 15:52

need doc

450b9f0

Since we've had a deprecation message for 2 versions, we should just …

0637b14

…take these functions out.

ldecicco-USGS temporarily deployed to CI_config February 24, 2026 22:00 — with GitHub Actions Inactive

ldecicco-USGS requested a review from jzemmels February 24, 2026 22:23

ldecicco-USGS added 3 commits February 25, 2026 08:31

Add token and ability for user to change version

7804178

fix test

0f61389

add token to check

ac034b3

ldecicco-USGS temporarily deployed to CI_config February 25, 2026 15:00 — with GitHub Actions Inactive

apparently deleted the wrong thing

a6ff55b

ldecicco-USGS temporarily deployed to CI_config February 25, 2026 16:05 — with GitHub Actions Inactive

jzemmels approved these changes Feb 25, 2026

View reviewed changes

update with review suggestions

8d4feb5

ldecicco-USGS temporarily deployed to CI_config March 2, 2026 14:02 — with GitHub Actions Inactive

updated docs and slightly different retry

e776fea

ldecicco-USGS temporarily deployed to CI_config March 2, 2026 14:14 — with GitHub Actions Inactive

add link

b1f4274

ldecicco-USGS temporarily deployed to CI_config March 2, 2026 14:21 — with GitHub Actions Inactive

ldecicco-USGS merged commit df7edad into DOI-USGS:develop Mar 2, 2026
1 check passed


		```

		There is an increasing amount of continuous data available from the USGS. Continuous data are collected via automated sensors installed at a monitoring location. They are collected at a high frequency and often at a fixed 15-minute interval. Depending on the specific monitoring location, the data may be transmitted automatically via telemetry and be available on WDFN within minutes of collection, while other times the delivery of data may be delayed if the monitoring location does not have the capacity to automatically transmit data. Continuous data are described by parameter name and parameter code (pcode). These data might also be referred to as "instantaneous values" or "IV".


		```

		That's all fine and good if everything works perfectly. What if something goes wrong in the middle of the pull? You could put some `tryCatch` statements in the above code, post process out what was missed, and re-request the missing data...OR...consider using a `targets` pipeline to take care of all of that!


		## Run in parallel

		As mentioned above, if you are running on a fairly standard laptop, feel free to make requests in parallel. However, please don't run queries in parallel on a supercomputer or HPC type environment, your requests will be stopped/killed. There may be techniques to not overwhelm the system, contact comptools@usgs.gov if you need help figuring that out.

Conversation

ldecicco-USGS commented Feb 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jzemmels left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

jzemmels commented Feb 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ldecicco-USGS commented Feb 24, 2026 •

edited

Loading