You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We have been talking about a plugin API that let earthaccess use services like NASA Harmony for a long time see #328; Today at the bi-weekly meeting engineers from the transformation train joined to talk about how this may look like and we agreed on start working on an API contract that defines what a plugin will need from earthaccess and what earthaccess should expect from such plugins.
A good way of generalizing would be to search in earthaccess and then provide a list of the results to a plugin, the plugins will use this list of results to process them returning a job-id that can be used by earthaccess to query the state of the request and when done, do the same thing earthaccess does, download or open those results to stream the data in xarray.
There are some details that need to be decided, when we search we get a list of UMM records from CMR, each one of these result items is a big json dictionary, are plugins going to use any of this metadata or can we pass just the concept-id or the link to the data files listed under "GET DATA". Currently, Harmony supports a list of concept-id for granules, so we could start there.
Usually these services are async, so earthaccess will have to implement a job class to track status and completion, this is actually the case for Harmony. I think we are in good shape to implement the folloing:
importearthaccessaseaea.login()
results=ea.search_data(**params)
job_id=ea.services.harmony.request(results, **kwargs) # internally the API will use only the concept-id and pass the kwargsstatus=ea.services.harmony.status(job_id)
ifstatusis"COMPLETED": # or an enum of itfiles=ea.services.harmony.results(job_id)
ds=xr.open_mfdataset(files)
From this pseudocode the initial proposal is that a service plugin to earthaccess will at least implement:
request(results, kwargs) -> job_id: earthaccess doesn't know any particular thing about the internals just passes a list of results and kwargs and gets an id to later identify progress and how to retrieve the results.
status(job_id) -> ENUM: this will tell us if the request is being processed, is ready, it failed, is done etc.
results(job_id)-> list[URI]: this should be a list of URLs that can be opened with earthaccess.open or earthaccess.download once the service is done processing the granules. if HTTPS, it should be publicly accessible (with or without NASA EDL bearer token) and if it's S3, the service should provide an endpoint to get access keys... perhaps an internal interface.
Things to think about
How will earthaccess register the plugins? same as xarray with accessors when they are imported?
Should we use earthaccess.services.<PLUGIN> as the canonical namespace for async processing plugins?
search results (input of the plugins): should the plugins get the whole UMM granule metadata and know how to get the things they need? will results be URIs? Harmony accepts concept-ids, so maybe the API is one of the 2?
I'm probably forgetting many nuances here but in th meantime this would be a good place to keep working on it.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
We have been talking about a plugin API that let earthaccess use services like NASA Harmony for a long time see #328; Today at the bi-weekly meeting engineers from the transformation train joined to talk about how this may look like and we agreed on start working on an API contract that defines what a plugin will need from earthaccess and what earthaccess should expect from such plugins.
A good way of generalizing would be to search in earthaccess and then provide a list of the results to a plugin, the plugins will use this list of results to process them returning a job-id that can be used by earthaccess to query the state of the request and when done, do the same thing earthaccess does, download or open those results to stream the data in xarray.
There are some details that need to be decided, when we search we get a list of UMM records from CMR, each one of these result items is a big json dictionary, are plugins going to use any of this metadata or can we pass just the
concept-idor the link to the data files listed under "GET DATA". Currently, Harmony supports a list ofconcept-idfor granules, so we could start there.Usually these services are async, so earthaccess will have to implement a job class to track status and completion, this is actually the case for Harmony. I think we are in good shape to implement the folloing:
From this pseudocode the initial proposal is that a service plugin to earthaccess will at least implement:
request(results, kwargs) -> job_id: earthaccess doesn't know any particular thing about the internals just passes a list of results and kwargs and gets an id to later identify progress and how to retrieve the results.status(job_id) -> ENUM: this will tell us if the request is being processed, is ready, it failed, is done etc.results(job_id)-> list[URI]: this should be a list of URLs that can be opened with earthaccess.open or earthaccess.download once the service is done processing the granules. if HTTPS, it should be publicly accessible (with or without NASA EDL bearer token) and if it's S3, the service should provide an endpoint to get access keys... perhaps an internal interface.Things to think about
earthaccess.services.<PLUGIN>as the canonical namespace for async processing plugins?I'm probably forgetting many nuances here but in th meantime this would be a good place to keep working on it.
cc @owenlittlejohns @danielfromearth @JessicaS11 @jhkennedy @asteiker @mfisher87 @chuckwondo @itcarroll @flamingbear
Beta Was this translation helpful? Give feedback.
All reactions