Skip to content

Cross-array constraints for Xarray #51

@TomNicholas

Description

@TomNicholas

Hi, we met at SciPy, this package looks very cool! I'm impressed with the amount of thought that seems to have gone into the upstream LinkML work too.

My main question after skimming all the docs is: can this system specify constraints across multiple arrays? That's necessary to support Xarray correctly.

I see you already support Zarr & HDF5, but Xarray's data model is subtly different. Fr example it's possible to create a Zarr group which cannot be converted to an xarray.Dataset.

Concretely, if I want to for example check that the schema of a given zarr store is something that is compatible with xarray, I need to check that within a group, for every occurrence of any given dimension name, the length of that dimension is equal. This is a cross-array constraint - the correctness can't be determined by looking at any one array, as it depends on all the other arrays. (In xarray parlance we would say that the array's dimensions are aligned.)

You don't need this feature for a single xarray.Variable, but you do need it for both xarray.DataArray and xarray.Dataset. xarray.DataTree effectively has even more complicated constraints, that are now cross-group as well as cross-array.

Can these types of constraints be supported in your system?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions