Add table_histogram endpoint by will-moore · Pull Request #677 · ome/omero-web

will-moore · 2026-06-05T10:56:45Z

We have a histogram functionality in omero-parade and now I also need it for iviewer (ome/omero-iviewer#532), so it makes sense for this to go into omero-web.

This endpoint behaves similarly to the existing OMERO.table slice endpoint e.g. /webgateway/table/FILE_ID/slice/?columns=0&rows=0-100 and wraps the table_slice() for loading the data, then generates a histogram using numpy and returns the result.

By default, we use ALL the rows to generate the histogram.
Since we don't want to have load the table twice (to get the row-count before passing the rows = 0-row_count-1 to table_slice(), I have updated the table_slice() to allow rows=* (no change on max amount of data permitted).

So you can now do /webgateway/table/FILE_ID/slice/?columns=0&rows=*

Histogram supports the bins request parameter (int or string) - behaves as described at https://numpy.org/devdocs/reference/generated/numpy.histogram.html

Sample response to /webgateway/table/15908/histogram/?columns=2,3 on merge-ci

{
  "histograms": [
    {
      "column": "x_centroid",
      "histogram": [1449, 2750, 2982, 3161, 3393, 3455, 3012, 2643, 1161, 400],
      "bin_edges": [
        3.757423210144043, 766.061197490692, 1528.36497177124,
        2290.6687460517883, 3052.9725203323364, 3815.2762946128846,
        4577.580068893432, 5339.8838431739805, 6102.187617454529,
        6864.491391735077, 7626.795166015625
      ]
    },
    {
      "column": "y_centroid",
      "histogram": [52, 142, 32, 1388, 3627, 3905, 4269, 4326, 4111, 2554],
      "bin_edges": [
        39.39493064880371, 614.6632842636108, 1189.9316378784179,
        1765.199991493225, 2340.4683451080323, 2915.7366987228393,
        3491.0050523376467, 4066.2734059524537, 4641.541759567261,
        5216.810113182068, 5792.078466796875
      ]
    }
  ],
  "meta": {
    "columns": ["x_centroid", "y_centroid"],
    "rowCount": 24406,
    "columnCount": 13,
    "maxCells": 1000000
  }
}

for more information, see https://pre-commit.ci

knabar · 2026-06-19T09:38:54Z

The time to calculate a histogram on demand will directly depend on the number of rows in the table and likely won't be sustainable for tables with millions of rows, which we are seeing regularly now.

Our strategy is to calculate column statistics for most numeric columns at the time of table creation and store them in the table metadata (The roi column and a few others are excluded, as statistics are not meaningful there). Metadata fields created include

<column name>.min
<column name>.max
<column name>.mean
<column name>.median
<column name>.std
<column name>.skew
<column name>.kurtosis
<column name>.histogram.count
<column name>.histogram.division

All custom metadata fields are already returned via the webgateway/table/<id>/metadata/ endpoint.

will-moore · 2026-06-22T16:23:52Z

@knabar Thanks, it would be great to see some sample code for how those stats are generated and saved to the table. I'm also curious as to how they are used to generate a histogram curve in the client?

While I appreciate that the histogram endpoint in this PR may not scale to all tables (if the row count is very large), I still think it is useful in situations where the table has a smaller row_count and no histogram/stats have previously been calculated.
Without this endpoint, the only way to create a histogram for a column is to load ALL the values into the browser and build a histogram in JavaScript which will scale less well than histogram generation in python on the web server.

Testing with a local omero-web, connecting to an idr server, using a table with 18k rows, loading a whole column with /webgateway/table/14209154/slice/?columns=2&rows=* takes 1.45 secs; the histogram for the same column takes about the same time.
Same with a bigger table webgateway/table/44583133/histogram/?columns=1 with 158k rows takes about 2.3 - 2.8 secs, and similar for the slice to load a whole column.
Because I'm running omero-web locally, both cases require the whole column to be retrieved from OMERO.server to my local web server, so we don't see a performance benefit. But when the web server is close to the OMERO.server, the histogram will benefit from moving less JSON data down the wire. The whole column slice JSON is about 1.25 MB in this case.

knabar · 2026-06-24T13:12:52Z

Since the histogram code relies on the data to fit into a single table slice call, I agree that the danger is probably minimal, since a client application could also just call the slice of the same size itself.

If I am reading the code correctly though it looks like the histogram code does not take a incomplete slice into consideration. If say a table has 3 million rows, MAX_TABLE_SLICE_SIZE will cause the table slice to only return the first million (by default) without raising an error, and the histogram will be only for those rows without indicating that fact.

will-moore · 2026-06-26T14:18:39Z

The histogram endpoint response includes the same meta returned by the underlying slice function, so we can use that to check the total table size compared with the max limit:

"meta": {
  "columns": [
    "area (µm)"
  ],
  "rowCount": 1596,
  "columnCount": 15,
  "maxCells": 1000000
}

I think that the limit_generator used by the _table_slice() function will raise ValueError("Too many items") if you try to create a histogram on a whole table that is bigger than MAX_TABLE_SLICE_SIZE.
So you'll then need to use e.g. &rows=0-1000000 to retrieve a partial histogram.

will-moore and others added 3 commits June 5, 2026 11:46

Add table_histogram endpoint

6fef8fa

[pre-commit.ci] auto fixes from pre-commit.com hooks

50bf498

for more information, see https://pre-commit.ci

Remove long comment line

1f39cd3

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add table_histogram endpoint#677

Add table_histogram endpoint#677
will-moore wants to merge 3 commits into
ome:masterfrom
will-moore:table_histogram

will-moore commented Jun 5, 2026 •

edited

Loading

Uh oh!

knabar commented Jun 19, 2026

Uh oh!

will-moore commented Jun 22, 2026

Uh oh!

knabar commented Jun 24, 2026

Uh oh!

will-moore commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

will-moore commented Jun 5, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

knabar commented Jun 19, 2026

Uh oh!

will-moore commented Jun 22, 2026

Uh oh!

knabar commented Jun 24, 2026

Uh oh!

will-moore commented Jun 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

will-moore commented Jun 5, 2026 •

edited

Loading