fix(mcp): ASCII chart crashes with NaN when dataset contains null values by aminghadersohi · Pull Request #39916 · apache/superset

aminghadersohi · 2026-05-06T22:53:54Z

SUMMARY

When a dataset contains NULL values, they are represented as float('nan') after numeric conversion. The ASCII bar chart renderers passed NaN directly into int(), raising ValueError: cannot convert float NaN to integer.

Root cause: _generate_ascii_bar_chart and _extract_time_series_data accepted float('nan') values through the isinstance(val, (int, float)) check, which returns True for NaN. The NaN then propagated into max() / normalization calculations, causing the crash at int(normalized * max_bar_width).

Three-layer fix:

Extraction-time filtering — _generate_ascii_bar_chart and _extract_time_series_data now call _is_nan_value() (already defined in the module) before accepting a numeric value, silently skipping NaN rows.
Defence-in-depth guards — _generate_horizontal_bar_chart and _generate_vertical_bar_chart check _is_nan_value(normalized) before the int() call, treating NaN bar size as 0 (no bar drawn).

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

N/A — ASCII chart output; crash prevented.

TESTING INSTRUCTIONS

Create a dataset with at least one NULL value in a numeric column.
Render a bar or line chart via the MCP get_chart_preview tool.
Before this fix: ValueError: cannot convert float NaN to integer.
After this fix: chart renders with NaN rows silently skipped.

Unit tests added in tests/unit_tests/mcp_service/chart/test_ascii_charts.py covering bar, column, line, and timeseries bar chart types with NaN and None values.

ADDITIONAL INFORMATION

Has associated issue:
Required feature flags:
Changes UI
Includes DB Migration (follow approval process in SIP-59)
Introduces new feature or API
Removes existing feature or API

When a dataset contains NULL values they become float NaN after numeric conversion. The horizontal and vertical bar chart renderers passed NaN into int(), raising ValueError. Likewise the line chart extractor would include NaN in the value list, corrupting min/max calculations. Three-layer fix: 1. Filter NaN during value extraction in _generate_ascii_bar_chart and _extract_time_series_data so NaN rows are silently skipped. 2. Add _is_nan_value() guard before int() in _generate_horizontal_bar_chart (bar_length) and _generate_vertical_bar_chart (bar_height) as defence-in-depth; NaN bar size is treated as 0 (no bar drawn). Fixes: ValueError: cannot convert float NaN to integer

codecov · 2026-05-06T23:05:08Z

Codecov Report

❌ Patch coverage is 0% with 8 lines in your changes missing coverage. Please review.
✅ Project coverage is 63.88%. Comparing base (5b5dd01) to head (f80e3b3).

Files with missing lines	Patch %	Lines
superset/mcp_service/chart/ascii_charts.py	0.00%	8 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master   #39916      +/-   ##
==========================================
- Coverage   63.88%   63.88%   -0.01%     
==========================================
  Files        2583     2583              
  Lines      136604   136608       +4     
  Branches    31502    31504       +2     
==========================================
  Hits        87276    87276              
- Misses      47812    47816       +4     
  Partials     1516     1516

Flag	Coverage Δ
hive	`39.38% <0.00%> (-0.01%)`	⬇️
mysql	`59.05% <0.00%> (-0.01%)`	⬇️
postgres	`59.13% <0.00%> (-0.01%)`	⬇️
presto	`41.08% <0.00%> (-0.01%)`	⬇️
python	`60.57% <0.00%> (-0.01%)`	⬇️
sqlite	`58.77% <0.00%> (-0.01%)`	⬇️
unit	`100.00% <ø> (ø)`

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

Vertical bar chart (chosen when avg label length <= 8) truncates labels to 3 characters, so 'Alpha' never appears literally. Use longer labels (> 8 chars average) to force horizontal layout, where the full label is preserved up to 15 characters.

Copilot

Pull request overview

Fixes MCP ASCII chart preview rendering crashes when datasets contain NULLs that become NaN during numeric conversion, by filtering NaN values during data extraction and guarding the int() bar-size conversion paths. Adds unit tests to exercise bar/column/line/timeseries behavior with NaN and None inputs.

Changes:

Filter out NaN numeric values during bar and time-series data extraction.
Add defense-in-depth checks to avoid int(float('nan')) when computing bar sizes.
Add unit tests covering chart generation with NaN/None values.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File	Description
`superset/mcp_service/chart/ascii_charts.py`	Filters `NaN` during extraction and adds guards before converting normalized values to `int()` to prevent crashes.
`tests/unit_tests/mcp_service/chart/test_ascii_charts.py`	Adds unit tests for `NaN`/`None` handling across several ASCII chart types.

aminghadersohi · 2026-05-07T03:43:49Z

+def test_horizontal_bar_chart_nan_rows_are_skipped() -> None:
+    """NaN rows must be silently skipped; valid rows render normally."""
+    # Use labels longer than 8 chars to force horizontal layout, where full
+    # label text is preserved (vertical layout truncates to 3 chars).
+    data = [
+        {"label": "Alpha Category", "amount": 50.0},
+        {"label": "Beta Category", "amount": float("nan")},
+        {"label": "Gamma Category", "amount": 150.0},
+    ]
+    result = generate_ascii_chart(data, "bar")


Addressed. — agor claude on Amin's behalf

Assert the exact fallback message 'No numeric data found for bar chart' instead of just checking len(result) > 0, which would pass even on unrelated error strings.

- test_horizontal_bar_chart_nan_rows_are_skipped: replace weak `or` assertion with separate `assert "Alpha" in result`, `assert "Gamma" in result`, and `assert "Beta" not in result`, making the test deterministic and verifying the NaN row is actually excluded - test_bar_chart_with_all_null_values_returns_fallback: add `isinstance(result, str)` and `assert "█" not in result` to explicitly verify no bar content is rendered in the fallback path

Add `assert "Horizontal Bar Chart" in result` to make it explicit that the horizontal renderer is chosen (avg label length 14 > 8 threshold). Horizontal mode preserves full label text; vertical would truncate to 3 chars, invalidating the "Alpha"/"Gamma" presence assertions.

codeant-ai-for-open-source · 2026-05-07T02:53:34Z

+                if (
+                    isinstance(val, (int, float))
+                    and not _is_nan_value(val)
+                    and numeric_val is None
+                ):


Suggestion: This numeric extraction condition now accepts booleans as numeric values (True/False are subclasses of int) and, because of the numeric_val is None gate, it can lock onto a boolean field and ignore the real metric in the same row. In datasets where a boolean column appears before the metric column, bars will be computed from 0/1 instead of the actual values. Exclude booleans from numeric detection (for example, require not isinstance(val, bool)) before assigning numeric_val. [logic error]

Severity Level: Major ⚠️

- ⚠️ MCP get_chart_preview bar charts show 0/1 instead of metrics. - ⚠️ ASCII bar/column previews misrepresent numeric values for users.

Steps of Reproduction ✅

1. Configure any Superset chart whose `viz_type` is `"bar"`, `"column"`, or `"echarts_timeseries_bar"` (these are routed to the bar renderer in `superset/mcp_service/chart/ascii_charts.py:50-51`) using a dataset where the first column in the query result row is BOOLEAN (e.g. `is_active`) and the second is a numeric metric (e.g. `total_sales`). 2. Use the MCP `get_chart_preview` tool, which executes `ChartPreviewStrategy.generate()` in `superset/mcp_service/chart/tool/get_chart_preview.py:3-35`. That method runs a `ChartDataCommand`, obtains `data = result["queries"][0].get("data", [])` at `get_chart_preview.py:26-28`, and then calls `ascii_chart = generate_ascii_chart(data, self.chart.viz_type or "table", ...)` at `get_chart_preview.py:30-35`. 3. Inside `generate_ascii_chart` in `superset/mcp_service/chart/ascii_charts.py:34-53`, the `"bar"`/`"column"`/`"echarts_timeseries_bar"` chart types are dispatched to `_generate_ascii_bar_chart(data, width, height)` at `ascii_charts.py:50-51`. 4. In `_generate_ascii_bar_chart` (`ascii_charts.py:66-93`), each result row dict is iterated in insertion order (`for _key, val in row.items():` at line 81). For a row like `{"is_active": True, "total_sales": 100.0}`, the first `val` is the boolean `True`. The condition at lines 82-86: `if (isinstance(val, (int, float)) and not _is_nan_value(val) and numeric_val is None):` evaluates to True because `True` is an instance of `int` and `_is_nan_value(True)` (implemented at `ascii_charts.py:23-27`) returns False. This sets `numeric_val = True` (1) at line 87, and because `numeric_val` is no longer `None`, the actual metric `total_sales` is ignored. The `values` list appended at lines 91-93 thus contains 0/1 flags instead of real metric values, and the horizontal/vertical bar chart produced by `_generate_horizontal_bar_chart`/`_generate_vertical_bar_chart` uses these incorrect 0/1-based values in the MCP ASCII preview.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** superset/mcp_service/chart/ascii_charts.py **Line:** 82:86 **Comment:** *Logic Error: This numeric extraction condition now accepts booleans as numeric values (`True`/`False` are subclasses of `int`) and, because of the `numeric_val is None` gate, it can lock onto a boolean field and ignore the real metric in the same row. In datasets where a boolean column appears before the metric column, bars will be computed from 0/1 instead of the actual values. Exclude booleans from numeric detection (for example, require `not isinstance(val, bool)`) before assigning `numeric_val`. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

👍 | 👎

Addressed. — agor claude on Amin's behalf

codeant-ai-for-open-source · 2026-05-07T02:53:35Z

+                if (
+                    isinstance(val, (int, float))
+                    and not _is_nan_value(val)
+                    and numeric_val is None
+                ):


Suggestion: The same extraction logic in time-series parsing can also select boolean columns as numeric points and then stop at the first match due numeric_val is None, which can replace the intended metric series with 0/1 values when boolean fields come first in row order. This breaks line/time-series charts for valid mixed-schema datasets. Exclude booleans from the numeric branch so only true numeric measures are used. [logic error]

Severity Level: Major ⚠️

- ⚠️ MCP get_chart_preview line charts use boolean 0/1 values. - ⚠️ Time-series ASCII previews misrepresent metric trends for users.

Steps of Reproduction ✅

1. Configure a Superset chart whose `viz_type` is `"line"` or `"echarts_timeseries_line"` (routed to the line renderer in `superset/mcp_service/chart/ascii_charts.py:52-53`) using a dataset whose query returns rows with a BOOLEAN column first (e.g. `is_active`) and a numeric metric second (e.g. `total_sales`) along with a temporal label column (e.g. `date`). 2. Invoke the MCP `get_chart_preview` tool so that `ChartPreviewStrategy.generate()` in `superset/mcp_service/chart/tool/get_chart_preview.py:3-35` runs `ChartDataCommand`, populates `data = result["queries"][0].get("data", [])` at lines 26-28, and calls `generate_ascii_chart(data, self.chart.viz_type or "table", ...)` at lines 30-35. 3. In `generate_ascii_chart` (`superset/mcp_service/chart/ascii_charts.py:34-54`), when `chart_type` is `"line"` or `"echarts_timeseries_line"`, control flows to `_generate_ascii_line_chart(data, width, height)` as seen at line 52. `_generate_ascii_line_chart` then calls `values, labels = _extract_time_series_data(data)` at `ascii_charts.py:253`. 4. Inside `_extract_time_series_data` (`ascii_charts.py:275-307`), each row dict is iterated (`for key, val in row.items():` at line 286). For a row such as `{"is_active": True, "total_sales": 100.0, "date": "2024-01-01"}`, the first `val` is the boolean `True`. The numeric-branch condition at lines 287-291: `if (isinstance(val, (int, float)) and not _is_nan_value(val) and numeric_val is None):` evaluates to True because `True` is an `int` and `_is_nan_value(True)` returns False (implementation at `ascii_charts.py:23-27` uses `math.isnan(float(value))`). This sets `numeric_val = True` at line 292 and prevents later numeric fields from being considered (`numeric_val is None` gate). As a result, the `values` list appended at lines 303-305 contains 0/1 boolean flags instead of the intended metric (e.g. `total_sales`), so the line chart drawn by `_create_enhanced_line_chart` (`ascii_charts.py:310-378`) and returned to the MCP `get_chart_preview` caller has an incorrect time-series shape.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖

This is a comment left during a code review. **Path:** superset/mcp_service/chart/ascii_charts.py **Line:** 287:291 **Comment:** *Logic Error: The same extraction logic in time-series parsing can also select boolean columns as numeric points and then stop at the first match due `numeric_val is None`, which can replace the intended metric series with 0/1 values when boolean fields come first in row order. This breaks line/time-series charts for valid mixed-schema datasets. Exclude booleans from the numeric branch so only true numeric measures are used. Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise. Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

👍 | 👎

Addressed. — agor claude on Amin's behalf

…derers bool is a subclass of int, so isinstance(True, (int, float)) returns True. Without an explicit bool guard the extractor would lock onto a boolean column (is_active, flag, etc.) and ignore the real numeric metric, producing 0/1 bars instead of actual values. Add not isinstance(val, bool) to the numeric guard in both _generate_ascii_bar_chart and _extract_time_series_data.

bito-code-review · 2026-05-07T08:32:33Z

Code Review Agent Run #f8215a

Actionable Suggestions - 0

Review Details

Files reviewed - 2 · Commit Range: 393906c..f80e3b3
- superset/mcp_service/chart/ascii_charts.py
- tests/unit_tests/mcp_service/chart/test_ascii_charts.py
Files skipped - 0
Tools
- Whispers (Secret Scanner) - ✔︎ Successful
- Detect-secrets (Secret Scanner) - ✔︎ Successful
- MyPy (Static Code Analysis) - ✔︎ Successful
- Astral Ruff (Static Code Analysis) - ✔︎ Successful

Bito Usage Guide

Commands

Type the following command in the pull request comment and save the comment.

/review - Manually triggers a full AI review.
/pause - Pauses automatic reviews on this pull request.
/resume - Resumes automatic reviews.
/resolve - Marks all Bito-posted review comments as resolved.
/abort - Cancels all in-progress reviews.

Refer to the documentation for additional commands.

Configuration

This repository uses Superset You can customize the agent settings here or contact your Bito workspace admin at evan@preset.io.

Documentation & Help

AI Code Review powered by

pull-request-size Bot added the size/L label May 6, 2026

aminghadersohi requested a review from Copilot May 6, 2026 23:43

Copilot started reviewing on behalf of aminghadersohi May 6, 2026 23:43 View session

Copilot AI reviewed May 6, 2026

View reviewed changes

test(mcp): strengthen all-NaN fallback assertion in bar chart test

436e7ae

Assert the exact fallback message 'No numeric data found for bar chart' instead of just checking len(result) > 0, which would pass even on unrelated error strings.

aminghadersohi marked this pull request as ready for review May 7, 2026 02:37

dosubot Bot added the viz:charts:bar Related to the Bar chart label May 7, 2026

aminghadersohi added 2 commits May 7, 2026 02:49

codeant-ai-for-open-source Bot reviewed May 7, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(mcp): ASCII chart crashes with NaN when dataset contains null values#39916

fix(mcp): ASCII chart crashes with NaN when dataset contains null values#39916
aminghadersohi wants to merge 6 commits intoapache:masterfrom
aminghadersohi:fix/mcp-ascii-chart-nan

aminghadersohi commented May 6, 2026

Uh oh!

codecov Bot commented May 6, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

aminghadersohi May 7, 2026

Uh oh!

Uh oh!

codeant-ai-for-open-source Bot May 7, 2026

Uh oh!

aminghadersohi May 7, 2026

Uh oh!

codeant-ai-for-open-source Bot May 7, 2026

Uh oh!

aminghadersohi May 7, 2026

Uh oh!

bito-code-review Bot commented May 7, 2026 •

edited

Loading

Code Review Agent Run #f8215a

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

aminghadersohi commented May 6, 2026

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

Uh oh!

codecov Bot commented May 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

aminghadersohi May 7, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

codeant-ai-for-open-source Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

aminghadersohi May 7, 2026

Choose a reason for hiding this comment

Uh oh!

codeant-ai-for-open-source Bot May 7, 2026

Choose a reason for hiding this comment

Uh oh!

aminghadersohi May 7, 2026

Choose a reason for hiding this comment

Uh oh!

bito-code-review Bot commented May 7, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Agent Run #f8215a

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov Bot commented May 6, 2026 •

edited

Loading

bito-code-review Bot commented May 7, 2026 •

edited

Loading