Add test for MSVC compilation w/o CMake

vincentlaucsb · vincentlaucsb · commit 51c88e0ccccb · 2026-03-31T17:44:34.000-07:00
diff --git a/.github/workflows/cmake-multi-platform.yml b/.github/workflows/cmake-multi-platform.yml
@@ -93,3 +93,4 @@ jobs:
       run: ctest --build-config ${{ matrix.build_type }}
 
 
+
diff --git a/.github/workflows/msvc-consumer-compat.yml b/.github/workflows/msvc-consumer-compat.yml
@@ -0,0 +1,63 @@
+name: MSVC Consumer Compatibility
+
+on:
+  push:
+    branches: [ "master" ]
+    paths:
+      - "include/**"
+      - "single_header.py"
+      - "CMakeLists.txt"
+      - ".github/workflows/msvc-consumer-compat.yml"
+  pull_request:
+    branches: [ "master" ]
+    paths:
+      - "include/**"
+      - "single_header.py"
+      - "CMakeLists.txt"
+      - ".github/workflows/msvc-consumer-compat.yml"
+  workflow_dispatch:
+
+jobs:
+  windows-msvc-no-zc-cplusplus:
+    name: Windows MSVC compatibility without /Zc:__cplusplus
+    runs-on: windows-latest
+
+    steps:
+    - name: Checkout repository and submodules
+      uses: actions/checkout@v4
+      with:
+        submodules: recursive
+
+    - name: Set up MSVC developer command prompt
+      uses: ilammy/msvc-dev-cmd@v1
+
+    - name: Set up Python
+      uses: actions/setup-python@v5
+      with:
+        python-version: '3.x'
+
+    - name: Configure CMake (for single-header generation)
+      run: >
+        cmake -S ${{ github.workspace }}
+        -B ${{ github.workspace }}/build/no-zc-cplusplus
+        -DCSV_CXX_STANDARD=20
+        -DCSV_BUILD_PROGRAMS=OFF
+        -DBUILD_PYTHON=OFF
+
+    - name: Generate amalgamated single-header
+      run: cmake --build ${{ github.workspace }}/build/no-zc-cplusplus --config Release --target generate_single_header
+
+    - name: Compile generated single-header without /Zc:__cplusplus
+      shell: cmd
+      run: |
+        REM NOTE: single_include/csv.hpp is an intentional compatibility shim; compile the generated amalgamated header instead.
+        echo #include "csv.hpp" > ci_no_zc_single_header.cpp
+        echo int main() { return 0; } >> ci_no_zc_single_header.cpp
+        cl /nologo /std:c++20 /EHsc /Zc:__cplusplus- /I build\no-zc-cplusplus\single_include_generated /c ci_no_zc_single_header.cpp
+
+    - name: Compile unamalgamated header without /Zc:__cplusplus
+      shell: cmd
+      run: |
+        echo #include "csv.hpp" > ci_no_zc_unamalgamated.cpp
+        echo int main() { return 0; } >> ci_no_zc_unamalgamated.cpp
+        cl /nologo /std:c++20 /EHsc /Zc:__cplusplus- /I include /c ci_no_zc_unamalgamated.cpp
diff --git a/AGENTS.md b/AGENTS.md
@@ -0,0 +1,77 @@
+# CSV Parser - AI Agent Context
+
+Architectural overview for AI assistants working with this codebase.
+
+> **Maintenance rule:** Whenever this file is changed, `CLAUDE.md` in the same directory must be updated to reflect the changes. `CLAUDE.md` is a bullet-point summary of this file and must stay in sync.
+
+## Critical: single_include/csv.hpp Is A Shim
+
+`single_include/csv.hpp` is intentionally **non-functional** and exists only as a compatibility shim.
+
+- Do **not** compile against `single_include/csv.hpp`
+- For single-header validation, generate `build/.../single_include_generated/csv.hpp` via the `generate_single_header` target, then compile that generated file
+- For unamalgamated usage, include headers from `include/`
+
+This guard exists to prevent stale-in-repo amalgamated headers and to force use of the canonical generated distribution.
+
+## Critical: Two Independent Code Paths
+
+The `CSVReader` class has **two completely different implementations**:
+
+```cpp
+// PATH 1: Memory-mapped I/O (MmapParser)
+CSVReader reader("filename.csv");
+
+// PATH 2: Stream-based (StreamParser)
+std::ifstream infile("filename.csv", std::ios::binary);
+CSVReader reader(infile, format);
+```
+
+**Impact:** Bugs can exist in one path but not the other (see issue #281). Any test validating parsing behavior must test BOTH paths using Catch2 `SECTION`.
+
+## Threading: Worker + 10MB Chunks
+
+- Worker thread reads in 10MB chunks (`ITERATION_CHUNK_SIZE`)
+- Communicates via `ThreadSafeDeque<CSVRow>`
+- Exceptions propagate via `std::exception_ptr`
+- Critical: Fields spanning chunk boundaries must not corrupt
+
+**Testing requirement:** Use ≥500K rows to cross 10MB boundary.
+
+## Key Files
+
+| File | Contains |
+|------|----------|
+| `csv_reader.hpp` | Mmap vs stream constructors |
+| `csv_reader.cpp` | Delimiter guessing, header detection |
+| `basic_csv_parser.hpp` | Parser base class (IBasicCSVParser, MmapParser, StreamParser) |
+| `basic_csv_parser.cpp` | Chunk transitions, worker thread |
+| `raw_csv_data.hpp` | Internal parser data structures (RawCSVField, CSVFieldList, RawCSVData) |
+| `thread_safe_deque.hpp` | Producer-consumer queue for parser→main thread communication |
+| `csv_row.hpp` | Public API types (CSVField, CSVRow) |
+| `test_round_trip.cpp` | Exemplar test patterns |
+
+## Data Flow: Parser → Row API
+
+```
+Parser Thread                      Main Thread
+	↓                                  ↓
+RawCSVData (shared_ptr) ─────────────→ CSVRow
+	↓                                  ↓
+CSVFieldList → RawCSVField[]       CSVField (lazy unescaping)
+	↓
+ThreadSafeDeque<CSVRow>
+(producer-consumer queue)
+```
+
+**Thread Safety:** Parser populates `RawCSVData`, pushes `CSVRow` to `ThreadSafeDeque`, main thread pops and reads. The `CSVFieldList` uses chunked allocation (~170 fields/chunk) for cache locality. See `raw_csv_data.hpp` and `thread_safe_deque.hpp` for implementation details.
+
+## Common Pitfalls
+
+1. **Don't assume one code path:** Mmap and stream paths are different. Always test both.
+2. **Don't write tiny tests:** Need ≥500K rows to cross 10MB chunk boundary.
+3. **Don't use uniform values:** Each column needs distinct values to detect corruption.
+4. **Don't ignore async:** Worker thread means exceptions must use `exception_ptr`.
+5. **Don't change one constructor:** Likely affects both mmap and stream paths.
+
+See `tests/AGENTS.md` for test strategy, checklist, and conventions.
diff --git a/CLAUDE.md b/CLAUDE.md
@@ -1,94 +1,37 @@
-# CSV Parser - AI Agent Context
+# CSV Parser - Claude Summary
 
-Architectural overview for AI assistants working with this codebase.
+> **`AGENTS.md` is the source of truth.** This file is a bullet-point summary only. Always load and follow `AGENTS.md` — it takes precedence over anything here.
 
-## Critical: Two Independent Code Paths
+## single_include/csv.hpp
+- Non-functional shim — do **not** compile against it
+- For single-header use: generate `build/.../single_include_generated/csv.hpp` via `generate_single_header` target
+- For unamalgamated use: include from `include/`
 
-The `CSVReader` class has **two completely different implementations**:
+## Two Independent Code Paths
+- `CSVReader("file.csv")` → MmapParser
+- `CSVReader(istream, format)` → StreamParser
+- Bugs can exist in one and not the other — always test both with Catch2 `SECTION`
 
-```cpp
-// PATH 1: Memory-mapped I/O (MmapParser)
-CSVReader reader("filename.csv");
-
-// PATH 2: Stream-based (StreamParser)
-std::ifstream infile("filename.csv", std::ios::binary);
-CSVReader reader(infile, format);
-```
-
-**Impact:** Bugs can exist in one path but not the other (see issue #281). Any test validating parsing behavior must test BOTH paths using Catch2 `SECTION`.
-
-## Threading: Worker + 10MB Chunks
-
-- Worker thread reads in 10MB chunks (`ITERATION_CHUNK_SIZE`)
-- Communicates via `ThreadSafeDeque<CSVRow>`
+## Threading
+- Worker thread reads 10MB chunks (`ITERATION_CHUNK_SIZE`)
+- Communication via `ThreadSafeDeque<CSVRow>`
 - Exceptions propagate via `std::exception_ptr`
-- Critical: Fields spanning chunk boundaries must not corrupt
-
-**Testing requirement:** Use ≥500K rows to cross 10MB boundary.
-
-## Test Strategy: Use Distinct Column Values
-
-❌ **BAD:** `array{i, i, i, i, i}` - All columns identical  
-✅ **GOOD:** `array{i*5+0, i*5+1, i*5+2, i*5+3, i*5+4}` - Each column distinct
-
-**Why:** Field corruption is only detectable if columns have different values.
+- Tests must use ≥500K rows to cross chunk boundary
 
 ## Key Files
-
-| File | Contains |
-|------|----------|
-| `csv_reader.hpp` | Mmap vs stream constructors |
-| `csv_reader.cpp` | Delimiter guessing, header detection |
-| `basic_csv_parser.hpp` | Parser base class (IBasicCSVParser, MmapParser, StreamParser) |
-| `basic_csv_parser.cpp` | Chunk transitions, worker thread |
-| `raw_csv_data.hpp` | Internal parser data structures (RawCSVField, CSVFieldList, RawCSVData) |
-| `thread_safe_deque.hpp` | Producer-consumer queue for parser→main thread communication |
-| `csv_row.hpp` | Public API types (CSVField, CSVRow) |
-| `test_round_trip.cpp` | Exemplar test patterns |
-
-## Data Flow: Parser → Row API
-
-```
-Parser Thread                      Main Thread
-     ↓                                  ↓
-RawCSVData (shared_ptr) ─────────────→ CSVRow
-     ↓                                  ↓
-CSVFieldList → RawCSVField[]       CSVField (lazy unescaping)
-     ↓
-ThreadSafeDeque<CSVRow>
-(producer-consumer queue)
-```
-
-**Thread Safety:** Parser populates `RawCSVData`, pushes `CSVRow` to `ThreadSafeDeque`, main thread pops and reads. The `CSVFieldList` uses chunked allocation (~170 fields/chunk) for cache locality. See `raw_csv_data.hpp` and `thread_safe_deque.hpp` for implementation details.
+- `csv_reader.hpp` — mmap vs stream constructors
+- `basic_csv_parser.hpp` — MmapParser, StreamParser implementations
+- `basic_csv_parser.cpp` — chunk transitions, worker thread
+- `raw_csv_data.hpp` — RawCSVField, CSVFieldList, RawCSVData
+- `thread_safe_deque.hpp` — producer-consumer queue
+- `csv_row.hpp` — CSVField, CSVRow public API
 
 ## Common Pitfalls
-
-1. **Don't assume one code path:** Mmap and stream paths are different. Always test both.
-2. **Don't write tiny tests:** Need ≥500K rows to cross 10MB chunk boundary.
-3. **Don't use uniform values:** Each column needs distinct values to detect corruption.
-4. **Don't ignore async:** Worker thread means exceptions must use `exception_ptr`.
-5. **Don't change one constructor:** Likely affects both mmap and stream paths.
-
-## Test Checklist
-
-- [ ] Tests both mmap and stream paths (use `SECTION`)
-- [ ] Distinct values per column
-- [ ] ≥500K rows to cross chunk boundary
-- [ ] Documents bug it would catch
-- [ ] Lambda + SECTION pattern for code reuse
-- [ ] Test data in `tests/data/fake_data` (real data in `tests/data/real_data`)
-- [ ] Use `FileGuard` for temporary files (ensures cleanup even if test fails)
-
-**Note:** `tests/data` is a git submodule. Remember to commit changes separately.
-
-## Recent Bug Fixes
-
-| Issue | Bug | Fixed |
-|-------|-----|-------|
-| #278 | CSVFieldList move constructor dangling pointer | Feb 2026 |
-| #280 | Field corruption at chunk boundaries | PR #282 |
-| #281 | Stream-specific exception handling | PR #282 |
-| #283 | Header detection with variable-width rows | Jan 2026 |
-| #285 | Delimiter guessing overwrites `no_header()` | Feb 2026 |
-
-See inline comments in source files for implementation details.
+- Always test both mmap and stream paths
+- ≥500K rows needed to cross 10MB boundary
+- Use distinct column values to detect field corruption
+- Exceptions from worker thread need `exception_ptr`
+- Changes to one constructor likely affect both paths
+
+## Tests
+See `tests/AGENTS.md` for full test strategy, checklist, and conventions.
diff --git a/README.md b/README.md
@@ -93,6 +93,10 @@ Found a bug? Please report it! This project welcomes **genuine bug reports broug
  * ✅ Performance regressions in real-world scenarios
  * ✅ API issues that affect **practical, real-world use cases**
 
+When reporting integration or compiler issues, please state which library form you are using:
+ * Single-header
+ * Unamalgamated headers/library (`include/` with your own build system, CMake, etc.)
+
 Please keep reports grounded in real use cases—no contrived edge cases or philosophical debates about API design, thanks!
 
 **Design Note:** `CSVReader` uses `std::input_iterator_tag` for single-pass streaming of arbitrarily large files. If you need multi-pass iteration or random access, copy rows to a `std::vector` first. This is by design, not a bug.
diff --git a/tests/AGENTS.md b/tests/AGENTS.md
diff --git a/tests/CLAUDE.md b/tests/CLAUDE.md

Original file line number	Diff line number	Diff line change
`@@ -93,3 +93,4 @@ jobs:`
`93`	`93`	`run: ctest --build-config ${{ matrix.build_type }}`
`94`	`94`
`95`	`95`
	`96`	`+`