AGENTS.mdis the source of truth. This file is a bullet-point summary only. Always load and followAGENTS.md— it takes precedence over anything here.
- When
AGENTS.mdchanges, update bothCLAUDE.mdand rootARCHITECTURE.mdto keep guidance and architecture index references aligned.
- Non-functional shim — do not compile against it
- For single-header use: generate
build/.../single_include_generated/csv.hppviagenerate_single_headertarget - For unamalgamated use: include from
include/
CSVReader("file.csv")→ MmapParserCSVReader(istream, format)→ StreamParser- Bugs can exist in one and not the other — always test both with Catch2
SECTION
- Worker thread reads 10MB chunks (
CSV_CHUNK_SIZE_DEFAULT) - Communication via
ThreadSafeDeque<CSVRow> - Exceptions propagate via
std::exception_ptr - Tests must use ≥500K rows to cross chunk boundary
- File mapping, parser data flow, and component relationships are maintained in
ARCHITECTURE.mdandinclude/internal/ARCHITECTURE.md
- Always test both mmap and stream paths
- ≥500K rows needed to cross 10MB boundary
- Use distinct column values to detect field corruption
- Exceptions from worker thread need
exception_ptr - Changes to one constructor likely affect both paths
- Always update or remove incorrect comments
- Do not delete or simplify comments unless trivially obvious or factually wrong — comments encode concurrency invariants and bug history
- Compatibility macros defined in
common.hppMUST be referenced only after includingcommon.hpp. Any macro (such asCSV_HAS_CXX20) that is defined incommon.hppmust not be used or checked before#include "common.hpp"appears in the file. This ensures feature detection and conditional compilation work as intended across all supported compilers and build modes. - Do not reference internal functions in public API comments — public API docs should remain user-facing; internal details belong in internal docs
- Public API docs belong on declarations in
.hppfiles — keep user-facing/Doxygen docs on the header declaration; reserve.cppcomments for implementation notes, concurrency invariants, performance rationale, and bug history - Remove meaningless
@paramand@returndocs when editing a function — if they merely restate the name or signature, delete them instead of preserving noise CSVReaderis non-copyable and move-enabled — prefer explicit ownership transfer (std::move) orstd::unique_ptr<CSVReader>when handing off parser ownership- Prefer trailing underscore for private members — when touching mixed-style code, normalize the edited region toward names like
source_andleftover_ - Prefer LF (
\n) line endings for tracked source, test, CMake, and Markdown files — when touching a file with mixed endings, normalize it to LF unless a file-specific reason says otherwise - Keep preprocessor directives flush left —
#define,#if,#ifdef,#else, and#endifstart at column 0; indent code inside multi-line macros exactly as equivalent non-macro code would be - Prefer user-friendly API constraints — do not narrow template constraints unless required for correctness, safety, or a measured performance win; if common containers/ranges already work, keep them accepted
- Respect compile-time compatibility macros — keep constructs like
IF_CONSTEXPRandCONSTEXPR_VALUEunless there is a correctness bug - Do not rewrite compile-time logic to silence warnings — prefer tightly scoped suppression at the exact site when needed
- Opportunistic rewrites are allowed when safe and justified — avoid mixing unrelated churn into urgent compiler triage unless requested
- Explain compile-time tradeoffs explicitly — when a change affects compile-time behavior, call out impact on codegen/perf/portability/readability
- Scope guard for build fixes — if a fix grows beyond roughly 3 files or 60 changed lines, pause and confirm scope with justification
- Apply the 5/2 anti-duplication rule — if equivalent behavior exists in 2+ code paths and each copy is ~5+ meaningful lines, extract a shared helper; if duplication remains, document why; keep at least one regression test that exercises each path
- Non-trivial methods go in
.cppwithCSV_INLINE—CSV_INLINEisinlinein the generated single-header and empty otherwise; omitting it causes ODR violations. Exceptions: templated methods must stay in.hpp(init_from_streamis the standing example); trivial one-liner accessors may stayinlinein the header when call overhead matters. Consolidate into a single.hpponly when the.cppwould be under ~100 lines and the split causes excessive comment duplication — consolidated definitions (free functions and methods alike) must useinline, notCSV_INLINE, to avoid ODR violations across TUs.
See tests/AGENTS.md for full test strategy, checklist, and conventions.