Use GitHub Pages for Doxygen docs

vincentlaucsb · vincentlaucsb · commit 8be15671632f · 2026-02-26T22:08:49.000-07:00
diff --git a/README.md b/README.md
@@ -99,7 +99,7 @@ Please keep reports grounded in real use cases—no contrived edge cases or phil
 
 ## Documentation
 
-In addition to the [Features & Examples](#features--examples) below, a [fully-fledged online documentation](https://vincela.com/csv/) contains more examples, details, interesting features, and instructions for less common use cases.
+In addition to the [Features & Examples](#features--examples) below, a [fully-fledged online documentation](https://vincentlaucsb.github.io/csv-parser/) contains more examples, details, interesting features, and instructions for less common use cases.
 
 ## Sponsors
 If you use this library for work, please [become a sponsor](https://github.com/sponsors/vincentlaucsb). Your donation
diff --git a/docs/source/Doxy.md b/docs/source/Doxy.md
@@ -6,39 +6,95 @@ For quick examples, go to this project's [GitHub page](https://github.com/vincen
 ## Outline
 
 ### CSV Reading
+
  * csv::CSVFormat: \copybrief csv::CSVFormat
+   * csv::CSVFormat::chunk_size(): Set chunk size for files with very large rows
  * csv::CSVReader
-  * csv::CSVReader::n_rows(): \copybrief csv::CSVReader::n_rows()
-  * csv::CSVReader::utf8_bom(): \copybrief csv::CSVReader::utf8_bom()
-  * csv::CSVReader::get_format(): \copybrief csv::CSVReader::get_format()
-  * Retrieving data
-      * csv::CSVReader::iterator: Recommended
-        * csv::CSVReader::begin()
-        * csv::CSVReader::end()
-      * csv::CSVReader::read_row()
+   * csv::CSVReader::n_rows(): \copybrief csv::CSVReader::n_rows()
+   * csv::CSVReader::utf8_bom(): \copybrief csv::CSVReader::utf8_bom()
+   * csv::CSVReader::get_format(): \copybrief csv::CSVReader::get_format()
+   * Retrieving data
+     * csv::CSVReader::iterator: Recommended
+       * csv::CSVReader::begin()
+       * csv::CSVReader::end()
+     * csv::CSVReader::read_row()
  * Convenience Functions
-  * csv::parse()
-  * csv::operator ""_csv()
-  * csv::parse_no_header()
-  * csv::operator ""_csv_no_header()
+   * csv::parse()
+   * csv::operator ""_csv()
+   * csv::parse_no_header()
+   * csv::operator ""_csv_no_header()
+ * File Utilities
+   * csv::get_file_info(): Returns row/column counts and detected format for a CSV file
+   * csv::get_col_pos(): Returns the zero-based index of a named column
 
  #### See also
  [Dealing with Variable Length CSV Rows](md_docs_source_variable_row_lengths.html)
 
  #### Working with parsed data
  * csv::CSVRow: \copybrief csv::CSVRow
-  * csv::CSVRow::operator std::vector<std::string>()
-  * csv::CSVRow::iterator
-    * csv::CSVRow::begin()
-    * csv::CSVRow::end()
-  * csv::CSVRow::to_json()
-  * csv::CSVRow::to_json_array()
+   * csv::CSVRow::operator std::vector<std::string>()
+   * csv::CSVRow::iterator
+     * csv::CSVRow::begin()
+     * csv::CSVRow::end()
+   * csv::CSVRow::to_json()
+   * csv::CSVRow::to_json_array()
  * csv::CSVField
-  * csv::CSVField::get(): \copybrief csv::CSVField::get()
-  * csv::CSVField::operator==()
+   * csv::CSVField::get(): \copybrief csv::CSVField::get()
+   * csv::CSVField::operator==()
+
+### DataFrame
+
+An in-memory keyed table built from a csv::CSVReader. Supports O(1) key lookup,
+column extraction, editing, and grouping.
+
+ * csv::DataFrame: Main container class (template parameter is the key type, default `std::string`)
+   * Construction
+     * From csv::CSVReader with an explicit key column
+   * Inspection
+     * csv::DataFrame::n_rows()
+     * csv::DataFrame::n_cols()
+     * csv::DataFrame::size()
+     * csv::DataFrame::empty()
+     * csv::DataFrame::has_column()
+     * csv::DataFrame::get_col_names()
+   * Row access by key
+     * csv::DataFrame::operator[](): Access by key value (O(1))
+     * csv::DataFrame::contains(): Check if a key exists
+     * csv::DataFrame::try_get(): Non-throwing keyed lookup
+   * Row access by position
+     * csv::DataFrame::iloc(): Positional access by index (use instead of `operator[](size_t)` for integer-keyed DataFrames)
+     * csv::DataFrame::try_get(): Non-throwing positional lookup (overloaded on `size_t`)
+   * Column extraction
+     * csv::DataFrame::column(): Extract all values from a named column as `std::vector<T>`
+   * Editing
+     * csv::DataFrame::set(): Edit a cell by key and column name
+     * csv::DataFrame::set_at(): Edit a cell by position and column name
+     * csv::DataFrame::erase_row(): Remove a row by key
+     * csv::DataFrame::erase_row_at(): Remove a row by position
+   * Grouping
+     * csv::DataFrame::group_by(): Group row indices by an arbitrary key function or column name
+   * Iteration
+     * csv::DataFrame::begin()
+     * csv::DataFrame::end()
+ * csv::DataFrameRow: Proxy row object returned by DataFrame access methods
+   * csv::DataFrameRow::operator[](): Access a field by column name
+   * csv::DataFrameRow::size()
+   * csv::DataFrameRow::empty()
+   * csv::DataFrameRow::get_col_names()
+ * csv::DataFrameOptions: Configuration for DataFrame construction
+   * Duplicate key policy (`OVERWRITE` or `KEEP_FIRST`)
+   * `set_key_column()`: Specify which column to use as the key
+   * `set_throw_on_missing_key()`: Control exception behavior for missing keys
 
 ### Statistics
  * csv::CSVStat
+   * csv::CSVStat::get_mean(): Per-column means
+   * csv::CSVStat::get_variance(): Per-column variances
+   * csv::CSVStat::get_mins(): Per-column minimums
+   * csv::CSVStat::get_maxes(): Per-column maximums
+   * csv::CSVStat::get_counts(): Per-column value frequency counts
+   * csv::CSVStat::get_dtypes(): Per-column inferred data types
+   * csv::CSVStat::get_col_names()
 
 ### CSV Writing
  * csv::make_csv_writer(): Construct a csv::CSVWriter
@@ -56,16 +112,7 @@ For quick examples, go to this project's [GitHub page](https://github.com/vincen
 See "How does automatic delimiter detection work?"
 
 ### How does automatic delimiter detection work?
-First, the CSV reader attempts to parse the first 100 lines of a CSV file as if the delimiter were a pipe, tab, comma, etc.
-Out of all the possible delimiter choices, the delimiter which produces the highest number of `rows * columns` (where all rows
-are of a consistent length) is chosen as the winner.
-
-However, if the CSV file has leading comments, or has less than 100 lines, a second heuristic will be used. The CSV reader again
-parses the first 100 lines using each candidate delimiter, but tallies up the length of each row parsed. Then, the delimiter with
-the largest most common row length `n` is chosen as the winner, and the line number where the first row of length `n` occurs
-is chosen as the starting row.
-
-Because you can subclass csv::CSVReader, you can implement your own guessing hueristic. csv::internals::CSVGuesser may be used as a helpful guide in doing so.
+See the implementation in csv::internals::_guess_format() — the source is the authoritative reference and is kept up to date.
 
 ### Is the CSV parser thread-safe?
 This library already does a lot of work behind the scenes to use threads to squeeze