You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -99,7 +99,7 @@ Please keep reports grounded in real use cases—no contrived edge cases or phil
99
99
100
100
## Documentation
101
101
102
-
In addition to the [Features & Examples](#features--examples) below, a [fully-fledged online documentation](https://vincela.com/csv/) contains more examples, details, interesting features, and instructions for less common use cases.
102
+
In addition to the [Features & Examples](#features--examples) below, a [fully-fledged online documentation](https://vincentlaucsb.github.io/csv-parser/) contains more examples, details, interesting features, and instructions for less common use cases.
103
103
104
104
## Sponsors
105
105
If you use this library for work, please [become a sponsor](https://github.com/sponsors/vincentlaucsb). Your donation
* csv::CSVStat::get_counts(): Per-column value frequency counts
96
+
* csv::CSVStat::get_dtypes(): Per-column inferred data types
97
+
* csv::CSVStat::get_col_names()
42
98
43
99
### CSV Writing
44
100
* csv::make_csv_writer(): Construct a csv::CSVWriter
@@ -56,16 +112,7 @@ For quick examples, go to this project's [GitHub page](https://github.com/vincen
56
112
See "How does automatic delimiter detection work?"
57
113
58
114
### How does automatic delimiter detection work?
59
-
First, the CSV reader attempts to parse the first 100 lines of a CSV file as if the delimiter were a pipe, tab, comma, etc.
60
-
Out of all the possible delimiter choices, the delimiter which produces the highest number of `rows * columns` (where all rows
61
-
are of a consistent length) is chosen as the winner.
62
-
63
-
However, if the CSV file has leading comments, or has less than 100 lines, a second heuristic will be used. The CSV reader again
64
-
parses the first 100 lines using each candidate delimiter, but tallies up the length of each row parsed. Then, the delimiter with
65
-
the largest most common row length `n` is chosen as the winner, and the line number where the first row of length `n` occurs
66
-
is chosen as the starting row.
67
-
68
-
Because you can subclass csv::CSVReader, you can implement your own guessing hueristic. csv::internals::CSVGuesser may be used as a helpful guide in doing so.
115
+
See the implementation in csv::internals::_guess_format() — the source is the authoritative reference and is kept up to date.
69
116
70
117
### Is the CSV parser thread-safe?
71
118
This library already does a lot of work behind the scenes to use threads to squeeze
0 commit comments