You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
See [docs/LOCAL_REPRODUCIBILITY.md](docs/LOCAL_REPRODUCIBILITY.md) for details.
483
-
484
482
---
485
483
486
484
## Input File Formats
@@ -779,7 +777,7 @@ data = create_plot_input("results/", start=47000000, end=49000000, rank_scores="
779
777
780
778
## Benchmarks
781
779
782
-
All benchmarks were run locally on a macOS 14 / arm64 (Apple M4 Pro) workstation with 14 logical CPUs and 48 GB RAM. Python 3.12, zarr 2.x. Full benchmark scripts are provided under [`scripts/benchmarks/`](scripts/benchmarks). Raw logs, machine specs, and TSV summaries are kept in this repository's manuscript handoff bundle (see `temp_review/hand_off/benchmark_package/`).
780
+
All benchmarks were run locally on a macOS 14 / arm64 (Apple M4 Pro) workstation with 14 logical CPUs and 48 GB RAM. Python 3.12, zarr 2.x. Full benchmark scripts are provided under [`scripts/benchmarks/`](scripts/benchmarks).
See [`docs/BENCHMARKS.md`](docs/BENCHMARKS.md) for the full reference.
828
-
829
825
## Validation on Public Data
830
826
831
827
ExP Heatmap ships with two reproducible 1000 Genomes Project validation pipelines under [`scripts/validation/`](scripts/validation). Both scripts start from public Phase 3 release URLs and run end-to-end through `filter-vcf` → `prepare` → `compute` → `plot`.
@@ -838,15 +834,15 @@ The `run_1kg_chr15_slc24a5.sh` script reproduces the pigmentation-locus showcase
|`compute`| 3034.42 s (XP-EHH, all 26 × 26 ordered pairs) |
837
+
|`compute`| 3034.42 s (XP-EHH, all 650 ordered population pairs from 26 populations) |
842
838
|`plot` (static) | 28.43 s |
843
839
|`plot` (interactive) | 14.04 s |
844
840
845
841
Output artifacts include a 21.95 GB filtered VCF, 499.69 MB Zarr store, 2.94 GB of pairwise TSV results, and a 442 KB PNG / 21.54 MB HTML heatmap of the SLC24A5 window (chr15:47,924,019-48,924,019).
846
842
847
843
### chr2 / LCT locus (region-scoped reconstruction)
848
844
849
-
The `run_1kg_chr2_lct_reconstruction.sh` script reconstructs the canonical lactase-persistence locus as a region-scoped public-data run. Because the main manuscript LCT figure uses an archived author-prepared bundle, the script does not try to re-run whole-chromosome compute; instead it filters chromosome 2 to the plotted LCT window plus 1 Mb of flanking sequence on each side before preparing and computing.
845
+
The `run_1kg_chr2_lct_reconstruction.sh` script reconstructs the canonical lactase-persistence locus as a region-scoped public-data run.The script does not try to re-run whole-chromosome compute; instead it filters chromosome 2 to the plotted LCT window plus 1 Mb of flanking sequence on each side before preparing and computing.
850
846
851
847
| Stage | Wall-clock |
852
848
|-------|-----------:|
@@ -856,7 +852,7 @@ The `run_1kg_chr2_lct_reconstruction.sh` script reconstructs the canonical lacta
856
852
|`plot` (static) | 4.61 s |
857
853
|`plot` (interactive) | 2.11 s |
858
854
859
-
This scales the whole-locus reconstruction to ~7 minutes of wall time end-to-end while keeping the plotted window and interpretation identical. See [`docs/VALIDATION_1KG_PIPELINES.md`](docs/VALIDATION_1KG_PIPELINES.md) for details, and [`scripts/validation/summarize_validation_run.py`](scripts/validation/summarize_validation_run.py) to regenerate the JSON/Markdown summaries from log files.
855
+
This scales the whole-locus reconstruction to ~7 minutes of wall time end-to-end while keeping the plotted window and interpretation identical. See [`scripts/validation/summarize_validation_run.py`](scripts/validation/summarize_validation_run.py) to regenerate the JSON/Markdown summaries from log files.
860
856
861
857
## 1000 Genomes Population Reference
862
858
@@ -1002,7 +998,7 @@ The suite currently contains 28 tests covering rank-score generation with ties a
1002
998
1003
999
### Building documentation-facing assets
1004
1000
1005
-
Reproducibility scripts, benchmark drivers, and validation pipelines live under [`scripts/`](scripts) and are covered in [`docs/`](docs).
1001
+
Reproducibility scripts, benchmark drivers, and validation pipelines live under [`scripts/`](scripts).
0 commit comments