Nathan TeBlunthuis
|
8590e5f920
|
fix jsonl.d output.
|
2025-12-30 11:26:24 -08:00 |
|
Nathan TeBlunthuis
|
93f6ed0ff5
|
fix bug by truncating corrupted jsonl lines.
|
2025-12-23 19:52:37 -08:00 |
|
Nathan TeBlunthuis
|
3f1a9ba862
|
refactor and enable jsonl output.
|
2025-12-21 23:42:18 -08:00 |
|
Nathan TeBlunthuis
|
6988a281dc
|
output parquet files in chunks to avoid memory issues with parquet.
|
2025-12-20 21:45:39 -08:00 |
|
Nathan TeBlunthuis
|
6a4bf81e1a
|
add test for two wikiq jobs in the same directory.
|
2025-12-19 11:50:56 -08:00 |
|
Nathan TeBlunthuis
|
006feb795c
|
fix interruption handling by breaking the diff loop.
|
2025-12-18 18:00:30 -08:00 |
|
Nathan TeBlunthuis
|
6b4f3939a5
|
more work on resuming.
|
2025-12-10 21:07:52 -08:00 |
|