Commit Graph

10 Commits

Author SHA1 Message Date
Nathan TeBlunthuis
6d03cac28d decrease batch_size. 2025-07-15 19:37:26 -07:00
Nathan TeBlunthuis
3a44cfd4da increase batch size. 2025-07-15 19:09:36 -07:00
Nathan TeBlunthuis
0fbe788e31 use ichunked instead of chunked. 2025-07-15 18:25:44 -07:00
Nathan TeBlunthuis
6b04791de2 reduce batch size. 2025-07-15 15:31:00 -07:00
Nathan TeBlunthuis
507335941d Revert "Merge branch 'compute-diffs' into HEAD"
This reverts commit 907a35323e, reversing
changes made to c40506137b.
2025-07-15 15:23:50 -07:00
Nathan TeBlunthuis
907a35323e Merge branch 'compute-diffs' into HEAD 2025-07-15 15:23:13 -07:00
Nathan TeBlunthuis
c40506137b make wikiq memory efficient again via batch processing. 2025-07-15 15:20:17 -07:00
Nathan TeBlunthuis
e53e7ada5d try fixing the memory problem. 2025-07-14 18:58:27 -07:00
Nathan TeBlunthuis
76d54ae597 support partitioning output parquet by namespace. 2025-07-07 20:58:43 -07:00
Nathan TeBlunthuis
c597a6b7f4 refactor into src-layout package. 2025-07-07 20:14:13 -07:00