Commit Graph

6 Commits

Author SHA1 Message Date
Will Beason
390499dd90 Pin to python 3.9
Since our execution environment requires this

Signed-off-by: Will Beason <willbeason@gmail.com>
2025-06-17 11:37:20 -05:00
Nathan TeBlunthuis
5a10f59dc4 Merge branch 'parquet_support' of gitea:collective/mediawiki_dump_tools into parquet_support 2025-05-28 23:52:59 -05:00
Nathan TeBlunthuis
b8cdc82fc2 add ipython for dev 2025-05-28 23:52:37 -05:00
Nathan TeBlunthuis
2a2b611d79 Fix issue with .7z archives
Before, only fandom wikis dumps were compressed with .7z.
These archives can have several .xml files in the .7z; not just one.
So we need to have a flag for the fandom-2020 dumps.

This fixes the bug so .7z archives work in either case.
2025-05-28 21:49:11 -07:00
Nathan TeBlunthuis
39fec0820d use my version of mwxml since it fixes a bug. 2025-05-28 21:13:18 -07:00
Nathan TeBlunthuis
15e9234903 adding pyproject.toml 2025-05-28 20:59:55 -07:00