Go to file
Will Beason 8c707f5ef3 Remove unused code
This should help PR readability.

There is likely still some unused code, but that should be the
bulk of it.

Signed-off-by: Will Beason <willbeason@gmail.com>
2025-06-03 17:20:05 -05:00
test Fix revert column behavior 2025-06-03 15:03:33 -05:00
.gitignore Allow specifying output file basename instead of just directory 2025-06-02 14:13:13 -05:00
.gitmodules migrate to mwpersistence. this fixes many issues. We preserve legacy persistence behavior using the --persistence-legacy. 2018-07-04 19:06:07 -07:00
pyproject.toml Merge branch 'parquet_support' of gitea:collective/mediawiki_dump_tools into parquet_support 2025-05-28 23:52:59 -05:00
README.rst Make tests runnable from anywhere 2025-05-27 13:40:57 -05:00
tables.py Get regex working 2025-06-03 16:02:18 -05:00
wikiq Remove unused code 2025-06-03 17:20:05 -05:00

When you install this from git, you will need to first clone the repository::

  git clone git://projects.mako.cc/mediawiki_dump_tools

From within the repository working directory, initiatlize and set up the
submodule like::

  git submodule init
  git submodule update


Wikimedia dumps are usually in a compressed format such as 7z (most common), gz, or bz2. Wikiq uses your computer's compression software to read these files. Therefore wikiq depends on
`7za`, `gzcat`, and `zcat`. 

Dependencies
----------------
These non-Python dependencies must be installed on your system for wikiq and its
associated tests to work.

- 7zip
- ffmpeg

Tests
----
To run tests::

   python -m unittest test.Wikiq_Unit_Test

TODO:
_______________
1. [] Output metadata about the run. What parameters were used? What versions of deltas?
2. [] Url encoding by default