Go to file
2025-07-22 13:29:01 -07:00
src/wikiq decrease moved paragraph detection cutoff to see if that fixes memory issue. 2025-07-22 13:29:01 -07:00
test make wikiq memory efficient again via batch processing. 2025-07-15 15:20:17 -07:00
.gitignore Allow specifying output file basename instead of just directory 2025-06-02 14:13:13 -05:00
.gitmodules got wikidiff2 persistence working except for paragraph moves. 2025-06-30 15:37:54 -07:00
.python-version Pin to python 3.9 2025-06-17 11:37:20 -05:00
pyproject.toml add memray for debugging memory usage. 2025-07-17 15:17:23 -07:00
README.md refactor into src-layout package. 2025-07-07 20:14:13 -07:00
README.rst got wikidiff2 persistence working except for paragraph moves. 2025-06-30 15:37:54 -07:00
runtest.sh compare pywikidiff2 to making requests to wikidiff2. 2025-07-07 10:51:11 -07:00

When you install this from git, you will need to first clone the repository:

git clone git://projects.mako.cc/mediawiki_dump_tools

From within the repository working directory, initiatlize and set up the submodule like:

git submodule init
git submodule update

Wikimedia dumps are usually in a compressed format such as 7z (most common), gz, or bz2. Wikiq uses your computer's compression software to read these files. Therefore wikiq depends on [7za]{.title-ref}, [gzcat]{.title-ref}, and [zcat]{.title-ref}.

Dependencies

These non-Python dependencies must be installed on your system for wikiq and its associated tests to work.

  • 7zip
  • ffmpeg

A new diff engine based on [_wikidiff2]{.title-ref} can be used for word-persistence. Wikiq can also output the diffs between each page revision. This requires installing Wikidiff 2 on your system. On Debian or Ubuntu Linux this can be done via.

apt-get install php-wikidiff2

You may have to also run. sudo phpenmod wikidiff2.

Tests ----To run tests:

python -m unittest test.Wikiq_Unit_Test

TODO:

  1. versions of deltas?