Go to file
Will Beason 96915a074b Add call to compute diffs via local PHP server
This is inefficient as it requires an individal request per diff.

Going to try collecting the revision texts to reduce communication
overhead.

Signed-off-by: Will Beason <willbeason@gmail.com>
2025-06-23 13:09:27 -05:00
test Add call to compute diffs via local PHP server 2025-06-23 13:09:27 -05:00
.gitignore Allow specifying output file basename instead of just directory 2025-06-02 14:13:13 -05:00
.gitmodules migrate to mwpersistence. this fixes many issues. We preserve legacy persistence behavior using the --persistence-legacy. 2018-07-04 19:06:07 -07:00
.python-version Pin to python 3.9 2025-06-17 11:37:20 -05:00
pyproject.toml Merge branch 'parquet_support' into test-parquet 2025-06-17 12:20:19 -05:00
README.rst Make tests runnable from anywhere 2025-05-27 13:40:57 -05:00
tables.py Merge branch 'parquet_support' into test-parquet 2025-06-17 12:20:19 -05:00
wikiq Add call to compute diffs via local PHP server 2025-06-23 13:09:27 -05:00

When you install this from git, you will need to first clone the repository::

  git clone git://projects.mako.cc/mediawiki_dump_tools

From within the repository working directory, initiatlize and set up the
submodule like::

  git submodule init
  git submodule update


Wikimedia dumps are usually in a compressed format such as 7z (most common), gz, or bz2. Wikiq uses your computer's compression software to read these files. Therefore wikiq depends on
`7za`, `gzcat`, and `zcat`. 

Dependencies
----------------
These non-Python dependencies must be installed on your system for wikiq and its
associated tests to work.

- 7zip
- ffmpeg

Tests
----
To run tests::

   python -m unittest test.Wikiq_Unit_Test

TODO:
_______________
1. [] Output metadata about the run. What parameters were used? What versions of deltas?
2. [] Url encoding by default