Go to file
Will Beason f916af9836 Allow specifying output file basename instead of just directory
This is optional, and doesn't impact existing users as preexisting
behavior when users specify an output directory is unchanged.

This makes tests not need to copy large files as part of their
execution, as they can ask files to be written to explicit
locations.

Signed-off-by: Will Beason <willbeason@gmail.com>
2025-06-02 14:13:13 -05:00
test Allow specifying output file basename instead of just directory 2025-06-02 14:13:13 -05:00
.gitignore Allow specifying output file basename instead of just directory 2025-06-02 14:13:13 -05:00
.gitmodules migrate to mwpersistence. this fixes many issues. We preserve legacy persistence behavior using the --persistence-legacy. 2018-07-04 19:06:07 -07:00
pyproject.toml Merge branch 'parquet_support' of gitea:collective/mediawiki_dump_tools into parquet_support 2025-05-28 23:52:59 -05:00
README.rst Make tests runnable from anywhere 2025-05-27 13:40:57 -05:00
wikiq Allow specifying output file basename instead of just directory 2025-06-02 14:13:13 -05:00

When you install this from git, you will need to first clone the repository::

  git clone git://projects.mako.cc/mediawiki_dump_tools

From within the repository working directory, initiatlize and set up the
submodule like::

  git submodule init
  git submodule update


Wikimedia dumps are usually in a compressed format such as 7z (most common), gz, or bz2. Wikiq uses your computer's compression software to read these files. Therefore wikiq depends on
`7za`, `gzcat`, and `zcat`. 

Dependencies
----------------
These non-Python dependencies must be installed on your system for wikiq and its
associated tests to work.

- 7zip
- ffmpeg

Tests
----
To run tests::

   python -m unittest test.Wikiq_Unit_Test

TODO:
_______________
1. [] Output metadata about the run. What parameters were used? What versions of deltas?
2. [] Url encoding by default