Logo
Explore Help
Sign In
collective/mediawiki_dump_tools
18
0
Fork 0
You've already forked mediawiki_dump_tools
Code Issues Pull Requests Activity
33 Commits 10 Branches 0 Tags
2d5008113b32082eac56a67e90b8267257df3381
Commit Graph

11 Commits

Author SHA1 Message Date
groceryheist
2d5008113b add flag for excluding whitespace and punctuation 2018-12-12 16:38:47 -08:00
groceryheist
19eda6dd0e use only a part of the sailormoon wiki 2018-12-12 16:12:57 -08:00
groceryheist
4089ebae92 create state where all tests pass 2018-12-12 16:08:00 -08:00
groceryheist
9c5a1b18f0 add test files 2018-12-12 14:56:48 -08:00
groceryheist
7cd0bf3b9e Add parameter for selecting specific namespaces. 2018-08-23 18:49:32 -07:00
groceryheist
f468d1a5b6 add support for persistence with segment matching 2018-08-20 16:08:16 -07:00
groceryheist
bf396ad366 Prefix page titles with namespace names. 2018-07-09 22:11:17 -07:00
groceryheist
dba793c6ac migrate to mwxml. This completes the migration away from python-mediawiki-utilities. Except for preserving legacy persistence behavior, we can safely use the nice updates from the mediawiki-utils project. 2018-07-05 01:16:00 -07:00
groceryheist
d77b0a4965 migrate to mwpersistence. this fixes many issues. We preserve legacy persistence behavior using the --persistence-legacy. 2018-07-04 19:06:07 -07:00
groceryheist
e925ac9da1 add tests for wikipedia, malformed xml, bzip2, correct bz2 bug in wikiq. 2018-07-04 15:08:30 -07:00
groceryheist
d2746879d0 create baseline tests for xml dump processing 2018-07-03 23:43:47 -07:00
Powered by Gitea Version: 1.25.4 Page: 56ms Template: 12ms
English
Bahasa Indonesia Deutsch English Español Français Gaeilge Italiano Latviešu Magyar nyelv Nederlands Polski Português de Portugal Português do Brasil Suomi Svenska Türkçe Čeština Ελληνικά Български Русский Українська فارسی മലയാളം 日本語 简体中文 繁體中文(台灣) 繁體中文(香港) 한국어
Licenses API