Commit Graph

15 Commits

Author SHA1 Message Date
414cc5ff2d validate tests and add asserts and baselines for regex tests. 2019-11-09 12:19:55 -08:00
sohyeonhwang
4ccde84529 added regex scanner v2's dump unit test file regextest.xml.bz2 2019-11-07 14:06:15 -06:00
sohyeonhwang
f147e1d899 merging pull containing revert-radius with 2nd version of regex scanner w/ unit tests 2019-11-07 13:28:17 -06:00
c84844cfb5 add unit tests for configuring revert_radius 2019-10-07 15:02:30 -07:00
7b856bec86 Merge branch 'master' into regex_scanner 2019-10-05 18:17:03 -07:00
324ccc8e26 update baseline outputs 2019-10-05 16:36:07 -07:00
sohyeonhwang
7bf4559ceb changes for regex scanner addition 2019-10-05 15:36:58 -05:00
f7f5bf8fd4 sub assertEquals assertEqual 2018-09-03 11:21:49 -07:00
7cd0bf3b9e Add parameter for selecting specific namespaces. 2018-08-23 18:49:32 -07:00
f468d1a5b6 add support for persistence with segment matching 2018-08-20 16:08:16 -07:00
bf396ad366 Prefix page titles with namespace names. 2018-07-09 22:11:17 -07:00
dba793c6ac migrate to mwxml. This completes the migration away from python-mediawiki-utilities. Except for preserving legacy persistence behavior, we can safely use the nice updates from the mediawiki-utils project. 2018-07-05 01:16:00 -07:00
d77b0a4965 migrate to mwpersistence. this fixes many issues. We preserve legacy persistence behavior using the --persistence-legacy. 2018-07-04 19:06:07 -07:00
e925ac9da1 add tests for wikipedia, malformed xml, bzip2, correct bz2 bug in wikiq. 2018-07-04 15:08:30 -07:00
d2746879d0 create baseline tests for xml dump processing 2018-07-03 23:43:47 -07:00