notmuch/test/T680-html-indexing.sh, branch master

notmuch/test/T680-html-indexing.sh, branch master thread-based email index, search, and tagging https://git.notmuchmail.org/git/notmuch/atom?h=master 2017-10-20T22:52:49Z test: use $(dirname "$0") for sourcing test-lib.sh 2017-10-20T22:52:49Z Jani Nikula jani@nikula.org 2017-09-25T20:38:19Z urn:sha1:a863de1e43ee34f6f5794a2759fdceb287e851aa Don't assume the tests are always run from within the source tree. lib/index: add simple html filter 2017-07-01T15:32:27Z David Bremner david@tethera.net 2017-06-08T02:11:49Z urn:sha1:6dd00d64863dfc0563877ca7899231b8c3058c49 The filter just drops all (HTML) tags. As an enabling change, pass the content type to the filter constructor so we can decide which scanner to user. test: add known broken test for indexing html 2017-04-20T09:59:40Z David Bremner david@tethera.net 2017-03-22T11:23:00Z urn:sha1:77c9ec1fddcbe145facfc3d65eee55b11ad61fb9 'quite' on IRC reported that notmuch new was grinding to a halt during initial indexing, and we eventually narrowed the problem down to some html parts with large embedded images. These cause the number of terms added to the Xapian database to explode (the first 400 messages generated 4.6M unique terms), and of course the resulting terms are not much use for searching. The second test is sanity check for any "improved" indexing of HTML.