<feed xmlns='http://www.w3.org/2005/Atom'>
<title>notmuch/test, branch debian/0.22.1-3</title>
<subtitle>thread-based email index, search, and tagging</subtitle>
<id>https://git.notmuchmail.org/git/notmuch/atom?h=debian%2F0.22.1-3</id>
<link rel='self' href='https://git.notmuchmail.org/git/notmuch/atom?h=debian%2F0.22.1-3'/>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/'/>
<updated>2016-08-14T04:27:57Z</updated>
<entry>
<title>test: make gdb even quieter</title>
<updated>2016-08-14T04:27:57Z</updated>
<author>
<name>David Bremner</name>
<email>david@tethera.net</email>
</author>
<published>2016-06-28T21:08:54Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=cf8aabdd3759519ad8c56852fe03908b1b09bc03'/>
<id>urn:sha1:cf8aabdd3759519ad8c56852fe03908b1b09bc03</id>
<content type='text'>
gdb sometimes writes warnings to stdout, which we don't need/want, and
for some reason --batch-silent isn't enough to hide. So in this commit
we write them to a log file, which is probably better for debugging
anyway. To see an illustrative test failure before this change, run

% make
% touch notmuch-count.c
% cd test &amp;&amp; ./T060-count.sh

(cherry picked from commit f45fa5bdd397d52473f7092f7ae3e2ffb9b7aee5)
</content>
</entry>
<entry>
<title>test: don't use dump and restore in a pipeline</title>
<updated>2016-06-30T15:47:36Z</updated>
<author>
<name>David Bremner</name>
<email>david@tethera.net</email>
</author>
<published>2016-06-28T08:24:07Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=48d33532bb58c2ff61d687011dc0283e3ff536b0'/>
<id>urn:sha1:48d33532bb58c2ff61d687011dc0283e3ff536b0</id>
<content type='text'>
This has been wrong since bbbdf0478ea, but the race condition was not
previously been (often?) triggered in the tests. With the DB_RETRY_LOCK
patches, it manifests itself as a deadlock.
</content>
</entry>
<entry>
<title>complete ghost-on-removal-when-shared-thread-exists</title>
<updated>2016-04-15T10:13:49Z</updated>
<author>
<name>Daniel Kahn Gillmor</name>
<email>dkg@fifthhorseman.net</email>
</author>
<published>2016-04-09T01:54:52Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=e366bb222722d6a635b736e875b760d82b46d1f5'/>
<id>urn:sha1:e366bb222722d6a635b736e875b760d82b46d1f5</id>
<content type='text'>
To fully complete the ghost-on-removal-when-shared-thread-exists
proposal, we need to clear all ghost messages when the last active
message is removed from a thread.

Amended by db: Remove the last test of T530, as it no longer makes sense
if we are garbage collecting ghost messages.
</content>
</entry>
<entry>
<title>fix thread breakage via ghost-on-removal</title>
<updated>2016-04-15T10:07:23Z</updated>
<author>
<name>Daniel Kahn Gillmor</name>
<email>dkg@fifthhorseman.net</email>
</author>
<published>2016-04-09T01:54:48Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=604d1e0977c2ede365f87492d6b9bf9a83c3e1d3'/>
<id>urn:sha1:604d1e0977c2ede365f87492d6b9bf9a83c3e1d3</id>
<content type='text'>
implement ghost-on-removal, the solution to T590-thread-breakage.sh
that just adds a ghost message after removing each message.

It leaks information about whether we've ever seen a given message id,
but it's a fairly simple implementation.

Note that _resolve_message_id_to_thread_id already introduces new
message_ids to the database, so i think just searching for a given
message ID may introduce the same metadata leakage.
</content>
</entry>
<entry>
<title>test thread breakage when messages are removed and re-added</title>
<updated>2016-04-15T10:07:23Z</updated>
<author>
<name>Daniel Kahn Gillmor</name>
<email>dkg@fifthhorseman.net</email>
</author>
<published>2016-04-09T01:54:47Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=92559ee3473381b0ba207ddb7db944d6ffadc2db'/>
<id>urn:sha1:92559ee3473381b0ba207ddb7db944d6ffadc2db</id>
<content type='text'>
This test (T590-thread-breakage.sh) has known-broken subtests.

If you have a two-message thread where message "B" is in-reply-to "A",
notmuch rightly sees this as a single thread.

But if you:

 * remove "A" from the message store
 * run "notmuch new"
 * add "A" back into the message store
 * re-run "notmuch new"

Then notmuch sees the messages as distinct threads.

This happens because if you insert "B" initially (before anything is
known about "A"), then a "ghost message" gets added to the database in
reference to "A" that is in the same thread, which "A" takes over when
it appears.

But if "A" is subsequently removed, no ghost message is retained, so
when "A" appears, it is treated as a new thread.

I see a few options to fix this:

ghost-on-removal
----------------

We could unilaterally add a ghost upon message removal.  This has a
few disadvantages: the message index would leak information about what
messages the user has ever been exposed to, and we also create a
perpetually-growing dataset -- the ghosts can never be removed.

ghost-on-removal-when-shared-thread-exists
------------------------------------------

We could add a ghost upon message removal iff there are other
non-ghost messages with the same thread ID.

We'd also need to remove all ghost messages that share a thread when
the last non-ghost message in that thread is removed.

This still has a bit of information leakage, though: the message index
would reveal that i've seen a newer message in a thread, even if i had
deleted it from my message store

track-dependencies
------------------

rather than a simple "ghost-message" we could store all the (A,B)
message-reference pairs internally, showing which messages A reference
which other messages B.

Then removal of message X would require deleting all message-reference
pairs (X,B), and only deleting a ghost message if no (A,X) reference
pair exists.

This requires modifying the database by adding a new and fairly weird
table that would need to be indexed by both columns.  I don't know
whether xapian has nice ways to do that.

scan-dependencies
-----------------

Without modifying the database, we could do something less efficient.

Upon removal of message X, we could scan the headers of all non-ghost
messages that share a thread with X.  If any of those messages refers
to X, we would add a ghost message.  If none of them do, then we would
just drop X entirely from the table.

---------------------

One risk of attempted fixes to this problem is that we could fail to
remove the search term indexes entirely.  This test contains
additional subtests to guard against that.

This test also ensures that the right number of ghost messages exist
in each situation; this will help us ensure we don't accumulate ghosts
indefinitely or leak too much information about what messages we've
seen or not seen, while still making it easy to reassemble threads
when messages come in out-of-order.
</content>
</entry>
<entry>
<title>test: add test-binary to print the number of ghost messages</title>
<updated>2016-04-15T10:07:23Z</updated>
<author>
<name>David Bremner</name>
<email>david@tethera.net</email>
</author>
<published>2016-04-09T01:54:46Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=f68e776617175fe77cbd7b29ce0fb2a1011117a8'/>
<id>urn:sha1:f68e776617175fe77cbd7b29ce0fb2a1011117a8</id>
<content type='text'>
This one-liner seems preferable to the complications of depending on
delve, getting the binary name right and parsing the output.
</content>
</entry>
<entry>
<title>lib: fix handling of one character long directory names at top level</title>
<updated>2016-04-12T23:40:19Z</updated>
<author>
<name>Jani Nikula</name>
<email>jani@nikula.org</email>
</author>
<published>2016-04-10T19:43:22Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=a352d9ceaa7e08b7c9de294419ec4c323b81ca15'/>
<id>urn:sha1:a352d9ceaa7e08b7c9de294419ec4c323b81ca15</id>
<content type='text'>
The code to skip multiple slashes in _notmuch_database_split_path()
skips back one character too much. This is compensated by a +1 in the
length parameter to the strndup() call. Mostly this works fine, but if
the path is to a file under a top level directory with one character
long name, the directory part is mistaken to be part of the file name
(slash == path in code). The returned directory name will be the empty
string and the basename will be the full path, breaking the indexing
logic in notmuch new.

Fix the multiple slash skipping to keep the slash variable pointing at
the last slash, and adjust strndup() accordingly.

The bug was introduced in

commit e890b0cf4011fd9fd77ebd87343379e4a778888b
Author: Carl Worth &lt;cworth@cworth.org&gt;
Date:   Sat Dec 19 13:20:26 2009 -0800

    database: Store the parent ID for each directory document.

just a little over two months after the initial commit in the Notmuch
code history, making this the longest living bug in Notmuch to date.
</content>
</entry>
<entry>
<title>test: test one character long directory names at top level</title>
<updated>2016-04-12T23:37:08Z</updated>
<author>
<name>Jani Nikula</name>
<email>jani@nikula.org</email>
</author>
<published>2016-04-10T19:43:21Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=0f6b399d5b2f925a75a69d7cda38dda5c67db7a1'/>
<id>urn:sha1:0f6b399d5b2f925a75a69d7cda38dda5c67db7a1</id>
<content type='text'>
Yes, it's broken. Reported by h01ger on IRC.
</content>
</entry>
<entry>
<title>test: cope with glass backend file naming variations</title>
<updated>2016-04-12T23:21:09Z</updated>
<author>
<name>David Bremner</name>
<email>david@tethera.net</email>
</author>
<published>2016-04-09T01:49:50Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=e311aad182326a1dcb0f8512e10b0e0f0faa9e2c'/>
<id>urn:sha1:e311aad182326a1dcb0f8512e10b0e0f0faa9e2c</id>
<content type='text'>
In several places in the test suite we intentionally corrupt the Xapian
database in order to test error handling. This corruption is specific to
the on-disk organization of the database, and that changed with the
glass backend. We use the previously computed default backend to make
the tests adapt to changing names.
</content>
</entry>
<entry>
<title>configure: add test for default xapian backend</title>
<updated>2016-04-12T23:14:43Z</updated>
<author>
<name>David Bremner</name>
<email>david@tethera.net</email>
</author>
<published>2016-04-09T01:49:49Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=deb4e5567c42afe834d83868b9337277256a0d66'/>
<id>urn:sha1:deb4e5567c42afe834d83868b9337277256a0d66</id>
<content type='text'>
This is mainly for the test suite.  We already expect the tests to be
run in the same environment as configure was run, at least to get the
name of the python interpreter. So we are not really imposing a new
restriction.
</content>
</entry>
</feed>
