<feed xmlns='http://www.w3.org/2005/Atom'>
<title>notmuch/notmuch-dump.c, branch 0.15.1</title>
<subtitle>thread-based email index, search, and tagging</subtitle>
<id>https://git.notmuchmail.org/git/notmuch/atom?h=0.15.1</id>
<link rel='self' href='https://git.notmuchmail.org/git/notmuch/atom?h=0.15.1'/>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/'/>
<updated>2013-01-07T02:40:32Z</updated>
<entry>
<title>dump/restore: Use Xapian queries for batch-tag format</title>
<updated>2013-01-07T02:40:32Z</updated>
<author>
<name>Austin Clements</name>
<email>amdragon@MIT.EDU</email>
</author>
<published>2013-01-06T20:22:41Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=425e2bc81263230df301c67d93c64ff9685ff840'/>
<id>urn:sha1:425e2bc81263230df301c67d93c64ff9685ff840</id>
<content type='text'>
This switches the new batch-tag format away from using a home-grown
hex-encoding scheme for message IDs in the dump to simply using Xapian
queries with Xapian quoting syntax.

This has a variety of advantages beyond presenting a cleaner and more
consistent interface.  Foremost is that it will dramatically simplify
the quoting for batch tagging, which shares the same input format.
While the hex-encoding is no better or worse for the simple ID queries
used by dump/restore, it becomes onerous for general-purpose queries
used in batch tagging.  It also better handles strange cases like
"id:foo and bar", since this is no longer syntactically valid.
</content>
</entry>
<entry>
<title>dump: Disallow \n in message IDs</title>
<updated>2013-01-07T02:40:01Z</updated>
<author>
<name>Austin Clements</name>
<email>amdragon@MIT.EDU</email>
</author>
<published>2013-01-06T20:22:40Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=d08c714b6a172cf0018bee4f60aff069d5508d4e'/>
<id>urn:sha1:d08c714b6a172cf0018bee4f60aff069d5508d4e</id>
<content type='text'>
When we switch to using regular Xapian queries in the dump format, \n
will cause problems, so we disallow it.  Specially, while Xapian can
quote and parse queries containing \n without difficultly, quoted
queries containing \n still span multiple lines, which breaks the
line-orientedness of the dump format.  Strictly speaking, we could
still round-trip these, but it would significantly complicate restore
as well as scripts that deal with tag dumps.  This complexity would
come at absolutely no benefit: because of the RFC 2822 unfolding
rules, no amount of standards negligence can produce a message with a
message ID containing a line break (not even Outlook can do it!).

Hence, we simply disallow it.
</content>
</entry>
<entry>
<title>notmuch-dump: add --format=(batch-tag|sup)</title>
<updated>2012-12-08T14:40:54Z</updated>
<author>
<name>David Bremner</name>
<email>bremner@debian.org</email>
</author>
<published>2012-06-14T22:08:42Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=bfe66da4321ce63b6fb3693eddf809e2e0532888'/>
<id>urn:sha1:bfe66da4321ce63b6fb3693eddf809e2e0532888</id>
<content type='text'>
sup is the old format, and remains the default, at least until
restore is converted to parse this format.

Each line of the batch-tag format is modelled on the syntax of notmuch tag:
- "notmuch tag" is omitted from the front of the line
- The dump format only uses query strings of a single message-id.
- Each space seperated tag/message-id is 'hex-encoded' to remove
  trouble-making characters.
- It is permitted (and will be useful) for there to be no tags before
  the query.

In particular this format won't have the same problem with e.g. spaces
in message-ids or tags; they will be round-trip-able.
</content>
</entry>
<entry>
<title>notmuch-dump: tidy formatting</title>
<updated>2012-11-16T12:46:31Z</updated>
<author>
<name>David Bremner</name>
<email>bremner@debian.org</email>
</author>
<published>2012-11-15T01:33:22Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=4c38148c20cd055f47b9c6ea7858bbd885a92354'/>
<id>urn:sha1:4c38148c20cd055f47b9c6ea7858bbd885a92354</id>
<content type='text'>
More uncrustify at work.
</content>
</entry>
<entry>
<title>notmuch-dump: remove deprecated positional argument for output file</title>
<updated>2012-08-06T11:52:33Z</updated>
<author>
<name>David Bremner</name>
<email>bremner@debian.org</email>
</author>
<published>2012-08-04T02:23:11Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=760e17488e6b11299f2971cf879b109b84816d14'/>
<id>urn:sha1:760e17488e6b11299f2971cf879b109b84816d14</id>
<content type='text'>
The syntax --output=filename is a smaller change than deleting the
output argument completely, and conceivably useful e.g. when running
notmuch under a debugger.
</content>
</entry>
<entry>
<title>lib/cli: Make notmuch_database_open return a status code</title>
<updated>2012-05-05T13:11:57Z</updated>
<author>
<name>Austin Clements</name>
<email>amdragon@MIT.EDU</email>
</author>
<published>2012-04-30T16:25:33Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=5fddc07dc31481453c1af186bf7da241c00cdbf1'/>
<id>urn:sha1:5fddc07dc31481453c1af186bf7da241c00cdbf1</id>
<content type='text'>
It has been a long-standing issue that notmuch_database_open doesn't
return any indication of why it failed.  This patch changes its
prototype to return a notmuch_status_t and set an out-argument to the
database itself, like other functions that return both a status and an
object.

In the interest of atomicity, this also updates every use in the CLI
so that notmuch still compiles.  Since this patch does not update the
bindings, the Python bindings test fails.
</content>
</entry>
<entry>
<title>Use notmuch_database_destroy instead of notmuch_database_close</title>
<updated>2012-04-28T12:27:33Z</updated>
<author>
<name>Justus Winter</name>
<email>4winter@informatik.uni-hamburg.de</email>
</author>
<published>2012-04-22T12:07:53Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=6f7469f54744656f90ce215f365d5731e16acd3c'/>
<id>urn:sha1:6f7469f54744656f90ce215f365d5731e16acd3c</id>
<content type='text'>
Adapt the notmuch binaries source to the notmuch_database_close split.

Signed-off-by: Justus Winter &lt;4winter@informatik.uni-hamburg.de&gt;
</content>
</entry>
<entry>
<title>notmuch-dump: convert to command-line-arguments</title>
<updated>2011-12-09T00:24:24Z</updated>
<author>
<name>David Bremner</name>
<email>bremner@debian.org</email>
</author>
<published>2011-12-02T06:08:51Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=7ced2e32d1612aa5eb154b31705236f2483f364e'/>
<id>urn:sha1:7ced2e32d1612aa5eb154b31705236f2483f364e</id>
<content type='text'>
The output file is handled via positional arguments. There are
currently no "normal" options.
</content>
</entry>
<entry>
<title>notmuch-dump.c: whitespace cleanup</title>
<updated>2011-12-04T15:28:42Z</updated>
<author>
<name>David Bremner</name>
<email>bremner@debian.org</email>
</author>
<published>2011-12-03T22:32:18Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=1c81e8f6d3aa451cec8524c171b9a64f7ecd2003'/>
<id>urn:sha1:1c81e8f6d3aa451cec8524c171b9a64f7ecd2003</id>
<content type='text'>
</content>
</entry>
<entry>
<title>dump: Don't sort the output by message id.</title>
<updated>2011-11-28T15:57:45Z</updated>
<author>
<name>Thomas Schwinge</name>
<email>thomas@schwinge.name</email>
</author>
<published>2011-11-27T18:40:53Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=3a0a7303368a515acc8e73bd211818e852b7e18c'/>
<id>urn:sha1:3a0a7303368a515acc8e73bd211818e852b7e18c</id>
<content type='text'>
Asking xapian to sort the messages for us causes suboptimal IO patterns. This
would be useful, if we only wanted the first few results, but since we want
everything anyway, this is pessimization.

On 2011-10-29, a measurement on a 372981 messages instance showed that wall
time can be reduced from 28 minutes (sorted by Message-ID) to 15 minutes
(unsorted).

Timings on 189605 messages:

$ time notmuch.old dump
19.48user 5.83system 12:10.42elapsed 3%CPU (0avgtext+0avgdata 110656maxresident)k
3629584inputs+22720outputs (33major+7073minor)pagefaults 0swaps
$ echo 3 &gt; /proc/sys/vm/drop_caches
$ time notmuch.new
14.89user 1.20system 3:23.58elapsed 7%CPU (0avgtext+0avgdata 46032maxresident)k
1256264inputs+22464outputs (43major+1990minor)pagefaults 0swaps
</content>
</entry>
</feed>
