<feed xmlns='http://www.w3.org/2005/Atom'>
<title>notmuch/notmuch-new.c, branch 0.8</title>
<subtitle>thread-based email index, search, and tagging</subtitle>
<id>https://git.notmuchmail.org/git/notmuch/atom?h=0.8</id>
<link rel='self' href='https://git.notmuchmail.org/git/notmuch/atom?h=0.8'/>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/'/>
<updated>2011-06-29T23:10:41Z</updated>
<entry>
<title>new: Improved workaround for mistaken new directories</title>
<updated>2011-06-29T23:10:41Z</updated>
<author>
<name>Austin Clements</name>
<email>amdragon@MIT.EDU</email>
</author>
<published>2011-06-29T23:00:01Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=bb2b33fbb85b88eb7d3bedf4cb261a89da20f504'/>
<id>urn:sha1:bb2b33fbb85b88eb7d3bedf4cb261a89da20f504</id>
<content type='text'>
Currently, notmuch new assumes any directory with a database mtime of
0 is new, but we don't set the mtime until after processing messages
and subdirectories in that directory.  Hence, anything that prevents
the mtime update (such as an interruption or the wall-clock logic
introduced in 8c39e8d6) will cause the next notmuch new to think the
directory is still new.

We work around this by setting the new directory's database mtime to
-1 before scanning anything in the new directory.  This also obviates
the need for the workaround used in 8c39e8d6.
</content>
</entry>
<entry>
<title>new: Don't update DB mtime if FS mtime equals wall-clock time.</title>
<updated>2011-06-29T22:26:04Z</updated>
<author>
<name>Austin Clements</name>
<email>amdragon@MIT.EDU</email>
</author>
<published>2011-06-29T07:10:54Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=8c39e8d6fbc1202605494d481b27be6bcccaf500'/>
<id>urn:sha1:8c39e8d6fbc1202605494d481b27be6bcccaf500</id>
<content type='text'>
This fixes a race where multiple message deliveries in the same second
with an intervening notmuch new could result in messages being ignored
by notmuch (at least, until a later delivery forced a rescan).
Because mtimes only have second granularity, later deliveries in the
same second won't change the directory mtime, and hence won't trigger
notmuch new to rescan the directory.  This situation can only occur
when notmuch new is being run at the same second as the directory's
modification time, so simply don't update the saved mtime in this
case.

This very race happens all over the test suite, and is currently
compensated for with increment_mtime (and, occasionally, luck).  With
this change, increment_mtime becomes unnecessary.
</content>
</entry>
<entry>
<title>fix sum moar typos [comments in source code]</title>
<updated>2011-06-23T22:58:39Z</updated>
<author>
<name>Pieter Praet</name>
<email>pieter@praet.org</email>
</author>
<published>2011-06-20T20:14:21Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=8bb6f7869c4c98190f010d60409938b1c50c5968'/>
<id>urn:sha1:8bb6f7869c4c98190f010d60409938b1c50c5968</id>
<content type='text'>
Various typo fixes in comments within the source code.

Signed-off-by: Pieter Praet &lt;pieter@praet.org&gt;

Edited-by: Carl Worth &lt;cworth@cworth.org&gt; Restricted to just
source-code comments, (and fixed fix of "descriptios" to "descriptors"
rather than "descriptions").
</content>
</entry>
<entry>
<title>Remove some variables which were set but not used.</title>
<updated>2011-05-11T20:27:14Z</updated>
<author>
<name>Carl Worth</name>
<email>cworth@cworth.org</email>
</author>
<published>2011-05-11T19:34:13Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=2f3a76c569e5efad54520613315c0d29512ce69c'/>
<id>urn:sha1:2f3a76c569e5efad54520613315c0d29512ce69c</id>
<content type='text'>
gcc (at least as of version 4.6.0) is kind enough to point these out to us,
(when given -Wunused-but-set-variable explicitly or implicitly via -Wunused
or -Wall).

One of these cases was a legitimately unused variable. Two were simply
variables (named ignored) we were assigning only to squelch a warning about
unused function return values. I don't seem to be getting those warnings
even without setting the ignored variable. And the gcc docs. say that the
correct way to squelch that warning is with a cast to (void) anyway.
</content>
</entry>
<entry>
<title>new: Update comments for add_files_recursive</title>
<updated>2011-03-10T19:56:16Z</updated>
<author>
<name>Carl Worth</name>
<email>cworth@cworth.org</email>
</author>
<published>2011-03-10T19:56:16Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=61d4d89572e18eeb0e84e67a3a699289b2c7c9cf'/>
<id>urn:sha1:61d4d89572e18eeb0e84e67a3a699289b2c7c9cf</id>
<content type='text'>
The most recent commit optimized the implementation of this
function. This commit simply updates the relevant comments to match
the new implementation.
</content>
</entry>
<entry>
<title>new: read db_files and db_subdirs only if mtime changed</title>
<updated>2011-03-10T19:48:33Z</updated>
<author>
<name>Karel Zak</name>
<email>kzak@redhat.com</email>
</author>
<published>2011-02-04T21:44:31Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=b0006b6ea2357572637b0c7946dfd074cfe18178'/>
<id>urn:sha1:b0006b6ea2357572637b0c7946dfd074cfe18178</id>
<content type='text'>
The db_files and db_subdirs are unnecessary for unchanged directories.

maildir with 10000 e-mails:

old version:
	$ time ./notmuch new
	No new mail.

	real    0m0.053s
	user    0m0.028s
	sys     0m0.026s

new version:
	$ time ./notmuch new
	No new mail.

	real    0m0.032s
	user    0m0.009s
	sys     0m0.023s

Signed-off-by: Karel Zak &lt;kzak@redhat.com&gt;

Reviewed-by:  Austin Clements &lt;amdragon@mit.edu&gt;

Looks good (faster than, but provably equivalent to the original code!
notmuch_directory_get_child_* are side-effect free,
db_files/db_subdirs aren't used between where they were set in the old
code and where they are set in the new code, and db_files/db_subdirs
are initialized to NULL when declared).

Another timing data point:
Old code: ./notmuch new  0.77s user 0.28s system 99% cpu 1.051 total
New code: ./notmuch new  0.09s user 0.27s system 98% cpu 0.368 total
</content>
</entry>
<entry>
<title>new: Print progress estimates only when we have sufficient information</title>
<updated>2011-01-26T13:47:51Z</updated>
<author>
<name>Michal Sojka</name>
<email>sojkam1@fel.cvut.cz</email>
</author>
<published>2011-01-26T13:06:57Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=c58523088ac7fcbfa841187b1447269b638bfa95'/>
<id>urn:sha1:c58523088ac7fcbfa841187b1447269b638bfa95</id>
<content type='text'>
Without this patch, it might happen that the remaining time or processing
rate were calculated just after start where nothing was processed yet.
This resulted into division by a very small number (or zero) and the
printed information was of little value.

Instead of printing nonsenses we print only that the operation is in
progress. The estimates will be printed later, after there is enough data.
</content>
</entry>
<entry>
<title>new: Enhance progress reporting</title>
<updated>2011-01-26T12:10:11Z</updated>
<author>
<name>Michal Sojka</name>
<email>sojkam1@fel.cvut.cz</email>
</author>
<published>2011-01-21T09:59:37Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=90a505373ef5a8135357f2da3cdf6837e32c3a7a'/>
<id>urn:sha1:90a505373ef5a8135357f2da3cdf6837e32c3a7a</id>
<content type='text'>
notmuch new reports progress only during the "first" phase when the
files on disk are traversed and indexed. After this phase, other
operations like rename detection and maildir flags synchronization are
performed, but the user is not informed about them. Since these
operations can take significant time, we want to inform the user about
them.

This patch enhances the progress reporting facility that was already
present. The timer that triggers reporting is not stopped after the
first phase but continues to run until all operations are finished. The
rename detection and maildir flag synchronization are enhanced to report
their progress.
</content>
</entry>
<entry>
<title>new: Add all initial tags at once</title>
<updated>2011-01-26T12:05:28Z</updated>
<author>
<name>Michal Sojka</name>
<email>sojkam1@fel.cvut.cz</email>
</author>
<published>2011-01-21T09:59:36Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=7c450905e41d9bc81aca82f4593a5b42a4bb8e31'/>
<id>urn:sha1:7c450905e41d9bc81aca82f4593a5b42a4bb8e31</id>
<content type='text'>
If there are several tags applied to the new messages, it is beneficial
to store them to the database at one, because it saves some time,
especially when the notmuch new is run for the first time.

This patch decreased the time for initial import from 1h 35m to 1h 14m.
</content>
</entry>
<entry>
<title>Do not defer maildir flag synchronization for new messages</title>
<updated>2011-01-26T11:52:54Z</updated>
<author>
<name>Austin Clements</name>
<email>amdragon@mit.edu</email>
</author>
<published>2011-01-26T11:52:54Z</published>
<link rel='alternate' type='text/html' href='https://git.notmuchmail.org/git/notmuch/commit/?id=de2acbd49c8fdb0c5bc28513283a9e12eefdaca3'/>
<id>urn:sha1:de2acbd49c8fdb0c5bc28513283a9e12eefdaca3</id>
<content type='text'>
This is a simplified version of a patch originally by Michal Sojka
&lt;sojkam1@fel.cvut.cz&gt; which is designed to have the same performance
benefits. Michal said the following:

  When notmuch new is run for the first time, it is not necessary to
  defer maildir flags synchronization to later because we already know
  that no files will be removed.

  Performing the maildinr flag synchronization immediately after the
  message is added to the database has the advantage that the message
  is likely hot in the disk cache so the synchronization is faster.
  Additionally, we also save one database query for each message,
  which must be performed when the operation is deferred.

  Without this patch, the first notmuch new of 200k messages (3 GB)
  took 1h and 46m out of which 20m was maildir flags
  synchronization. With this patch, the whole operation took only 1h
  and 36m.

Unlike Michal's patch, this version does the deferral for any new
message, rather than doing it only on the first run of "notmuch new".
</content>
</entry>
</feed>
