notmuch - thread-based email index, search, and tagging

Age	Commit message (Collapse)	Author
2014-11-06	lib: bump LIBNOTMUCH_MAJOR_VERSION to 4	David Bremner
	This should have been done at the same time as the SONAME bump.
2014-10-25	lib: Remove unnecessary thread linking steps when using ghost messages	Austin Clements
	Previously, it was necessary to link new messages to children to work around some (though not all) problems with the old metadata-based approach to stored thread IDs. With ghost messages, this is no longer necessary, so don't bother with child linking when ghost messages are in use.
2014-10-25	lib: Enable ghost messages feature	Austin Clements
	This fixes the broken thread order test.
2014-10-25	lib: Implement upgrade to ghost messages feature	Austin Clements
	Somehow this is the first upgrade pass that actually does any error checking, so this also adds the bit of necessary infrastructure to handle that.
2014-10-25	lib: Implement ghost-based thread linking	Austin Clements
	This updates the thread linking code to use ghost messages instead of user metadata to link messages into threads. In contrast with the old approach, this is actually correct. Previously, thread merging updated only the thread IDs of message documents, not thread IDs stored in user metadata. As originally diagnosed by Mark Walters [1] and as demonstrated by the broken T260-thread-order test, this can cause notmuch to fail to link messages even though they're in the same thread. In principle the old approach could have been fixed by updating the user metadata thread IDs as well, but these are not indexed and hence this would have required a full scan of all stored thread IDs. Ghost messages solve this problem naturally by reusing the exact same thread ID and message ID representation and indexing as regular messages. Furthermore, thanks to this greater symmetry, ghost messages are also algorithmically simpler. We continue to support the old user metadata format, so this patch can't delete any code, but when we do remove support for the old format, several functions can simply be deleted. [1] id:8738h7kv2q.fsf@qmul.ac.uk
2014-10-25	lib: Internal support for querying and creating ghost messages	Austin Clements
	This updates the message abstraction to support ghost messages: it adds a message flag that distinguishes regular messages from ghost messages, and an internal function for initializing a newly created (blank) message as a ghost message.
2014-10-25	lib: Introduce macros for bit operations	Austin Clements
	These macros help clarify basic bit-twiddling code and are written to be robust against C undefined behavior of shift operators.
2014-10-25	lib: Update database schema doc for ghost messages	Austin Clements
	This describes the structure of ghost mail documents. Ghost messages are not yet implemented.
2014-10-25	lib: Add a ghost messages database feature	Austin Clements
	This will be implemented over the next several patches. The feature is not yet "enabled" (this does not add it to NOTMUCH_FEATURES_CURRENT).
2014-10-11	lib: Handle empty date value	Austin Clements
	In the interest of robustness, avoid undefined behavior of sortable_unserialise if the date value is missing. This shouldn't happen now, but ghost messages will have blank date values.
2014-10-11	lib: Refactor _notmuch_database_link_message	Austin Clements
	This moves the code to retrieve and clear the metadata thread ID out of _notmuch_database_link_message into its own function. This will simplify future changes.
2014-10-11	lib: Move message ID compression to _notmuch_message_create_for_message_id	Austin Clements
	Previously, this was performed by notmuch_database_add_message. This happens to be the only caller currently (which is why this was safe), but we're about to introduce more callers, and it makes more sense to put responsibility for ID compression in the lower-level function rather than requiring each caller to handle it.
2014-10-03	lib: Simplify close and codify aborting atomic section	Austin Clements
	In Xapian, closing a database implicitly aborts any outstanding transaction and commits changes. For historical reasons, notmuch_database_close had grown to almost, but not quite duplicate this behavior. Before closing the database, it would explicitly (and unnecessarily) commit it. However, if there was an outstanding transaction (ie atomic section), commit would throw a Xapian exception, which notmuch_database_close would unnecessarily print to stderr, even though notmuch_database_close would ultimately abort the transaction anyway when it called close. This patch simplifies notmuch_database_close to explicitly abort any outstanding transaction and then just call Database::close. This works for both read-only and read/write databases, takes care of committing changes, unifies the exception handling path, and codifies aborting outstanding transactions. This is currently the only way to abort an atomic section (and may remain so, since it would be difficult to roll back things we may have cached from rolled-back modifications).
2014-09-24	lib: actually return failures from notmuch_message_tags_to_maildir_flags	Jani Nikula
	The function takes great care to preserve the first error status it encounters, yet fails to return that status to the caller. Fix it.
2014-09-16	lib: bump soname	Peter Wang
	Adding return values to notmuch_database_close and notmuch_database_destroy may require bumping the soname.
2014-09-13	notmuch_thread_get_authors: document match grouping with \|	Gaute Hope
	as stated in thread.cc:115 /* Construct an authors string from matched_authors_array and * authors_array. The string contains matched authors first, then * non-matched authors (with the two groups separated by '\|'). Within * each group, authors are listed in date order. */ this is, however, not reflected in the public API documentation in notmuch.h:970. This patch a paragraph explaining how \| separates the group of authors of messages matching the query and those of messages that do not, but are still contained in the thread.
2014-09-01	lib: Fix endless upgrade problem	Austin Clements
	48db8c8 introduced a disagreement between when notmuch_database_needs_upgrade returned TRUE and when notmuch_database_upgrade actually performed an upgrade. As a result, if a database had a version less than 3, but no new features were required, notmuch new would call notmuch_database_upgrade to perform an upgrade, but notmuch_database_upgrade would return immediately without updating the database version. Hence, the next notmuch new would do the same, and so on. Fix this by ensuring that the upgrade-required logic is identical between the two.
2014-08-30	lib: Update doc of notmuch_database_{needs_upgrade,upgrade}	Austin Clements
	Clients are no longer required to call these functions after opening a database in read/write mode (which is good, because almost none of them do!).
2014-08-30	lib: Return an error from operations that require an upgrade	Austin Clements
	Previously, there was no protection against a caller invoking an operation on an old database version that would effectively corrupt the database by treating it like a newer version. According to notmuch.h, any caller that opens the database in read/write mode is supposed to check if the database needs upgrading and perform an upgrade if it does. This would protect against this, but nobody (even the CLI) actually does this. However, with features, it's easy to protect against incompatible operations on a fine-grained basis. This lightweight change allows callers to safely operate on old database versions, while preventing specific operations that would corrupt the database with an informative error message.
2014-08-30	lib: Support empty header values in database	Austin Clements
	Commit 567bcbc2 introduced support for storing various headers in document values. However, doing so in a backwards-compatible way meant that genuinely empty header values could not be distinguished from the old behavior of not storing the headers at all, so these required parsing the original message. Now that we have database features, new databases can declare that all messages have header values, so if we have this feature flag, we can use the stored header value even if it's the empty string. This requires slight cleanup to notmuch_message_get_header, since the code previously couldn't distinguish between empty headers and headers that are never stored in the database (previously this distinction didn't matter).
2014-08-30	lib: Report progress for combined upgrade operation	Austin Clements
	Previously, some parts of upgrade didn't report progress and for others it was possible for the progress meter to restart at 0 part way through the upgrade because each stage was reported separately. Fix this by computing the total amount of work that needs to be done up-front and updating completed work monotonically.
2014-08-30	lib: Reorganize upgrade around document types	Austin Clements
	Rather than potentially making multiple passes over the same type of data in the database, reorganize upgrade around each type of data that may be upgraded. This eliminates code duplication, will make multi-version upgrades faster, and will let us improve progress reporting.
2014-08-30	lib: Use database features to drive upgrade	Austin Clements
	Previously, we had database version information hard-coded in the upgrade code. Slightly re-organize the upgrade process around the set of new database features to be enabled by the upgrade.
2014-08-30	lib: Simplify upgrade code using a transaction	Austin Clements
	Previously, the upgrade was organized as two passes -- an upgrade pass, and a separate cleanup pass -- so the database was always in a valid state. This change substantially simplifies this code by performing the upgrade in a transaction and combining both passes in to one. This 1) eliminates a lot of duplicate code between the passes, 2) speeds up the upgrade process, 3) makes progress reporting more accurate, 4) eliminates the potential for stale data if the upgrade is interrupted during the cleanup pass, and 5) makes it easier to reason about the safety of the upgrade code.
2014-08-30	lib: Database version 3: Introduce fine-grained "features"	Austin Clements
	Previously, our database schema was versioned by a single number. Each database schema change had to occur "atomically" in Notmuch's development history: before some commit, Notmuch used version N, after that commit, it used version N+1. Hence, each new schema version could introduce only one change, the task of developing a schema change fell on a single person, and it all had to happen and be perfect in a single commit series. This made introducing a new schema version hard. We've seen only two schema changes in the history of Notmuch. This commit introduces database schema version 3; hopefully the last schema version we'll need for a while. With this version, we switch from a single version number to "features": a set of named, independent aspects of the database schema. Features should make backwards compatibility easier. For many things, it should be easy to support databases both with and without a feature, which will allow us to make upgrades optional and will enable "unstable" features that can be developed and tested over time. Features also make forwards compatibility easier. The features recorded in a database include "compatibility flags," which can indicate to an older version of Notmuch when it must support a given feature to open the database for read or for write. This lets us replace the old vague "I don't recognize this version, so something might go wrong, but I promise to try my best" warnings upon opening a database with an unknown version with precise errors. If a database is safe to open for read/write despite unknown features, an older version will know that and issue no message at all. If the database is not safe to open for read/write because of unknown features, an older version will know that, too, and can tell the user exactly which required features it lacks support for.
2014-08-16	Make parsing of References and In-Reply-To header less error prone	Michal Sojka
	According to RFC2822 References and In-Reply-To headers are supposed to contain one or more Message-IDs, however older RFC822 allowed almost any content. When both References and In-Reply-To headers ends with something else that a Message-ID (see e.g. [1]), the thread structure presented by notmuch is incorrect. The reason is that notmuch treats this case as if the email contained no "replyto" information (see _notmuch_database_link_message_to_parents). This patch changes the parse_references() function to return the last valid Message-ID encountered rather than NULL resulting from the last hunk of text not being the Message-ID. [1] https://lkml.org/lkml/headers/2014/5/19/864
2014-08-05	lib: Improve documentation of _notmuch_message_create_for_message_id	Austin Clements
	Clarify the state of the returned message when _notmuch_message_create_for_message_id returns NOTMUCH_PRIVATE_STATUS_NO_DOCUMENT_FOUND.
2014-08-04	lib: Fix slight misinformation in the database schema doc	Austin Clements
	The database schema documentation made it sound like each mail document had exactly one on-disk message file, which hasn't been true for a long time.
2014-08-04	lib: Invalidate message metadata in _notmuch_message_gen_terms	Austin Clements
	Previously, we invalidated stored message metadata in _notmuch_message_add_term and _notmuch_message_remove_term, but not in _notmuch_message_gen_terms. This doesn't currently result in any bugs because of our limited uses of _notmuch_message_gen_terms, but it may could cause trouble in the future.
2014-07-13	lib: Start all function names in notmuch-private.h with	Charles Celerier
	As noted in devel/STYLE, every private library function should start with _notmuch. This patch corrects function naming that did not adhere to this style in lib/notmuch-private.h. In particular, the old function names that now begin with _notmuch are notmuch_sha1_of_file notmuch_sha1_of_string notmuch_message_file_close notmuch_message_file_get_header notmuch_message_file_open notmuch_message_get_author notmuch_message_set_author Signed-off-by: Charles Celerier <cceleri@cs.stanford.edu>
2014-07-09	lib: add return status to database close and destroy	Jani Nikula
	notmuch_database_close may fail in Xapian ->flush() or ->close(), so report the status. Similarly for notmuch_database_destroy which calls close. This is required for notmuch insert to report error status if message indexing failed.
2014-06-18	lib: Separate all phrases indexed by _notmuch_message_gen_terms	Austin Clements
	This adds a 100 termpos gap between all phrases indexed by _notmuch_message_gen_terms. This fixes a bug where terms from the end of one header and the beginning of another header could match together in a single phrase and a separate bug where term positions of un-prefixed terms overlapped. This fix only affects newly indexed messages. Messages that are already indexed won't benefit from this fix without re-indexing, but the fix won't make things any worse for existing messages.
2014-06-18	lib: Index name and address of from/to headers as a phrase	Austin Clements
	Previously, we indexed the name and address parts of from/to headers with two calls to _notmuch_message_gen_terms. In general, this indicates that these parts are separate phrases. However, because of an implementation quirk, the two calls to _notmuch_message_gen_terms generated adjacent term positions for the prefixed terms, which happens to be the right thing to do in this case, but the wrong thing to do for all other calls. Furthermore, _notmuch_message_gen_terms produced potentially overlapping term positions for the un-prefixed copies of the terms, which is simply wrong. This change indexes both the name and address in a single call to _notmuch_message_gen_terms, indicating that they should be part of a single phrase. This masks the problem with the un-prefixed terms (fixing the two known-broken tests) and puts us in a position to fix the unintentionally phrases generated by other calls to _notmuch_message_gen_terms.
2014-06-13	lib: resurrect support for single-message mbox files	Jani Nikula
	This is effectively a revert of commit 6812136bf576d894591606d9e10096719054d1f9 Author: Jani Nikula <jani@nikula.org> Date: Mon Mar 31 00:21:48 2014 +0300 lib: drop support for single-message mbox files The intention was to drop support for indexing new single-message mbox files (and whether that was a good idea in the first place is arguable). However this inadvertently broke support for reading headers from previously indexed single-message mbox files, which is far worse. Distinguishing between the two cases would require more code than simply bringing back support for single-message mbox files.
2014-04-19	build: add canonicalize_file_name to symbols exported from libnotmuch.so	David Bremner
	This is needed for our compat version of canonicalize_file_name to be used.
2014-04-05	lib: replace the header parser with gmime	Jani Nikula
	The notmuch library includes a full blown message header parser. Yet the same message headers are parsed by gmime during indexing. Switch to gmime parsing completely. These are the main changes: * Gmime stops header parsing at the first invalid header, and presumes the message body starts from there. The current parser is quite liberal in accepting broken headers. The change means we will be much pickier about accepting invalid messages. * The current parser converts tabs used in header folding to spaces. Gmime preserve the tabs. Due to a broken python library used in mailman, there are plenty of mailing lists that produce headers with tabs in header folding, and we'll see plenty of tabs. (This change has been mitigated in preparatory patches.) * For pure header parsing, the current parser is likely faster than gmime, which parses the whole message rather than just the headers. Since we parse the message and its headers using gmime for indexing anyway, this avoids and extra header parsing round when adding new messages. In case of duplicate messages, we'll end up parsing the full message although just headers would be sufficient. All in all this should still speed up 'notmuch new'. * Calls to notmuch_message_get_header() may be slightly slower than previously for headers that are not indexed in the database, due to parsing of the whole message. Within the notmuch code base, notmuch reply is the only such user.
2014-04-05	lib: drop support for single-message mbox files	Jani Nikula
	We've supported mbox files containing a single message for historical reasons, but the support has been deprecated, with a warning message while indexing, since Notmuch 0.15. Finally drop the support, and consider all mbox files non-email.
2014-03-11	lib: make folder: prefix literal	Jani Nikula
	In xapian terms, convert folder: prefix from probabilistic to boolean prefix, matching the paths, relative from the maildir root, of the message files, ignoring the maildir new and cur leaf directories. folder:foo matches all message files in foo, foo/new, and foo/cur. folder:foo/new does not match message files in foo/new. folder:"" matches all message files in the top level maildir and its new and cur subdirectories. This change constitutes a database change: bump the database version and add database upgrade support for folder: terms. The upgrade also adds path: terms. Finally, fix the folder search test for literal folder: search, as some of the folder: matching capabilities are lost in the probabilistic to boolean prefix change.
2014-03-11	lib: add support for path: prefix searches	Jani Nikula
	The path: prefix is a literal boolean prefix matching the paths, relative from the maildir root, of the message files. path:foo matches all message files in foo (but not in foo/new or foo/cur). path:foo/new matches all message files in foo/new. path:"" matches all message files in the top level maildir. path:foo/ matches all message files in foo and recursively in all subdirectories of foo. path: matches all message files recursively, i.e. all messages.
2014-03-11	lib: refactor folder term update after filename removal	Jani Nikula
	Abstract some blocks of code for reuse. No functional changes.
2014-02-13	doc: notmuch_result_move_to_next -> notmuch_tags_move_to_next	Gaute Hope
	Fix typo in docs.
2014-01-26	lib: update documentation for notmuch_database_get_directory	David Bremner
	Clarify that using the directory after destroying the corresponding database is not permitted. This is implicit in the description of notmuch_database_destroy, but it doesn't hurt to be explicit, and we do express similar "ownership" relationships at other places in the docs.
2014-01-24	lib: make notmuch_threads_valid return FALSE when passed NULL	David Bremner
	Without this patch, the example code in the header docs crashes for certain invalid queries (see id:871u00oimv.fsf@approx.mit.edu)
2014-01-18	lib: fix error handling	Tomi Valkeinen
	Currently if a Xapian exception happens in notmuch_message_get_header, the exception is not caught leading to crash. In notmuch_message_get_date the exception is caught, but an internal error is raised, again leading to crash. This patch fixes the error handling by making both functions catch the Xapian exceptions, print an error and return NULL or 0. The 'notmuch->exception_reported' is also set, as is done elsewhere, even if I don't really get the idea of that field. Signed-off-by: Tomi Valkeinen <tomi.valkeinen@iki.fi>
2014-01-18	lib: fix clang compiler warning	Jani Nikula
	With some combination of clang and talloc, not using the return value of talloc_steal() produces a warning. Ignore it, as talloc_steal() has no failure modes per documentation.
2014-01-05	lib: modify notmuch.h for automatic document generation	Jani Nikula
	Minimal changes to produce a sensible result.
2013-12-07	lib: Bump library version from 3.0.0 to 3.1.0	Austin Clements
	This version of the library introduces LIBNOTMUCH_CHECK_VERSION and the *_VERSION macros. Bumping the version number is also necessary to make the comment on LIBNOTMUCH_CHECK_VERSION no longer a lie.
2013-12-07	lib: Replace NOTMUCH__VERSION with LIBNOTMUCH__VERSION	Austin Clements
	This makes it clear that these macros refer to the library version, and not to the notmuch application-level release. Since there are no consumers of these macros yet, this is now or never.
2013-12-07	lib: Make VERSION macros agree with soname version	Austin Clements
	We have two distinct "library version" numbers: the soname version and the version macros. We need both for different reasons: the version macros enable easy compile-time version detection (and conditional compilation), while the soname version enables runtime version detection (which includes the version checking done by things like the Python bindings). However, currently, these two version numbers are different, which is unnecessary and can lead to confusion (especially in things like Debian, which include the soname version in the package name). This patch makes them the same by bumping the version macros up to agree with the soname version. (We should probably keep the version number in just one place so they can't get out of sync, but that can be done in another patch.)
2013-11-27	util: detect byte order	David Bremner
	Unfortunately old versions of GCC and clang do not provide byte order macros, so we re-invent them. If UTIL_BYTE_ORDER is not defined or defined to 0, we fall back to macros supported by recent versions of GCC and clang