summaryrefslogtreecommitdiff
path: root/lib
AgeCommit message (Collapse)Author
2019-05-29indexing: record protected subject when indexing cleartextDaniel Kahn Gillmor
When indexing the cleartext of an encrypted message, record any protected subject in the database, which should make it findable and visible in search. Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
2019-05-25lib/database: index user headers.David Bremner
This essentially involves calling _notmuch_message_gen_terms once for each user defined header.
2019-05-25lib: support user prefix names in term generationDavid Bremner
This should not change the indexing process yet as nothing calls _notmuch_message_gen_terms with a user prefix name. On the other hand, it should not break anything either. _notmuch_database_prefix does a linear walk of the list of (built-in) prefixes, followed by a logarithmic time search of the list of user prefixes. The latter is probably not really noticable.
2019-05-25lib: cache user prefixes in database objectDavid Bremner
This will be used to avoid needing a database access to resolve a db prefix from the corresponding UI prefix (e.g. when indexing). Arguably the setup of the separate header map does not belong here, since it is about indexing rather than querying, but we currently don't have any other indexing setup to do.
2019-05-25lib: setup user headers in query parserDavid Bremner
These tests will need to be updated if the Xapian query print/debug format changes.
2019-05-23n_m_remove_indexed_terms: reduce number of Xapian API calls.David Bremner
Previously this functioned scanned every term attached to a given Xapian document. It turns out we know how to read only the terms we need to preserve (and we might have already done so). This commit replaces many calls to Xapian::Document::remove_term with one call to ::clear_terms, and a (typically much smaller) number of calls to ::add_term. Roughly speaking this is based on the assumption that most messages have more text than they have tags. According to the performance test suite, this yields a roughly 40% speedup on "notmuch reindex '*'"
2019-05-10lib/message-file: close stream in destructorDavid Bremner
Without this, $ make time-test OPTIONS=--small leads to fatal errors from too many open files. Thanks to st-gourichon-fid for bringing this problem to my attention in IRC.
2019-05-03lib/message_file: open gzipped filesDavid Bremner
Rather than storing the lower level stdio FILE object, we store a GMime stream. This allows both transparent decompression, and passing the stream into GMime for parsing. As a side effect, we can let GMime close the underlying OS stream (indeed, that stream isn't visible here anymore). This change is enough to get notmuch-{new,search} working, but there is still some work required for notmuch-show, to be done in a following commit.
2019-05-03gmime-cleanup: pass NULL as default GMimeParserOptionsDaniel Kahn Gillmor
This is a functional change, not a straight translation, because we are no longer directly invoking g_mime_parser_options_get_default(), but the GMime source has indicated that the options parameter for g_mime_parser_construct_message() is "nullable" since upstream commit d0ebdd2ea3e6fa635a2a551c846e9bc8b6040353 (which itself precedes GMime 3.0). Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
2019-05-03gmime-cleanup: pass NULL arguments explicitly where GMime 3.0 expects itDaniel Kahn Gillmor
Several GMime 2.6 functions sprouted a change in the argument order in GMime 3.0. We had a compatibility layer here to be able to handle compiling against both GMime 2.6 and 3.0. Now that we're using 3.0 only, rip out the compatibility layer for those functions with changed argument lists, and explicitly use the 3.0 argument lists. Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
2019-05-03gmime-cleanup: use GMime 3.0 function namesDaniel Kahn Gillmor
Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
2019-05-03gmime-cleanup: drop unused gmime #defines and simplify g_mime_init ()Daniel Kahn Gillmor
Several of these #defines were not actually used in the notmuch codebase any longer. And as of GMime 3.0, g_mime_init takes no arguments, so we can also drop the bogus RFC2047 argument that we were passing and then #defining away. signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
2019-05-03gmime-cleanup: drop all arguments unused in GMime 3Daniel Kahn Gillmor
This means dropping GMimeCryptoContext and notmuch_config arguments. All the argument changes are to internal functions, so this is not an API or ABI break. We also get to drop the #define for g_mime_3_unused. signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
2019-05-03gmime-cleanup: drop g_mime_2_6_unrefDaniel Kahn Gillmor
signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
2019-05-03gmime-cleanup: always support session keysDaniel Kahn Gillmor
Our minimum version of GMime 3.0 always supports good session key handling. signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
2019-05-03gmime-cleanup: remove GMime 2.6 variant codeblocksDaniel Kahn Gillmor
signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
2019-05-03gmime-cleanup: drop unused gmime 2.6 content_type from _index_encrypted_mime_partDaniel Kahn Gillmor
In _index_mime_part, we don't need to extract the content-type from the part until just before we use it, so we also defer it lazily. Signed-off-by: Daniel Kahn Gillmor <dkg@fifthhorseman.net>
2019-04-17lib: add 'body:' field, stop indexing headers twice.David Bremner
The new `body:` field (in Xapian terms) or prefix (in slightly sloppier notmuch) terms allows matching terms that occur only in the body. Unprefixed query terms should continue to match anywhere (header or body) in the message. This follows a suggestion of Olly Betts to use the facility (since Xapian 1.0.4) to add the same field with multiple prefixes. The double indexing of previous versions is thus replaced with a query time expension of unprefixed query terms to the various prefixed equivalent. Reindexing will be needed for 'body:' searches to work correctly; otherwise they will also match messages where the term occur in headers (demonstrated by the new tests in T530-upgrade.sh)
2019-03-31lib: update commentary about path/folder termsDavid Bremner
We missed this when we changed to binary fields.
2019-03-31lib: add clarification about the use of "prefix" in the docs.David Bremner
2019-03-31lib: drop comment about only indexing one file.David Bremner
Although the situation is complicated by the value fields (which are taken from a single file), this comment is now more false than true.
2019-03-28lib: use phrase search for anything not ending in '*'David Bremner
Anything that does not look like a wildcard should be safe to quote. This should fix the problem searching for xapian keywords.
2019-03-11Prepend regerror() messages with "regexp error: "Luis Ressel
The exact error messages returned by regerror() aren't standardized; relying on them isn't portable. Thus, add a a prefix to make clear that the subsequent message is a regexp parsing error, and only look for this prefix in the test suite, ignoring the rest of the message.
2019-03-06Merge branch 'release'David Bremner
Changes from 0.28.3
2019-03-05lib/string_map: fix return type of string_cmpDavid Bremner
I can't figure out how checking the sign of a bool ever worked. The following program demonstrates the problem (i.e. for me it prints 1). #include <stdio.h> #include <stdbool.h> int main(int argc, char **argv) { bool x; x = -1; printf("x = %d\n", x); } This seems to be mandated by the C99 standard 6.3.1.2.
2019-01-25docs: Use correct call to notmuch_query_search_threads in usage examplerhn
Amended by db: simplify (subjectively) the example.
2019-01-25lib: Explicitly state when replies will be destroyedrhn
Without an explicit guarantee, it's not clear how to use the reference.
2018-10-21index: explicitly follow GObject conventionsDaniel Kahn Gillmor
Use explicit labels for GTypeInfo member initializers, rather than relying on comments and ordering. This is both easier to read, and harder to screw up. This also makes it clear that we're mis-casting GObject class initializers for gcc. Without this patch, g++ 8.2.0-7 produces this warning: CXX -g -O2 lib/index.o lib/index.cc: In function ‘GMimeFilter* notmuch_filter_discard_non_term_new(GMimeContentType*)’: lib/index.cc:252:23: warning: cast between incompatible function types from ‘void (*)(NotmuchFilterDiscardNonTermClass*)’ {aka ‘void (*)(_NotmuchFilterDiscardNonTermClass*)’} to ‘GClassInitFunc’ {aka ‘void (*)(void*, void*)’} [-Wcast-function-type] (GClassInitFunc) notmuch_filter_discard_non_term_class_init, ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ The definition of GClassInitFunc in /usr/include/glib-2.0/gobject/gtype.h suggests that this function will always be called with the class_data member of the GTypeInfo. We set that value to NULL in both GObject definitions in notmuch. So we mark it as explicitly unused. There is no functional change here, just code cleanup.
2018-09-06lib: change parent strategy to use In-Reply-To if it looks saneDavid Bremner
As reported by Sean Whitton, there are mailers (in particular the Debian Bug Tracking System) that have sensible In-Reply-To headers, but un-useful-for-notmuch References (in particular with the BTS, the oldest reference is last). I looked at a sample of about 200K messages, and only about 0.5% these had something other than a single message-id in In-Reply-To. On this basis, if we see a single message-id in In-Reply-To, consider that as authoritative.
2018-09-06lib: add _notmuch_message_id_parse_strictDavid Bremner
The idea is that if a message-id parses with this function, the MUA generating it was probably sane, and in particular it's probably safe to use the result as a parent from In-Reply-to.
2018-09-06lib/thread: change _resolve_thread_relationships to use depthsDavid Bremner
We (finally) implement the XXX comment. It requires a bit of care not to reparent all of the possible toplevel messages. _notmuch_messages_has_next is not ready to be a public function yet, since it punts on the mset case. We know in the one case it is called, the notmuch_messages_t is just a regular list / iterator.
2018-09-06lib/thread: rewrite _parent_or_toplevel to use depthsDavid Bremner
This is part 1/2 of changing the reparenting of alleged toplevel messages to use a "deep" reference rather than just the first one found.
2018-09-06lib: calculate message depth in threadDavid Bremner
This will be used in reparenting messages without useful in-reply-to, but with useful references
2018-09-06lib/thread: initial use of references as for fallback parentingDavid Bremner
This is mainly to lay out the structure of the final code. The problem isn't really solved yet, although some very simple cases are better (hence the fixed test). We need two passes through the messages because we need to be careful not to re-parent too many messages and end up without any toplevel messages.
2018-09-06use EMPTY_STRING in _parent_via_in_reply_toDavid Bremner
This is a review suggestion [1] of Tomi. I decided not to squash it so that the code movement remains clear. [1]: id:m2pnxxgf5q.fsf@guru.guru-group.fi
2018-09-06lib/thread: refactor in_reply_to testDavid Bremner
This is not a complete win in code-size, but it makes the code (which is about to get more complicated) easier to follow.
2018-09-06lib: add _notmuch_message_list_emptyDavid Bremner
There is no public notmuch_message_list_t public interface, so to this is added to the private API. We use it immediately in thread.cc; future commits will use it further.
2018-09-06lib/thread: add macro for debug printing of threadingDavid Bremner
This is analogous to DEBUG_DATABASE_SANITY, and is intended to help debugging and to help users submit bug reports.
2018-09-06lib: read reference terms into message struct.David Bremner
The plan is to use these in resolving threads.
2018-09-06lib/thread: sort sibling messages by dateDavid Bremner
For non-root messages, this should not should anything currently, as the messages are already added in date order. In the future we will add some non-root messages in a second pass out of order and the sorting will be useful. It does fix the order of multiple root-messages (although it is overkill for that).
2018-05-26lib: make notmuch_message_get_database() take a const notmuch_message_t*Daniel Kahn Gillmor
This is technically an API change, but it is not an ABI change, and it's merely a statement that limits what the library can do. This is in parallel to notmuch_query_get_database(), which also takes a const pointer.
2018-05-26properties: add notmuch_message_count_propertiesDaniel Kahn Gillmor
The user can already do this manually, of course, but (a) it's nice to have a convenience function, and (b) exposing this interface means that someone more clever with a _notmuch_string_map_t than i am can write a more efficient version if they like, and it will just accelerate the users of the convenience function.
2018-05-26lib: bump minor versionDavid Bremner
This recognizes the addition of (at least) notmuch_message_get_database to the API.
2018-05-26lib: expose notmuch_message_get_database()Daniel Kahn Gillmor
We've had _notmuch_message_database() internally for a while, and it's useful. It turns out to be useful on the other side of the library interface as well (i'll use it later in this series for "notmuch show"), so we expose it publicly now.
2018-05-14drop use of register keywordDavid Bremner
The performance benefits are dubious, and it's deprecated in C++11.
2018-05-07lib: define specialized get_thread_id for use in thread subqueryDavid Bremner
The observation is that we are only using the messages to get there thread_id, which is kindof a pessimal access pattern for the current notmuch_message_get_thread_id
2018-05-07lib: add thread subqueries.David Bremner
This change allows queries of the form thread:{from:me} and thread:{from:jian} and not thread:{from:dave} This is still somewhat brute-force, but it's a big improvement over both the shell script solution and the previous proposal [1], because it does not build the whole thread structure just generate a query. A further potential optimization is to replace the calls to notmuch with more specialized Xapian code; in particular it's not likely that reading all of the message metadata is a win here. [1]: id:20170820213240.20526-1-david@tethera.net
2018-05-03move more http -> httpsDaniel Kahn Gillmor
Correct URLs that have crept into the notmuch codebase with http:// when https:// is possible. As part of this conversion, this changeset also indicates the current preferred upstream URLs for both gmime and sup. the new URLs are https-enabled, the old ones are not. This also fixes T310-emacs.sh, thanks to Bremner for catching it.
2018-04-26Merge branch 'release'David Bremner
minimal mset fix, for 0.26.2
2018-04-26lib: work around xapian bug with get_mset(0,0, x)David Bremner
At least Fedora28 triggers this Xapian bug due to some toolchain change . https://bugzilla.redhat.com/show_bug.cgi?id=1546162 The underlying bug is fixed in xapian commit f92e2a936c1592, and should be fixed in Xapian 1.4.6