Tomi Ollila [Wed, 18 Feb 2015 14:54:01 +0000 (16:54 +0200)]
configure: consistent command -v usage
When the shell builtin `command -v` operates normally, it either
prints the path of the arg given to it and returns zero -- or it
returns nonzero and prints nothing.
In abnormal situations something might be printed to stderr and
in that case we want to know about it; therefore the used
command -v stderr redirections to /dev/null have been removed.
The `hash` (builtin) command in ksh returns zero even the arg
given to is is not found in path. For that and for consistency
the one appearance of it has been converted to `command -v`.
notmuch-mutt: support for messages that lack Message-ID headers
For those messages, compute a synthetic Message-ID based on the SHA1
of the whole message, in the same way that notmuch would do. See:
http://git.notmuchmail.org/git/notmuch/blob/HEAD:/lib/sha1.c
To do the above, rewrite get_message_id() to accumulate header lines,
parse them to check for Message-ID, and fallback to SHA1 computation
if it is not present.
Thanks to:
- Jan N. Klug for preliminary versions of this patch
- Tomi Ollila for suggesting an elegant implementation
Austin Clements [Sat, 24 Jan 2015 21:17:02 +0000 (16:17 -0500)]
emacs: Rewrite content ID handling
Besides generally cleaning up the code and separating the general
content ID handling from the w3m-specific code, this fixes several
problems.
Foremost is that, previously, the code roughly assumed that referenced
parts would be in the same multipart/related as the reference.
According to RFC 2392, nothing could be further from the truth:
content IDs are supposed to be globally unique and globally
addressable. This is nonsense, but this patch at least fixes things
so content IDs can be anywhere in the same message.
As a side-effect of the above, this handles multipart/alternate
content-IDs more in line with RFC 2046 section 5.1.2 (not that I've
ever seen this in the wild). This also properly URL-decodes cid:
URLs, as per RFC 2392 (the previous code did not), and applies crypto
settings from the show buffer (the previous code used the global
crypto settings).
Austin Clements [Sat, 24 Jan 2015 21:16:59 +0000 (16:16 -0500)]
emacs: Return unibyte strings for binary part data
Unibyte strings are meant for representing binary data. In practice,
using unibyte versus multibyte strings affects *almost* nothing. It
does happen to matter if we use the binary data in an image descriptor
(which is, helpfully, not documented anywhere and getting it wrong
results in opaque errors like "Not a PNG image: <giant binary spew
that is, in fact, a PNG image>").
Austin Clements [Sat, 24 Jan 2015 21:16:58 +0000 (16:16 -0500)]
emacs: Remove broken `notmuch-get-bodypart-content' API
`notmuch-get-bodypart-content' could do two very different things,
depending on conditions: for text/* parts other than text/html, it
would return the part content as a multibyte Lisp string *after*
charset conversion, while for other parts (including text/html), it
would return binary part content without charset conversion.
This commit completes the split of `notmuch-get-bodypart-content' into
two different and explicit APIs: `notmuch-get-bodypart-binary' and
`notmuch-get-bodypart-text'. It updates all callers to use one or the
other depending on what's appropriate.
Austin Clements [Sat, 24 Jan 2015 21:16:57 +0000 (16:16 -0500)]
emacs: Create an API for fetching parts as undecoded binary
The new function, `notmuch-get-bodypart-binary', replaces
`notmuch-get-bodypart-internal'. Whereas the old function was really
meant for internal use in `notmuch-get-bodypart-content', it was used
in a few other places. Since the difference between
`notmuch-get-bodypart-content' and `notmuch-get-bodypart-internal' was
unclear, these other uses were always confusing and potentially
inconsistent. The new call clearly requests the part as undecoded
binary.
This is step 1 of 2 in separating `notmuch-get-bodypart-content' into
two APIs for retrieving either undecoded binary or decoded text.
David Bremner [Sun, 18 Jan 2015 12:59:29 +0000 (13:59 +0100)]
doc: add details about Xapian search syntax
Questions related to the way that probabilistic prefixes and phrases
are handled come up quite often and it is nicer to have the documentation self contained. Hopefully putting it in subsections prevents it from being overwhelming.
Todd [Thu, 22 Jan 2015 23:43:40 +0000 (17:43 -0600)]
Update documentation
Adds new entry to the NEWS file, and updates the search terms section
of the man page. The search terms section needs to be updated again
once the new section in the documentation covering probablistic terms
has been committed.
Todd [Thu, 22 Jan 2015 23:43:38 +0000 (17:43 -0600)]
Add indexing for the mimetype term
This adds the indexing support for the "mimetype:" term and removes
the broken test flag. The indexing is probablistic in Xapian terms,
which gives a better experience to end users. Standard content-types
of the form "foo/bar" are automatically interpreted as phrases in
Xapian due to the embedded slash.
Assume, separate messages with application/pdf and application/x-pdf
are indexed, then:
- mimetype:application/x-pdf will find only the application/x-pdf
- mimetype:application/pdf will find only the application/pdf
- mimetype:pdf will find both of the messages
Todd [Thu, 22 Jan 2015 23:43:37 +0000 (17:43 -0600)]
Add the NOTMUCH_FEATURE_INDEXED_MIMETYPES database feature
This feature will exist in all newly created databases, but there is
no upgrade provided for it. If this flag exists, it indicates that
the database was created after the indexed MIME-types feature was
added.
Todd [Thu, 22 Jan 2015 23:43:36 +0000 (17:43 -0600)]
test: Add failing unit tests for indexed mime types
Adds three failing unit tests for searching of mime-types.
An attempt was made at adding a negative test (i.e. searching for a
non-existent mime-type and ensuring it didn't return a message), but
that test would always pass making it pointless.
David Bremner [Thu, 22 Jan 2015 08:37:32 +0000 (09:37 +0100)]
emacs: escape % in header line format
We set header-line-format to the message subject, but if the subject
contains percents, the next character is interpreted as a formatting
control, which is not desired.
Tomi Ollila [Sun, 21 Sep 2014 18:06:20 +0000 (21:06 +0300)]
test: prepare test-lib.sh for possible test system debug session
When something in tests fails one possibility to test is to run
the test script as `bash -x TXXX-testname.sh`. As stderr (fd 2) was
redirected to separate file during test execution also this set -x
(xtrace) output would also go there.
test-lib.sh saves the stderr to fd 7 from where it can be restored,
and bash has BASH_XTRACEFD variable, which is now given the same value
7, making bash to output all xtrade information (consistently) there.
This lib file used to save fd's 1 & 2 to 6 & 7 (respectively) in
test_begin_subtest(), but as those needs to be set *before* XTRACEFD
variable is set those are now saved at the beginning of the lib (once).
This is safe and simple thing to do.
To make xtrace output more verbose PS4 variable was set to contain the
source file, line number and if execution is in function, that function
name. Setting this variable has no effect when not xtracing.
As it is known that fd 6 is redirected stdout, printing status can now
use that fd, instead of saving stdout to fd 5 and use it.
Todd [Sat, 17 Jan 2015 15:51:46 +0000 (09:51 -0600)]
lib: Fix use after free
_thread_set_subject_from_message sometimes replaces the subject, making the
cur_subject point to free'd memory
==6550== ERROR: AddressSanitizer: heap-use-after-free on address 0x601a0000bec0 at pc 0x4464a4 bp 0x7fffa40be910 sp 0x7fffa40be908
READ of size 1 at 0x601a0000bec0 thread T0
#0 0x4464a3 in _thread_add_matched_message /home/todd/.apps/notmuch/lib/thread.cc:369
#1 0x443c2c in notmuch_threads_get /home/todd/.apps/notmuch/lib/query.cc:496
#2 0x41d947 in do_search_threads /home/todd/.apps/notmuch/notmuch-search.c:131
#3 0x40a3fe in main /home/todd/.apps/notmuch/notmuch.c:345
#4 0x7f4e535b4ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
#5 0x40abe6 in _start ??:?
0x601a0000bec0 is located 96 bytes inside of 134-byte region [0x601a0000be60,0x601a0000bee6)
freed by thread T0 here:
#0 0x7f4e54e6933a in __interceptor_free ??:?
#1 0x7f4e54482fab in _talloc_free ??:?
previously allocated by thread T0 here:
#0 0x7f4e54e6941a in malloc ??:?
#1 0x7f4e54485b5d in talloc_strdup ??:?
Todd [Sat, 17 Jan 2015 15:51:45 +0000 (09:51 -0600)]
lib: Fix potential invalid read past an empty string
==22884== ERROR: AddressSanitizer: heap-buffer-overflow on address 0x601600008291 at pc 0x7ff6295680e5 bp 0x7fff4ab9aa40 sp 0x7fff4ab9aa08
READ of size 1 at 0x601600008291 thread T0
#0 0x7ff6295680e4 in __interceptor_strcmp ??:?
#1 0x44763b in _thread_add_message /home/todd/.apps/notmuch/lib/thread.cc:255
#2 0x4459e8 in notmuch_threads_get /home/todd/.apps/notmuch/lib/query.cc:496
#3 0x41e2a7 in do_search_threads /home/todd/.apps/notmuch/notmuch-search.c:131
#4 0x40a408 in main /home/todd/.apps/notmuch/notmuch.c:345
#5 0x7ff627cb9ec4 in __libc_start_main /build/buildd/eglibc-2.19/csu/libc-start.c:287
#6 0x40abf3 in _start ??:?
0x601600008291 is located 0 bytes to the right of 97-byte region [0x601600008230,0x601600008291)
allocated by thread T0 here:
#0 0x7ff62956e41a in malloc ??:?
#1 0x7ff628b8ab5d in talloc_strdup ??:?
Michal Sojka [Fri, 19 Sep 2014 18:16:40 +0000 (20:16 +0200)]
Emacs: Display a message when generating address completion candidates
The TAB-initiated address completion generates completion candidates
synchronously, blocking the UI. Since this can take long time, it is
better to let the use know what's happening.
Jesse Rosenthal [Wed, 29 Oct 2014 20:51:44 +0000 (16:51 -0400)]
test-lib: Add dummy subject to force empty subject
At the moment, the test-lib fills in any missing headers. This makes
it impossible to test our handling of empty subjects. This will
allow us to use a special dummy subject -- `@FORCE_EMPTY` -- to force
the subject to remain empty.
Jesse Rosenthal [Wed, 29 Oct 2014 20:51:43 +0000 (16:51 -0400)]
thread.cc: Avoid empty thread names if possible.
Currently the thread is named based on either the oldest or newest
matching message (depending on the search order). If this message has
an empty subject, though, the thread will show up with an empty
subject in the search results. (See the thread starting with
`id:1412371140-21051-1-git-send-email-david@tethera.net` for an
example.)
This changes the behavior so it will use a non-empty name for the
thread if possible. We name threads based on (a) non-empty matches for
the query, and (b) the search order. If the search order is
oldest-first (as in the default inbox) it chooses the oldest matching
non-empty message as the subject. If the search order is newest-first
it chooses the newest one.
David Edmondson [Fri, 31 Oct 2014 08:05:20 +0000 (08:05 +0000)]
emacs: More flexible washed faces.
The faces used when washing messages should be notmuch specific and
inherit from the underlying emacs face rather than using it
directly. This allows the washed face to be modified without requiring
the modification of the underlying face.
David Bremner [Sat, 3 Jan 2015 12:58:19 +0000 (13:58 +0100)]
configure: add check for python interepreter name
Currently we hardcode "python" in several places. This makes things
hard for people who have only commands called python3 and/or
python2. We also add the name to sh.config to eventually replace the
current workaround in the test suite.
The approach of this commit avoids modifying the python module path,
which is arguably preferable since it avoids potentially accidentally
importing a module from the wrong place.
David Bremner [Sun, 28 Dec 2014 10:45:08 +0000 (11:45 +0100)]
lib: another iterator-temporary/stale-pointer bug
Tamas Szakaly points out [1] that the bug fixed in 51b073c still
exists in at least one place. This change follows the suggestion of
[2] and creates a block scope temporary std::string to avoid the rules
of iterators temporaries.
Jani Nikula [Sun, 23 Nov 2014 11:15:12 +0000 (13:15 +0200)]
lib: drop the deprecation message for single-message mbox files
We generally do not support mbox files, but for historical reasons
we've supported single-message mbox files, with a deprecation
message. We've tried dropping the support altogether, but backed out
of it because we'd need to stop indexing them, while keeping support
for previously indexed files. This would be more complicated than
simply supporting single-message mbox files. Therefore, drop the
deprecation message, and just silently accept single-message mboxes.
Jesse Rosenthal [Sat, 22 Nov 2014 13:17:16 +0000 (08:17 -0500)]
lib: Use email address instead of empty real name.
Currently, if a From-header is of the form:
"" <address@example.com>
the empty string will be treated as a valid real-name, and the entry
in the search results will be empty.
The new behavior here is that we treat an empty real-name field as if
it were null, so that the email address will be used in the search
results instead.
David Edmondson [Tue, 18 Nov 2014 07:03:17 +0000 (07:03 +0000)]
emacs: `with-current-notmuch-show-message' should not leak `coding-system-for-read'
`with-current-notmuch-show-message' applies a `no-conversion' coding
system when reading a raw message from notmuch. That coding system
should _not_ be applied when the body of the macro is evaluated, as it
can cause file operations used during that evaluation to incorrectly
apply the `no-conversion' coding system.
This was discovered when a user's .signature file contained non-ASCII
characters. When a message is forwarded, the `no-conversion' coding
system was applied to the reading of the .signature file, resulting in
raw rather than UTF-8 interpretation of the data.
David Bremner [Thu, 13 Nov 2014 20:59:43 +0000 (21:59 +0100)]
NEWS: deprecate notmuch deliver
notmuch-deliver has no commits for about 2.5 years. notmuch-insert has
all the features that deliver does, and as far as I understand the
error handling has now caught up.
Jani Nikula [Mon, 10 Nov 2014 19:27:52 +0000 (21:27 +0200)]
NEWS: notmuch insert, search updates
News for
- cli: add support for notmuch search --duplicate=N with --output=messages
- cli/insert: add post-insert hook
- cli/insert: require succesful message indexing for success statu
Michal Sojka [Wed, 5 Nov 2014 00:25:56 +0000 (01:25 +0100)]
cli: search: Convert --output to keyword argument
Now, when address related outputs are in a separate command, it makes
no sense to combine multiple --output options in search command line.
Using switch statement to handle different outputs is more readable
than a series of if statements.
Michal Sojka [Wed, 5 Nov 2014 00:25:55 +0000 (01:25 +0100)]
cli: Introduce "notmuch address" command
This moves address-related functionality from search command to the
new address command. The implementation shares almost all code and
some command line options.
Options --offset and --limit were intentionally not included in the
address command, because they refer to messages numbers, which users
do not see in the output. This could confuse users because, for
example, they could see more addresses in the output that what was
specified with --limit. This functionality can be correctly
reimplemented for address subcommand later.
Also useless values of --exclude flag were not included in the address
command.
Michal Sojka [Wed, 5 Nov 2014 00:25:52 +0000 (01:25 +0100)]
cli: search: Convert ctx. to ctx->
In the next commit, notmuch_search_command will be refactored to
several smaller functions. In order to simplify the next commit to
verbatim move of several lines to new functions with search_context_t*
argument, we convert all references to this structure to pointer
dereferences. To do so we rename the context variable and use the
original name ctx as the pointer to the renamed structure.
Michal Sojka [Wed, 5 Nov 2014 00:25:51 +0000 (01:25 +0100)]
cli: search: Move more variables into search_context_t
In order to share some command line options between search and address
subcommands we need to add corresponding variables to the context
structure. While we are at it, we also add notmuch_database_t to unify
parameters of all do_search_* functions and to simplify subsequent
commits.
Tomi Ollila [Mon, 4 Aug 2014 17:39:32 +0000 (20:39 +0300)]
devel: make man-to-mdwn.pl to work with generated manual pages
The new manual pages converted from rst using sphinx or rst2man
has somewhat different syntax. man-to-mdwn.pl is now adjusted
to produce even better output from this syntax. The changes also
include using utf-8 locale (e.g. for tables and generated hypens)
and and quite a few bugs fixes.
This tool still produces better results than just using the
html pages generated using sphinx / rst2html. For example those
tools don't create inter-page hyperlinks -- and the preformatted
pages written by man-to-mdwn.pl just works well with manual page
content.
Jesse Rosenthal [Fri, 31 Oct 2014 17:33:25 +0000 (13:33 -0400)]
test: Make gen-threads work with python3
python3 doesn't allow dictionaries to be initialized with non-string
keywords. This presents problems on systems in which "python" means
"python3". We instead initalize the dictionary using the dict
comprehension and then update it with the values from the tree. This
will work with both python2 and python3.
Tomi Ollila [Sat, 1 Nov 2014 09:39:04 +0000 (11:39 +0200)]
configure: move make {,install} instructions to the end
There was theorical possibility that writing the config files could
have skipped (by interruption) after the instructions how to make
notmuch was printed out.
Michal Sojka [Fri, 31 Oct 2014 21:53:58 +0000 (22:53 +0100)]
cli: search: Add --output={sender,recipients}
The new outputs allow printing senders, recipients or both of matching
messages. To print both, the user can use --output=sender and
--output=recipients simultaneously.
Currently, the same address can appear multiple times in the output.
The next commit will change this. For this reason, tests are
introduced there.
We use mailbox_t rather than InternetAddressMailbox because we will
need to extend it in a following commit.