| Age | Commit message (Collapse) | Author |
|
It's obviously an innocent-enough message, and the right thing is
so easy to do.
|
|
This allows for indexing an arbitrary number of messages with a
single invocation rather than just a single message on the command
line.
|
|
This is a step toward having a program that will index many messages
with a single invocation.
|
|
We're basically matching sup now! (As long as one uses sup with my
special notmuch_index.rb file).
|
|
Pull the "constant" source_id value out from among several calls
that set a value based on the Message ID.
|
|
Getting closer to sup results all the time.
|
|
We identify it based on a trailing ':' on the line before a quote
begins.
At this point the database-dump diff between sup and notmuch is
getting very, very small, (at least for our one test message).
|
|
At this point, we're achieving a result that is *very* close to
what sup does. The only difference is that we are still indexing
the "excerpts from message ..." line, and we are not yet indexing
references.
|
|
This one is complex enough to deserve its own treament.
|
|
Most of this code is fairly clean and works well. One part is
fairly painful---namely extracting the body of an email message
from libgmime. Currently, I'm just extracting the offset to
the end of the headers, and then separately opening the message.
Surely there's a better way.
Anyway, with that the results are looking very similar to sup-sync
now, (as verified by xapian-dump). The only substantial difference
I'm seeing now is that sup does not seem to index quoted portions
of messages nor signatures. I'm not actually sure whether I want
to follow sup's lead in that or not.
|
|
In preparation for actually creating a Xapian index from the
message, (not that we're doing that quite yet).
|
|
Just to make it easier to visually identify where one document ends
and the next begins.
|
|
At the same time, I've started hacking up sup with a new NotmuchIndex
class in the place of the previous XapianIndex class. The new class
stores only the source_info field in the document data, (rather than
a serialized ruby hash with a bunch of data that can be found in the
original message).
Eventually, I plan to replace source_info with a relative filename for
the message, (or even a list of filenames for when multiple messages
in the database share a common message ID).
|
|
The interface for this is cheesy, (bare integer value numbers on the
command line indicating that unserialization is desired for those
value numbers). But this at least lets us print sup databases with
human-readable output for the date values.
|
|
|
|
|
|
|
|
It's not a complete tool yet, but it at least does something now.
|
|
Compiling with -Wall considered useful.
|
|
This will (when it is finished) make a much more reliable way to
ensure that notmuch's sync program behaves identically to sup-sync.
It doesn't actually do anything yet.
|
|
What I've done here is to instrument sup-sync to print the text
and terms objects it constructs just before indexing a message.
Then I've made my g_mime_test program achieve (nearly) identical
output for an example email message, (just missing the body
text). Next we can start shoving this data into a Xapian index.
|
|
Basically just playing with some simple code using libgmime to parse
an email message.
|