-## Notmuch Email Corpus
+[[!img notmuch-logo.png alt="Notmuch logo" class="left"]]
+# Notmuch Email Corpus
-A corpus of about 108k messages is available for performance testing of
+A corpus of about 209k messages is available for performance testing of
notmuch (or other uses).
The contents are as follows
- `Mail/enron`: selected data from the EDRM v2 enron data set
- CC Attribution: "ZL Technologies, Inc. (http://www.zlti.com)"
-
+
- Downloaded via bittorrent
http://www.searchdaimon.com/community/dataset/
-
+
- massaged with scripts/unpack-enron.sh (in the corpus tarball)
+- `Mail/lkml`: lkml messages 1000000 to 1100000 from the gmane archive
+
The corpus is gpg signed by David Bremner with key fingerprint:
- 815B 6398 2A79 F8E7 C727 86C4 762B 57BB 7842 06AD
+ 7A18 807F 100A 4570 C596 8420 7E4E 65C8 720B 706B
You can download the corpus from
-- [notmuchmail.org](http:///notmuchmail.org/releases/notmuch-email-corpus-0.2.tar.xz) [signature](http:///notmuchmail.org/releases/notmuch-email-corpus-0.2.tar.xz.asc)
-- [UNB](http://tesseract.cs.unb.ca/notmuch/notmuch-email-corpus-0.2.tar.xz) [signature](http://tesseract.cs.unb.ca/notmuch/notmuch-email-corpus-0.2.tar.xz.asc)
-- [Corpus 0.3](http://tesseract.cs.unb.ca/notmuch/notmuch-email-corpus-0.3.tar.xz) [signature](http://tesseract.cs.unb.ca/notmuch/notmuch-email-corpus-0.3.tar.xz.asc)
-
-
+- [notmuchmail.org](https://notmuchmail.org/releases/notmuch-email-corpus-0.5.tar.xz) [signature](https://notmuchmail.org/releases/notmuch-email-corpus-0.5.tar.xz.asc)