[PATCH] parsearchive: Support maildirs

Stephen Finucane stephen at that.guru
Mon Apr 10 03:14:03 AEST 2017


At present, the 'parsearchive' command only supports parsing of mboxes.
Expand this to support maildirs. This allows us to rewrite the
'parsemail-bulk' script to deliver much improved performance.

Signed-off-by: Stephen Finucane <stephen at that.guru>
Suggested-by: Daniel Axtens <dja at axtens.net>
---
This is an alternative to [1] that avoids us having to add yet another
management command.

[1] https://patchwork.ozlabs.org/patch/731892/
---
 patchwork/bin/parsemail-batch.sh              | 27 +++++++++------------------
 patchwork/management/commands/parsearchive.py |  7 ++++++-
 2 files changed, 15 insertions(+), 19 deletions(-)

diff --git a/patchwork/bin/parsemail-batch.sh b/patchwork/bin/parsemail-batch.sh
index d42712e..3d3725c 100755
--- a/patchwork/bin/parsemail-batch.sh
+++ b/patchwork/bin/parsemail-batch.sh
@@ -1,7 +1,7 @@
 #!/bin/sh
 #
 # Patchwork - automated patch tracking system
-# Copyright (C) 2008 Jeremy Kerr <jk at ozlabs.org>
+# Copyright (C) 2017 Stephen Finucane <stephen at that.guru>
 #
 # This file is part of the Patchwork package.
 #
@@ -20,25 +20,16 @@
 # Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA  02111-1307  USA
 
 BIN_DIR=$(dirname "$0")
+PATCHWORK_BASE=$(readlink -e "$BIN_DIR/../..")
 
-if [ $# -lt 1 ]; then
-    echo "usage: $0 <dir> [options]" >&2
-    exit 1
+if [ -z "$PW_PYTHON" ]; then
+    PW_PYTHON=python2
 fi
 
-mail_dir="$1"
-
-echo "dir: $mail_dir"
-
-if [ ! -d "$mail_dir" ]; then
-    echo "$mail_dir should be a directory"? >&2
-    exit 1
+if [ -z "$DJANGO_SETTINGS_MODULE" ]; then
+    DJANGO_SETTINGS_MODULE=patchwork.settings.production
 fi
 
-shift
-
-find "$mail_dir" -maxdepth 1 |
-while read -r line; do
-    echo "$line"
-    "$BIN_DIR/parsemail.sh" "$@" < "$line"
-done
+PYTHONPATH="${PATCHWORK_BASE}:${PATCHWORK_BASE}/lib/python:$PYTHONPATH" \
+    DJANGO_SETTINGS_MODULE="$DJANGO_SETTINGS_MODULE" \
+    "$PW_PYTHON" "$PATCHWORK_BASE/manage.py" parsearchive "$@"
diff --git a/patchwork/management/commands/parsearchive.py b/patchwork/management/commands/parsearchive.py
index 40b2cc0..a3c8360 100644
--- a/patchwork/management/commands/parsearchive.py
+++ b/patchwork/management/commands/parsearchive.py
@@ -69,7 +69,12 @@ class Command(BaseCommand):
             self.stdout.write('Invalid path: %s' % path)
             sys.exit(1)
 
-        mbox = mailbox.mbox(path)
+        # assume if <infile> is a directory, then we're passing a maildir
+        if os.path.isfile(path):
+            mbox = mailbox.mbox(path)
+        else:
+            mbox = mailbox.Maildir(path)
+
         count = len(mbox)
 
         logger.info('Parsing %d mails', count)
-- 
2.9.3



More information about the Patchwork mailing list