[PATCH v2 3/9] tools/scripts: split a mbox N ways

Daniel Axtens dja at axtens.net
Sun Feb 25 01:50:14 AEDT 2018


To test parallel loading of mail, it's handy to be able to split
an existing mbox file into N mbox files in an alternating pattern
(e.g. 1 2 1 2 or 1 2 3 4 1 2 3 4 etc)

Introduce tools/scripts as a place to put things like this.

Reviewed-by: Andrew Donnellan <andrew.donnellan at au1.ibm.com>
Signed-off-by: Daniel Axtens <dja at axtens.net>

--

v2: address Andrew's review comments
    for full pep8 compliance, add to tox.ini testing
---
 tools/scripts/split_mail.py | 80 +++++++++++++++++++++++++++++++++++++++++++++
 tox.ini                     |  2 +-
 2 files changed, 81 insertions(+), 1 deletion(-)
 create mode 100755 tools/scripts/split_mail.py

diff --git a/tools/scripts/split_mail.py b/tools/scripts/split_mail.py
new file mode 100755
index 000000000000..d1e3b06fdf85
--- /dev/null
+++ b/tools/scripts/split_mail.py
@@ -0,0 +1,80 @@
+#!/usr/bin/python3
+# Patchwork - automated patch tracking system
+# Copyright (C) 2018 Daniel Axtens <dja at axtens.net>
+#
+# This file is part of the Patchwork package.
+#
+# Patchwork is free software; you can redistribute it and/or modify
+# it under the terms of the GNU General Public License as published by
+# the Free Software Foundation; either version 2 of the License, or
+# (at your option) any later version.
+#
+# Patchwork is distributed in the hope that it will be useful,
+# but WITHOUT ANY WARRANTY; without even the implied warranty of
+# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+# GNU General Public License for more details.
+
+import sys
+import os
+import mailbox
+
+usage = """Split a maildir or mbox into N mboxes
+in an alternating pattern
+
+Usage: ./split_mail.py <input> <mbox prefix> <N>
+
+ <input>: input mbox file or Maildir
+ <mbox prefix>: output mbox
+    <mbox-prefix>-1... must not exist
+ <N> N-way split"""
+
+
+if len(sys.argv) != 4:
+    print(usage)
+    exit(1)
+
+in_name = sys.argv[1]
+out_name = sys.argv[2]
+
+try:
+    n = int(sys.argv[3])
+except ValueError:
+    print("N must be an integer.")
+    print(" ")
+    print(usage)
+    exit(1)
+
+if n < 2:
+    print("N must be be at least 2")
+    print(" ")
+    print(usage)
+    exit(1)
+
+if not os.path.exists(in_name):
+    print("No input at ", in_name)
+    print(" ")
+    print(usage)
+    exit(1)
+
+print("Opening", in_name)
+if os.path.isdir(in_name):
+    inmail = mailbox.Maildir(in_name)
+else:
+    inmail = mailbox.mbox(in_name)
+
+out = []
+for i in range(n):
+    if os.path.exists(out_name + "-" + str(i + 1)):
+        print("mbox already exists at ", out_name + "-" + str(i + 1))
+        print(" ")
+        print(usage)
+        exit(1)
+
+    out += [mailbox.mbox(out_name + '-' + str(i + 1))]
+
+print("Copying messages")
+
+for (i, msg) in enumerate(inmail):
+    out[i % n].add(msg)
+
+print("Done")
diff --git a/tox.ini b/tox.ini
index 09505f78e157..345f7fe2e15a 100644
--- a/tox.ini
+++ b/tox.ini
@@ -37,7 +37,7 @@ commands =
 [testenv:pep8]
 basepython = python2.7
 deps = flake8
-commands = flake8 {posargs} patchwork patchwork/bin/pwclient
+commands = flake8 {posargs} patchwork patchwork/bin/pwclient tools/scripts/split_mail.py
 
 [flake8]
 ignore = E129, F405
-- 
2.14.1



More information about the Patchwork mailing list