[PATCH 3/3] parsearchive: Handle duplicates
Stephen Finucane
stephen.finucane at intel.com
Tue Dec 22 03:21:43 AEDT 2015
The parsearchive tool can be used to load missing messages sourced
from mailman or another source. In this use case, there's a good
possibility that at least some of the messages found in the archive
are already stored in patchwork. Handle this case by ignoring these
duplicates.
Signed-off-by: Stephen Finucane <stephen.finucane at intel.com>
---
patchwork/bin/parsearchive.py | 10 +++++++++-
1 file changed, 9 insertions(+), 1 deletion(-)
diff --git a/patchwork/bin/parsearchive.py b/patchwork/bin/parsearchive.py
index f879215..33cb5cb 100755
--- a/patchwork/bin/parsearchive.py
+++ b/patchwork/bin/parsearchive.py
@@ -31,6 +31,8 @@ import django
from patchwork.bin import parsemail
+LOGGER = logging.getLogger(__name__)
+
VERBOSITY_LEVELS = {
'debug': logging.DEBUG,
'info': logging.INFO,
@@ -42,8 +44,14 @@ VERBOSITY_LEVELS = {
def parse_mbox(path, list_id):
mbox = mailbox.mbox(path)
+ duplicates = 0
for msg in mbox:
- parsemail.parse_mail(msg, list_id)
+ try:
+ parsemail.parse_mail(msg, list_id)
+ except django.db.utils.IntegrityError:
+ duplicates += 1
+ LOGGER.info('Processed %d messages, %d duplicates',
+ len(mbox), duplicates)
def main():
--
2.0.0
More information about the Patchwork
mailing list