[PATCH v2 4/9] tools/scripts: parallel_parsearchive - load archives in parallel

Stephen Finucane stephen at that.guru
Sun Feb 25 23:16:29 AEDT 2018


On Sun, 2018-02-25 at 01:50 +1100, Daniel Axtens wrote:
> If you have multiple archives, you quickly tire of typing stuff like
> python3 manage.py parsearchive --list-id=patchwork.ozlabs.org foo-1 &
> python3 manage.py parsearchive --list-id=patchwork.ozlabs.org foo-2 &
> python3 manage.py parsearchive --list-id=patchwork.ozlabs.org foo-3 &
> python3 manage.py parsearchive --list-id=patchwork.ozlabs.org foo-4 &
> and having to copy and paste it - or retype it! - each time you reset
> the database.
> 
> Instead, this patch allows you to do
> tools/scripts/parallel_parsearchive.sh --list-id=patchwork.ozlabs.org 
> -- foo-*
> 
> Much easier, especially when you are doing it a dozen times.
> 
> Reviewed-by: Andrew Donnellan <andrew.donnellan at au1.ibm.com>
> Signed-off-by: Daniel Axtens <dja at axtens.net>
> 
> --
> v2: Include example, thanks Andrew
>     Allow python to be overridden by the PW_PYTHON variable,
>     defaulting to Python 3.
> ---

This works, but it does seem less obvious than I'd like. I realise we
can't use threads, thanks to the GIL. However, I have used
multiprocessing here in the past to solve similar problems and there is
prior art here for management commands [1]. Any reason we can't do the
same here?

Stephen

[1] https://brobin.me/blog/2017/05/mutiprocessing-in-python-django-management-commands/

>  tools/scripts/parallel_parsearchive.sh | 61
> ++++++++++++++++++++++++++++++++++
>  1 file changed, 61 insertions(+)
>  create mode 100755 tools/scripts/parallel_parsearchive.sh
> 
> diff --git a/tools/scripts/parallel_parsearchive.sh
> b/tools/scripts/parallel_parsearchive.sh
> new file mode 100755
> index 000000000000..f03875b85d6a
> --- /dev/null
> +++ b/tools/scripts/parallel_parsearchive.sh
> @@ -0,0 +1,61 @@
> +#!/bin/bash
> +# Patchwork - automated patch tracking system
> +# Copyright (C) 2018 Daniel Axtens <dja at axtens.net>
> +#
> +# This file is part of the Patchwork package.
> +#
> +# Patchwork is free software; you can redistribute it and/or modify
> +# it under the terms of the GNU General Public License as published
> by
> +# the Free Software Foundation; either version 2 of the License, or
> +# (at your option) any later version.
> +#
> +# Patchwork is distributed in the hope that it will be useful,
> +# but WITHOUT ANY WARRANTY; without even the implied warranty of
> +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
> +# GNU General Public License for more details.
> +
> +set -euo pipefail
> +
> +usage() {
> +    cat <<EOF
> +parallel_parsearchive.sh - load archives in parallel
> +Usage:
> +  parallel_parsearchive.sh [parsearchive options] -- <archives>
> +  The -- is mandatory.
> +  As many processes as there are archives will be spun up.
> +
> +Example:
> +  tools/scripts/parallel_parsearchive.sh --list-
> id=patchwork.ozlabs.org -- foo-*
> +EOF
> +    exit 1
> +}
> +
> +if [ $# -eq 0 ] || [[ $1 == "-h" ]]; then
> +    usage;
> +fi
> +
> +PARSEARCHIVE_OPTIONS=""
> +while [[ $1 != "--" ]]; do
> +    PARSEARCHIVE_OPTIONS="$PARSEARCHIVE_OPTIONS $1"
> +    shift
> +    if [ $# -eq 0 ]; then
> +        usage;
> +    fi
> +done
> +shift
> +
> +if [ $# -eq 0 ]; then
> +    usage;
> +fi
> +
> +set +u
> +if [ -z "$PW_PYTHON" ]; then
> +    PW_PYTHON=python3
> +fi
> +set -u
> +
> +for x in "$@"; do
> +    echo "Starting $x"
> +    "$PW_PYTHON" manage.py parsearchive $PARSEARCHIVE_OPTIONS "$x" &
> +done
> +echo "Processes started in the background."



More information about the Patchwork mailing list