[PATCH 0/9] powerpc: delete duplicated words

Randy Dunlap rdunlap at infradead.org
Mon Jul 27 05:08:08 AEST 2020


On 7/26/20 10:49 AM, Joe Perches wrote:
> On Sun, 2020-07-26 at 10:23 -0700, Randy Dunlap wrote:
>> On 7/26/20 7:29 AM, Christophe Leroy wrote:
>>> Randy Dunlap <rdunlap at infradead.org> a écrit :
>>>
>>>> Drop duplicated words in arch/powerpc/ header files.
>>>
>>> How did you detect them ? Do you have some script for tgat, or you just read all comments ?
>>
>> Yes, it's a script that finds lots of false positives, so I have to check
>> each and every one of them for validity.
> 
> And it's a lot of work too. (thanks Randy)
> 
> It could be something like:
> 
> $ grep-2.5.4 -nrP --include=*.[ch] '\b([A-Z]?[a-z]{2,}\b)[ \t]*(?:\n[ \t]*\*[ \t]*|)\1\b' * | \
>   grep -vP '\b(?:struct|enum|union)\s+([A-Z]?[a-z]{2,})\s+\*?\s*\1\b' | \
>   grep -vP '\blong\s+long\b' | \
>   grep -vP '\b([A-Z]?[a-z]{2,})(?:\t+| {2,})\1\b'

Hi Joe,

(what is grep-2.5.4 ?)

It looks like you tried a few iterations of this -- since it drops things
like "long long".  There are lots of data types that are repeated & valid.
And many struct names, like "struct kref kref", "struct completion completion",
and "struct mutex mutex".  I handle (ignore) those manually, although that
could be added to the Perl script.

v0.1 of this script also found lots of repeated numbers and strings of
special characters (ASCII art etc.), so now it ignores duplicated numbers
or special characters -- since it is really looking for duplicate words.

Anyway, I might as well attach it. It's no big deal.
And if someone else wants to tackle using it, go for it.

-- 
~Randy

-------------- next part --------------
A non-text attachment was scrubbed...
Name: find_dup_words.pl
Type: application/x-perl
Size: 2958 bytes
Desc: not available
URL: <http://lists.ozlabs.org/pipermail/linuxppc-dev/attachments/20200726/e134a7be/attachment-0001.bin>


More information about the Linuxppc-dev mailing list