Description
The LICENSE file of text-unidecode is incorrectly identified as: 'gpl-2.0-plus AND gpl-2.0 AND (gpl-1.0-plus OR gpl-2.0-plus OR artistic-1.0) AND artistic-perl-1.0'
I think that the text that causes the problem is:
text-unidecode is a free software; you can redistribute
it and/or modify it under the terms of either:
* GPL or GPLv2+ (see https://www.gnu.org/licenses/license-list.html#GNUGPL), or
* Artistic License - see below:
The matches (of the text above) are:
gpl-2.0-plus matched by text-unidecode is a free software; you can redistribute\nit and/or modify it under the terms of either:\n\n* GPL or GPLv2+ (see https://www.gnu.org/licenses/license-list.html#GNUGPL), or
gpl-2.0 matched by text-unidecode is a free software; you can redistribute\nit and/or modify it under the terms of either:\n\n* GPL or GPLv2+ (see https://www.gnu.org/licenses/l\ icense-list.html#GNUGPL), or
gpl-1.0-plus OR gpl-2.0-plus OR artistic-1.0 matched by * GPL or GPLv2+ (see https://www.gnu.org/licenses/license-list.html#GNUGPL), or\n* Artistic License - see below:
Is scancode matching the same text sections too many times?
Note: artistic-perl-1.0 is identified from the license text in the LICENSE file referred to as "Artistic License"
How To Reproduce
$ curl -LJO https://files.pythonhosted.org/packages/ab/e2/e9a00f0ccb71718418230718b3d900e71a5d16e701a3dae079a21e9cd8f8/text-unidecode-1.3.tar.gz
$ tar zxvf text-unidecode-1.3.tar.gz
$ scancode -clipe --license-text --license-text-diagnostics --classify --license-clarity-score --todo --license-diagnostics --summary -n 16 --json-pp text-unidecode-1.3-scan.json text-unideco\
de-1.3
Note that the following files are also identified with an AND` between the licenses, but that is because of the following
Classifier: License :: OSI Approved :: Artistic License
Classifier: License :: OSI Approved :: GNU General Public License (GPL)
Classifier: License :: OSI Approved :: GNU General Public License v2 or later (GPLv2+)
... which scancode really can't do much about. The safe interpretation of listing thee licenses as to add an AND in between them.
System configuration
OS: Ubuntu 24-04
Scancode-toolkit: 32.5.0 (using pip)
Description
The LICENSE file of text-unidecode is incorrectly identified as:
'gpl-2.0-plus AND gpl-2.0 AND (gpl-1.0-plus OR gpl-2.0-plus OR artistic-1.0) AND artistic-perl-1.0'I think that the text that causes the problem is:
The matches (of the text above) are:
gpl-2.0-plusmatched bytext-unidecode is a free software; you can redistribute\nit and/or modify it under the terms of either:\n\n* GPL or GPLv2+ (see https://www.gnu.org/licenses/license-list.html#GNUGPL), orgpl-2.0matched bytext-unidecode is a free software; you can redistribute\nit and/or modify it under the terms of either:\n\n* GPL or GPLv2+ (see https://www.gnu.org/licenses/l\ icense-list.html#GNUGPL), orgpl-1.0-plus OR gpl-2.0-plus OR artistic-1.0matched by* GPL or GPLv2+ (see https://www.gnu.org/licenses/license-list.html#GNUGPL), or\n* Artistic License - see below:Is scancode matching the same text sections too many times?
Note:
artistic-perl-1.0is identified from the license text in theLICENSEfile referred to as "Artistic License"How To Reproduce
Note that the following files are also identified with an AND` between the licenses, but that is because of the following
... which scancode really can't do much about. The safe interpretation of listing thee licenses as to add an
ANDin between them.System configuration
OS: Ubuntu 24-04
Scancode-toolkit: 32.5.0 (using pip)