This is a channel focused on ScanCode support and not as noisy as the main discuss channel
pombredanne on misc-license-rules
Only trigger license rule with … (compare)
@lf32 I do not know about it so I have no opinion... we are using ace I think already though https://github.com/nexB/scancode.io/blob/main/scanpipe/templates/scanpipe/resource_detail.html#L78
@lf32 returning the question, what do you thing of it?
2022-09-09 21:52:44.04 Pipeline [scan_codebase] starting
2022-09-09 21:52:44.15 Step [copy_inputs_to_codebase_directory] starting
2022-09-09 21:52:44.24 Step [copy_inputs_to_codebase_directory] completed in 0.09 seconds
2022-09-09 21:52:44.25 Step [extract_archives] starting
2022-09-09 21:52:45.00 Step [extract_archives] completed in 0.76 seconds
2022-09-09 21:52:45.01 Step [collect_and_create_codebase_resources] starting
2022-09-09 21:52:49.75 Step [collect_and_create_codebase_resources] completed in 4.74 seconds
2022-09-09 21:52:49.75 Step [tag_empty_files] starting
2022-09-09 21:52:49.81 Step [tag_empty_files] completed in 0.06 seconds
2022-09-09 21:52:49.81 Step [scan_for_application_packages] starting
2022-09-09 21:53:17.12 Step [scan_for_application_packages] completed in 27.30 seconds
2022-09-09 21:53:17.13 Step [scan_for_files] starting
2022-09-09 21:53:20.83 Pipeline failed
Task output
A process in the process pool was terminated abruptly while the future was running or pending.
Traceback:
File "/app/scanpipe/pipelines/__init__.py", line 115, in execute
step(self)
File "/app/scanpipe/pipelines/scan_codebase.py", line 99, in scan_for_files
scancode.scan_for_files(self.project)
File "/app/scanpipe/pipes/scancode.py", line 310, in scan_for_files
_scan_and_save(resource_qs, scan_file, save_scan_file_results)
File "/app/scanpipe/pipes/scancode.py", line 297, in _scan_and_save
scan_results, scan_errors = future.result()
File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 439, in result
return self.__get_result()
File "/usr/local/lib/python3.9/concurrent/futures/_base.py", line 391, in __get_result
raise self._exception
go mod vendor
to populate the ./vendor
directory, and then run scancode, pointing it at the vendor directory, and it scans all the files. Thus to scan source code deps, I can't just point it at a cloned version of the repo, but I'm required to first prep the repo by pulling in all the source files of all the deps, using the appropriate package manage per each. Is my understanding right? I was looking at using https://github.com/pivotal/LicenseFinder, and see that it works with the package managers for you to do this prep step automatically. Is this something that scancode supports and I'm just not seeing it? Does scancode.io work differently than the scancode-toolkit in this case?
Thus to scan source code deps, I can't just point it at a cloned version of the repo, but I'm required to first prep the repo by pulling in all the source files of all the deps, using the appropriate package manage per each. Is my understanding right?
correct... though we are eventually building a series of dependency resolvers, the first being https://github.com/nexB/python-inspector and https://github.com/nexB/nuget-inspector ... and more to come for all the main package ecosystems
I was looking at using https://github.com/pivotal/LicenseFinder, and see that it works with the package managers for you to do this prep step automatically. Is this something that scancode supports and I'm just not seeing it? Does scancode.io work differently than the scancode-toolkit in this case?
neither the toolkit nor scancode.io do this running of package management tools. Instead you need to run your build first for now, but as mentioned above that's definitely on the roadmap
When a package contains a README.md, and that file contains the following string:
See [LICENSE](https://github.com/someuser/mypackage/blob/master/LICENSE) for details.
then scancode produces the following license entry:
"key": "unknown-license-reference",
"score": 100.0,
"name": "Unknown License file reference",
"short_name": "Unknown License reference",
"category": "Unstated License",
"is_exception": false,
"is_unknown": true,
"owner": "Unspecified",
"homepage_url": null,
"text_url": "",
"reference_url": "https://scancode-licensedb.aboutcode.org/unknown-license-reference",
"scancode_text_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/unknown-license-reference.LICENSE",
"scancode_data_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/unknown-license-reference.yml",
"spdx_license_key": "LicenseRef-scancode-unknown-license-reference",
"spdx_url": "https://github.com/nexB/scancode-toolkit/tree/develop/src/licensedcode/data/licenses/unknown-license-reference.LICENSE",
"start_line": 76,
"end_line": 76,
"matched_rule": {
"identifier": "unknown-license-reference_see-license_1.RULE",
"license_expression": "unknown-license-reference",
"licenses": [
"unknown-license-reference"
],
"referenced_filenames": [
"LICENSE"
],
"is_license_text": false,
"is_license_notice": false,
"is_license_reference": true,
"is_license_tag": false,
"is_license_intro": false,
"has_unknown": true,
"matcher": "2-aho",
"rule_length": 2,
"matched_length": 2,
"match_coverage": 100.0,
"rule_relevance": 100
}
}
Is that by design? In this case, the LICENSE is MIT, and the license expression is correctly identified as this:
"license_expressions": [
"mit",
"unknown-license-reference"
],
Can I somehow ignore this? It would be nice if it just detected that this package was MIT.
files
to packages
in the scancode output JSON? In my scan output, I see the files
array, and an entry for a file with path
= /vendor/github.com/aws/aws-sdk-go/LICENSE.txt
, and correctly identifies the apache-2.0
license, but I'm wondering if it should automatically identify this file as mapped to a package? I see there are two other empty fields in the file
entity: "package_data" and "for_packages", but wasn't sure how these should get filled in. Also note, that I can see the deps in the packages
array, and correctly identified in my go.mod
file.