Q: Why does the first request using the CLI take so much longer?
A: The tool expects to have a list of licenses it can load for the similarity analysis. When startet the first time, the tool will download the package from github. We maintain a curated license texts package there.
Q: We plan to ramp-up a new pipeline every build. Is there a way to skip the downloading of data when using the CLI?
A: Sure. You may provide the downloaded data - residing in a `/temp` directory after the first run - together with the tool. Then you will not have to wait for the download.
Q: Is it possible to speed up the scanning?
A: Well, it depends. As usual demand drives action. DeepScan supports three different scanning modes: single file, license only, license & copyright.
The first mode - SINGLE FILE - is provided to allow the verification of a single file. In the result you will see the similarity to the license identified. Our tests showed, that similarities below 80% should be treated as an indication to assess the file. Most likely you will find a modified license.
The second mode - LICENSE ONLY - has been provided to allow quick repo scans for license indications. This should be done _before_ a new component is added to a project. It is searching the complete directory and assessing all types of texts or files that could bear license indications.
The third mode - LICENSE & COPYRIGHT - is for clearance and provision of notice files in a later stage. This will require the longest processing time. In general it should be run only once per commit. The shared version profits from a persistent context, so that a file hash can be used to identify already assessed files and shorten the processing time.