Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 08:12
    DanFelton starred nexB/scancode-toolkit
  • 07:57
    OmidZamani starred nexB/scancode-toolkit
  • 05:05
    KevinJi22 synchronize #2979
  • 00:04
    pombredanne synchronize #3046
  • 00:04

    pombredanne on macos-12-test

    Test release on fewer combos T… Be verbose when running release… Add missing pinned requirement … and 7 more (compare)

  • Aug 12 22:55
    mbragalone starred nexB/scancode-toolkit
  • Aug 12 20:53
    rspier commented #2877
  • Aug 12 17:49
    jeannekamikaze starred nexB/scancode-toolkit
  • Aug 12 06:37
    KevinJi22 synchronize #2979
  • Aug 12 04:54
    KevinJi22 synchronize #2979
  • Aug 11 19:53
    JonoYang commented #3042
  • Aug 11 19:48
    JonoYang synchronize #3042
  • Aug 11 19:48

    JonoYang on datafilehandler-yield-package-first

    Update assemble methods * … (compare)

  • Aug 11 18:34
    JonoYang synchronize #3042
  • Aug 11 18:34

    JonoYang on datafilehandler-yield-package-first

    Update doc and comments Signed… (compare)

  • Aug 11 17:18
    pombredanne closed #2950
  • Aug 11 17:18
    pombredanne commented #2950
  • Aug 11 16:05
    AyanSinhaMahapatra commented #3045
  • Aug 11 16:04
    AyanSinhaMahapatra commented on a13f81e
  • Aug 11 15:46
    pombredanne milestoned #3045
Philippe Ombredanne
@pombredanne
it uses roughly one GB per process (and this is mostly static usage for the index, but unfortunately not memory-mapped hence not shared between processes)
and it then needs RAM to assemble the final output
This part may be the most memory hungry
which output format do you use?
the jsonlines has been designed for a smaller footprint as things do not nee to be all loaded in memory to create the output
Can you paste your scan cli args details?
Philippe Ombredanne
@pombredanne
@salt:sal.td I am intrigued by your research projects too! tell me more :)
Salt
@salt:sal.td
[m]
@pombredanne: must be the final output that is crashing things then. I'm using json-pp and sending that to elasticsearch. I keep upping the ram to a virtual machine, but it's crashing out at 6gb. Will give cli details and such later, in a meeting but wanted to respond :)
Philippe Ombredanne
@pombredanne
@salt:sal.td sure thing. If you want you could file an issue so we can track this in details there
Abhishek Kumar
@Abhishek-Dev09
@/all 👋
Philippe Ombredanne
@pombredanne
@Abhishek-Dev09 hey :wave:
'sup?
Abhishek Kumar
@Abhishek-Dev09
@pombredanne Hi , how it is going?
Ayan Sinha Mahapatra
@AyanSinhaMahapatra
:wave:
Philippe Ombredanne
@pombredanne
@Abhishek-Dev09 doing great and you?
1 reply
dwdanielo
@dwdanielo
Hello everyone,
sorry if this is not the right place to ask this but I'm having some toubles with scancode while excluding some files. I've tried both --ignore-author and --ignore-copyright-holder like in this example:
https://scancode-toolkit.readthedocs.io/en/latest/cli-reference/output-filters-and-control.html?highlight=ignore
but none of the above worked for me.
How can I skip one folder in my repository, is there a way to do that with adding the parameter while executing the command, or should I use separated config file to specify particular directories I want to scan? I'll just add that it's not about skipping the unwanted files in post scan activities, I need to skip the unwanted folder from skanning as it will save 10h(!) of my time.
Thanks in advance, any help will be appreciated!
Philippe Ombredanne
@pombredanne
@dwdanielo the --ignore "glob pattern" option should be what you need
@dwdanielo this is the right place BTW... and welcome! :wave:
@dwdanielo the doc surely could be improved... so please come back here to tell if this worked for you
dwdanielo
@dwdanielo

Thanks so much for the answer :) Unfortunately I still can't exclude the unwanted folder. I generated the glob pattern like in below photo using https://regex101.com/ but nothing changed, the scancode scanned all the content from the path.
Just to show better what I've tried, here is my structure:

C/
├─ workspace/
│ ├─ UNWANTED/
│ ├─ folder_1/
│ ├─ folder_2/
│ ├─ folder_n/
│ ├─ file_1
│ ├─ file_2
│ ├─ file_n

and the command:
C:\workspace>scancode --ignore "./UNWANTED/." -l --html C:/scan_log.html C:/workspace

I've also tried the --ignore "./UNWANTED/." as last parameter, also with r before glob pattern but nothing changed...
Maybe I miss some basic stuff?

image.png
Philippe Ombredanne
@pombredanne
@dwdanielo let me try this locally
dwdanielo
@dwdanielo
I have dealt with it different way, I created a python script that scans every folder separately(except UNWANTED), the result is that the script generates many reports, but then I merge every html report into one, just wanted to share my solution ;)
Philippe Ombredanne
@pombredanne
@dwdanielo ah :) thanks!
Aditya Sangave
@adii21-Ux
nexB/scancode-toolkit#2872 @pombredanne added a comment
Rahul Surwade
@RahulSurwade08
Hey Everyone!
My Name is Rahul Surwade. I am a Cloud Security Engineer by profession with knowledge in AWS, GCP, Linux, Docker, Terraform and Kubernetes. I am new to open source contribution. I am very excited to contribute in your GSoc 2022 projects and learn a lot during my tenure here. I am currently working my way towards learning Go and GitOps.
I am interested for contributing in container-insector Project.
Please feel free to reach out on Linkedin : www.linkedin.com/in/rahul-surwade
Philippe Ombredanne
@pombredanne
@RahulSurwade08 welcome!
lf32
@lf32
Hi
lf32
@lf32
I have a question
do I have to register for something to attend the meet?
I have got some questions to ask
Philippe Ombredanne
@pombredanne
@lf32 this is open to everyone and no registration is required :)
lf32
@lf32
Is this a good ui? I also managed to change the table size to 25 according to issue 413
9 replies
img
lf32
@lf32
I didn't get one thing
Will that app (the new django app to detect text) will be a seperate project or an app inside scancode.io which specifically does one job i.e identifing licence with input
??
Philippe Ombredanne
@pombredanne

@lf32 re:

Is this a good ui?

Can you paste this is in an issue in ScanCode.io? This way we can gayjer feedback from a larger base

Will that app (the new django app to detect text) will be a seperate project or an app inside scancode.io which specifically does one job i.e identifing licence with input

IMHO: inside scancode.io which specifically does one job i.e identifing licence with input

Is this a good ui? I also managed to change the table size to 25 according to issue 413

This looks good at first, but let's use an issue to collect feedback :)

lf32
@lf32
How do you sort a pipeline?
i mean based on what?
Philippe Ombredanne
@pombredanne

@lf32 re:

How do you sort a pipeline?
i mean based on what?

I do not understand your question

lf32
@lf32
In #413 sorting based on pipeline
Philippe Ombredanne
@pombredanne
@lf32 you meant this nexB/scancode.io#413
@lf32 I think that @adii21-Ux may already be working on this?
lf32
@lf32
Ok then, no worries
Philippe Ombredanne
@pombredanne
:)
@lf32 there are plenty things to clench your teeth on otherwise :)