Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
  • 08:29
    pombredanne edited #2958
  • 08:29
    pombredanne edited #2958
  • 08:27
    pombredanne edited #2958
  • 08:13
    pombredanne commented #2880
  • 08:12

    pombredanne on early-summer-q3-license-updates2

    (compare)

  • 08:12

    pombredanne on develop

    Do not detect MIT license when … Improve license detection metad… Add new HERE disclaimer Report… and 19 more (compare)

  • 08:12
    pombredanne closed #3030
  • 08:11
    pombredanne commented #3030
  • 07:37

    pombredanne on develop

    Bump version post release A ne… (compare)

  • 07:35
    pombredanne synchronize #3030
  • 07:35

    pombredanne on early-summer-q3-license-updates2

    Add new licenses Contributed-b… (compare)

  • Aug 17 22:55
    JonoYang labeled #3054
  • Aug 17 22:55
    JonoYang assigned #3054
  • Aug 17 22:55
    JonoYang opened #3054
  • Aug 17 22:38

    pombredanne on v31.0.1

    (compare)

  • Aug 17 22:37

    pombredanne on develop

    Ensure reslease scripts publish… Minor bump Signed-off-by: Phil… (compare)

  • Aug 17 22:20
    pombredanne commented #3052
  • Aug 17 22:19
    pombredanne commented #3052
  • Aug 17 22:18
    pombredanne unlabeled #3052
  • Aug 17 22:18
    pombredanne labeled #3052
Philippe Ombredanne
@pombredanne
@salt:sal.td hey :wave:
How much ram do you have on hand?
it uses roughly one GB per process (and this is mostly static usage for the index, but unfortunately not memory-mapped hence not shared between processes)
and it then needs RAM to assemble the final output
This part may be the most memory hungry
which output format do you use?
the jsonlines has been designed for a smaller footprint as things do not nee to be all loaded in memory to create the output
Can you paste your scan cli args details?
Philippe Ombredanne
@pombredanne
@salt:sal.td I am intrigued by your research projects too! tell me more :)
Salt
@salt:sal.td
[m]
@pombredanne: must be the final output that is crashing things then. I'm using json-pp and sending that to elasticsearch. I keep upping the ram to a virtual machine, but it's crashing out at 6gb. Will give cli details and such later, in a meeting but wanted to respond :)
Philippe Ombredanne
@pombredanne
@salt:sal.td sure thing. If you want you could file an issue so we can track this in details there
Abhishek Kumar
@Abhishek-Dev09
@/all 👋
Philippe Ombredanne
@pombredanne
@Abhishek-Dev09 hey :wave:
'sup?
Abhishek Kumar
@Abhishek-Dev09
@pombredanne Hi , how it is going?
Ayan Sinha Mahapatra
@AyanSinhaMahapatra
:wave:
Philippe Ombredanne
@pombredanne
@Abhishek-Dev09 doing great and you?
1 reply
dwdanielo
@dwdanielo
Hello everyone,
sorry if this is not the right place to ask this but I'm having some toubles with scancode while excluding some files. I've tried both --ignore-author and --ignore-copyright-holder like in this example:
https://scancode-toolkit.readthedocs.io/en/latest/cli-reference/output-filters-and-control.html?highlight=ignore
but none of the above worked for me.
How can I skip one folder in my repository, is there a way to do that with adding the parameter while executing the command, or should I use separated config file to specify particular directories I want to scan? I'll just add that it's not about skipping the unwanted files in post scan activities, I need to skip the unwanted folder from skanning as it will save 10h(!) of my time.
Thanks in advance, any help will be appreciated!
Philippe Ombredanne
@pombredanne
@dwdanielo the --ignore "glob pattern" option should be what you need
@dwdanielo this is the right place BTW... and welcome! :wave:
@dwdanielo the doc surely could be improved... so please come back here to tell if this worked for you
dwdanielo
@dwdanielo

Thanks so much for the answer :) Unfortunately I still can't exclude the unwanted folder. I generated the glob pattern like in below photo using https://regex101.com/ but nothing changed, the scancode scanned all the content from the path.
Just to show better what I've tried, here is my structure:

C/
├─ workspace/
│ ├─ UNWANTED/
│ ├─ folder_1/
│ ├─ folder_2/
│ ├─ folder_n/
│ ├─ file_1
│ ├─ file_2
│ ├─ file_n

and the command:
C:\workspace>scancode --ignore "./UNWANTED/." -l --html C:/scan_log.html C:/workspace

I've also tried the --ignore "./UNWANTED/." as last parameter, also with r before glob pattern but nothing changed...
Maybe I miss some basic stuff?

image.png
Philippe Ombredanne
@pombredanne
@dwdanielo let me try this locally
dwdanielo
@dwdanielo
I have dealt with it different way, I created a python script that scans every folder separately(except UNWANTED), the result is that the script generates many reports, but then I merge every html report into one, just wanted to share my solution ;)
Philippe Ombredanne
@pombredanne
@dwdanielo ah :) thanks!
Aditya Sangave
@adii21-Ux
nexB/scancode-toolkit#2872 @pombredanne added a comment
Rahul Surwade
@RahulSurwade08
Hey Everyone!
My Name is Rahul Surwade. I am a Cloud Security Engineer by profession with knowledge in AWS, GCP, Linux, Docker, Terraform and Kubernetes. I am new to open source contribution. I am very excited to contribute in your GSoc 2022 projects and learn a lot during my tenure here. I am currently working my way towards learning Go and GitOps.
I am interested for contributing in container-insector Project.
Please feel free to reach out on Linkedin : www.linkedin.com/in/rahul-surwade
Philippe Ombredanne
@pombredanne
@RahulSurwade08 welcome!
lf32
@lf32
Hi
lf32
@lf32
I have a question
do I have to register for something to attend the meet?
I have got some questions to ask
Philippe Ombredanne
@pombredanne
@lf32 this is open to everyone and no registration is required :)
lf32
@lf32
Is this a good ui? I also managed to change the table size to 25 according to issue 413
9 replies
img
lf32
@lf32
I didn't get one thing
Will that app (the new django app to detect text) will be a seperate project or an app inside scancode.io which specifically does one job i.e identifing licence with input
??
Philippe Ombredanne
@pombredanne

@lf32 re:

Is this a good ui?

Can you paste this is in an issue in ScanCode.io? This way we can gayjer feedback from a larger base

Will that app (the new django app to detect text) will be a seperate project or an app inside scancode.io which specifically does one job i.e identifing licence with input

IMHO: inside scancode.io which specifically does one job i.e identifing licence with input

Is this a good ui? I also managed to change the table size to 25 according to issue 413

This looks good at first, but let's use an issue to collect feedback :)

lf32
@lf32
How do you sort a pipeline?
i mean based on what?
Philippe Ombredanne
@pombredanne

@lf32 re:

How do you sort a pipeline?
i mean based on what?

I do not understand your question

lf32
@lf32
In #413 sorting based on pipeline
Philippe Ombredanne
@pombredanne
@lf32 you meant this nexB/scancode.io#413
@lf32 I think that @adii21-Ux may already be working on this?
lf32
@lf32
Ok then, no worries
Philippe Ombredanne
@pombredanne
:)