These are chat archives for coala/coala-bears

14th
Apr 2018
John Vandenberg
@jayvdb
Apr 14 2018 00:58

My bear classes do not end with 'Bear'. Random names.

@DopamineGamer_twitter , that might be your problem. We have a bug in coala/coala about this, and it is has a PR, possibly even merged as it looked ok when I last saw it

John Vandenberg
@jayvdb
Apr 14 2018 01:06
@DopamineGamer_twitter , what is your github username so I can invite you to @coala org?
John Vandenberg
@jayvdb
Apr 14 2018 01:12
@harshhx17, if you are stuck, yes push so others can help
John Vandenberg
@jayvdb
Apr 14 2018 01:26
@DopamineGamer_twitter coala/coala#5379
John Vandenberg
@jayvdb
Apr 14 2018 01:55
@anctartica coala/coala-bears#2414
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 03:10
@jayvdb Thanks for creating the issue for me. My GitHub username is kuthulas. Haha, I like this username-check app.

My bear classes do not end with 'Bear'. Random names.

@DopamineGamer_twitter , that might be your problem. We have a bug in coala/coala about this, and it is has a PR, possibly even merged as it looked ok when I last saw it
Ah, I see. Thanks for letting me know. Let me rename by bears to see if it helps.

@li-boxuan Thanks. It's a bit less elegant than I'd have hoped for, but I'll keep it in mind.
John Vandenberg
@jayvdb
Apr 14 2018 03:15
@DopamineGamer_twitter , sent invite to org
I suggest you start with an easier task to begin with, like e.g. coala/coala#5366
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 03:16
@jayvdb Gracias!
John Vandenberg
@jayvdb
Apr 14 2018 03:16
we have a bit of a process for newcomers, which helps you and us avoid wasting time
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 03:16
I'll be sure to do that. Thanks.
John Vandenberg
@jayvdb
Apr 14 2018 03:17
if we get an awesome feature patch , but it doesnt comply with our guidelines, we will not merge it and it will get dropped.
so the newcomer tasks are about letting you see the guidelines in action on a trivial patch, so that (if you're not into guidelines) .. we only discard a trivial patch rather than an awesome feature
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 03:19
I understand. I'll run it by the team here before I start doing anything large/new.
John Vandenberg
@jayvdb
Apr 14 2018 03:21
I've assigned you to newcomer issue coala/coala#5375 as it is unallocated and not time sensitive
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 05:53
Guys, what's the optimal way to write and organize bears? Should rules (each rule yields a different result type) be grouped into bears so that they're optimized for performance, or should a bear yield only one type of result? I'm wondering if there's a recommended best practice. For instance, there would be overhead in each bear separately iterating on the file, but at the same time, one should be able to disable and enable bears as desired.
John Vandenberg
@jayvdb
Apr 14 2018 06:13
usually grouped by analysis type that the user wants to enable. then they disable parts of the bear that they dont want, or configure it to suit their needs
for performance, we have bears which provide data to other user-facing bears.
e.g. URLHeadBear (fetches status, reports offline URLs) gets its data from URLBear (extracts URLs from files)
same with Clang AST bears - one bear gets the CLang AST, multiple bears use that info to do analysis
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 06:20
@jayvdb That is valuable information. Thanks.
One more question about the clangbear. Would you know the difference between libclang-py3 and clang packages?
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 06:28
Both support python3, and I think that the official clang should be anyone’s choice. I could be wrong.
John Vandenberg
@jayvdb
Apr 14 2018 07:31
tough question. We've had lots of problems with libclang-py3 . My understanding is that the python wrapper version is tied to the clang version
have only had success with one clang version, working with its libclang-py3
note that this limitation is somewhat because we already have 3 analysis routines, and they are expecting certain clang functions to existing and work.
when we change clang versions, we get core dumps on windows, on x86-64 iirc
Ankit Joshi
@MacBox7
Apr 14 2018 07:38
@jayvdb according to gitlab docs https://docs.gitlab.com/ee/api/issues.html, every api call must be authenticated. Thus, when I was trying to run IGitt without token it threw a 401. How, should I deal with this? Should the token be input as a setting (not recommended as per our last discussion ).
Or should we limit issue checking to only GitHub as of now.
John Vandenberg
@jayvdb
Apr 14 2018 07:46
the answers dont change; and you keep trying
Ankit Joshi
@MacBox7
Apr 14 2018 07:47
But without auth token, it can't be done.
The only option is to support this feature for GitHub as it allows making request without token
Ankit Joshi
@MacBox7
Apr 14 2018 07:52
Well, the problem is not with IGitt, its the way Github and GitLab are designed. The GitHub api allows access to public repo without token, but for GitLab you need a token that's it.
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 08:31

same with Clang AST bears - one bear gets the CLang AST, multiple bears use that info to do analysis

@jayvdb I couldn’t find an example where the AST itself is obtained from another bear. Can the translation unit be part of a dependency result?

I was experimenting, but (as expected?) I keep getting ValueError: ctypes objects containing pointers cannot be pickled. I guess each bear runs in its own process, and sharing a libclang object is not possible. Am I right? Is it possible to force all dependent bears to run within the same process?
Vaibhav Rai
@RaiVaibhav
Apr 14 2018 08:36

@jayvdb I agree with @MacBox7
The link which sils trying to query i.e https://gitlab.com/api/v4/projects/coala%2Fcoala-utils/issues will not work if we are going to try in incognito, responded with this message

{"message":"401 Unauthorized"}

but for github if we are going to query this link https://api.github.com/repos/coala/coala/issues/3206, It responded with valid json data ( in incognito this is reason I think because of which it is not working as mentioned by sils ).
Here don't know does %2F represent 2-factor or something else.

John Vandenberg
@jayvdb
Apr 14 2018 08:49
%2F= /
@ sangamcse , are you available?
Sangam Kumar
@sangamcse
Apr 14 2018 08:50
Yes
John Vandenberg
@jayvdb
Apr 14 2018 08:54
ok, well W504 is entirely against PEP8
as is W503
which is why they are not enabled by default
they are a very poor quality check of a complex rule in PEP8
Sangam Kumar
@sangamcse
Apr 14 2018 08:55
Exactly thats what I thought
John Vandenberg
@jayvdb
Apr 14 2018 08:55
not really easy to do in pycodestyle as it doesnt have an AST, as so they have a half arsed version which likely makes enough people happy
Sangam Kumar
@sangamcse
Apr 14 2018 08:56
I think first we should resolve coala/coala-bears#2431
John Vandenberg
@jayvdb
Apr 14 2018 08:59
no matter what happens with coala/coala-bears#2431 , coala repo needs a long explicit ignore list
there is a feature in there, which is allowing += in settings to inherit from the bear, but that isnt an easy win
coala/coala-bears#2431 can percolate a little, and worse case be reverted before release
@DopamineGamer_twitter, processing ... that is a question better aimed at @Makman2 .
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 09:06
:point_up: April 14, 2018 1:31 AM @Makman2 Hi. Do you have a comment on this?
@jayvdb Thanks. Sorry to bother too much, but I have another question for you. How are patch conflicts across bears handled?
John Vandenberg
@jayvdb
Apr 14 2018 09:07
@DopamineGamer_twitter the bears you want to look at are in bears/c_languages/codeclone_detection/
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 09:08
I did look at them, but they don’t seem to share the AST (cursor or the translation unit). Some share static lists etc.
John Vandenberg
@jayvdb
Apr 14 2018 09:10
right you are.
so you'll need to wait for Makman to see if it is possible
we do have two GSoC projects this year that will be doing AST sharing, but likely AST using native python objects of foreign languages, not cursors
John Vandenberg
@jayvdb
Apr 14 2018 09:17
likewise Makman will be best person to give exact answers regarding conflicts between patches. my limited understanding of that area is that there is a break(pause) after each section of .coafile , and it is assumed that within the section there are no conflicts, and the user chooses to apply or discard patches at the end of each section, so the internals reset again processing the next section if the same files appear in multiple sections.
at some point, the modified file from one bear is streamed into the next bear. im not sure whether that is within a section, or when crossing sections of .coafile
hard questions today.
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 09:21

likewise Makman will be best person to give exact answers regarding conflicts between patches. my limited understanding of that area is that there is a break(pause) after each section of .coafile , and it is assumed that within the section there are no conflicts, and the user chooses to apply or discard patches at the end of each section, so the internals reset again processing the next section if the same files appear in multiple sections.

Wow, I wonder how one can assume that there wouldn’t be conflicts within a section. A lot of design decisions will depend on this. Would be great if @Makman2 could throw some light.

Thanks for your comments today. I’ll let you go now :)
John Vandenberg
@jayvdb
Apr 14 2018 09:23
@DopamineGamer_twitter what happens within a section is controlled by which bears are selected
so it is always possible to reason about that, and put some bears in a latter section to avoid conflicts.
but I could be out of date , or completely of base ; @makman2 will be thrilled to chat about it at length with you, until you are bored out of your brain, and he has written up 50 new features
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 09:26
@jayvdb That is true, but a bear can do anything it wants. One bear can look at a code comments, and another just function declarations. But if both happen on the same line, then there would be a patch conflict. Anticipating this and separating these two bears as two sections doesn’t look okay.
By the way, can I change my username here? It’s weird.
John Vandenberg
@jayvdb
Apr 14 2018 09:28
ya, you can log in using the github integration in order to switch username
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 09:29
Ah, ok. But I’ll wait for Makman’s reply. I don’t want to hit a Gitter bug, and lose the messages :)
John Vandenberg
@jayvdb
Apr 14 2018 09:30
oh , ya , you have two accounts - twitter and github ... there is no way to have the two merged
Sangam Kumar
@sangamcse
Apr 14 2018 09:30
@jayvdb, here we go coala/coala#5381
John Vandenberg
@jayvdb
Apr 14 2018 09:30
but @Makman2 will find you, no problem
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 09:31
@jayvdb Okay. And, the AST sharing is not a major problem for me, but patch conflicts seem like one. Not sure what to do about it, especially when coala doesn’t give me a combined diff. I wonder how existing coala users handle this.
Sangam Kumar
@sangamcse
Apr 14 2018 09:31
Next we have to create 2 issues to change coala code according to pycodestyle new checks and remove them from ignore list
John Vandenberg
@jayvdb
Apr 14 2018 09:33
@sangamcse , yup; thx. also look at those other codes in the list and look for existing issue or new create about any you think are unambiguous code style problems which we've avoided previously
we might have some old patches stuck in WIP
Sangam Kumar
@sangamcse
Apr 14 2018 09:34
Ok.
John Vandenberg
@jayvdb
Apr 14 2018 09:34
if you're not sure about an ignored code, ping me
Sangam Kumar
@sangamcse
Apr 14 2018 09:34
Yes, obviously. :smiley:
Sangam Kumar
@sangamcse
Apr 14 2018 10:01
@jayvdb, PyCQA/pycodestyle#747 PR is merged, still W605 is not checking the position correctly in our repo. So do they need to release new version for that changes??
Also, it is related to coala/coala-bears#2425
John Vandenberg
@jayvdb
Apr 14 2018 10:06
yes, that will need to wait for a release
wrong position is not critical
and I am about 95% sure my bug is about a different position problem ... as it is wrong line .. due to how pycodestyle is mostly a logical line tool, and has difficulty with physical line numbers in many cases, and def is one of them
John Vandenberg
@jayvdb
Apr 14 2018 10:29
@sangamcse , if you are bored, there are two new pycodestyle violations appearing on https://travis-ci.org/coala/coala-bears/jobs/366449315#L4164 which are not expected error codes
there are others on the log which I am addressing, but E301 & E501 at least were not on the list to expect
Sangam Kumar
@sangamcse
Apr 14 2018 10:32
Sorry @jayvdb, but tomorrow is my exam. So I thought I should go and revise for the exam :stuck_out_tongue:
John Vandenberg
@jayvdb
Apr 14 2018 10:36
oh of course
Mischa Krüger
@Makman2
Apr 14 2018 12:33

I guess each bear runs in its own process, and sharing a libclang object is not possible.

Not quite. We have a process pool with as many processes as cores on a machine by default.

Is it possible to force all dependent bears to run within the same process?

Not sure about the current core, but the nextgen-core is able to do that (sadly not yet integrated). However, then you loose parallel processing of bears.

we do have two GSoC projects this year that will be doing AST sharing, but likely AST using native python objects of foreign languages, not cursors

That is the main issue, that the libclang bindings use those cursors directly which appear to be C-objects. @DopamineGamer_twitter the solution would be to translate the AST you get from clang to an AST made up of Python objects which can be pickled.

at some point, the modified file from one bear is streamed into the next bear. im not sure whether that is within a section, or when crossing sections of .coafile

I'm also not sure of the exact internas of the old core, but I know that modified files are only streamed across sections, not across bears in the same section (that's why you can get conflicts when applying patches in coala, and coala will reject applying a conflicting patch).

Wow, I wonder how one can assume that there wouldn’t be conflicts within a section

We knew that there will be conflicts, but it's in general a problem when you do parallel processing. But usually they don't appear that often in common setups, and we have also improved our diff handling by being able to merge changes inside a single line. There's always the possibility of a conflict, it's not possible avoid them 100%.

Currently I believe it's best to even further improve our diffs regarding conflict handling, maybe you have some more ideas? We already merge changes within a line though^^
Harsh Kumar Bhartiya
@harshhx17
Apr 14 2018 14:28
@jayvdb @alphadose
I need some help with coala/coala-bears#2312
Can you please give me some insight on how I should proceed now...
I am sort of stuck
Anish Mukherjee
@alphadose
Apr 14 2018 14:50
let me have a look
Harsh Kumar Bhartiya
@harshhx17
Apr 14 2018 15:38
thanks @alphadose
Sangam Kumar
@sangamcse
Apr 14 2018 17:11

@jayvdb, about https://travis-ci.org/coala/coala-bears/jobs/366399908#L4343. I don't know whether E301 check is valid there or not but what I found is by removing line 55 from https://github.com/coala/coala-bears/blob/master/tests/ruby/RuboCopBearTest.py#L55 can resolve that issue.

there are others on the log which I am addressing, but E301 & E501 at least were not on the list to expect

Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 18:43

@Makman2 Thanks for taking time to respond.

We have a process pool with as many processes as cores on a machine by default.

Is the pool size user-configurable? (to 1 :sparkles: )

We knew that there will be conflicts, but it's in general a problem when you do parallel processing.

Yeah, I understand.

Currently I believe it's best to even further improve our diffs regarding conflict handling, maybe you have some more ideas?

How about this? Within a section, after a Bear completes its analysis and is about to yield a result, if a diff is part of the result then it performs a dry run of the patching. If it finds that there would be a conflict, then this Bear refreshes file content, and performs its analysis once again (just once; not infinite loop, of course). Now, this logic could be enabled/disabled by the user. If conflicts are minimal in common setups, as you say, then this overhead might be tolerable?

This is probably not a great idea. I just got up from bed, but what do you think?

Mischa Krüger
@Makman2
Apr 14 2018 19:01

Is the pool size user-configurable?

@DopamineGamer_twitter yes it is, even in the current core :) (see the --jobs flag)

If it finds that there would be a conflict, then this Bear refreshes file content, and performs its analysis once again (just once; not infinite loop, of course).

Interestingly, we have an issue proposing exactly that :) However, the issue had originally in mind to handle user-changes in files while coala has been running.

The idea is worth to consider, however it's practically a bit hard to implement. The workaround would be to run coala once again after you've applied the first patch. Feel free to file an issue so we can discuss in more detail :)

One thing to note: We plan to execute all sections in parallel (this is possible with the nextgen-core), so we have to be careful regarding that too (so that this stays possible) ;)

however it's practically a bit hard to implement.

Okay that's actually not quite right, it should be just decided if the overhead is bearable. Implementation could be actually quite easy: If there's a conflict, the coala CLI could schedule a new coala run with the same bears again. Due to caching this could be actually quite fast, but still not negligible.

Mischa Krüger
@Makman2
Apr 14 2018 19:08
And another very important fact: Consider two bears doing the same (not intended, but some bears can check for the same problems), and they propose a patch, that when you apply one, would change the result in the other bear.
That's conceptually hard.
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 20:00

@Makman2

that when you apply one, would change the result in the other bear.

Like forward chaining in inference engines, yes. I wonder if it would help if we could let the user contruct the sequence/workflow of bear execution (order and concurrency), like tasks in Celery. Could it be the nextgen bear dependency management for coala? I’m just thinking aloud.

Mischa Krüger
@Makman2
Apr 14 2018 20:11
In the end you can invoke coala multiple times :/ Not a super convenient workaround though.

Could it be the nextgen bear dependency management for coala?

Hm

just thinking how to do this...
the core itself shall ideally only be responsible for running bears. It lets them start running as quickly as possible.
doesn't fit that nicely with that idea ... hmm...
hmm...
actually, I'm thinking about your previous idea again to rerun coala
we could have some "clean-run-mode", which only exits coala once no more issues are found. It will ask the user as long as results are lying around.

@DopamineGamer_twitter your opinions?

As you seem to have some specific scenarios in mind, you mind to file an issue? I think this should be officially discussed with a comprehensive pro/cons list

Mischa Krüger
@Makman2
Apr 14 2018 20:16
/ with other ideas to address such conflict problems in coala in general.
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 20:29

@Makman2

the core itself shall ideally only be responsible for running bears. It lets them start running as quickly as possible.

Then we need an orchestrator module :smile: Coala now takes advantage of all the cores in the system. With Bears being tasks in a Celery-like environment, the multiprocessing could be extended to workers running anywhere. But is it really worth doing for a task that coala handles? I’m not sure.

It will ask the user as long as results are lying around.

I’m not sure about this. It feels like both my idea of running it again, and this clean-run-mode feels less elegant and bumpy. Let me think again, about the set of issues I’ve encountered so far, and file an issue.

is it really worth doing for a task that coala handles? I’m not sure.

Perhaps yes in a large enterprise/projects deployment.

On a side note, is there a libclang expert here in our team, that you know of? I have more questions around that.
Mischa Krüger
@Makman2
Apr 14 2018 20:32
I don't know Celery that well, but it seems to be a framework for distributed computing. And that is actually officially encouraged in the new core: just exchange and replace the executor with a "CeleryExecutor" :) I plan to have support for distributed workers on Kubernetes currently (or Mesos, but Kubernetes would get us more potential users).

I’m not sure about this. It feels like both my idea of running it again, and this clean-run-mode feels less elegant and bumpy. Let me think again, about the set of issues I’ve encountered so far, and file an issue.

Yeah I'm also not that confident, we have to collect more ideas :)

On a side note, is there a libclang expert here in our team, that you know of? I have more questions around that.

@sils has written the CodeCloneDetectionBear, so I guess he's the expert here ;)

ping @sils :point_up: April 14, 2018 10:29 PM
@DopamineGamer_twitter *
Dopamine Gamer
@DopamineGamer_twitter
Apr 14 2018 20:40
Thank you!
Mischa Krüger
@Makman2
Apr 14 2018 20:42
No problem ;) If you've anything left, just ping ;)
(I try to answer as fast as possible, but I'm recently working 40h/week so I became less active here in coala (I wanna change that again though :P))
John Vandenberg
@jayvdb
Apr 14 2018 23:17

I don't know whether E301 check is valid there or not but what I found is by removing line 55 from https://github.com/coala/coala-bears/blob/master/tests/ruby/RuboCopBearTest.py#L55 can resolve that issue.

ok, that sounds better. and there is definitely a bug in there somewhere, as E301 isnt descriptive of whatever it is trying report