These are chat archives for coala/coala-bears

9th
Apr 2018
Ankit Joshi
@MacBox7
Apr 09 2018 06:17
@jayvdb I have put up some issues on the upstream regarding the issue coala/coala-bears#1342. If these are solved fast enough then we could use urlextract for URLBear. I have already worked on a way to solve that false positive issue which came on parsing html files.
I even searched for some other tools that can be used to extract URL, but couldn't find anything that's perfect. This is IMO a very challenging task and so there is no defined way to do it. I think we could only resort to heuristic approach to extract URL's.
John Vandenberg
@jayvdb
Apr 09 2018 11:43
did you look at scrapy?
Ankit Joshi
@MacBox7
Apr 09 2018 12:18
@jayvdb didn't found any such function which simply returns url for a normal text file. However, if it is an HTML page it can be done quite easily.
John Vandenberg
@jayvdb
Apr 09 2018 12:25
what about feeding the text file into their html engine?
it is just text ;-)
Ankit Joshi
@MacBox7
Apr 09 2018 12:27
Yes but in our text file the url can be present in any fomat. For a general html case scrapy would check all anchor, source, etc tags to fetch url. But such tags won't be present in our case.
Anyways, let me dig a bit deeper to see if there is any other way we could use scrapy to extract url's.
John Vandenberg
@jayvdb
Apr 09 2018 12:30
I think scrapy can be configured to look for raw links, not just anchors
Ankit Joshi
@MacBox7
Apr 09 2018 12:31
I asked on their IRC channel they said no :smile_cat:
scrapy.png
John Vandenberg
@jayvdb
Apr 09 2018 12:38
ok
thx for investigating
Ankit Joshi
@MacBox7
Apr 09 2018 12:39
I have posted issues on urlextract, he said he would work on it. But don't know how long would it take.
John Vandenberg
@jayvdb
Apr 09 2018 12:50
could you add a note about scrapy; no need to copy IRC logs (typically assumed to be private), just summarise outcome
i.e. tell me I am wrong! ;-)
Ankit Joshi
@MacBox7
Apr 09 2018 12:51
Yup, will do.
I have a doubt regarding coala/coala-bears#2416. Should I also add a setting to accept authentication details. Because this is required by the Git api for the private repos.
John Vandenberg
@jayvdb
Apr 09 2018 12:53
IMO private repos are out of scope to begin with.
Ankit Joshi
@MacBox7
Apr 09 2018 12:54
Why? I don't understand.
John Vandenberg
@jayvdb
Apr 09 2018 12:54
that will hold back the initial release of the feature
what setting will you use?
and settings will be bad. needs to be env var
we dont have bears using env vars
new territory ... needs consensus ... that takes time
I would immediately reject any bear specific code to do that
Ankit Joshi
@MacBox7
Apr 09 2018 12:55
In my implementation I have created one more setting called remote. Will this also be a problem?
John Vandenberg
@jayvdb
Apr 09 2018 12:56
why cant the remote be detected ?
Ankit Joshi
@MacBox7
Apr 09 2018 12:56
There could be multiple remotes.
John Vandenberg
@jayvdb
Apr 09 2018 12:56
asking people to provide settings always makes coala more difficult to use
sensible defaults. 80/20 rule
Ankit Joshi
@MacBox7
Apr 09 2018 12:56
For eg I have 2 remotes origin and upstream.
for my coala-bears repo
John Vandenberg
@jayvdb
Apr 09 2018 12:57
yup and I might have 'origin' (meaning upstream) and 'me'
I get the need.
Ankit Joshi
@MacBox7
Apr 09 2018 12:58
It is necessary for handling short urls.
John Vandenberg
@jayvdb
Apr 09 2018 12:58
i dont like hand balling the problem to the user
each user could have different remote names for the same thing, so that value can not be a .coafile setting value
Ankit Joshi
@MacBox7
Apr 09 2018 12:59
The thing is I don't need remote if it is a full url, but to get the repository and the owner name for a short url like #3213 this is necessary.
Otherwise we can provide this support only for full url.
Because I can extract all the required parameters to query git api.
John Vandenberg
@jayvdb
Apr 09 2018 13:00
I've added reviewers who can tell you about how to solve this.
Ankit Joshi
@MacBox7
Apr 09 2018 13:01
ok :+1:
Nalin Bhardwaj
@nalinbhardwaj
Apr 09 2018 14:02
@MacBox7 : I'm not sure, why can't we just ping every remote's corresponding issue URL? No sane use case should involve too many of them.
Ankit Joshi
@MacBox7
Apr 09 2018 14:05
2 different remote can have different issues with same issue_id. How will you deal with it :smile:
John Vandenberg
@jayvdb
Apr 09 2018 14:09
is that a likely scenario ?
and if the bear breaks because of it, whose fault is it?
Anish Mukherjee
@alphadose
Apr 09 2018 14:10
@jayvdb I noticed that in https://github.com/coala/coala-bears/pull/2312/files there were a lot of sudo commands
is there no other way to add the dependencies ?
also pyenv was used which was no longer necessary with the new circleci
John Vandenberg
@jayvdb
Apr 09 2018 14:13
you're the reviewer, not me :P
Anish Mukherjee
@alphadose
Apr 09 2018 14:14
will try my best to help :P
John Vandenberg
@jayvdb
Apr 09 2018 14:15
be bold. it is better to make suggestions and be wrong than say nothing and the patch goes nowhere
Anish Mukherjee
@alphadose
Apr 09 2018 14:15
:+1:
Ankit Joshi
@MacBox7
Apr 09 2018 14:44
@jayvdb I am stating that this would be a problem if we followed @nalinbhardwaj approach. The current approach is current except obviously that the remote has to be specified by the user.
John Vandenberg
@jayvdb
Apr 09 2018 14:48
but is it really a problem? what is the real scenario where that could happen? is that scenario avoidable/detectable, and is it a error anyway?
Ankit Joshi
@MacBox7
Apr 09 2018 15:05
@jayvdb replied here coala/coala-bears#2369
John Vandenberg
@jayvdb
Apr 09 2018 15:13
Im gonna stay out of this one and see if you gsoc wantabees can figure this one out :P
Nalin Bhardwaj
@nalinbhardwaj
Apr 09 2018 16:54
@MacBox7 : What @jayvdb (and my) point was that if there are multiple remotes with the same issue number, the very fact that the user used shorthand notation for the issue makes it ambiguous. Generally no fork has issues attached to it, therefore, if any remote of a user has an issue with the same number as user's notation, it is fine to assume it's correctness.
@MacBox7 : To elaborate further, having coala config dependent on a user's git tastes is a very bad idea, since it almost automatically prevents different people(working on the same repo) from using the same config, which is almost definitely much less desirable.
Ankit Joshi
@MacBox7
Apr 09 2018 17:09
I totally agree with you on the second point. But the first point totally relies on our assumptions. What we can do is, as of now only support this functionality if the full url is given. And we can extend this to short url when coala starts supporting env variables(A good feature enhancement).
Viresh Gupta
@virresh
Apr 09 2018 17:11
@MacBox7
I would suggest to make the setting optional, and in case isn't provided, we should go with the assumption
Short url are used in quite some repos and would be desirable to have the issue open check for them too
Ankit Joshi
@MacBox7
Apr 09 2018 17:11
I have no problem in pinging every remote but there is a chance that it might give false_positives. So, it all depends on whether we should allow it or not.
Viresh Gupta
@virresh
Apr 09 2018 17:16

env variables

I don't think we would want the user to mess with their environment variables so that they can use coala

For e.g, being a user of coala, I would want that coala interferes with my main project's requirements as less as possible. One environment variable doesn't hurt, but it could potentially mean something else to other dependencies / parts of the core project

Ankit Joshi
@MacBox7
Apr 09 2018 17:17
@virresh if that's the case then I would use the brute-force approach and would add a warning in the implementation (false positive if issue are made in forked repo). Then we can change it after env variables are implemented.
Viresh Gupta
@virresh
Apr 09 2018 17:18
:+1:
Ankit Joshi
@MacBox7
Apr 09 2018 17:21
Yeh the naming convention for the env var can be a topic for future discussions :wink:
Viresh Gupta
@virresh
Apr 09 2018 17:24
Also if you really are in favour of an additional setting
There can be a middle ground by an optional setting for the user to add their main upstream repository link instead of remote name
Handling a manually entered link comes with its own complexities though
Ankit Joshi
@MacBox7
Apr 09 2018 17:26
@virresh adding a remote or a upstream repo link are the same thing. I am using remote to get the link itself :stuck_out_tongue_winking_eye:
Viresh Gupta
@virresh
Apr 09 2018 17:27
That's the point @MacBox7
Different people can have different remote names for the same url
Ankit Joshi
@MacBox7
Apr 09 2018 17:29
Yup. Now I get you. :+1:
Viresh Gupta
@virresh
Apr 09 2018 17:30
:smile: