These are chat archives for coala/coala-bears

19th
Apr 2018
Dopamine Gamer
@DopamineGamer_twitter
Apr 19 2018 00:29
@Makman2 Thanks for explaining.
Mischa Krüger
@Makman2
Apr 19 2018 08:46
@DopamineGamer_twitter I don't want to turn down ideas, but this is really not a trivial topic. If you believe that your stuff just brings it, then feel free to write something up, I guess a cEP would be good (our PEP equivalent) or some prototypes :)
What i can think of now that the new core would work pretty well, so for me that's enough :3 I mean that thing really cost me some time to finally implement it :D but that doesn't mean I want to stop there. Though a bit more proof is needed for me to modify the concept again ;)
Dopamine Gamer
@DopamineGamer_twitter
Apr 19 2018 20:05
@Makman2 :tired_face: I’m mostly confused at this point. When I find time I’ll get to the code, and get back to you.
But could you just answer me this: As it stands now, can Coala be configured to always run all the bears sequentially (no dependency; random order is fine) on a file, and can multiple files be processed concurrently? Yes or no? :smile:

Currently we already parallelize on a per-file basis (if possible)

And please tell me a bit about this. What does ‘if possible’ mean?

Ishan Srivastava
@ishanSrt
Apr 19 2018 20:19
@DopamineGamer_twitter from what i can see in the code, a complete file dict is made of all the files and the bears operate on that in parallel
Dopamine Gamer
@DopamineGamer_twitter
Apr 19 2018 20:23
@ishanSrt Thanks for that. From the file dict, how are individual files dispatched to the bears? Is there parallelization there?
Ishan Srivastava
@ishanSrt
Apr 19 2018 20:25

a complete file dict is made of all the files

so the bears rush through line by line of one file after the other

Dopamine Gamer
@DopamineGamer_twitter
Apr 19 2018 20:28
I see. I still don’t get why it was designed this way, but I understand from you that this is the current state. Thank you!
Ishan Srivastava
@ishanSrt
Apr 19 2018 20:31

fileA:

import foo
blah()

fileB:

import blah
foo()

File_dict:

{‘fileA’: (‘import foo\n’, ‘blah()\n’), ‘fileB’: (‘import blah\n’, ‘foo()\n’)}
Dopamine Gamer
@DopamineGamer_twitter
Apr 19 2018 20:36
I see. Would bears see the same content in file and open(filename).read() at all times?
Ishan Srivastava
@ishanSrt
Apr 19 2018 20:36
No, changes during the run are not reflected
Dopamine Gamer
@DopamineGamer_twitter
Apr 19 2018 20:37
Is there a rule that says bears should only operate on the file content passed to them, and not attempt to open and read the file themselves?
When a result prompt appears on command line, and I apply a patch, the file gets changed instantly. What are these changes during the run?
Ishan Srivastava
@ishanSrt
Apr 19 2018 20:39

the file gets changed instantly

no it doesn't

Dopamine Gamer
@DopamineGamer_twitter
Apr 19 2018 20:39
It does. I just verified.
Ishan Srivastava
@ishanSrt
Apr 19 2018 20:40
did you try a single patch (on of many) by a single bear?
Dopamine Gamer
@DopamineGamer_twitter
Apr 19 2018 20:41
Ah, no.
Are you saying that all the patches from a single bear are applied together when the result generator stops?
Ishan Srivastava
@ishanSrt
Apr 19 2018 20:42
Yes, it seems like it, test it yourself!
Dopamine Gamer
@DopamineGamer_twitter
Apr 19 2018 20:43
Oh, okay. I will.
Ishan Srivastava
@ishanSrt
Apr 19 2018 20:49

Is there a rule that says bears should only operate on the file content passed to them, and not attempt to open and read the file themselves?

I don’t think that would be a good idea

Multiprocesses have seperate share of resources which get (I presume) merged in the end, what do you think will happen in the end when each process has different version of the file and they try to change it
its better to get the conflict in the file dict than result in a python discrepancy
^^not sure about that though, confirm by Makman, something related to locking of files by processes I don’t understand
Ishan Srivastava
@ishanSrt
Apr 19 2018 21:03
yeah just confirmed only one of the process can write to the file at a single time using locking so no net result
file_dicts are clearly a better option
Vaibhav Rai
@RaiVaibhav
Apr 19 2018 21:58
One doubt
whenever patch gets applied file_dict not changes but file_diff_dict changes (which contains results diff object), does these diffs objects are used in changing the original file contents ?
Mischa Krüger
@Makman2
Apr 19 2018 22:07

Is there a rule that says bears should only operate on the file content passed to them, and not attempt to open and read the file themselves?

I don’t think that would be a good idea

More useless file I/O (that's why we load file contents before a run into the file-dict), and yeah discrepancies in files would be way worse to get right. So yeah, it's highly discouraged to manually reload the file from disk.

something related to locking of files by processes I don’t understand

Not on Linux, you can't lock a file there and prevent another process writing to it. Concurrent writes usually always mess files up. File locking is supported on Windows though^^

does these diffs objects are used in changing the original file contents ?

yes. When coala exits, those diffs are applied on the real files.

Ishan Srivastava
@ishanSrt
Apr 19 2018 22:10
so is it safe when multiple processes write to a single file?
speaking in terms of other errors apart from messed up files
Mischa Krüger
@Makman2
Apr 19 2018 22:16

so is it safe when multiple processes write to a single file?

yeah your program won't crash

Answer from SO:

What you're doing seems perfectly OK, provided you're using the POSIX "raw" IO syscalls such as read(), write(), lseek() and so forth.

If you use C stdio (fread(), fwrite() and friends) or some other language runtime library which has its own userspace buffering, then the answer by "Tilo" is relevant, in that due to the buffering, which is to some extent outside your control, the different processes might overwrite each other's data.

Wrt OS locking, while POSIX states that writes or reads less than of size PIPE_BUF are atomic for some special files (pipes and FIFO's), there is no such guarantee for regular files. In practice, I think it's likely that IO's within a page are atomic, but there is no such guarantee. The OS only does locking internally to the extent that is necessary to protect its own internal data structures. One can use file locks, or some other interprocess communication mechanism, to serialize access to files. But, all this is relevant only of you have several processes doing IO to the same region of a file. In your case, as your processes are doing IO to disjoint sections of the file, none of this matters, and you should be fine.

so writing to a file is thread-safe, but what data comes out in the end:
¯\_(ツ)_/¯
Mischa Krüger
@Makman2
Apr 19 2018 22:41
corobo invite RachaelNantale
ah forgot doesn't work in this room...