Hi Michal Klinka I'm afraid that there is no comprehensive doc on this problem .. but it would be cool to have :)
The topic is quite complex and depends in parts also on the Galaxy config, e.g. how requirements are resolved (conda/containers) and the the tool, e.g. if there are dynamically discovered datasets or the tool's profile version.
I think the most important part is the generation of the job's working dir. In particular the
tool_script.sh (which is derived from interpreting the cheetah code in the tool's command block) and
galaxy_JOBID.sh which sets up the environment. The later is the shell script that is executed (how is again depending on Galaxy's job configuration).
Maybe you can start by taking a look at these scripts for an example job? I guess we can help in case of questions.
Then after the job ran results are copied to the final destination and the DB needs some updates .. but I can't tell you much details about this part.
<command>section. The tool also comes with a helpful
--all-versionsoption that lists its own version as well as those of core dependencies.
<version_command>, however, I'd be getting the versions that were in effect before the updates happening in the
<command>block? Anything that could be done here?
<version_command>block so there's no way to tell whether updating should be done or not (depends on user settings).
which might update its environment as part of its <command> section
The conda env? That sounds wrong to me.
utils\localization.jsfile in 100+ of our VUE & JS Galaxy files? Is this necessary since there are only 3 files in the codebase where language can be changed by the user? (e.g. Home Page, Login Page, and Preferences Page)
@bgruening:matrix.org for the particular usecase I was wondering if the ncbi taxonomy data table could be reused.
manual merge and deploy? if this happens more often it would be cool to have a way to disable size checks for certain paths .. maybe an additional file like
on the long run progress with galaxyproject/galaxy#13495 :)
was just wondering if we could use the data on the CVMFS for this case (use cached in the tool's conditional). But its kind of circular, since we would need the data manager or at least the datatable + data on CVMFS before accepting the tool. Also CI would need to be adapted to set the
--tool_data_table parameter of
Just checked recent weekly CI errors. Seems that
samtools sort output may depend on the number of used CPUs (sorting is correct, just the order of alignments with the same mapping position may differ). Wondering how to deal with this ..
Just check for the number of lines and maybe some regex (does this work for bam?)?
mapper | samtools sort -@2 ... | samtools -@\$GALAXY_SLOTS.. which also solves the problem the parallel sorting and mapping may use to many CPUs