Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Michael Barton
    @michaelbarton
    It makes it harder for a user because then they have to manage all the databases too.
    pbelmann
    @pbelmann
    All they would have to do is download the database (I added the ftp link) and reference it in the yaml. I'm not sure if that is such a big problem. I'm more worried that it is not really interchangeable, or at least just to certain degree.
    Michael Barton
    @michaelbarton
    Yes
    You won’t be able to swap out the bioboxes.
    pbelmann
    @pbelmann
    exactly
    Michael Barton
    @michaelbarton
    If they use different databases, you’d have to change the YAML.
    pbelmann
    @pbelmann
    yes
    Michael Barton
    @michaelbarton
    What do you think?
    pbelmann
    @pbelmann

    I would continue to work on the spec (actually I started today with the validator) and we have to admit that there are some tools that allow bioboxes to be interchangeable just to a certain degree. It does not mean that we should stop following the aim of creating interchangeable tools but maybe this means that we have to find way on the long ther to make it as easy as possible to use such tools. e.g:

    • Find a way that the tools could report in yaml which databases they need (I think in the initial propasal you wrote that he tools could report their types for example.)

      ... we can instead specify a list of morphisms and each container can list which of those they implement. ...

    • Maybe to create a container that checks if another container needs a database and downloads it and places it somewhere (something like an adapter). I mean if you want to use binning tools that uses such databases you would have to download it anyway.

    I'm sure for profiling tools we will have the same problems.

    pbelmann
    @pbelmann
    What do you think?
    Michael Barton
    @michaelbarton
    I agree, this is a good path forward.
    We can continue to evolve the spec as we have done with the assembler.
    We are discussing putting Docker/bioboxes in production here at the JGI.
    pbelmann
    @pbelmann
    That's good to hear. :+1:
    Michael Barton
    @michaelbarton
    And so this could help identify issues with the specs, however it would mean that developers here would write bioboxes.
    pbelmann
    @pbelmann
    Wow, that would be great
    Michael Barton
    @michaelbarton
    We are mostly interested in the preprocessors and assemblers, as we have standard proprocessing and assembly pipelines.
    This is still longer term though.
    The sys admins are experimenting with how to run Docker on the shared super computer cluster.
    pbelmann
    @pbelmann
    Yes I would really like to help with the prepocessing containers. I think they are not that that difficult right?
    Michael Barton
    @michaelbarton
    No, they should be simple.
    pbelmann
    @pbelmann
    ah ok
    Michael Barton
    @michaelbarton
    I have to manage some other responsiblities here at the JGI so I have to juggle my time.
    Also my laptop was stolen so I can’t work at home for the time being either.
    pbelmann
    @pbelmann
    Oh no,really
    Michael Barton
    @michaelbarton
    However I did start to experiment with parsing the signature into the json spec - https://github.com/michaelbarton/bioboxes-signature-validator
    If this works we would not have to write the spec documents ourselves.
    This would simplify development, I think.
    We would only enforce each container provides the default signature.
    Anyway.
    I have to go.
    pbelmann
    @pbelmann
    Ok. see you
    I take a look at it
    Michael Barton
    @michaelbarton
    We can discuss priorities for the next month at the meeting on Thursday.
    Ok
    pbelmann
    @pbelmann
    ok
    Michael Barton
    @michaelbarton
    @pbelmann Sorry I didn’t get a chance to look at #147 today.
    pbelmann
    @pbelmann
    no problem Michael
    Christian Frech
    @Gig77
    Bioboxes are awesome and could make all our lives easier. The one thing I worry about is the need for Yaml files, because they generate quite some overhead for both users and developers. Why not stick to good ol' Linux command line parameter syntax (see git, samtools, etc. for good examples)? To keep bioboxes exchangeable a spec could still define required and optional parameters for each class of tools that could even be enforced/validated. So running an assembler could be as easy as 'docker run velvet --input-fastq=in.fastq -o contigs.fa'. Other assembler? 'docker run ray --input-fastq=in.fastq -o contigs.fa'. Proper volume mounts would stay the responsibility of the user so that files can be found. Another advantage of this would be that piping is still possible, e.g. 'cat in.fastq | docker run velvet | gzip > out.fa.gz'. I think that would be closer to the hearts of the creators of GNU/Linux and Docker. Thoughts?
    Christian Frech
    @Gig77
    What about Yaml being an option only for Bioboxes that require complex input data types like assemblers (e.g. via 'docker run velvet --yaml inputs.yaml'), instead of making Yaml mandatory for all Bioboxes?
    pbelmann
    @pbelmann
    @Gig77 We originally started with environment variables and it became quite complicated when you want for example assign multiple fasta files different insert sizes.
    But mixing different interfaces might work, yes.
    I'm not sure if we could still integrate piping in bioboxes even with the current yaml based interface.
    We had our longest discussion regarding interfaces in issue #61. Everything in Bioboxes is open for discussion, so feel free to create issues with a proposal for mixing interfaces or passing a yaml with the commandline.
    Michael Barton
    @michaelbarton
    We could consider a simpler command line interface over the top of the YAML API. This might be a script that takes a fastq file and then takes care of mounting the files and generating the bioboxes.yml file.
    pbelmann
    @pbelmann
    @michaelbarton I agree
    Michael Barton
    @michaelbarton
    @Gig77 Would you be interested in implementing a simpler CLI over the existing YAML one?
    ecerami
    @ecerami
    hello, bioboxes people. I had a few newbie questions for you...
    Michael Barton
    @michaelbarton
    Yes, I’ll try to help if I can.
    You can ask any questions you have.
    ecerami
    @ecerami
    hi. Sorry, stepped away. I am basically wondering which doc to read that explains how bioboxes is distinct from docker.
    Michael Barton
    @michaelbarton
    Bioboxes is a standard for docker containers. We make suggestions for how certain types of docker containers should respond to different inputs and outputs.
    For example the short read assembler spec describes how docker containers of these software should accept input and give output.