Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Michael Barton
    @michaelbarton
    I agree, this is a good path forward.
    We can continue to evolve the spec as we have done with the assembler.
    We are discussing putting Docker/bioboxes in production here at the JGI.
    pbelmann
    @pbelmann
    That's good to hear. :+1:
    Michael Barton
    @michaelbarton
    And so this could help identify issues with the specs, however it would mean that developers here would write bioboxes.
    pbelmann
    @pbelmann
    Wow, that would be great
    Michael Barton
    @michaelbarton
    We are mostly interested in the preprocessors and assemblers, as we have standard proprocessing and assembly pipelines.
    This is still longer term though.
    The sys admins are experimenting with how to run Docker on the shared super computer cluster.
    pbelmann
    @pbelmann
    Yes I would really like to help with the prepocessing containers. I think they are not that that difficult right?
    Michael Barton
    @michaelbarton
    No, they should be simple.
    pbelmann
    @pbelmann
    ah ok
    Michael Barton
    @michaelbarton
    I have to manage some other responsiblities here at the JGI so I have to juggle my time.
    Also my laptop was stolen so I can’t work at home for the time being either.
    pbelmann
    @pbelmann
    Oh no,really
    Michael Barton
    @michaelbarton
    However I did start to experiment with parsing the signature into the json spec - https://github.com/michaelbarton/bioboxes-signature-validator
    If this works we would not have to write the spec documents ourselves.
    This would simplify development, I think.
    We would only enforce each container provides the default signature.
    Anyway.
    I have to go.
    pbelmann
    @pbelmann
    Ok. see you
    I take a look at it
    Michael Barton
    @michaelbarton
    We can discuss priorities for the next month at the meeting on Thursday.
    pbelmann
    @pbelmann
    ok
    Michael Barton
    @michaelbarton
    Ok
    Michael Barton
    @michaelbarton
    @pbelmann Sorry I didn’t get a chance to look at #147 today.
    pbelmann
    @pbelmann
    no problem Michael
    Christian Frech
    @Gig77
    Bioboxes are awesome and could make all our lives easier. The one thing I worry about is the need for Yaml files, because they generate quite some overhead for both users and developers. Why not stick to good ol' Linux command line parameter syntax (see git, samtools, etc. for good examples)? To keep bioboxes exchangeable a spec could still define required and optional parameters for each class of tools that could even be enforced/validated. So running an assembler could be as easy as 'docker run velvet --input-fastq=in.fastq -o contigs.fa'. Other assembler? 'docker run ray --input-fastq=in.fastq -o contigs.fa'. Proper volume mounts would stay the responsibility of the user so that files can be found. Another advantage of this would be that piping is still possible, e.g. 'cat in.fastq | docker run velvet | gzip > out.fa.gz'. I think that would be closer to the hearts of the creators of GNU/Linux and Docker. Thoughts?
    Christian Frech
    @Gig77
    What about Yaml being an option only for Bioboxes that require complex input data types like assemblers (e.g. via 'docker run velvet --yaml inputs.yaml'), instead of making Yaml mandatory for all Bioboxes?
    pbelmann
    @pbelmann
    @Gig77 We originally started with environment variables and it became quite complicated when you want for example assign multiple fasta files different insert sizes.
    But mixing different interfaces might work, yes.
    I'm not sure if we could still integrate piping in bioboxes even with the current yaml based interface.
    We had our longest discussion regarding interfaces in issue #61. Everything in Bioboxes is open for discussion, so feel free to create issues with a proposal for mixing interfaces or passing a yaml with the commandline.
    Michael Barton
    @michaelbarton
    We could consider a simpler command line interface over the top of the YAML API. This might be a script that takes a fastq file and then takes care of mounting the files and generating the bioboxes.yml file.
    pbelmann
    @pbelmann
    @michaelbarton I agree
    Michael Barton
    @michaelbarton
    @Gig77 Would you be interested in implementing a simpler CLI over the existing YAML one?
    ecerami
    @ecerami
    hello, bioboxes people. I had a few newbie questions for you...
    Michael Barton
    @michaelbarton
    Yes, I’ll try to help if I can.
    You can ask any questions you have.
    ecerami
    @ecerami
    hi. Sorry, stepped away. I am basically wondering which doc to read that explains how bioboxes is distinct from docker.
    Michael Barton
    @michaelbarton
    Bioboxes is a standard for docker containers. We make suggestions for how certain types of docker containers should respond to different inputs and outputs.
    For example the short read assembler spec describes how docker containers of these software should accept input and give output.
    The aim is to make them interchangeable, all with the same interface.
    ecerami
    @ecerami
    ok, thanks. which document explains this though? I am looking at the https://github.com/bioboxes/rfc. this look also appears broken: http://bioboxes.org/getting-started/. I am just looking for best starting point for documentation. thanks.
    Michael Barton
    @michaelbarton
    The website http://bioboxes.org has the most recent documentation. You find that this site doesn’t really explain how bioboxes relates to docker?
    pbelmann
    @pbelmann
    @ecerami the links on https://github.com/bioboxes/rfc are fixed now. You can find the user guide here: http://bioboxes.org/guide/user/
    ecerami
    @ecerami
    thanks, everyone. for a complete newbie, yes I found the documentation a bit hard to follow. for example, this page: https://github.com/bioboxes/rfc is a good intro, but it's not obvious where the actual meat of the RFC is, or whether Assembly, Binning, and Profiling are starting points for specific types of applications, or what I would do if I wanted to create an application that did not fall into one of these three categories. Anyway, I will read more. thanks.
    pbelmann
    @pbelmann
    @ecerami I agree it does not directly lead to the interfaces.
    We want to display the github rfc in bioboxes.org so that a developer/user does not have switch betweent bioboxes.org and github. But for now we should maybe reference is just from here: https://github.com/bioboxes/rfc as you stated .
    Could you create an issue in github for everything you think could be improved or even better provide a pull request?
    Michael Barton
    @michaelbarton
    The next bioboxes review meeting is set for July 02, the isse is #159.
    In the last meeting we agreed to have more focused milestones to help organise development goals. The milestone for the next three months will be increasing usage of biobox and to do this we will start tracking downloads - #157.
    @Gig77 In response to your comments and that of others in a similar vein, we will start developing a simpler interface to allow using bioboxes in development workflows #152.
    @pbelmann Do you need the binning validator set up for download from EC2?