Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    ecerami
    @ecerami
    thanks, everyone. for a complete newbie, yes I found the documentation a bit hard to follow. for example, this page: https://github.com/bioboxes/rfc is a good intro, but it's not obvious where the actual meat of the RFC is, or whether Assembly, Binning, and Profiling are starting points for specific types of applications, or what I would do if I wanted to create an application that did not fall into one of these three categories. Anyway, I will read more. thanks.
    pbelmann
    @pbelmann
    @ecerami I agree it does not directly lead to the interfaces.
    We want to display the github rfc in bioboxes.org so that a developer/user does not have switch betweent bioboxes.org and github. But for now we should maybe reference is just from here: https://github.com/bioboxes/rfc as you stated .
    Could you create an issue in github for everything you think could be improved or even better provide a pull request?
    Michael Barton
    @michaelbarton
    The next bioboxes review meeting is set for July 02, the isse is #159.
    In the last meeting we agreed to have more focused milestones to help organise development goals. The milestone for the next three months will be increasing usage of biobox and to do this we will start tracking downloads - #157.
    @Gig77 In response to your comments and that of others in a similar vein, we will start developing a simpler interface to allow using bioboxes in development workflows #152.
    @pbelmann Do you need the binning validator set up for download from EC2?
    pbelmann
    @pbelmann
    Yes that would be great.
    and assembly benchmark validator too
    Michael Barton
    @michaelbarton
    Ok
    pbelmann
    @pbelmann
    thanks michael
    Michael Barton
    @michaelbarton
    I’ve created a docker container repository which should simplify this.
    The circle ci server still needs the EC2 parameters however.
    So it still requires manual work.
    pbelmann
    @pbelmann
    ok
    Michael Barton
    @michaelbarton
    Perhaps AWS code pipeline might be useful - http://aws.amazon.com/codepipeline/
    It’s still in beta
    pbelmann
    @pbelmann
    But you would still have to provide EC2 keys ?
    Michael Barton
    @michaelbarton
    Yes, hopefully it would allow to create a deployment template or something like that. At the moment I basically have to copy and paste a set of commands each time into circle ci.
    pbelmann
    @pbelmann
    ah ok
    A container runtime too - http://blog.docker.com/2015/06/runc
    Open containers project - https://www.opencontainers.org/
    pbelmann
    @pbelmann
    I think appc was already a great start for a container runtime. I hope they will reuse the most part of it.
    Michael Barton
    @michaelbarton
    Metrics page is now live on the site - http://bioboxes.org/metrics/
    Michael Barton
    @michaelbarton
    My suggestion for the biobox command line interface
    biobox short-read-assemble bioboxes/velvet -i FASTQ -o CONTIGS
    Michael Barton
    @michaelbarton
    Bioboxes data file - https://github.com/bioboxes/data
    Johannes Dröge
    @fungs
    I think the syntax is clear and just what I was thinking of. I'd suggest to make it a bit more abstract for further extension and then simplify via shortcuts/alias names like:
    biobox run --container docker://bioboxes/velvet --specification biobox.yaml --arguments -i FASTQ -o CONTIGS
    or shorthand:
    biobox run docker://bioboxes/velvet -i FASTQ -o CONTIGS
    docker:// is the container runtime backend
    run is analogous to docker run (further commands to be added),
    the specification can be passed as a file but if the biobox command can link the container id and the spec itself via metadata, then there is no need to pass it.
    Johannes Dröge
    @fungs
    IMO, there should also be an option to pass the YAML file itself (I believe it is better to let the YAML file point to valid data on the local system and to let the biobox wrapper transform it to paths according to the container-internal mount points before passing it to the containe)
    just my two cents...
    Johannes Dröge
    @fungs
    this is some module which can create a python argparse (option parsing) object directly from a YAML or JSON file: https://github.com/fmenabe/python-clg
    it might be useful
    Michael Barton
    @michaelbarton
    Sounds like Peter’s talk at BOSC went well - https://twitter.com/GigaScience/status/619823935933284352
    Andreas Bremges
    @abremges
    Yes, from what I've heard from Scott Edmunds & Peter himself
    Definitely the right place to present bioboxes
    Michael Barton
    @michaelbarton
    I have got a prototype branch working on the biobox cli.
    This allows the following command biobox short_read_assembler bioboxes/megahit --verify
    This verifies that a short_read_assembler biobox conforms to the spec.
    I’m most excited about this because building and testing new biobox docker containers is not easy, but this makes it much more so.
    Albert Vilella
    @avilella
    I may be late in finding out about this DinD approach but here it goes: https://twitter.com/ProfParmer/status/651835246158024704
    Paolo Di Tommaso
    @pditommaso
    It looks this approach has a serious security problem
    Michael Barton
    @michaelbarton
    I agree with the blog post, providing access to the docker socket is still a security concern. It would be nice if there was a better way to support Docker in Docker or at least some variant.
    NERSC have created ‘shifter’ which seems to provide a safer way to do this in a HPC environment - https://www.nersc.gov/news-publications/nersc-news/nersc-center-news/2015/shifter-makes-container-based-hpc-a-breeze/
    Paolo Di Tommaso
    @pditommaso
    It looks an interesting project, and I suppose that this kind of "hybrid" approach (i.e. Docker + custom tools) is the most likely to take place in the HPC
    Daniel Schober
    @DSchober
    Hi there, at the moment I see very few bioboxes available and their presentation as a simple list might be sufficient for the moment, but if this repository grows, you will need a way to search for particulat boxes. Is there anything planned yet to group/classify and annotate biobox functionalities in a standardized way?
    Johannes Dröge
    @fungs
    @DSchober you are right, currently we are only maintaining a plain list. Bioboxes tries to follow the approach "by users for users" and thus we will have to come up with a more accessible and detailed index data structure which users can contribute to. Since our central hub currently is the GitHub project, I think that YAML/JSON files (over HTTPS with clients) will be a good start.
    Michael Barton
    @michaelbarton
    @fungs There is a YAML document created by @pbelmann here -https://github.com/bioboxes/data/blob/master/images.yml