Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Repo info
Activity
    Michael Barton
    @michaelbarton
    Thanks again for solving this, switching out both nuitka and validictory was a very good solution.
    pbelmann
    @pbelmann
    Thanks but it was a few hours work
    I noticed it when I was working on the new binning spec
    pbelmann
    @pbelmann

    the new file-validator problem is that the check_mounted_files method
    https://github.com/bioboxes/file-validator/blob/master/validate_biobox_file/main.py#L46-L52
    does not work with fasta types when the values are not defined in a list:

    version: 0.9.0
    arguments:
                 - fasta:
                           id: "id"
                           type: "type"
                           value: "value"

    And I need this for binning.

    Michael Barton
    @michaelbarton
    I see
    Because it iterates over a list
    I can take a look at this now.
    pbelmann
    @pbelmann
    thanks Michael, this works
    Michael Barton
    @michaelbarton
    @pbelmann is the docker seminar series still in progress, or have the students finished now?
    pbelmann
    @pbelmann
    We have now megahit biobox (still have to add/ fork it to bioboxes) 3 further bioboxes should be available next week.
    Michael Barton
    @michaelbarton
    @pbelmann Do you have a url for megahit?
    In issue #134 the commentator is one of the maintainers of megahit. If you shared your megahit container with him, he might be able to provide some feedback on the spec.
    pbelmann
    @pbelmann
    no at the moment not. But he used the nucleotides container and is now working on abyss I think. I will get his email tomorrow I hope.
    Ah ok
    Michael Barton
    @michaelbarton
    Ok
    pbelmann
    @pbelmann
    Yes I will contact him as soon as I have contact to the student.
    Michael Barton
    @michaelbarton
    Ok
    It’s good that he’s interested in bioboxes.
    I hope that we can developers interested and on-board with the project.
    pbelmann
    @pbelmann
    I think he likes the idea of bioboxes but I'm not sure if he is interested on helping us after the seminar ends. :smile:
    But I can ask him.
    Michael Barton
    @michaelbarton
    I mean voutcn in the issue. It’s good that assembler developers are interested.
    But if the student is interested, that’s good too.
    pbelmann
    @pbelmann
    Yes I hope we get more bioinformatic tool developers helping bioboxes to evolve. For assemblers we have many examples like your nucleotides containers but for other categories it will be not that easy.
    Michael Barton
    @michaelbarton
    Yes, I am a little worried about that.
    If there are complex interfaces it could become more difficult.
    How are finding developing the binning container spec/
    s/\//?/
    pbelmann
    @pbelmann
    :smile:
    I'm working on the binning spec and I think at the end we can say that a binning biobox will be standardized but it will be interchangeable just to certain degree.
    Example:
    If I have a binning tool that references the refseq database in the yaml I can not just reuse this yaml for another binning tool. Because the next tool might be using blastdb or any other database. You could just reuse the yaml if you have all database on your local machine and all database entries in the yaml. And this will never be the case.
    Michael Barton
    @michaelbarton
    I see.
    The problem is that they may all want a different database?
    pbelmann
    @pbelmann
    yes
    Michael Barton
    @michaelbarton
    And providing all databases it too much space.
    pbelmann
    @pbelmann
    yes
    But at the moment I have kraken and metabat as examples and they don't have to use a database.
    Michael Barton
    @michaelbarton
    That’s good.
    I think for the user, not having a database is simpler, and so they might end up being used more.
    pbelmann
    @pbelmann
    Yes that's true. But another problem is that kraken has custom database and custom databases are the worst case for bioboxes because there is no standard. So kraken developers offer a mini kraken datbases about 4 GB that will always be downloaded before kraken starts. I entroduced a cache parameter so that a database might be reused but that is still not a nice solution.
    Michael Barton
    @michaelbarton
    Yes, that’s not ideal either.
    @fungs mentioned converting them to fasta, would that help with standardization?
    I guess that kraken needs it to be in it’s own special format.
    pbelmann
    @pbelmann
    I'm not sure.
    Even if it is in fasta
    you don't want to add 100+ entries in the yaml.
    Michael Barton
    @michaelbarton
    Yes
    I agree
    That’s tricky.