Where communities thrive

  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
Repo info
  • Mar 20 15:51
    ctb closed #1912
  • Mar 20 15:51
    ctb commented #1912
  • Mar 19 14:35
    ctb commented #1912
  • Mar 19 14:31
    alirezaebiii commented #1912
  • Mar 19 14:30
    alirezaebiii opened #1912
  • Mar 17 15:07
    ctb commented #1911
  • Mar 17 09:20
    Mewgia opened #1911
  • Feb 22 10:57
    mr-c opened #1910
  • Jan 04 13:16
    mr-c commented #1817
  • Dec 12 2020 23:53
    ansarfraz opened #1909
  • Nov 20 2020 14:23
    ctb commented #1862
  • Nov 20 2020 14:23
    ctb commented #1862
  • Nov 19 2020 00:14
    luizirber commented #1862
  • Nov 17 2020 09:34
    phiweger commented #1862
  • Nov 02 2020 04:29
    swamidass commented #1541
  • Oct 26 2020 03:38
    ryfi opened #1908
  • Sep 07 2020 18:18
    ctb closed #1907
  • Sep 07 2020 18:18
    ctb commented #1907
  • Aug 29 2020 14:08
    mars188 commented #1907
  • Aug 24 2020 21:29
    ctb commented #1907
Tim Head
Daniel Standage
that python setup was trying to run gcc-4.2 to compile some of the code
must've been a default, because it wasn't specified anywhere in our setup.py or anything
brew install python followed by brew link python did the trick
...aaaaaand then I removed gcov
make coverage.xml (or whatever it's called) was segfaulting
not a long term solution, but...
Tim Head
I think I've gotten to the bottom of why the coverage has dropped: https://github.com/gcovr/gcovr/issues/140#issuecomment-243157655 currently my diagnosis is that at some point jenkins stopped using a special version of gcovr
I mean, there is GCOVRURL in the Makefile which looks like it should be used by pip to install gcovr but isn't actually used ever and from the branch name it sounds like it does something with unreachable branches, which is the commandline argument that seems to "fix" things
maybe not the most coherent sentence ...
Daniel Standage
khmer has definitely installed custom package versions before, hosted at ci.oxli.org
maybe it was something like that
Tim Head
Kevin Murray
Our manuscript for kWIP is finally out. Thanks to all here for your help getting khmer to play nice! http://biorxiv.org/content/early/2016/09/16/075481
Philipp Schiffer

Hi guys! (sorry found this room only after asking this by email)

After a long while I am returning to khmer for a new genome, but find myself a bit confused about the best approach to digital normalisation (I want ~100x) with your pipeline v.2.0.
One thing is that filter-abund.py apparently needs the kmer hash table, but normalize-by-median.py appears to lack the --savehash option now.
Would/could you maybe point me to the most recent example of the workflow? That would be very helpful.



Hello, can anyone please explain about the file: dib-lab/khmer/data/100k-filtered.fa? what's the means of the data
I am collecting some DNA sequence data for personal test, I want to use the data, thank you!
Daniel Standage
Hi @hmyan90! The provenance of this data is not documented. Given the read ID, I would have assumes it's real data from an Illumina sequencer, and I wouldn't be surprised if it's from something like E. coli. But I can't confirm that. Really, the purpose of this file as far as khmer is concerned is to make sure the software handles Fastq and Fasta files correctly.
Hope this helps!
Hi, Thank you for replying, I have figure out it is from NCBI database. And NCBI has exactly what I need data. @standage
Hello, does anyone know if the latest version of khmer still has a functioning filter-below-abund.py script in the sandbox directory? I am trying to run the khmer/sandbox/filter-below-abund.py script in order to trim off high-abundance kmers for a metagenome assembly.

And this is the error I receive:
Traceback (most recent call last):

File "/local/cluster/khmer-legacy/sandbox/filter-below-abund.py", line 49, in <module>


File "/local/cluster/khmer-legacy/sandbox/filter-below-abund.py", line 22, in main

ht = khmer.load_counting_hash(counting_ht)

AttributeError: 'module' object has no attribute 'load_counting_hash'

I am concerned that I am getting this error because the filter-below-abund.py script is no longer part of the khmer pipeline.

The newest installed khmer in our linux /local/cluster/bin is version 2.0+103.g8300de0, but the filter-below-abundance.py script did not show up after the installation.

The script I used came from an older installation of khmer in /local/cluster.

(The python version I am using: Python 2.7.14, and the OS Version:
Linux 3.10.0-693.11.6.el7.x86_64 x86_64)

I wanted to know if anyone would know why I am getting this error, if the filter-below-abund.py script should be included in installations of the latest khmer version, and if this script is still functioning.

Daniel Standage
Hi @MessyaszA. The error message makes me think that there is some kind of conflict or confusion regarding the two different khmer versions installed on the cluster.
Perhaps the filter-abund.py or the filter-abund-single.py scripts from the newer version can satisfy your needs?
@standage I'm not sure if filter-abund.py and filter-abund-single.py would satisfy my needs. Those scripts trim low abundance kmers, but for metagenome assembly I see that the opposite is recommended - trimming high abundance kmers. When looking at the newer version I don't see any commands in those scripts that would allow me to trim high abundance kmers rather than low abund. kmers. I'm also wondering if anyone has a recommendation for metagenome assembly that would either skip this step or use a different method.
Daniel Standage
I know my colleagues have used the variable coverage trimming options on transcriptome and metagenome data. That is not something I have worked much with. Let me ping @ctb and see what he has to say.
C. Titus Brown
hi @MessyaszA we recommended doing two things in the past but things have changed
so the two old things were -
  • trim low abundance k-mers with variable coverage approach
  • trim high abundance k-mers b/c of partitioning etc
with newer assemblers like megahit, I would say
  • assemble with megahit if you can! it will work except for really really big metagenomes.
C. Titus Brown
if you really want to do k-mer trimming to reduce memory requirements prior to assembly, then the instructions here https://peerj.com/preprints/890/ are what I would suggest. we use this a lot, but not for assembly specifically.
it basically comes down to using either 'filter-abund.py -V' or 'trim-low-abund.py -V' from khmer.
but honestly I don't think you need to do any trimming prior to metagenome assembly unless you are trying to lower memory requirements, which megahit probably won't need.
If you want to ask some more questions, please just file an issue at github.com/dib-lab/khmer and we will help you there! I don't get notifiations from gitter :(
@standage @ctb thank you for the advice!
Brad Langhorst
I’d like to get a list of the most abundtant kmers in my sample set… what’s the right script for this? find-knots?
i should mentions… i’ve already created countgraphs with load-into-counting and prepared abundance histograms...
I want to findout the identity of sequences in that 31k + category.
(these are 6-mers)
Brad Langhorst
i’m trying partition_graph -> find_knots now… please let me know if that’s the wrong path… thanks!
Hi everyone - very very new user here and doing an assignment for class with khmer. Currently trying to install khmer with "pip3 install khmer" and running into an error: "ERROR: Command errored out with exit status 1: /Users/haleyhuston/miniconda3/bin/python -u -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/private/var/folders/rc/3h5px1p12yx0tbs36jtfb_4h0000gn/T/pip-install-evflawz6/khmer/setup.py'"'"'; file='"'"'/private/var/folders/rc/3h5px1p12yx0tbs36jtfb_4h0000gn/T/pip-install-evflawz6/khmer/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(file);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, file, '"'"'exec'"'"'))' install --record /private/var/folders/rc/3h5px1p12yx0tbs36jtfb_4h0000gn/T/pip-record-g1my_5bd/install-record.txt --single-version-externally-managed --compile --install-headers /Users/haleyhuston/miniconda3/include/python3.8/khmer Check the logs for full command output." Would anyone be able to advise?