Where communities thrive


  • Join over 1.5M+ people
  • Join over 100K+ communities
  • Free without limits
  • Create your own community
People
Activity
    Kasra
    @kkasra12
    is there any built-in command to find the interleave between two FASTQ files ?
    mans
    @mansoureh_mans_twitter
    Hi Mark, thank you very much for your response. The PATH is exported correctly because when I execute Seqc -help, all the options and relevant info are displayed correctly. However, whenever I try seqc xxx.seq I see absolutely no output. Indeed nothing happens and cursor jumps to them next line as if no command has been executed at all.
    A. R. Shajii
    @arshajii
    Hi @mansoureh_mans_twitter what is the content of xxx.seq?
    @kkasra12 Are you trying to produce an interleaved FASTQ? You can do that by reading two FASTQs simultaneously (e.g. with zip as in Python) then printing the two records.
    mans
    @mansoureh_mans_twitter
    Hi A.R. It is fixed now. Apparently I didn't execute Seqc xxx.seq in the folder where xxx.seq exists. That's why Seq was not able to find the file and moved to the next line without any output.
    Ibrahim Numanagić
    @inumanag
    Oh, that is a bug then! Thanks for reporting it
    Gert Hulselmans
    @ghuls
        open_func: Union[function[gzFile,str,str], function[File,str,str]] = open
        open_func = open
        open_mode = 'w'
        # Define open function and open mode to the correct setting, depending on the fact
        # that the output file is gzipped or not.
        if fastq_filename.endswith('.gz'):
            open_func = gzip.open
            open_mode = 'wb6'
        else:
            open_func = open
            open_mode = 'w'
    
       with open_func(fastq_filename, open_mode) as fastq_fh:
    Is there a way to get open_func behave like I want (gzip.open and open are different types)?
    A. R. Shajii
    @arshajii
    Are you using the types branch, @ghuls ?
    Gert Hulselmans
    @ghuls
    no, develop, but with the precompiled seq binary
    A. R. Shajii
    @arshajii
    Oh ok, just checking since we're in the process now of incorporating union types :P
    You can do something like this:
    def process_file(fastq_fh):
        ...
    
    if fastq_filename.endswith('.gz'):
        with gzip.open(fastq_filename, 'wb6') as fastq_fh: process_file(fastq_fh)
    else:
        with open(fastq_filename, 'w') as fastq_fh: process_file(fastq_fh)
    process_file is a generic function -- I believe the APIs for normal files and for gzip'd files should be basically identical
    Gert Hulselmans
    @ghuls
    Ah, in that case I will wait a bit. Any timeframe for a new release?
    Is there a way to process reads of a FASTQ file in parallel, but to write the modified reads in the original input order?
    Gert Hulselmans
    @ghuls
    It would also be nice if seq would allow writing to a stream which would be read by a streaming command line tool. Now I have to write to stdout and then compress the output with pigz (parallel gzip). I was using the buildin gzip functionality (with gzip level 6) before, but then my script takes 35 minutes to run on that FASTQ file. After just writing to a file or stdout, it takes less than 8 minutes. When compressing that plain text output with gzip -6 it takes 15 minutes to compress (same time them when running the script and piping the output directly to gzip -6. When piping the plain text output (uncompressed modifed FASTQ file) of my script to pigz -t 4 -6, it takes less than8 minutes too. So buildin gzip compression seems to be quite slow. Being able to write directly to stdin of pigz (or other tools) would be great. In AWK output redirection to commands exists: https://www.gnu.org/software/gawk/manual/gawk.html#Redirection
    A. R. Shajii
    @arshajii
    We're just finalizing the new type system now and should hopefully be on track to do a major release in the coming couple weeks
    There's no easy way to do this kind of FASTQ processing as far as I know, but one approach would be the following:
    • For each block, create an empty list the size of the block (e.g. [s''] * N)
    Hm actually scratch this; it will still be out of order w.r.t blocks
    Need to think about this one more
    Also thanks for the note about gzip, will look more into it as well
    Gert Hulselmans
    @ghuls
    having an option to get results back in order when using parallel processing would be really helpful
    Gert Hulselmans
    @ghuls
    zstd compression support might also be interesting to look at, at a certain point. With 1 thread it compresses the uncompressed modifed FASTQ file from the seq script in real time (7 minutes) and the final size is even a bit smaller than the one from pigz -t 4 -6.
    John Leung
    @fuzzthink
    FYI, I used seq to both try out the language (to seek a faster/typed/more functional pypy) and learn genomics algos earlier this year. I probably won't use seq until I find a need for it, so posting the repo for the solutions to up to Lesson 3 week 2 for the Coursera Bioinformatics / Stepik Genome Sequencing course here so hoping it can help those who are evaluating the language. https://github.com/fuzzthink/seq-genomics
    John Leung
    @fuzzthink
    Side note, my thoughts about python's broken lambda still stands (https://gitter.im/seq-lang/Seq?at=5e450f2155b6b04bf6ac4b72) . Since seq is a new language, there's no real need to continue with the poor design choices of the past.
    Ivan Perez
    @ivanpmartell
    What's the best and fastest way to do matrix operations in seq (e.g. dot product)? Should I use numpy?
    A. R. Shajii
    @arshajii
    Hey @ivanpmartell, integrating numpy into Seq is a big item on our TODO list. For now though if you are doing a lot of numpy-based computations I'd suggest doing them in Python and interfacing with Seq (https://docs.seq-lang.org/python.html).
    Ivan Perez
    @ivanpmartell
    How can I download a file over https or ftp with seq? I also tried importing python and using urllib or requests and i get an error that https is not supported. I think its the python that seq uses that's not configured for https.
    Ivan Perez
    @ivanpmartell
    When importing ssl, I get the following error: PyError: /usr/lib/python3.6/lib-dynload/_ssl.cpython-36m-x86_64-linux-gnu.so: undefined symbol: PyExc_OSError
    Ibrahim Numanagić
    @inumanag
    You can't yet
    If your htslib is compiled with curl support, maybe, but we haven't tested it yet
    we use htslib w/o curl exactly because of the massive dependency issues with curl and ssl
    YusufCakan
    @YusufCakan
    Hi I am trying to install seq from source. However, it requires llvm6 which is not available from brew. Does anyone know how i can install it on macos.
    A. R. Shajii
    @arshajii
    Hi @YusufCakan -- the easiest way to do this is probably to use the deps.sh script for building all dependencies
    We actually use our own fork of LLVM 6 anyway; this script downloads and builds that
    Should just be CC=clang CXX=clang++ ./deps.sh to run it. You can also pass it an argument to use >1 core (e.g. deps.sh 2 for 2 cores)
    (It will take a while to build all the dependencies though.)
    Once it's built you can pass the generated deps folder path to cmake: cd build && cmake .. -DSEQ_DEP=/path/to/deps
    Lmk if this works for you
    YusufCakan
    @YusufCakan
    Hi, Thanks. I followed the instructions you gave and it does run without errors, but no executable appears in build/bin.
    The output when i run cmake is

    cmake .. -DSEQ_DEP=../deps -DLLVM_DIR=llvm-config --cmakedir -DHTS_LIB=htslib-1.9/libhts.a -DGC_LIB=gc-8.0.4/release/lib/libgc.a CPATH=gc-8.0.4/release/include:htslib-1.9 cmake --build .
    -- Dependency directory: ../deps
    -- Found LLVM 6.0.0
    -- Using LLVMConfig.cmake in: /Users/yusufcakan/Dropbox/6_Project_Files/1_Personal_Projects/FX/fx_seq_orig/deps/lib/cmake/llvm
    -- Found zlib: ../deps/lib/libz.a
    -- Found bdwgc: ../deps/lib/libgc.a
    -- Found OCaml: /Users/yusufcakan/Dropbox/6_Project_Files/1_Personal_Projects/FX/fx_seq_orig/deps/lib/ocaml
    -- Found Menhir: ../deps/share/menhir
    -- Found OpenMP: ../deps/lib/libomp.dylib
    CMake Deprecation Warning at CMakeLists.txt:1 (cmake_minimum_required):
    Compatibility with CMake < 2.8.12 will be removed from a future version of
    CMake.

    Update the VERSION argument <min> value or use a ...<max> suffix to tell
    CMake that the project does not need compatibility with older versions.

    -- Configuring done
    -- Generating done
    -- Build files have been written to: /Users/yusufcakan/Dropbox/6_Project_Files/1_Personal_Projects/FX/fx_seq_orig/build/googletest-download
    [ 11%] Performing update step for 'googletest'
    [ 22%] No patch step for 'googletest'
    [ 33%] No configure step for 'googletest'
    [ 44%] No build step for 'googletest'
    [ 55%] No install step for 'googletest'
    [ 66%] No test step for 'googletest'
    [ 77%] Completed 'googletest'
    [100%] Built target googletest
    -- Configuring done
    -- Generating done
    -- Build files have been written to: /Users/yusufcakan/Dropbox/6_Project_Files/1_Personal_Projects/FX/fx_seq_orig/build

    what am i doing wrong
    Mark Henderson
    @markhend
    I didn't try to recreate your steps, and I don't know what is wrong exactly, but I'm curious, did you run cmake --build . as a separate command? I have very little experience with cmake, but what you pasted above looks odd to me.
    A. R. Shajii
    @arshajii
    Hey @YusufCakan -- you shouldn't need to specify those extra args to cmake: cmake .. -DSEQ_DEP=../deps alone should do the trick
    Lmk if this works
    Oh yeah and don't forget to actually build: cmake --build build