These are chat archives for dereneaton/ipyrad

5th
Nov 2018
AliceLedent
@AliceLedent
Nov 05 2018 13:39
Hi, I'm now running cutadapt before ipyrad (filter_adapter =0) in order to specify the adapter sequences i want to trim. I'm then running ipyrad on my paired-end reads from step 1 to 7. The first step works perfectly, then the second step seems to work as well (no error in the log file) but the 3rd step is crashing with this error: "Invalid line 10716930 in FASTQ file: Unexpected end of file\n'))". Indeed, it seems that even if i haven't received an error, the two paired files outputed by step 2 don't have the same length. And i think the folder "_edits" contained some temporary files as if the step 2 did not complete (96,1_derep.fastq and 96,1merged.fastq). Interestingly, the s2_rawedit_stats.txt file seems normal!
I checked and the output files of cutadapt are totally good as well as the output files of step1. Does this could be linked to the param "filter_adapter =0". Before, i used on the same data ipyrad directly on the reads from the illumina sequencer with the param "filter_adapter =2" and it worked perfectly. I don't see any other change in my param files or .sh file that could explain why it used to worked before and no longer now.
Could this be because the cluster is overloaded? So the program stops to write the output files? How could i know that? I don't receive any alarm from the slurm system.
Thank you very much in advance!
Alice
heather340
@heather340
Nov 05 2018 15:31

Hello! I'm attempting to run Tetrad using my .snps.phy and .snps.map outfiles from iPyrad, but am running into difficulty with an engine dying. I've checked my core settings who uses the same HPC system for Tetrad and everything checks out with her, and I just installed toytree and any other associated packages. For context: Enabling debug mode
tetrad instance: 4_85_a_snps
loading seq array [45 taxa x 24792 bp]
max unlinked SNPs per quartet (nloci): 4453
inferring 148995 quartet tree sets
establishing parallel connection:
host compute node: [24 cores] on o0412.ten.osc.edu
[####################] 100% generating q-sets | 0:00:06 |
[# ] 9% initial tree | 0:00:40 |
Unknown exception encountered: EngineError(Engine '4ee7d881-820a7c5f437ce7eb1a6c9293' died while running task u'1934ea0d-e1b1e110ae8c42105728f3b2')
warning: error during shutdown:

[Errno 3] No such process

Resources requested:
nodes=1:ppn=24

mem=109704mb

Resources used:
cput=00:03:25
walltime=00:01:56
mem=1.792GB
vmem=1.290GB

arminf
@arminf82
Nov 05 2018 17:28
In Mainz @Mogon we can't run Ipyrad since around 7 month because of engines di
dying. We tried everything from reinstalling, ipylcuter etc...
I
@isaacovercast
@isaacovercast When people show their ABBA BABA results there are never more than a couple of tests in the plots. I followed the cookbook and get min 170k Tests with my GBS data which I can't plot with topology
Toyplot of course. Why is my data producing so many tests?
Deren Eaton
@dereneaton
Nov 05 2018 19:48
@bioballs, The functions to "generate all tests" can lead to lots and lots of tests for large trees, definitely too many to plot. You can either run only the tests you are interested in, or run all tests and then select only the most interesting for plotting. Here is an example notebook: http://nbviewer.jupyter.org/github/dereneaton/Canarium-GBS/blob/master/nb-6-abba-baba.ipynb