These are chat archives for dereneaton/ipyrad

23rd
Jul 2018
Alice
@Alice85809572_twitter
Jul 23 2018 08:55
Hi,
I've been using ipyrad for a while but i now have a problem.
I've sequenced 5 pairgbs librairies of about 96 individuals from 10 different species.
I used ipyrad to demultipex each of the 5 libraries separately (cause i can't mix different individuals that share the same barcode in the step 1 and 2). Now i want to proceed step 34567 for each species. The problem is each species is split into the 5 different librairies, i thus need to specify 5 paths to demultiplex data instead of 1 path. I don't know how to do so. The program don't allow me to do that. I also tried to make a new directory where i copied the demultiplex files from all of the 5 libraries fro one species and tried to run ipyrad -b ... -p ... from there. But the program don't allow me to do that neither. I tried to copy the .json file but there is 5 different .json files so how to choose one? Is there a way to concatenate the 5 .json files to make one big .json file so that the program would proceed has if i demultiplexed all the reads for the species in one time and not in 5 times?
Thanks in advance!
tommydevitt
@tommydevitt
Jul 23 2018 14:12
@isaacovercast That seems to have worked. Thanks! One question - how do I know if the analysis is running? I see that the input files were created successfully. All I see in the "Running" tab of the Jupyter notebook though are a terminal and the notebook.
Robin K Bagley
@rkbagley_twitter
Jul 23 2018 16:51

Hi! I am working with a collaborator to run some (already demultiplexed) 3RAD data through ipyrad. We get through steps 1 and 2 okay; and through step 3 for some of our samples, but when we hit a particular one we throw a "list index out of range" error similar to a previous ticket (#291 I think).

Here is some of the text from the debug log near the error:
2018-07-20 18:19:12,506 pid=25591 [cluster_within.py] INFO INSIDE derep ME574
2018-07-20 18:19:12,506 pid=25591 [util.py] DEBUG Entering merge_pairs()
2018-07-20 18:19:12,511 pid=25591 [util.py] INFO gunzipping pairs
2018-07-20 18:19:29,664 pid=25591 [util.py] DEBUG merge cmd: /Users/ahippee/miniconda2/lib/python2.7/site-packages/bin/vsearch-linux-x86_64 --fastq_mergepairs /Users/ahippee/3RAD/pyrad/q20/run5/str_lib1_q20_edits/ME574.trimmedR1.fastq.tmp1 --reverse /Users/ahippee/3RAD/pyrad/q20/run5/str_lib1_q20_edits/ME574.trimmedR2.fastq.tmp2 --fastqout /Users/ahippee/3RAD/pyrad/q20/run5/str_lib1_q20_edits/ME574merged.fastq --fastqout_notmerged_fwd /Users/ahippee/3RAD/pyrad/q20/run5/str_lib1_q20_edits/tmpZoVpeC_nonmergedR1.fastq --fastqout_notmerged_rev /Users/ahippee/3RAD/pyrad/q20/run5/str_lib1_q20_edits/tmpPHRTPW_nonmergedR2.fastq --fasta_width 0 --fastq_minmergelen 35 --fastq_maxns 5 --fastq_minovlen 20 --fastq_maxdiffs 4 --label_suffix _m1 --fastq_qmax 1000 --threads 2 --fastq_allowmergestagger
2018-07-20 18:20:24,451 pid=25591 [cluster_within.py] INFO Entering declone_3rad - ME574
2018-07-20 18:20:24,664 pid=25591 [cluster_within.py] INFO INSIDE derep AH514
2018-07-20 18:20:24,665 pid=25591 [util.py] DEBUG Entering merge_pairs()
2018-07-20 18:20:24,669 pid=25591 [util.py] INFO gunzipping pairs
2018-07-20 18:20:31,695 pid=25591 [util.py] DEBUG merge cmd: /Users/ahippee/miniconda2/lib/python2.7/site-packages/bin/vsearch-linux-x86_64 --fastq_mergepairs /Users/ahippee/3RAD/pyrad/q20/run5/str_lib1_q20_edits/AH514.trimmedR1.fastq.tmp1 --reverse /Users/ahippee/3RAD/pyrad/q20/run5/str_lib1_q20_edits/AH514.trimmedR2.fastq.tmp2 --fastqout /Users/ahippee/3RAD/pyrad/q20/run5/str_lib1_q20_edits/AH514merged.fastq --fastqout_notmerged_fwd /Users/ahippee/3RAD/pyrad/q20/run5/str_lib1_q20_edits/tmpi7OtY5_nonmergedR1.fastq --fastqout_notmerged_rev /Users/ahippee/3RAD/pyrad/q20/run5/str_lib1_q20_edits/tmpsh95x9_nonmergedR2.fastq --fasta_width 0 --fastq_minmergelen 35 --fastq_maxns 5 --fastq_minovlen 20 --fastq_maxdiffs 4 --label_suffix _m1 --fastq_qmax 1000 --threads 2 --fastq_allowmergestagger
2018-07-20 18:20:52,834 pid=25591 [cluster_within.py] INFO Entering declone_3rad - AH514
2018-07-20 18:20:52,982 pid=25591 [cluster_within.py] WARNING Bad derephandle - /Users/ahippee/3RAD/pyrad/q20/run5/str_lib1_q20_edits/JA401A_derep.fastq
2018-07-20 18:20:53,027 pid=25447 [assembly.py] ERROR IPyradError( Caught error while decloning 3rad data - list index out of range)
2018-07-20 18:20:54,087 pid=25447 [assembly.py] INFO interrupted engine 0 w/ SIGINT to 25591
2018-07-20 18:20:55,088 pid=25447 [assembly.py] INFO shutting down engines
2018-07-20 18:20:55,116 pid=25447 [assembly.py] INFO finished shutdown
2018-07-20 18:20:55,187 pid=25447 [init.py] INFO debugging turned off

Any advice for resolving this would be helpful! Thanks!

Isaac Overcast
@isaacovercast
Jul 23 2018 17:28
@tommydevitt If you look at top you should see running ipcluster engines.
tommydevitt
@tommydevitt
Jul 23 2018 17:36
@isaacovercast Thanks. Looks like there's one instance of ipcluster running. Should there be one for each core specified, or just one total?
Amanda Haponski
@ahaponski_twitter
Jul 23 2018 17:49
Hi, I have a question about the potential differences between the number of retained loci listed in the .stats file vs. the .u.snps.phy file. For some of my datasets, I see that the two files match and both report 1,000 loci/unlinked SNPs (just an example), but then for other datasets, the stats file may say 1,000 loci, but then there's only 900 in the .u.snps.phy. I also see this pattern with my .vcf files. I'm just curious as to why there would be this difference. I'm assuming that those 100 are invariable and so not included, but just wanted to double check.