These are chat archives for dereneaton/ipyrad

17th
Nov 2016
R2C2.lab
@R2C2_Lab_twitter
Nov 17 2016 11:07
Hi @edgardomortiz; many thanks. Do I have to run all steps again if I update to version 0.5.2?
Edgardo M. Ortiz
@edgardomortiz
Nov 17 2016 14:36
@R2C2_Lab_twitter I don't think so, just rerun step 6
LinaValencia85
@LinaValencia85
Nov 17 2016 17:16
Hi @dereneaton @isaacovercast I have run successfully ipyrad from s1-s6, but I am now getting an error in s7:

loading Assembly: nwm-s456
from saved path: /scratch/02202/lmv498/ddRAD/JA16240/SA16089/iPYRAD/new_iPYRAD/new_iPYRAD/iPYRAD_140bp/iPYRAD/nwm-s456.json
host compute node: [24 cores] on nid00221

Step 7: Filter and write output files for 91 Samples
[####################] 100% filtering loci | 0:00:54
[####################] 100% building loci/stats | 0:00:03

Empty varcounts array. Probably no samples passed filtering.

Encountered an unexpected error (see ./ipyrad_log.txt)
Error message is below -------------------------------
max() arg is an empty sequence

I have tried modifying the types of output files I specify and I still get the same error. Any ideas of why? Thanks!
Amely Martins
@amelymartins
Nov 17 2016 17:43

Hi @dereneaton @isaacovercast . I'm having the same problem that @LinaValencia85 mentioned.
My ipyrad_log.txt has:

2016-11-17 10:59:45,663 pid=25289 [assembly.py] ERROR IPyradWarningExit: error in vcf build chunk 0: MemoryError()

Deren Eaton
@dereneaton
Nov 17 2016 17:48
Just changed the settings for vcf recently, should reduce memory, are you using latest? @LinaValencia85 you probably need to lower the minsampleslocus param if all loci are being filtered. @isaacovercast can you look into the vcf error?
Amely Martins
@amelymartins
Nov 17 2016 17:55
@dereneaton yes, I'm using the latest. I was re-running with -f option, but without remove the outfiles folder. I'll try to reduce the number of cores as @edgardomortiz have suggested to me.
LinaValencia85
@LinaValencia85
Nov 17 2016 18:12
Hi @dereneaton I have reduced the number of samples and it work, but I get a new error:

loading Assembly: nwm-s456
from saved path: /scratch/02202/lmv498/ddRAD/JA16240/SA16089/iPYRAD/new_iPYRAD/new_iPYRAD/iPYRAD_140bp/iPYRAD/nwm-s456.json
host compute node: [24 cores] on nid00064

Step 7: Filter and write output files for 91 Samples
[####################] 100% filtering loci | 0:00:54
[####################] 100% building loci/stats | 0:00:09
[####################] 100% building vcf file | 0:00:40

Encountered an error, see ./ipyrad_log.txt.
error in vcf build chunk 0: MemoryError()

Which is weird as I a have not specified v in my output files.
Isaac Overcast
@isaacovercast
Nov 17 2016 18:44
@sborstein I recently pushed a new version that does a better job handling errors in samples in step 3. Pull down the new version and try it again conda install -c ipyrad ipyrad (should be version 0.5.4). You should see that all the samples that succeed will move on to step 3 (a few low quality samples won't hold back analysis).
Sam Borstein
@sborstein
Nov 17 2016 18:45
Awesome! I'll give it a shot. Thanks @isaacovercast.
Isaac Overcast
@isaacovercast
Nov 17 2016 19:21
@LinaValencia85 Can you post the contents of your params file?
LinaValencia85
@LinaValencia85
Nov 17 2016 19:22
@isaacovercast Here they are:

------- ipyrad params file (v.0.5.1)--------------------------------------------
nwm-s456 ## [0] [assembly_name]: Assembly name. Used to name output directories for assembly steps
/scratch/02202/lmv498/ddRAD/JA16240/SA16089/iPYRAD/new_iPYRAD/new_iPYRAD/iPYRAD_140bp/iPYRAD ## [1] [project_dir]: Project dir (made $
Merged: sample2_sub, sample3, sample4, sample5, sample6, sample8, sample9, sample10, sample11, sample12, sample13, sample14, sample15$
Merged: sample2_sub, sample3, sample4, sample5, sample6, sample8, sample9, sample10, sample11, sample12, sample13, sample14, sample15$
Merged: sample2_sub, sample3, sample4, sample5, sample6, sample8, sample9, sample10, sample11, sample12, sample13, sample14, sample15$
denovo ## [5] [assembly_method]: Assembly method (denovo, reference, denovo+reference, denovo-reference)

                           ## [6] [reference_sequence]: Location of reference sequence file

pairddrad ## [7] [datatype]: Datatype (see docs): rad, gbs, ddrad, etc.

                ## [8] [restriction_overhang]: Restriction overhang (cut1,) or (cut1, cut2)

7 ## [9] [max_low_qual_bases]: Max low quality base calls (Q<20) in a read
26 ## [10] [phred_Qscore_offset]: phred Q score offset (33 is default and very standard)
6 ## [11] [mindepth_statistical]: Min depth for statistical base calling
6 ## [12] [mindepth_majrule]: Min depth for majority-rule base calling
10000 ## [13] [maxdepth]: Max cluster depth within samples
0.85 ## [14] [clust_threshold]: Clustering threshold for de novo assembly
0 ## [15] [max_barcode_mismatch]: Max number of allowable mismatches in barcodes
2 ## [16] [filter_adapters]: Filter for adapters/primers (1 or 2=stricter)
35 ## [17] [filter_min_trim_len]: Min length of reads after adapter trim
2 ## [18] [max_alleles_consens]: Max alleles per site in consensus sequences
8, 8 ## [19] [max_Ns_consens]: Max N's (uncalled bases) in consensus (R1, R2)
6, 6 ## [20] [max_Hs_consens]: Max Hs (heterozygotes) in consensus (R1, R2)
4 ## [21] [min_samples_locus]: Min # samples per locus for output
100, 100 ## [22] [max_SNPs_locus]: Max # SNPs per locus (R1, R2)
100, 100 ## [23] [max_Indels_locus]: Max # of indels per locus (R1, R2)
1.0 ## [24] [max_shared_Hs_locus]: Max # heterozygous sites per locus (R1, R2)
5, 4 ## [25] [edit_cutsites]: Edit cut-sites (R1, R2) (see docs)
0, 0, 0, 0 ## [26] [trim_overhang]: Trim overhang (see docs) (R1>, <R1, R2>, <R2)
l, s, p, u, k ## [27] [output_formats]: Output formats (see docs)

                           ## [28] [pop_assign_file]: Path to population assignment file
IM sorry I dont know how to paste them prettier.
Isaac Overcast
@isaacovercast
Nov 17 2016 19:40
@LinaValencia85 There was a bug that forced creation of the vcf file, even if it wasn't requested. I fixed this, will push it soon. How much memory is there on the system you're doing the assembly on?
LinaValencia85
@LinaValencia85
Nov 17 2016 19:42
@isaacovercast great! Its 64gb
Isaac Overcast
@isaacovercast
Nov 17 2016 19:44
Should be more than enough for 91 samples...
I'm working on it.
LinaValencia85
@LinaValencia85
Nov 17 2016 19:45
Ok! @isaacovercast I will wait on the new version with the fixed bug and try to run it. THANKS!
Isaac Overcast
@isaacovercast
Nov 17 2016 19:48
@LinaValencia85 Actually, can you rerun step 7 and include -f -d
at the end of the ipyrad command
then email me the ipyrad_log.txt that it creates? I pm'd you my email
LinaValencia85
@LinaValencia85
Nov 17 2016 20:05
@isaacovercast just sent it!
Isaac Overcast
@isaacovercast
Nov 17 2016 20:59
@amelymartins let me know if reducing the number of cores fixes that problem. I think it probably will, but it'll be good to know for sure.
Amely Martins
@amelymartins
Nov 17 2016 21:04
Hi @isaacovercast . No, it didn't work. I think my problem is the same of @LinaValencia85's . I'm always getting the same error when ipyrad tries to generate the vcf file, even when I didn't include the vcf in the list of output formats.
I've run the s7 with -f -d flags. Would you like to see the ipyrad_log.txt too?
Isaac Overcast
@isaacovercast
Nov 17 2016 21:09
Hi @amelymartins. Can we try one more thing? Will you try running like this:
ipyrad -p params-yourparams.txt -s 7 -f -c 1
This will just run it on ONE core. It might take a little longer, but if it doesn't work then it will definitely tell us the problem is not what I think it is.
I pm'd you my email if you want to send the ipyrad_log.txt
Amely Martins
@amelymartins
Nov 17 2016 21:12

Yes, I can run this.
Should I use the following in my launcher:

ipcluster start --n=1 --daemonize --profile=ipyrad
sleep 60
ipyrad -p params-merged1-nwm-85.txt -s7 -f -c 1 --ipcluster &>s7.log

Isaac Overcast
@isaacovercast
Nov 17 2016 21:14
Loogs good
*looks
Amely Martins
@amelymartins
Nov 17 2016 21:15
Ok. I'll run.
Isaac Overcast
@isaacovercast
Nov 17 2016 21:18
v.0.5.5 is up on conda. Will now write other output formats even if vcf crashes, and will also not create vcf if you don't include the 'v' flag in output formats.
LinaValencia85
@LinaValencia85
Nov 17 2016 21:21
@isaacovercast I tried running as you suggested and it still crashes. I will update ipyrad and see if that might solve the problem.
Amely Martins
@amelymartins
Nov 17 2016 21:23
@isaacovercast It doesn't work like that. But now the error was different
blob

I've used the following instead:

ipcluster start --n=12 --daemonize --profile=ipyrad
sleep 60
ipyrad -p params-merged1-nwm-85.txt -s7 -f -c 1 --ipcluster &>s7.log

But got the same error as before.

LinaValencia85
@LinaValencia85
Nov 17 2016 22:32
@isaacovercast it worked with the new version! Thanks!!!
Isaac Overcast
@isaacovercast
Nov 17 2016 22:33
With -c 1? Did it create the vcf file successfully?