These are chat archives for dereneaton/ipyrad

2nd
Aug 2016
RobertaDamasceno
@RobertaDamasceno
Aug 02 2016 13:29

Hi! I'm having the same error in step 7 as reported by @Cycadales : yeti:damasceno Yeti$ ipyrad -p data2-p12_12_p13_1000_P14_09_P21_45_P22_20_P24_025.txt -s 67


ipyrad [v.0.3.25]

Interactive assembly and analysis of RAD-seq data

loading Assembly: data2
from saved path: ~/Documents/damasceno/ipyrad_test_2/data2.json
ipyparallel setup: Local connection to 24 Engines

Step6: Clustering across 91 samples at 0.9 similarity
[####################] 100% concat/shuffle input | 0:06:37
[####################] 100% clustering across | 0:14:01
[####################] 100% building clusters | 0:06:45
[####################] 100% aligning clusters | 0:12:12
[####################] 100% indexing clusters | 1:21:35
[####################] 100% building database | 12:53:36

Step7: Filter and write output files for 91 Samples
[####################] 100% filtering loci | 0:00:19

Caught unknown exception - ValueError(invalid literal for int() with base 10: '')

Traceback (most recent call last):
File "/Users/Yeti/miniconda2/bin/ipyrad", line 9, in <module>
load_entry_point('ipyrad==0.3.25', 'console_scripts', 'ipyrad')()
File "/Users/Yeti/miniconda2/lib/python2.7/site-packages/ipyrad/main.py", line 457, in main
data.run(steps=steps, force=args.force, preview=args.preview)
File "/Users/Yeti/miniconda2/lib/python2.7/site-packages/ipyrad/core/assembly.py", line 1417, in run
self.step7(force=force)
File "/Users/Yeti/miniconda2/lib/python2.7/site-packages/ipyrad/core/assembly.py", line 1385, in step7
self._clientwrapper(self._step7func, [samples, force], 45)
File "/Users/Yeti/miniconda2/lib/python2.7/site-packages/ipyrad/core/assembly.py", line 859, in _clientwrapper
stepfunc(*args)
File "/Users/Yeti/miniconda2/lib/python2.7/site-packages/ipyrad/core/assembly.py", line 1196, in _step7func
assemble.write_outfiles.run(self, samples, force, ipyclient)
File "/Users/Yeti/miniconda2/lib/python2.7/site-packages/ipyrad/assemble/write_outfiles.py", line 70, in run
filter_all_clusters(data, samples, ipyclient)
File "/Users/Yeti/miniconda2/lib/python2.7/site-packages/ipyrad/assemble/write_outfiles.py", line 355, in filter_all_clusters
[i.get() for i in results]
File "/Users/Yeti/miniconda2/lib/python2.7/site-packages/ipyparallel/client/asyncresult.py", line 166, in get
raise self.exception()
ipyparallel.error.RemoteError: ValueError(invalid literal for int() with base 10: '')

I have conda 4.1.11 and I noticed that Issac fixed this bug and the new version is in conda 0.3.5. Does that mean I need to switch to that version (instead of conda 4.1.11)? Thanks! (Oh, BTW, I was able to run steps 1-6 with my radseq dataset of 91 samples in ~ 24 hours! It's amazingly fast! Thanks!)
Deren Eaton
@dereneaton
Aug 02 2016 17:40
@danielyao12 ; nothing available yet. For now I would probably cite the original pyRAD paper and/or the ipyrad github page as a reference. But I'm hoping we can finish a manuscript/preprint to put up on biorxiv within the next month, and then that would be the proper citation.
Deren Eaton
@dereneaton
Aug 02 2016 17:49
@RobertaDamasceno It looks like you might have a bad parameter setting. Can you copy your params file?
RobertaDamasceno
@RobertaDamasceno
Aug 02 2016 19:16
Hi, @dereneaton ! Thanks! Here's my params file:
This message was deleted

------- ipyrad params file (v.0.3.25)-------------------------------------------
data2 ## [0] [assembly_name]: Assembly name. Used to name output directories for assembly steps
ipyrad_test_2 ## [1] [project_dir]: Project dir (made in curdir if not present)

       ## [2] [raw_fastq_path]: Location of raw non-demultiplexed fastq files

./ipyrad_barcodes.txt ## [3] [barcodes_path]: Location of barcodes file
./ipyrad_test_1/data1_fastqs/*.fastq.gz ## [4] [sorted_fastq_path]: Location of demultiplexed/sorted fastq files
denovo ## [5] [assembly_method]: Assembly method (denovo, reference, denovo+reference, denovo-reference)

       ## [6] [reference_sequence]: Location of reference sequence file

rad ## [7] [datatype]: Datatype (see docs): rad, gbs, ddrad, etc.
TGCAGG, ## [8] [restriction_overhang]: Restriction overhang (cut1,) or (cut1, cut2)
5 ## [9] [max_low_qual_bases]: Max low quality base calls (Q<20) in a read
33 ## [10] [phred_Qscore_offset]: phred Q score offset (only alternative=64)
12 ## [11] [mindepth_statistical]: Min depth for statistical base calling
12 ## [12] [mindepth_majrule]: Min depth for majority-rule base calling
1000 ## [13] [maxdepth]: Max cluster depth within samples
0.9 ## [14] [clust_threshold]: Clustering threshold for de novo assembly
1 ## [15] [max_barcode_mismatch]: Max number of allowable mismatches in barcodes
1 ## [16] [filter_adapters]: Filter for adapters/primers (1 or 2=stricter)
80 ## [17] [filter_min_trim_len]: Min length of reads after adapter trim
2 ## [18] [max_alleles_consens]: Max alleles per site in consensus sequences
5 ## [19] [max_Ns_consens]: Max N's (uncalled bases) in consensus (R1, R2)
10 ## [20] [max_Hs_consens]: Max Hs (heterozygotes) in consensus (R1, R2)
45 ## [21] [min_samples_locus]: Min # samples per locus for output
20 ## [22] [max_SNPs_locus]: Max # SNPs per locus (R1, R2)
5 ## [23] [max_Indels_locus]: Max # of indels per locus (R1, R2)
0.25 ## [24] [max_shared_Hs_locus]: Max # heterozygous sites per locus (R1, R2)
TGCAGG, ## [25] [edit_cutsites]: Edit cut-sites (R1, R2) (see docs)

       ## [26] [trim_overhang]: Trim overhang (see docs) (R1>, <R1, R2>, <R2)
  • [27] [output_formats]: Output formats (see docs)

    [28] [pop_assign_file]: Path to population assignment file

RobertaDamasceno
@RobertaDamasceno
Aug 02 2016 19:22
This message was deleted