These are chat archives for dereneaton/ipyrad

1st
Nov 2017
tommydevitt
@tommydevitt
Nov 01 2017 02:35

Hi @tommydevitt. The command job.ready() returns whether the job is done running or not, but does not tell you whether it was successful or not. If the job is finished but there was an error, you can check for the error using job.result(). One reason that an error might be raised is if it cannot find the bpp binary.

@dereneaton the error is because glibc isn't installed: ("bpp: /lib64/libc.so.6: version `GLIBC_2.14' not found (required by bpp)\n",
None). Tried installing it via conda like so

conda install -c dan_blanchard glibc

but no dice.

JStarrett
@JStarrett
Nov 01 2017 16:06
Hello. I got the following error when running step 2, and am wondering if anyone has encountered this before: [rawedit.py] ERROR error in run_cutadapt(): AttributeError('NoneType' object has no attribute 'replace')
Isaac Overcast
@isaacovercast
Nov 01 2017 19:45
@rfolkert I've never tried this. It could work, but it might be goofy. I think the clustering won't work very well. You used the same enzymes? Probably get better results if you trim the 150bp samples, but I can see how you might not want to throw away that data.
@JStarrett Can you run it again and include the -d flag and post the last 20ish lines of the ipyrad_log.txt file?
JStarrett
@JStarrett
Nov 01 2017 19:55
@JStarrett just for more detail, I demultiplexed my data with a different program (mr_demuxy), and ran step 1 with the sorted fastqs with no problems. I couldn't demultiplex with ipryad due to an error in the sequencing that occurs in one of the overhang sequence. Here is some cutting and pasting from the log file, cutting some of the redundant messages: Using args {'preview': False, 'force': False, 'threads': 2, 'results': False, 'quiet': False, 'merge': None, 'ipcluster': None, 'cores': 80, 'params': 'params-3RAD_plate2Atyp
oides_mr_NBV.txt', 'branch': None, 'steps': '2', 'debug': True, 'new': None, 'MPI': True}
Platform info: ('Linux', 'node031', '2.6.32-504.16.2.el6.x86_64', '#1 SMP Wed Apr 22 06:48:29 UTC 2015', 'x86_64')2017-11-01 00:59:20,649 pid=23280 [load.py]
DEBUG skipping: no svd results present in old assembly
2017-11-01 00:59:20,952 pid=23280 [parallel.py] INFO ['ipcluster', 'start', '--daemonize', '--cluster-id=ipyrad-cli-23280', '--engines=MPI', '--profile=defau
lt', '--n=80', '--ip=*']
2017-11-01 00:59:33,009 pid=23280 [rawedit.py] INFO [3205128.0, 3028761.0, 2315237.0, 2304217.0, 2248322.0, 2101959.0, 1944071.0, 1834957.0, 1627486.0, 1622
290.0, 1601731.0, 1572983.0, 1551643.0, 1540534.0, 1518355.0, 1510765.0, 1503988.0, 1499927.0, 1440798.0, 1426360.0, 1423094.0, 1420293.0, 1413401.0, 1409020.0, 1387904.0, 1385
944.0, 1373875.0, 1371257.0, 1370451.0, 1363393.0, 1359257.0, 1353482.0, 1349207.0, 1337232.0, 1331501.0, 1307429.0, 1249374.0, 1243548.0, 1231013.0, 1212922.0, 1196646.0, 1180
184.0, 1180027.0, 1129929.0, 1097354.0, 1076918.0, 1074824.0, 1074147.0, 1062544.0, 1050349.0, 1047437.0, 1044996.0, 1013744.0, 1001094.0, 960109.0, 958924.0, 923554.0, 913900.
0, 885235.0, 876491.0, 872121.0, 841406.0, 824280.0, 807423.0, 795302.0, 755475.0, 737441.0, 718912.0, 708018.0, 672091.0, 651171.0, 641723.0, 603595.0, 574762.0, 563098.0, 559
241.0, 508550.0, 472658.0, 455462.0, 409343.0, 405405.0, 390528.0, 368898.0, 359684.0, 302790.0, 293831.0, 275750.0, 133755.0]
2017-11-01 00:59:34,779 pid=23345 [rawedit.py] DEBUG Entering cutadaptit_pairs - MY4251
2017-11-01 00:59:34,885 pid=29978 [rawedit.py] DEBUG Entering cutadaptit_pairs - AR78
2017-11-01 00:59:38,854 pid=23280 [rawedit.py] ERROR error in run_cutadapt(): AttributeError('NoneType' object has no attribute 'replace')
2017-11-01 00:59:38,855 pid=23280 [rawedit.py] ERROR error in run_cutadapt(): AttributeError('NoneType' object has no attribute 'replace')
2017-11-01 00:59:38,855 pid=23280 [rawedit.py] ERROR error in run_cutadapt(): AttributeError('NoneType' object has no attribute 'replace')
2017-11-01 00:59:38,856 pid=23280 [rawedit.py] ERROR error in run_cutadapt(): AttributeError('NoneType' object has no attribute 'replace') 2017-11-01 00:59:39,076 pid=23280 [assembly.py] INFO shutting down engines
2017-11-01 00:59:39,354 pid=23280 [assembly.py] INFO finished shutdown
2017-11-01 00:59:39,361 pid=23280 [init.py] INFO debugging turned off
Isaac Overcast
@isaacovercast
Nov 01 2017 20:03
@tommydevitt It's not because glibc isn't installed (it almost certainly is), but rather the version of the bpp binary that's in the ipyrad conda repo was built on a system with a newer version of glibc than is on your machine. This is pretty common to have older versions of glibc on some cluster systems. You can fix this by running a conda build on the conda.recipe/bpp recipe from the github repo (easiest to just check out the github, and install conda build tools). Sorry that's a little annoying.
@JStarrett Can you post the results of ipyrad -p params-watever.txt -r, this will print results of the assembly, I want to look at the formatting of the names and the number of raw reads per sequence.
JStarrett
@JStarrett
Nov 01 2017 20:23

Thank you for your reply! Here are those results: Enabling debug mode

Summary stats of Assembly Atypoides_Plate2_mr

    state  reads_raw

AR14 1 1044996
AR17 1 1243548
AR27 1 1373875
AR31 1 1510765
AR32 1 1387904
AR37 1 508550
AR51 1 1196646
AR52 1 1601731
AR56 1 3028761
AR63 1 1349207
AR64 1 1499927
AR75 1 1426360
AR78 1 3205128
AR79 1 2304217
AR82 1 1413401
AR83 1 1013744
AR84 1 1420293
AR85 1 1551643
AR87 1 960109
AR88 1 1440798
AR90 1 1359257
AR91 1 1180184
AR92 1 1518355
AR93 1 1540534
AR97 1 1337232
AR98 1 2101959
MY1014 1 1363393
MY1033 1 1627486
MY1034 1 1834957
MY1038 1 1409020
MY1039 1 1371257
MY1046 1 1622290
MY1056 1 1944071
MY1057 1 841406
MY1058 1 302790
MY1059 1 603595
MY1060 1 807423
MY1148 1 1180027
MY1149 1 1076918
MY1157 1 574762
MY1158 1 368898
MY1160 1 1370451
MY1161 1 563098
MY1163 1 1074824
MY1164 1 923554
MY1172 1 672091
MY1173 1 651171
MY1317 1 755475
MY1318 1 1050349
MY1360 1 708018
MY1361 1 958924
MY394 1 1503988
MY399 1 718912
MY405 1 559241
MY409 1 1074147
MY4203 1 737441
MY4204 1 1097354
MY4208 1 1249374
MY4209 1 1047437
MY4210 1 405405
MY4211 1 133755
MY4213 1 824280
MY4217 1 1307429
MY4218 1 1062544
MY4221 1 885235
MY4225 1 472658
MY4226 1 795302
MY4230 1 1385944
MY4231 1 1129929
MY4234 1 1423094
MY4235 1 275750
MY4238 1 455462
MY4239 1 876491
MY4246 1 1001094
MY4250 1 641723
MY4251 1 2248322
MY4258 1 872121
MY4262 1 390528
MY4265 1 1231013
MY4366 1 293831
MY4371 1 913900
MY4678 1 359684
MY975 1 1353482
MY977 1 2315237
MY978 1 1331501
MY979 1 409343
MY985 1 1572983
MY986 1 1212922

Full stats files

step 1: ./Atypoides_Plate2_mr_s1_demultiplex_stats.txt
step 2: ./Atypoides_Plate2_mr_edits/s2_rawedit_stats.txt
step 3: None

Isaac Overcast
@isaacovercast
Nov 01 2017 20:44
Is it paired end? Did you merge the reads prior to importing? Can you paste your params file?
JStarrett
@JStarrett
Nov 01 2017 20:59

The data is paired-end. The reads are not merged, but the R1 and R2 fastq files are all in the same folder, and the R1 and R2 fastq files are listed for each sample under "items" under "fastqs" and "concat" in the json file . Here is the params file: ------- ipyrad params file (v.0.6.11)-------------------------------------------
Atypoides_Plate2_mr ## [0] [assembly_name]: Assembly name. Used to name output directories for assembly steps
./ ## [1] [project_dir]: Project dir (made in curdir if not present)

                           ## [2] [raw_fastq_path]: Location of raw non-demultiplexed fastq files

/home/jrs0129/3RAD_data/AtypoidesJune17/FastqData_Atypoidesjune2017b/ipyrad/barcodes/3RAD_Plate2barcodes_Atypoides.txt ## [3] [barcodes_path]: Location of barcodes file
/home/jrs0129/3RAD_data/AtypoidesJune17/FastqData_Atypoidesjune2017b/mrdemuxy_demultiplexed/demultiplexed_Atypoides_Plate2/*.fastq ## [4] [sorted_fastq_path]: Location of d$
denovo ## [5] [assembly_method]: Assembly method (denovo, reference, denovo+reference, denovo-reference)

                           ## [6] [reference_sequence]: Location of reference sequence file

pair3rad ## [7] [datatype]: Datatype (see docs): rad, gbs, ddrad, etc.
ATCGG,TAATTC ## [8] [restriction_overhang]: Restriction overhang (cut1,) or (cut1, cut2)
5 ## [9] [max_low_qual_bases]: Max low quality base calls (Q<20) in a read
33 ## [10] [phred_Qscore_offset]: phred Q score offset (33 is default and very standard)
6 ## [11] [mindepth_statistical]: Min depth for statistical base calling
6 ## [12] [mindepth_majrule]: Min depth for majority-rule base calling
10000 ## [13] [maxdepth]: Max cluster depth within samples
0.90 ## [14] [clust_threshold]: Clustering threshold for de novo assembly
1 ## [15] [max_barcode_mismatch]: Max number of allowable mismatches in barcodes
0 ## [16] [filter_adapters]: Filter for adapters/primers (1 or 2=stricter)
40 ## [17] [filter_min_trim_len]: Min length of reads after adapter trim
2 ## [18] [max_alleles_consens]: Max alleles per site in consensus sequences
5, 5 ## [19] [max_Ns_consens]: Max N's (uncalled bases) in consensus (R1, R2)
8, 8 ## [20] [max_Hs_consens]: Max Hs (heterozygotes) in consensus (R1, R2)
45 ## [21] [min_samples_locus]: Min # samples per locus for output
20, 20 ## [22] [max_SNPs_locus]: Max # SNPs per locus (R1, R2)
8, 8 ## [23] [max_Indels_locus]: Max # of indels per locus (R1, R2)
0.5 ## [24] [max_shared_Hs_locus]: Max # heterozygous sites per locus (R1, R2)
5, 0, 6, 0 ## [25] [trim_reads]: Trim raw read edges (R1>, <R1, R2>, <R2) (see docs) 0, 0, 0, 0 ## [26] [trim_loci]: Trim locus edges (see docs) (R1>, <R1, R2>, <R2)
p, s, v ## [27] [output_formats]: Output formats (see docs)

                           ## [28] [pop_assign_file]: Path to population assignment file
tommydevitt
@tommydevitt
Nov 01 2017 21:09
@isaacovercast thanks Isaac. So do conda install conda-build?
Isaac Overcast
@isaacovercast
Nov 01 2017 21:10
@tommydevitt yes
@JStarrett Are you actually using this version v.0.6.11? If so I would highly recommend updating to the newest.
So if this is 3rad then i assume you have multiplexed barcodes? Can i see the first handful of lines of your barcodes file?
tommydevitt
@tommydevitt
Nov 01 2017 21:14
@isaacovercast hmm. . . /home1/miniconda2/bin/python: error while loading shared libraries: __vdso_time: invalid mode for dlopen(): Invalid argument
JStarrett
@JStarrett
Nov 01 2017 21:19
No, I'm using ipyrad [v.0.7.8], the params file is just from an older verion. This is 3RAD, here are some lines from the barcode file: AR14 CCGAAT TCGGTAC
AR17 TTAGGCA TCGGTAC
AR27 AACTCGTC TCGGTAC
AR79 GGTCTACGT TCGGTAC
AR32 GATACC TCGGTAC
MY4231 AGCGTTG TCGGTAC
AR51 CTGCAACT TCGGTAC
AR52 TCATGGTCA TCGGTAC
AR56 CCGAAT GATCGTTG
AR63 TTAGGCA GATCGTTG
MY394 AACTCGTC GATCGTTG
AR75 GGTCTACGT GATCGTTG
AR78 GATACC GATCGTTG
AR31 AGCGTTG GATCGTTG
AR82 CTGCAACT GATCGTTG
AR90 TCATGGTCA GATCGTTG