These are chat archives for dereneaton/ipyrad

24th
Mar 2017
Jenny Archibald
@jenarch
Mar 24 2017 01:26

@dereneaton @isaacovercast Thanks for the additional advice. So far the new run is still at 0% clustering on step 6, and the s6_cluster_stats.txt is empty (but I don't know when that changes). As for running without MPI, that's actually what I was doing before today.
The previous run was:

MSUB -N ipyCH1mar15

MSUB -l nodes=1:ppn=8,mem=125gb,walltime=168:00:00

MSUB -m abe

MSUB -d /panfs/pfs.local/scratch/bi/jkarch/cam/ch1

MSUB -e /panfs/pfs.local/scratch/bi/jkarch/cam/ch1

MSUB -o /panfs/pfs.local/scratch/bi/jkarch/cam/ch1

MSUB -j oe

module purge
export PATH=/home/jkarch/miniconda2/bin:$PATH
ipyrad -p params-m04c90.txt -s 67 -c 8

Today I continued it with these changes:

MSUB -N ipyCH1mar23

MSUB -l nodes=4:ppn=8,mem=125gb,walltime=168:00:00

MSUB -m abe

MSUB -d /panfs/pfs.local/scratch/bi/jkarch/cam/ch1

MSUB -e /panfs/pfs.local/scratch/bi/jkarch/cam/ch1

MSUB -o /panfs/pfs.local/scratch/bi/jkarch/cam/ch1

MSUB -j oe

module purge
export PATH=/home/jkarch/miniconda2/bin:$PATH
module load OpenMPI
ipyrad -p params-m04c90.txt -s 67 -c 32 --MPI

Would you suggest additional changes (such as restarting with the filter_adapters changed etc.), or do I need to wait and see? I saw your post about long jobs - these jobs have been running for weeks with 8 cores.

R2C2.lab
@R2C2_Lab_twitter
Mar 24 2017 10:50
@isaacovercast here is the link for the raw data (symbB.fasta is the reference) https://z3ma8m.s.cld.pt
R2C2.lab
@R2C2_Lab_twitter
Mar 24 2017 11:20
@dereneaton I changed the params file following your suggestions and got this message:

ipyrad [v.0.6.10]

Interactive assembly and analysis of RAD-seq data

loading Assembly: 19samples-reference-transcriptome-montipora-trimgalore
from saved path: ~/19samples_reference/19samples-reference-transcriptome-montipora-trimgalore.json
host compute node: [20 cores] on ccmar-r2c2-01

Step 2: Filtering reads
[####################] 100% processing reads | 0:03:55
No reads passed filtering in Sample: MC-RB-3-YOP
No reads passed filtering in Sample: L27
No reads passed filtering in Sample: L26

I repeated the analysis using less stringent parameters and got the same message in ipyrad_log.txt in step3 that I posted before (Fatal error: More reverse reads than forward reads).
draheem
@draheem
Mar 24 2017 17:09
@isaacovercast Thanks. I have 30 samples for a phylogenetic study (single-end RAdseq data, 250 bp reads, R1). (1) Nucleotide diversity: not really sure how to check this but I experimented with varying param 22 (max_SNPs_locus) with other parameters held constant (clustering at 85%). When I upped param 22 from 20 to 40 the total number of SNPS increased from about 126,000 to 312,000 (total loci went up from 8,300 to 13,000). I also tried settings of 60, 80, and 100 but the increase in total SNPs and loci plateaued off – e.g. only a 13% increase when param 22 was increased from 40 to 60. (2) I checked all the samples on FastQC – some stats such as per base sequence quality are good for all samples (others stats are abnormal e.g. per base sequence content, sequence duplication levels). Also looked at the actual sequence data for a few samples – in some quality scores decrease towards the end of reads, but still generally above a Qscore of 20. My question was given the length of the reads is it advisable to increase the settings above the default values for parameters 22 (max_SNPs_locus) and 23 (max_indels_locus)?