These are chat archives for dereneaton/ipyrad

9th
Jun 2017
Deren Eaton
@dereneaton
Jun 09 2017 14:41
Hi @NebulousNic_twitter, I'm guessing that maybe the file did not finish downloading completely, or is not actually gzipped, or is corrupted in some way. You can check this using the linux gzip -t command like in this post (https://stackoverflow.com/questions/21524643/php-how-to-check-if-gz-file-is-corrupt). Let me know if this is not the case and we can look further.
toczydlowski
@toczydlowski
Jun 09 2017 16:22
@dereneaton Am I correct that there is not an option to output in format for TREEMIX? Is there an easy way to convert from an output ipyrad does have to TREEMIX? Thanks!
Isaac Overcast
@isaacovercast
Jun 09 2017 19:01
@jaecan808_twitter The easiest way to diagnose this problem is to use the file command to show you what type of file your raw fastqs are. Like this:
file /Volumes/WorkDrive/ipyrad/ipyrad/tests/ipsimdata/rad_example_R1_.fastq.gz 
/Volumes/WorkDrive/ipyrad/ipyrad/tests/ipsimdata/rad_example_R1_.fastq.gz: gzip compressed data, was "rad_example_R1_.fastq", last modified: Thu Sep 29 15:11:25 2016, max compression
@jaecan808_twitter My guess is your data is either compressed with some other compression algorithm, or it's not compressed at all and it just has the .gz extension.
LinaValencia85
@LinaValencia85
Jun 09 2017 23:30
Hi @dereneaton @isaacovercast I am analyzing some 150bp PE ddRAD data of one species of primate for some population genetic analysis. After running s1-7 I realized that many of my loci have this weird pattern, where iPYRAD identifies a SNP per each site of the last ~40bp of the loci. After looking at the data, I realized that is due to the fact that there was an incomplete digestion of the restriction enzyme in some samples. Given the fact that I have a very common enzyme not all reads are cut at the same site, and for some loci I observe two very proximate RE sites which leads to a weird alignment of the end of the loci and thus the identification of many fake SNPS. I was wondering if in iPYRAD is there any way of adaptitively trimming the loci so that the last bp of the loci are eliminated not simply on default value (using trim loci) but if like in PYRAD, loci could be trimmed to the shortest read? Thanks!!