These are chat archives for dereneaton/ipyrad

Aug 2017
Zac Forsman
Aug 24 2017 06:54
@dereneaton Wow Deren! thanks for that! I think the problem might be either with the reference sequence being one big contig, or the way I formated the data in the file.... I ran some real data in --preview mode against a denovo reference and it worked! This approach might work for some of the really massive datasets our lab has been generating... I can't afford to wait a month for libraries to de-novo cluster and we have good transcriptomes and de-novo loci generated by dDocent (runs fast). I'll try this out and let you know how it works... I'm excited to give it a shot! -Zac
@Cycadales_twitter Thanks James! I might have stumbled upon a shortcut by using a reference assembly method (either from a transcriptome or de-novo assembly generated in the dDocent pipeline) The pipeline is very fast (usually runs overnight), but it is geared towards population genetics and lacks a phylogenetic angle and some of the tools/downstream plug n'play. I just ran a test in --preview mode and it seemed to work, I'll let you know if it works on some of the more massive datasets we have (hopefully soon!) -Zac
Aug 24 2017 08:13

Hi @dereneaton, I reran the analysis from scratch. But this time dividing my whole dataset in three parts to try to get it running faster and to test if one of the sample was problematic. I ran -s 123 on the three datasets with identical parameters. This finished without raising an error. Now I wanted to merge them for the rest of the analysis. So I did

ipyrad -m all_2017 params-asian_marinum.txt params-capense.txt params-secalinum.txt

which worked. But then I got this error:

bash-4.2$ ipyrad -p params-all_2017.txt -s 4567 -c 20 -t 4

  ipyrad [v.0.7.11]
  Interactive assembly and analysis of RAD-seq data
  loading Assembly: all_2017
  from saved path: /filer-5/user/brassac/Secalinum_NGS/GBS_2/raw_reads/new/all_2017.json
  establishing parallel connection:
  host compute node: [20 cores] on

  Step 4: Joint estimation of error rate and heterozygosity
  [####################] 100%  inferring [H, E]      | 0:03:33
ERROR:ipyrad.assemble.util:  Sample secalinum_JoB2014_002A failed with error IOError([Errno 2] No such file or directory: '/filer-5/user/brassac/Secalinum_NGS/GBS_2/raw_reads/new/asian_marinum_clust_0.85/secalinum_JoB2014_002A.clustS.gz')
ERROR:ipyrad.core.assembly:The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

  Encountered an unexpected error (see ./ipyrad_log.txt)
  Error message is below -------------------------------
The truth value of an array with more than one element is ambiguous. Use a.any() or a.all()

What does this mean? Why would the program search for secalinum_JoB2014_002A in '../asian_marinum_clust_0.85/' when it actually is in '../secalinum_clust_0.85' ?

Aug 24 2017 14:49
@joqb Did you look in the .json file that ipyrad creates to see what file paths are written out for the location of this sample?