These are chat archives for dereneaton/ipyrad

30th
Sep 2016
Edgardo M. Ortiz
@edgardomortiz
Sep 30 2016 03:20
Question: for parameter [9] max_low_qual_bases, is the total of Ns allowed in a read-pair or in r1 and r2 independently?
Deren Eaton
@dereneaton
Sep 30 2016 03:36
It is the total number, which I think we decided on because when the reads are clustered they are compared as a single concatenated sequence, and so it seems like just the total number is relevant. But I suppose when mapping to a reference the pairs are mapped separately, so maybe it would make sense to allow separate values.
Robin Sleith
@R0cknRobin_twitter
Sep 30 2016 14:03
Hello, I am still having a problem with a broadcasting error in step 6 similar to the issue @SheaML was having. Were you able to determine what the issue was? The error does not appear when I use small subsets of my dataset, so I am not sure if individual readpools are causing the issue or if it is an issue of clustering across many readpools. Thanks! ('error in step 6 %s', <Remote[1]:ValueError(could not broadcast input array from shape (210) into shape (197))>)
Deren Eaton
@dereneaton
Sep 30 2016 14:29
Hi @R0cknRobin_twitter , we're doing some final testing and then we'll push v.0.4.0 today, this will fix the problem, and run step 6 much faster.
This message was deleted
Deren Eaton
@dereneaton
Sep 30 2016 14:35

@jlmcdaniel A few quick questions and/or tips:

  1. The error you received is due to the python call in your bash script

  2. If you're switching from pyrad to ipyrad you have to restart from step 1. Have you run steps 1 and 2 with ipyrad already?

  3. The installation instructions for ipyrad include the command source ~/.bashrc after you install miniconda and ipyrad. This command re-loads the file .bashrc which is usually automatically called when you open a new terminal. Once it is 'sourced' you don't have to type the full path to the programs anymore, just the name. So instead of the long path /mnt/gluster/jlmcdaniel/miniconda2/bin/ipyrad, you can simply type ipyrad. If you type ipyrad and it says that the command is not recognized, then your system must not automatically source .bashrc (though this is rare), but you can fix this by running source ~/.bashrc.

  4. Once ipyrad is callable, you can simply call it with a single line of code as follows:

    ipyrad -p params.txt -s 123
  5. If you want to make a submission script like in your example that can be run on condor I would make it like the following: https://gist.github.com/dereneaton/6c6bfe6e487eec49cb0731bc9c3565ac

  6. The ipcluster code in your example is used to set up an advanced cluster setup, usually involving many nodes. If you are connecting to CPUs on a single node you are unlikely to have need of it.

Does that make things more clear? Thanks for the questions.

James McDaniel
@jlmcdaniel
Sep 30 2016 15:27
@dereneaton Thanks for the clear explanation! I was working on the error last night and realized that I installed ipyrad through an interactive job and didn't set the correct path for the installation as well. I am going to start from the beginning with a clean install, and I'll let you know how it goes. I appreciate you and Isaac being so willing to help with computing issues!
Isaac Overcast
@isaacovercast
Sep 30 2016 16:50
v.0.4.0 now available on conda.
  • Significantly faster steps 6 and 7!
  • Better handling of filtering and trimming raw data in step 2!
Shea Lambert
@SheaML
Sep 30 2016 18:34
This message was deleted
Hi @dereneaton and @isaacovercast, just tried v0.4.0 on my university HPC but ran into this error at step 2:
IPyradError((' error in %s, %s', ['cutadapt', '--quality-cutoff', '20,20', '-u', '0', '-U', '0', '--trim-n', '--quality-base', '33', '--max-n', '15', '--minimum-length', '35', '-o',
'Traceback (most recent call last):\n File "/home/u26/slambert1/miniconda2/bin/cutadapt", line 4, in <module>\n import(\'pkg_resources\').run_script(\'cutadapt==1.11\', \'cutadapt\')\n File "/home/u26/slambert1/miniconda2/lib/python2.7/site-packages/setuptools-20.3-py2.7.egg/pkg_resources/init.py", line 726, in run_script\n File "/home/u26/slambert1/miniconda2/lib/python2.7/site-packages/setuptools-20.3-py2.7.egg/pkg_resources/init.py", line 1491, in run_script\n File "/home/u26/slambert1/miniconda2/lib/python2.7/site-packages/cutadapt-1.11-py2.7-linux-x86_64.egg/EGG-INFO/scripts/cutadapt", line 9, in <module>\n \n File "/12kx/gsfs2/home/u26/slambert1/miniconda2/lib/python2.7/site-packages/cutadapt-1.11-py2.7-linux-x86_64.egg/cutadapt/scripts/cutadapt.py", line 62, in <module>\n File "/12kx/gsfs2/home/u26/slambert1/miniconda2/lib/python2.7/site-packages/cutadapt-1.11-py2.7-linux-x86_64.egg/cutadapt/init.py", line 11, in check_importability\n File "/12kx/gsfs2/home/u26/slambert1/miniconda2/lib/python2.7/site-packages/cutadapt-1.11-py2.7-linux-x86_64.egg/cutadapt/_align.py", line 7, in <module>\n File "/12kx/gsfs2/home/u26/slambert1/miniconda2/lib/python2.7/site-packages/cutadapt-1.11-py2.7-linux-x86_64.egg/cutadapt/_align.py", line 4, in bootstrap\n File "/home/u26/slambert1/miniconda2/lib/python2.7/site-packages/setuptools-20.3-py2.7.egg/pkg_resources/init.py", line 1152, in resource_filename\n File "/home/u26/slambert1/miniconda2/lib/python2.7/site-packages/setuptools-20.3-py2.7.egg/pkg_resources/init.py", line 1696, in get_resource_filename\n File "/home/u26/slambert1/miniconda2/lib/python2.7/site-packages/setuptools-20.3-py2.7.egg/pkg_resources/init.py", line 1726, in _extract_resource\n File "/home/u26/slambert1/miniconda2/lib/python2.7/site-packages/setuptools-20.3-py2.7.egg/pkg_resources/init.py", line 1219, in get_cache_path\n File "/home/u26/slambert1/miniconda2/lib/python2.7/site-packages/setuptools-20.3-py2.7.egg/pkg_resources/init.py", line 1199, in extraction_error\npkg_resources.ExtractionError: Can\'t extract file(s) to egg cache\n\nThe following error occurred while trying to extract file(s) to the Python egg\ncache:\n\n [Errno 17] File exists: \'/home/u26/slambert1/.python-eggs\'\n\nThe Python egg cache directory is currently set to:\n\n /home/u26/slambert1/.python-eggs\n\nPerhaps your account does not have write access to this directory? You can\nchange the cache directory by setting the PYTHON_EGG_CACHE environment\nvariable to point to an accessible directory.\n\n'))
Isaac Overcast
@isaacovercast
Sep 30 2016 18:45
Hi @SheaML, did you run step 1 with v0.4.0?
Shea Lambert
@SheaML
Sep 30 2016 18:51
@isaacovercast I ran "-f -s 1234567" -- Trying again after deleting the .json and all previous outputs.
Isaac Overcast
@isaacovercast
Sep 30 2016 18:53
let me know how it goes.
Shea Lambert
@SheaML
Sep 30 2016 18:59
looks like we made it through this time - I'll avoid re-running step 1 with the '-f' argument after updates in the future. Thanks!
Isaac Overcast
@isaacovercast
Sep 30 2016 19:02
Hm. Well, glad its back on track.
Not sure what was happening...
Deren Eaton
@dereneaton
Sep 30 2016 19:14
@isaacovercast @SheaML Looks like it was an error with importing cutadapt.