These are chat archives for dereneaton/ipyrad

9th
Aug 2017
Bohao Fang
@fangbohao_twitter
Aug 09 2017 12:44
@dereneaton That would be great, Thank you!
Deren Eaton
@dereneaton
Aug 09 2017 15:12

Notes on recent updates:
Version 0.7.9
Some new features include:

  • cleaner ctrl-c interrupt
  • bug fix to bpp API code for random sampling of loci
  • compatibility fix for baba plot code with newer toyplot
  • better support for --ipyclient flag in CLI when using a profile name for client.
  • merge paired reads with reference mapping now working again (no problem with merging for denovo) though its a bit slower.
  • added hacker option to skip pair merging.

Version 0.7.10

  • bugfix to bucky ipa tool -- do not remove nex files
  • bugfix to bpp ipa tool -- randomize_order and seed options fixed
  • improved API analysis tool design.
joqb
@joqb
Aug 09 2017 18:02
Hi @dereneaton, unfortunately I don't have the error on screen anymore and I couldn't find anything related to this error in the ipyrad_log but I thought at first it may be due to disk space. So I moved files around to have some more space but it seemed that the bus error would come back each time faster.
Deren Eaton
@dereneaton
Aug 09 2017 18:49

@jobq did you update pysam since we made a fix for the GLIBC error that arose back in v.0.7.2? If not, then you will want to run conda install pysam -c ipyrad -f. If you happened to have the bad pysam version installed before it might not have been overwritten/updated if you only updated ipyrad with conda. If you update pysam explicitly then you should see the new version of pysam look like this:

The following NEW packages will be INSTALLED:

    pysam: 0.10.0-py27h0af7445_3 ipyrad

I'm not sure if that is related to the problem you're seeing, but it might be.

James Clugston
@Cycadales_twitter
Aug 09 2017 18:51

@dereneaton got a similar problem ```Traceback (most recent call last):
File "/home/jclugston/miniconda2/bin/ipyrad", line 11, in <module>
load_entry_point('ipyrad==0.7.10', 'console_scripts', 'ipyrad')()
File "/home/jclugston/miniconda2/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg/pkg_resources/init.py", line 565, in load_entry_point

File "/home/jclugston/miniconda2/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg/pkg_resources/init.py", line 2598, in load_entry_point

File "/home/jclugston/miniconda2/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg/pkg_resources/init.py", line 2258, in load

File "/home/jclugston/miniconda2/lib/python2.7/site-packages/setuptools-27.2.0-py2.7.egg/pkg_resources/init.py", line 2264, in resolve

File "/home/jclugston/miniconda2/lib/python2.7/site-packages/ipyrad/init.py", line 20, in <module>
from . import load as _load
File "/home/jclugston/miniconda2/lib/python2.7/site-packages/ipyrad/load/init.py", line 14, in <module>
from .load import test_assembly
File "/home/jclugston/miniconda2/lib/python2.7/site-packages/ipyrad/load/load.py", line 13, in <module>
from ipyrad.assemble.util import
File "/home/jclugston/miniconda2/lib/python2.7/site-packages/ipyrad/assemble/init.py", line 7, in <module>
from . import cluster_within
File "/home/jclugston/miniconda2/lib/python2.7/site-packages/ipyrad/assemble/cluster_within.py", line 31, in <module>
from refmap import

File "/home/jclugston/miniconda2/lib/python2.7/site-packages/ipyrad/assemble/refmap.py", line 15, in <module>
import pysam
File "/home/jclugston/miniconda2/lib/python2.7/site-packages/pysam/init.py", line 5, in <module>
from pysam.libchtslib import *
ImportError: /home/jclugston/miniconda2/lib/./libcom_err.so.3: symbol k5_strerror_r, version krb5support_0_MIT not defined in file libkrb5support.so.0 with link time reference
```

Deren Eaton
@dereneaton
Aug 09 2017 18:54
@Cycadales_twitter can you also try the update above?
James Clugston
@Cycadales_twitter
Aug 09 2017 18:54
Yea I just did
@dereneaton yea I just did and got the same message
@dereneaton I will try and force install ipyrad and see if that helps
Deren Eaton
@dereneaton
Aug 09 2017 18:56
OK, let me know. I was pretty sure we have this pysam bug fixed, so hopefully any errors are just bad older versions still lingering around.
James Clugston
@Cycadales_twitter
Aug 09 2017 18:57
@dereneaton so same error
@dereneaton These are the commands I tried to run 320 conda update conda 321 conda update ipyrad -c ipyrad 322 cd /data/jclugston/ubuntu/Carm 323 history 324 ipyrad -p params-Carm.txt -s 34567 -c 52 -r >& Carm1 & 325 jobs 326 kill %1 327 jobs 328 htop 329 killall ipyrad 330 ipyrad -p params-Carm.txt -s 34567 -c 52 -r >& Carm1 & 331 jobs 332 disown 333 tail -f Carm1 334 killall ipyrad 335 jobs 336* 337 kill %1 338 ipyrad -p params-Carm.txt -s 1234567 -c 52 -r >& Carm1 & 339 jobs 340 killall jobs 341 ipyrad -p params-Carm.txt -s 12 -c 52 -r -f 342 conda install pysam -c ipyrad -f 343 ipyrad -p params-Carm.txt -s 12 -c 52 -r -f 344 conda install ipyrad -c ipyrad -f 345 ipyrad -p params-Carm.txt -s 12 -c 52 -r -f
@dereneaton these are the versions I am running ipyrad 0.7.10 0 ipyrad pysam 0.10.0 py27h0af7445_3 ipyrad
Deren Eaton
@dereneaton
Aug 09 2017 19:04
hmm... did you install conda into your home directory (i.e., somewhere that should have no permission errors at all)?
James Clugston
@Cycadales_twitter
Aug 09 2017 19:05
@dereneaton no its installed in my user folder where I have access
Deren Eaton
@dereneaton
Aug 09 2017 19:05
I found online this as a fix for the problem export LD_LIBRARY_PATH="", which I think was related to some other problem you had as well...
does that help?
@isaacovercast would know more about the nuts and bolts of pysam, maybe he can chime in.
James Clugston
@Cycadales_twitter
Aug 09 2017 19:10
@dereneaton ok that worked! I have been having a problem with a run taking forever and just getting stuck at 98% even after 18 days using 52 cores. Its a lot of data and over 200 samples 150 BP PE.
Deren Eaton
@dereneaton
Aug 09 2017 19:11
I'm not sure why your LD_LIBRARY_PATH keeps getting set to some other value. You could try adding the export command to your ~/.bashrc to ensure it is always cleared when you login.
which step or substep specifically does it get stuck on?
James Clugston
@Cycadales_twitter
Aug 09 2017 19:13
@dereneaton its always in step three and clustering I have been trying for months last time it was over 22 days and just stuck on 98%. This was before the updates running version 0.7.2.
Deren Eaton
@dereneaton
Aug 09 2017 19:14
is it denovo or reference method?
James Clugston
@Cycadales_twitter
Aug 09 2017 19:14
@dereneaton denonvo ezRAD running pairgbs
Deren Eaton
@dereneaton
Aug 09 2017 19:15
ah, I remember now.
James Clugston
@Cycadales_twitter
Aug 09 2017 19:15
@dereneaton one dataset has gone though but this one is just being very strange.
@dereneaton its 154 samples exactly all 150 PE ezRAD. It is usually quite slow but this just seems not to be finishing it seems as though one sample gets stuck.
Deren Eaton
@dereneaton
Aug 09 2017 19:19
Yeah, the clustering substep in 3 is a bit strange b/c we don't let it progress until all samples have finished clustering, which just makes balancing the parallelization a bit easier, otherwise a clustering job might be competing against millions of little aligning jobs that other samples would be doing if they were ahead of it.
One easy way to reduce the size of the problem would be to identify the troublesome sample and split it into a separate branch, then run that branch with more resources available to it (e.g., use the -t flag to tell it to use more threads concurrently while clustering). Then merge the finished assemblies back together.
James Clugston
@Cycadales_twitter
Aug 09 2017 19:22
@dereneaton well the question is how do I do which sample are being problematic?
Deren Eaton
@dereneaton
Aug 09 2017 19:23
Oh, look for which sample does not have a .clustfile in the {name}_clust_0.85/ folder. That will be the one that didn't finish clustering. It will probably the sample with the most reads.
James Clugston
@Cycadales_twitter
Aug 09 2017 19:24
@dereneaton ahh ok I will check that one now then
Deren Eaton
@dereneaton
Aug 09 2017 19:24
Or use ls -lthr to look at that directory in time sorted order, and it will probably the sample that last had data written to disk.
James Clugston
@Cycadales_twitter
Aug 09 2017 19:48
@dereneaton found the problem it was two samples each with over 3gb of data...
Deren Eaton
@dereneaton
Aug 09 2017 20:04
Cool. They'll cluster much faster if allowed to use more threads in step3. For example, if you ran just those two samples and you had 20 cores available you could tell it something like this and it would allow up to 10 threads during clustering in each sample.
ipyrad -p params-bigones.txt -c 20 -t 10
James Clugston
@Cycadales_twitter
Aug 09 2017 20:26
@dereneaton Ok I will give that a try at the moment I have removed them as I need to get the analysis out the way and then I will do back to them when I have more time to wait on the job. Plus I have never tried merging jobs yet
Deren Eaton
@dereneaton
Aug 09 2017 21:14
@Cycadales_twitter It's in the documentation, though maybe should be highlighted more. It's pretty simple though:
## create initial assembly
ipyrad -n alldata

## run some steps on that assembly
ipyrad -p params-alldata.txt -s 12

## branch to exclude samples A and B (not the minus symbol)
ipyrad -p params-alldata.txt -b data1 - A B

## branch to include only samples A and B (note the lack of minus symbol)
ipyrad -p params-alldata.txt -b data2 A B

## run step 3 on the data set with small samples
ipyrad -p params-data1.txt -c 20 -s 3 

## run step3 on the data set with just 2 big samples (use high threading)
ipyrad -p params-data2.txt -c 20 -t 10 -s 3

## merge the assemblies back together
ipyrad -m merged params-data1.txt params-data2.txt

## run remaining steps
ipyrad -p params-merged.txt -s 4567
James Clugston
@Cycadales_twitter
Aug 09 2017 21:32
@dereneaton thanks Darren I will try that then and see how it works. That does look very clear and easy to follow.