These are chat archives for dereneaton/ipyrad

7th
Sep 2018
Andrew Hipp
@andrew-hipp
Sep 07 2018 10:42
@eaton-lab Fantastic! I look forward to trying it out!
Andrew Hipp
@andrew-hipp
Sep 07 2018 10:55
@alx552 On your "No files ready" error ... did you get it solved? I had the same problem, and it was a path length problem. Take a look here: https://groups.google.com/forum/#!topic/pyrad-users/d48TCd6PSMs
Ollie White
@Ollie_W_White_twitter
Sep 07 2018 17:14
image.png
Ollie White
@Ollie_W_White_twitter
Sep 07 2018 17:24

Hello, I was just wondering how the p values for D-statistics are calculated. I have calculated the p values myself and mine seem to be less conservative for tests with borderline significance levels. On that note, am I correct in assuming that the significance is denoted by the orange or turquoise/green coloration of the D-statistic distribution? See attached screen grab of the plot I mean.

I have taken the output tables produced in the jupyter notebook such as this:

n dstat bootmean bootstd Z ABBA BABA nloci
0 0.054 0.057 0.057 0.941 149.283 134.056 2690
1 0.147 0.146 0.053 2.783 151.011 112.235 2834

I have tried to calculate the p values following the Pedicularis paper, converting the Z score to a two tailed p value, correcting using Holm–Bonferroni and a 0.01 cut off. I did this in R (sorry I'm still learning Python...) using the following code if its helpful

read in table

d <- read.table(file = "D-stats-results.txt", header = TRUE, sep = "\t")

convert Z-score into two-tailed p-value

p <- 2*(pnorm(-abs(d$Z)))

correct for multiple comparisons unsing Holm-Bonferroni

cor.p <- p.adjust(p, method = "holm")

add p values to data frame

d <- data.frame(d, cor.p)

write table

write.table(d, file = "D-stats-results-with-p-value.txt", sep = "\t", col.names = TRUE, row.names = FALSE)

There's probably a easier way of doing this, I was just wondering why I am getting less conservative results.

Hope this makes sense

Cheers
Ollie