Notes+on+q-value+for+many+simultaneous+tests

The q value is used instead of the p value for many simultaneous tests. The q value is essentially the false discovery rate for many tests rather than the false positive rate of a single test. The q values for a set of features (gene, peptide, etc.) can be calculated from the p values using a variety of different algorithms (Bonferroni, Benjamani Hochberg, etc.). Two ways of calculating these q values are to use the p.adjust function in R (see example R session 5-23-12) or the QValue program with R. Once these q values are obtained, one could select features which all have a false discovery rate below a certain level, or just simply see what the false discovery rate is for features with certain p values. One could also determine how many samples would be necessary to obtain false discovery rates below a certain threshold assuming that the effect size and standard deviations remain about the same. This calculation could be accomplished through a process of trial and error in excel.

"L:\storage\CIM Research Folder\DR\2012\5-23-12\example determination of multi test sample size 5-23-12.xlsx"

Other information This site looks like it might have some good information. http://viiia.org/fdrFigs/?l=en-us http://www.nonlinear.com/support/progenesis/samespots/faq/pq-values.aspx
 * Another way to look at the difference is that a p-value of 0.05 implies that 5% of all tests will result in false positives. An FDR adjusted p-value (or q-value) of 0.05 implies that 5% of significant tests will result in false positives. The latter is clearly a far smaller quantity.
 * if you order the p-values used to calculate the q-values, then the q-values will also be ordered.
 * the q-value is a little greater at 0.0141, which means we should expect 1.41% of all the spots with q-value less than this to be false positives.
 * In this way, a threshold of 0.05 has meaning across the entire experiment.