dinucleotideFrequencyTest {Biostrings} | R Documentation |
Pearson's chi-squared Test and G-tests for String Position Dependence
Description
Performs Person's chi-squared test, G-test, or William's corrected G-test to determine dependence between two nucleotide positions.
Usage
dinucleotideFrequencyTest(x, i, j, test = c("chisq", "G", "adjG"),
simulate.p.value = FALSE, B = 2000)
Arguments
x |
A DNAStringSet or RNAStringSet object. |
i , j |
Single integer values for positions to test for dependence. |
test |
One of |
simulate.p.value |
a logical indicating whether to compute p-values by Monte Carlo simulation. |
B |
an integer specifying the number of replicates used in the Monte Carlo test. |
Details
The null and alternative hypotheses for this function are:
- H0:
positions
i
andj
are independent- H1:
otherwise
Let O and E be the observed and expected probabilities for base pair
combinations at positions i
and j
respectively. Then the
test statistics are calculated as:
test="chisq"
:stat = sum(abs(O - E)^2/E)
test="G"
:stat = 2 * sum(O * log(O/E))
test="adjG"
:stat = 2 * sum(O * log(O/E))/q, where q = 1 + ((df - 1)^2 - 1)/(6*length(x)*(df - 2))
Under the null hypothesis, these test statistics are approximately distributed chi-squared(df = ((distinct bases at i) - 1) * ((distinct bases at j) - 1)).
Value
An htest object. See help(chisq.test) for more details.
Author(s)
P. Aboyoun
References
Ellrott, K., Yang, C., Sladek, F.M., Jiang, T. (2002) "Identifying transcription factor binding sites through Markov chain optimations", Bioinformatics, 18 (Suppl. 2), S100-S109.
Sokal, R.R., Rohlf, F.J. (2003) "Biometry: The Principle and Practice of Statistics in Biological Research", W.H. Freeman and Company, New York.
Tomovic, A., Oakeley, E. (2007) "Position dependencies in transcription factor binding sites", Bioinformatics, 23, 933-941.
Williams, D.A. (1976) "Improved Likelihood ratio tests for complete contingency tables", Biometrika, 63, 33-37.
See Also
nucleotideFrequencyAt
,
XStringSet-class,
chisq.test
Examples
data(HNF4alpha)
dinucleotideFrequencyTest(HNF4alpha, 1, 2)
dinucleotideFrequencyTest(HNF4alpha, 1, 2, test = "G")
dinucleotideFrequencyTest(HNF4alpha, 1, 2, test = "adjG")