Compute and transform relative frequencies for a qualitative trait in a germplasm collection by the following methods (Balakrishnan and Suresh 2001) :
Square root-proportion
Log-frequency
Usage
prop.adj(x, method = c("none", "log", "sqrt"), size.count = NULL)Arguments
- x
Data of a qualitative trait for accessions in a collection as a vector of type factor.
- method
The method for transformation. Either
"none"for no transformation or"log"for log-frequency transformation or"sqrt"for square root-proportion transformation.- size.count
A positive integer specifying the target size of the core collection. The sum of frequencies allocated across levels of each qualitative trait will not exceed this value, and serves as the upper bound for iterative proportion clamping when
size.countis supplied. IfNULL, no clamping is performed and the adjusted proportions are returned as-is.
Details
If \(p_{i}\) is the relative frequency of the \(i\)th descriptive state for a qualitative trait in a collection, then the square root-proportion transformed relative \(q_{i}\) is computed as
\[q_{i} = \frac{\sqrt{p_{i}}}{\sum_{i=1}^{s}\sqrt{p_{i}}}\]
Where \(s\) is the number of possible descriptor states for the qualitative trait in the collection.
Similarly, the log-frequency transformed relative \(q_{i}\) is computed as
\[q_{i} = \frac{\log(F_{i} + k)}{\sum_{i=1}^{s}\log(F_{i} + k)}\]
Where \(F_{i}\) is the absolute frequency of the \(i\)th
descriptive state for a qualitative trait in a collection. It is incremented
by a constant \(k = 0.000001\) prior to log transformation. This ensures
that singleton descriptor states (where \(F_{i} = 1\)) yield a small but
non-zero proportion rather than being assigned a zero proportion due to
\(\log(1) = 0\), which would otherwise exclude all accessions of that
descriptor state from core selection irrespective of size.count.
When size.count is supplied, the transformed proportions
\(q_{i}\) are subject to iterative clamping to ensure that the implied
frequency \(q_{i} \times n\) for any descriptor state \(i\) does
not exceed its actual count in the collection, where \(n\) is
size.count. Excess proportion from clamped states is redistributed
proportionally among unclamped states and the process repeats until no state
exceeds its maximum allowable proportion \(F_{i} / n\).
References
Balakrishnan R, Suresh KK (2001). “Strategies for developing core collections of safflower (Carthamus tinctorius L.) germplasm-part II. Using an information measure for obtaining a core sample with pre-determined diversity levels for several descriptors simultaneously.” Indian Journal of Plant Genetic Resources, 14(1), 32–42.
Examples
suppressPackageStartupMessages(library(EvaluateCore))
library(EvaluateCore)
# Get data from EvaluateCore
data("cassava_EC", package = "EvaluateCore")
# Data of 'Colour of unexpanded apical leaves' qualitative trait
CUAL <- as.factor(cassava_EC$CUAL)
# Raw relative frequencies
prop.adj(CUAL, method = "none")
#> Dark green Green Green purple Light green Purple
#> 0.190617577 0.001187648 0.527909739 0.028503563 0.251781473
# Square root-proportion transformed relative frequencies
prop.adj(CUAL, method = "sqrt")
#> Dark green Green Green purple Light green Purple
#> 0.23369439 0.01844636 0.38890779 0.09036836 0.26858311
# Square log-frequency transformed relative frequencies
prop.adj(CUAL, method = "log")
#> Dark green Green Green purple Light green Purple
#> 0.24903071 0.02990848 0.29298448 0.16703764 0.26103868