Significant variations of n-grams exist, for example smoothing (words that never show up in the corpus are modified to have non-zero probability)
Seems like you need the probability of a frequency when estimating from data. You would obtain counts of word transitions, then estimate transition probabilities from this data, with uncertainty on the transition probability.
Do you have different n-gram models per text?