Construct necessary information for insertions and deletions (indels) that will be used in create_variants.

indels(rate, max_length = 10, a = NULL, rel_rates = NULL)

Arguments

rate

Single number specifying the overall indel rate among all lengths.

max_length

Maximum length of indels. Defaults to 10.

a

Extra parameter necessary for generating rates from a Lavalette distribution. See Details for more info. Defaults to NULL.

rel_rates

A numeric vector of relative rates for each indel length from 1 to the maximum length. If provided, all arguments other than rate are ignored. Defaults to NULL.

Value

An indel_rates object, which is just a wrapper around a numeric vector. You can access the rates vector for indel_rates object x by running as.numeric(x).

Details

Both insertions and deletions require the rate parameter, which specifies the overall insertion/deletion rate among all lengths. The rate parameter is ultimately combined with a vector of relative rates among the different lengths of insertions/deletions from 1 to the maximum possible length. There are three different ways to specify/generate relative-rate values.

  1. Assume that rates are proportional to exp(-L) for indel length L from 1 to the maximum length (Albers et al. 2011). This method will be used if the following arguments are provided:

    • rate

    • max_length

  2. Generate relative rates from a Lavalette distribution (Fletcher and Yang 2009), where the rate for length L is proportional to {L * max_length / (max_length - L + 1)}^(-a). This method will be used if the following arguments are provided:

    • rate

    • max_length

    • a

  3. Directly specify values by providing a numeric vector of relative rates for each insertion/deletion length from 1 to the maximum length. This method will be used if the following arguments are provided:

    • rate

    • rel_rates

References

Albers, C. A., G. Lunter, D. G. MacArthur, G. McVean, W. H. Ouwehand, and R. Durbin. 2011. Dindel: accurate indel calls from short-read data. Genome Research 21:961–973.

Fletcher, W., and Z. Yang. 2009. INDELible: a flexible simulator of biological sequence evolution. Molecular Biology and Evolution 26:1879–1888.

Examples

# relative rates are proportional to `exp(-L)` for indel # length `L` from 1 to 5: indel_rates1 <- indels(0.1, max_length = 5) # relative rates are proportional to Lavalette distribution # for length from 1 to 10: indel_rates2 <- indels(0.2, max_length = 10, a = 1.1) # relative rates are all the same for lengths from 1 to 100: indel_rates3 <- indels(0.2, rel_rates = rep(1, 100))