Construct necessary information for insertions and deletions (indels) that will be used in create_haplotypes.

indels(rate, max_length = 10, a = NULL, rel_rates = NULL)

Arguments

rate

Single number specifying the overall indel rate among all lengths.

max_length

Maximum length of indels. Defaults to 10.

a

Extra parameter necessary for generating rates from a Lavalette distribution. See Details for more info. Defaults to NULL.

rel_rates

A numeric vector of relative rates for each indel length from 1 to the maximum length. If provided, all arguments other than rate are ignored. Defaults to NULL.

Value

An indel_info object, which is an R6 class that wraps the info needed for the create_haplotypes function. It does not allow the user to directly manipulate the info inside, as that should be done using this function. You can use the rates() method to view the indel rates by size.

Details

All indels require the rate parameter, which specifies the overall indels rate among all lengths. The rate parameter is ultimately combined with a vector of relative rates among the different lengths of indels from 1 to the maximum possible length. There are three different ways to specify/generate relative-rate values.

  1. Assume that rates are proportional to exp(-L) for indel length L from 1 to the maximum length (Albers et al. 2011). This method will be used if the following arguments are provided:

    • rate

    • max_length

  2. Generate relative rates from a Lavalette distribution (Fletcher and Yang 2009), where the rate for length L is proportional to {L * max_length / (max_length - L + 1)}^(-a). This method will be used if the following arguments are provided:

    • rate

    • max_length

    • a

  3. Directly specify values by providing a numeric vector of relative rates for each insertion/deletion length from 1 to the maximum length. This method will be used if the following arguments are provided:

    • rate

    • rel_rates

References

Albers, C. A., G. Lunter, D. G. MacArthur, G. McVean, W. H. Ouwehand, and R. Durbin. 2011. Dindel: accurate indel calls from short-read data. Genome Research 21:961–973.

Fletcher, W., and Z. Yang. 2009. INDELible: a flexible simulator of biological sequence evolution. Molecular Biology and Evolution 26:1879–1888.

Examples

# relative rates are proportional to `exp(-L)` for indel
# length `L` from 1 to 5:
indel_rates1 <- indels(0.1, max_length = 5)

# relative rates are proportional to Lavalette distribution
# for length from 1 to 10:
indel_rates2 <- indels(0.2, max_length = 10, a = 1.1)

# relative rates are all the same for lengths from 1 to 100:
indel_rates3 <- indels(0.2, rel_rates = rep(1, 100))