Uses one of multiple methods to create variant haplotypes from a reference genome. See haps_functions for the methods available.

create_haplotypes(
  reference,
  haps_info,
  sub = NULL,
  ins = NULL,
  del = NULL,
  epsilon = 0.03,
  n_threads = 1,
  show_progress = FALSE
)

Arguments

reference

A ref_genome object from which to generate haplotypes. This argument is required.

haps_info

Output from one of the haps_functions. These functions organize higher-level information for use here. See haps_functions for brief descriptions and links to each method. If this argument is NULL, all arguments other than reference are ignored, and an empty haplotypes object with no haplotypes is returned. This is designed for use when you'd like to add mutations manually. If you create a blank haplotypes object, you can use its add_haps method to add haplotypes manually.

sub

Output from one of the sub_models functions that organizes information for the substitution models. See sub_models for more information on these models and their required parameters. This argument is ignored if you are using a VCF file to create haplotypes. Passing NULL to this argument results in no substitutions. Defaults to NULL.

ins

Output from the indels function that specifies rates of insertions by length. This argument is ignored if you are using a VCF file to create haplotypes. Passing NULL to this argument results in no insertions. Defaults to NULL.

del

Output from the indels function that specifies rates of deletions by length. This argument is ignored if you are using a VCF file to create haplotypes. Passing NULL to this argument results in no deletions. Defaults to NULL.

epsilon

Error control parameter for the "tau-leaping" approximation to the Doob–Gillespie algorithm, as used for the indel portion of the simulations. Smaller values result in a closer approximation. Larger values are less exact but faster. Values must be >= 0 and < 1. For more information on the approximation, see Cao et al. (2006) and Wieder et al. (2011), listed below. If epsilon is 0, then it reverts to the exact Doob–Gillespie algorithm. Defaults to 0.03.

n_threads

Number of threads to use for parallel processing. This argument is ignored if OpenMP is not enabled. Threads are spread across chromosomes, so it doesn't make sense to supply more threads than chromosomes in the reference genome. Defaults to 1.

show_progress

Boolean for whether to show a progress bar during processing. Defaults to FALSE.

Value

A haplotypes object.

References

Cao, Y., D. T. Gillespie, and L. R. Petzold. 2006. Efficient step size selection for the tau-leaping simulation method. The Journal of Chemical Physics 124(4): 044109.

Doob, J. L. 1942. Topics in the theory of markoff chains. Transactions of the American Mathematical Society 52(1): 37–64.

Gillespie, D. T. 1976. A general method for numerically simulating the stochastic time evolution of coupled chemical reactions. Journal of Computational Physics 22(4): 403–434.

Wieder, N., R. H. Fink, and F. von Wegner. 2011. Exact and approximate stochastic simulation of intracellular calcium dynamics. Journal of Biomedicine and Biotechnology 2011: 572492.

Examples

r <- create_genome(10, 1000)
v_phylo <- create_haplotypes(r, haps_phylo(ape::rcoal(5)), sub_JC69(0.1))
v_theta <- create_haplotypes(r, haps_theta(0.001, 5), sub_K80(0.1, 0.2))