Note: This class wraps a pointer to a C++ object, so do NOT change fields in this class directly. It will cause your R session to do bad things. (Ever seen the bomb popup on RStudio? Manually mess with these fields and you surely will.) For safe ways of manipulating the variants' information, see the "Methods" section.

variants

## Format

An R6Class generator object

## Value

An object of class variants.

## Fields

genome

An externalptr to a C++ object storing the sequences representing the genome.

reference

An externalptr to a C++ object storing the sequences representing the genome. This field is private, so you can't view it, but I'm listing it here so that I can provide a few extra notes about it:

• This point is the most important. Since it's a pointer, if you make any changes to the reference genome that it points to, those changes will also show up in the variants object. For example, if you make a variants object named V based on an existing ref_genome object named R, then you merge sequences in R, V will now have merged sequences. If you've already started adding mutations to V, then all the indexes used to store those mutations will be inaccurate. So when you do anything with V later, your R session will crash or have errors.

• If a ref_genome object is used to create a variants object, deleting the ref_genome object won't cause issues with the variants object. However, the variants class doesn't provide methods to edit sequences, so only remove the ref_genome object when you're done editing the reference genome.

## Methods

Viewing information:

n_seqs()

View the number of sequences.

n_vars()

View the number of variants.

sizes(var_ind)

View vector of sequence sizes for a given variant.

seq_names()

View vector of sequence names.

var_names()

View vector of variant names.

sequence(var_ind, seq_ind)

View a sequence string based on indices for the sequence (seq_ind) and variant (var_ind).

gc_prop(var_ind, seq_ind, start, end)

View the GC proportion for a range within a variant sequence.

nt_prop(nt, var_ind, seq_ind, start, end)

View the proportion of a range within a variant sequence that is of nucleotide nt.

Editing information:

set_names(new_names)

Set names for all variants. new_names is a character vector of what to change names to, and it must be the same length as the # variants.

add_vars(new_names)

Add new, named variant(s) to the object. These variants will have no mutations. If you want to add new variants with mutations, either re-run create_variants or use the dup_vars method to duplicate existing variants.

dup_vars(var_names, new_names = NULL)

Duplicate existing variant(s) based on their name(s). You can optionally specify the names of the duplicates (using new_names). Otherwise, their names are auto-generated.

rm_vars(var_names)

Remove one or more variants based on names in the var_names vector.

add_sub(var_ind, seq_ind, pos, nt)

Manually add a substitution for a given variant (var_ind), sequence (seq_ind), and position (pos). The reference nucleotide will be changed to nt, which should be a single character.

add_ins(var_ind, seq_ind, pos, nts)

Manually add an insertion for a given variant (var_ind), sequence (seq_ind), and position (pos). The nucleotide(s) nts will be inserted after the designated position.

add_del(var_ind, seq_ind, pos, n_nts)

Manually add a deletion for a given variant (var_ind), sequence (seq_ind), and position (pos). The designated number of nucleotides to delete (n_nts) will be deleted starting at pos, unless pos is near the sequence end and doesn't have n_nts nucleotides to remove; it simply stops at the sequence end in this case.

create_variants