Conditionals and Logic
Included in PopGen.jl are some functions to help discriminate your data a bit more. Like all conditionalsℹ️, these return true
or false
depending on the test.
By Julia's design, conditionals on missing
values return missing
. For
indexing and subsetting reasons, ishom
and ishet
return false
on
missing values, however unexported methods _ishom
and _ishet
return
missing
as per the standard convention. These unexported methods are critical
for calculations where missing
values should absolutely not be treated as false
.
Homozygosity
ishom(locus::Genotype)
ishom(locus::GenoArray)
This will return true
if a genotype is homozygous. The GenoaArray
version
just broadcasts it across all the genotypes in an array, returning a vector
of true
or false
.
Example
julia> cats = @nancycats ;
julia> subset = cats[1:10, :genotype]
10-element Vector{Union{Missing, Tuple{Int16, Int16}}}:
missing
missing
(135, 143)
(133, 135)
(133, 135)
(135, 143)
(135, 135)
(135, 143)
(137, 143)
(135, 135)
julia> ishom(subset[3])
false
julia> ishom(subset)
10-element Vector{Bool}:
0
0
0
0
0
0
1
0
0
1
If you want to avoid missing
genotypes, you can use skipmissing
to ignore them. This also works for ishet
.
julia> ishom(skipmissing(subset))
8-element Vector{Bool}:
0
0
0
0
1
0
0
1
Another option is to check if a genotype is homozygous for a specific allele. To
do that, we exploit Julia's multiple dispatch and use ishom
again, but with
different arguments.
ishom(geno::Genotype, allele::Signed)
ishom(genos::GenoArray, allele::Signed)
This will return true
if the geno
(or genos
) is/are homozygous for the specified allele
. Notice that when we query a genotype that doesn't contain that allele, it returns false
.
Example
julia> ishom(subset[3], 135)
false
julia> ishom(subset[10], 135)
true
julia> ishom(subset[9], 135)
false
julia> ishom(subset, 135)
10-element Vector{Bool}:
0
0
0
0
0
0
1
0
0
1
Heterozygosity
ishet(locus::Genotype)
ishet(locus::GenoArray)
This is the exact opposite of ishom
, returning true
if the genotype (or genotypes) is/are heterozygous.
Example
julia> cats = @nancycats ;
julia> subset = cats[1:10, :genotype]
10-element Vector{Union{Missing, Tuple{Int16, Int16}}}:
missing
missing
(135, 143)
(133, 135)
(133, 135)
(135, 143)
(135, 135)
(135, 143)
(137, 143)
(135, 135)
julia> ishet(subset[3])
true
julia> ishet(subset)
10-element Vector{Bool}:
0
0
1
1
1
1
0
1
1
0
We likewise have the option to check if a locus is heterozygous for a specific
allele. To do that, we again exploit Julia's multiple dispatch and use ishet
,
but with different arguments.
ishet(geno::Genotype, allele::Signed)
ishet(genos::GenoArray, allele::Signed)
This will return true
if the geno
(or genos
) is/are heterozygous for the specified allele
. Notice that when we query a genotype that doesn't contain that allele, it returns false
.
Example
julia> ishet(subset[3], 135)
true
julia> ishet(subset[10], 135)
false
julia> ishet(subset[9], 135)
true
julia> ishet(subset, 135)
10-element Vector{Bool}:
0
0
1
1
1
1
0
1
0
0
Biallelic data
Some analyses are restricted to work exclusively on biallelic data (e.g. Hudson pairwise FST), so it may help to know if things are biallelic.
isbiallelic(data::GenoArray)
Returns true
if the GenoArray
is biallelic, false
if not.
isbiallelic(data::PopData)
Returns true
all the loci in the PopData
are biallelic, false
if not.
Example
julia> sharks = @gulfsharks ;
julia> isbiallelic(sharks)
false
julia> drop_multiallelic!(sharks)
[ Info: Removing 258 multialleic loci
julia> isbiallelic(sharks)
true