What's New
v0.8โ
v0.8.0โ
โ ๏ธโ ๏ธ Breaking Changesโ
- dropped support for Julia <v1.6
- new
countmethods we use aren't supported by previous versions
- new
โจโจ New Thingsโ
- k-means clustering using Kmeans++ via
kmeans() - Principcal Component Analysis via
pca() - Jason and Pavel both completed their doctorates!
โกโก Improvementsโ
- allele matrix creation methods (internal) have >50% fewer LOC and are >2x faster!
๐๐ Bug fixesโ
- none, I think
v0.7โ
v0.7.0โ
โ ๏ธโ ๏ธ Breaking Changesโ
- all PopData functionality moved to separate package PopGenCore.jl
- PopGen.jl reexports functions from PopGenCore.jl for familiar functionality
.metaand.locihave been renamed.metadataand.genodata.metadatais no longer a DataFrame and instead a newPopDataInfotype- latitude and longitude columns no longer mandatory and omitted in cases where not used
โกโก Improvementsโ
- PopData can be indexed like a DataFrame and it will return a brand new PopData!
PopDataInfois self-updating (in most cases)- preliminary plink .bed file importing (not writing, yet)
showfor PopData is now smaller and cleaner- INFO text for data importing now elides abs paths longer than the terminal width
- VCF/BCF support no longer lazy loaded
- VCF/BCF uses VariantCallFormat.jl now (instead of GeneticVariations.jl)
- VCF/BCF uses different GZ library for decompression
try...catchblocks used in file io for faster file reading and fewer lines of code
๐๐ Bug fixesโ
- super slow structure io on larger files
v.0.6โ
v.0.6.5โ
Summary of changes from 0.6.2-5
โกโก Improvementsโ
- Bumped compat for
DataFrames.jlto 1.0 - VCF/BCF importing now naturally sorts the loci names
- includes new
NaturalSort.jldep
- includes new
- file import
INFOtext consolidated somewhat PopDatashow method information consolidated somehwat
๐๐ Bug fixesโ
- Hudson fst works as expected
isbiallelicreturns correct answer when used onPopDataobject- [internal] conditional functions moved to
Conditionals.jlfile keepandkeep!are exported
v.0.6.1โ
โจโจ New Featuresโ
- Hudson pairwise FST & Permutation
- adds the Hudson et al. 1992 method
isbiallelic- adds boolean test if a
PopDataobject has only biallelic loci - adds boolean test if a
GenoArrayis biallelic
- adds boolean test if a
drop_multiallelic- mutating and non-mutating methods to remove non-biallelic loci from a
PopDataobject
- mutating and non-mutating methods to remove non-biallelic loci from a
โกโก Improvementsโ
drop_monomorphicnow uses the same logic asdrop_multiallelic, which should make it faster and leaner
๐๐ Bug fixesโ
vcfandbcfkwargrename_locinow consistent in functions and docstringsgenerate_metanow uses a comprehension rather than deprecatedmap(fn, groupeddataframe)method
v.0.5โ
v.0.5.2โ
โกโก Improvementsโ
- a rewrite of nei and weir-cockerham fst methods to be matrix-based (faster!)
โจโจ New Featuresโ
- fully implements permutation testing for both pairwise fst methods
- adds method for
avg_allele_freqto accommodate newpairwise_nei - extends
pairwise_fstto include iterations keyword to activate permutation testing
v.0.5.1โ
โจโจ New featuresโ
pairwise_fstis now available for Weir & Cockerham (1984) and Nei (1987) methods- check out the benchmarks!
- added
skipinf,skipnan, andskipinfnanmethods (unexported) toUtils.jl - dropped
safemeanbecause the skip___ methods are a lot faster and slimmer
v.0.5.0โ
This release fixes a critical bug in all the file importing functions that returned nothing when dropping monomorphic loci. Other changes include
- Dropping
JLD2.jlsuport due to its version-to-version instability. Two fewer dependencies!- As a result,
datasets()now reads nancycats and gulfsharks directly from their source data files - To maintain all of the information, gulfsharks reads from a delimited file rather than a genepop file
- As a result,
populationstype signature and behavior has been changed:- the default returns an array of the unique population names
- the keyword
listall::Boolhas been replaced withcounts::Bool, which now returns a dataframe of the number of samples per population
v.0.4โ
v0.4.5โ
This release builds off of 0.4.3 and does a better job with the VCF loading logic. Along with that, vcf and bcf exist in the namespace before loading in GeneticVariation.jl, meaning you can always view the docstrings. These stripped-down methods in the namespace will give helpful errors to remind you to load in GeneticVariation.jl and/or GZip.jl.
v0.4.3โ
This release fixes and simplifies the under-the-hood allele_freq, geno_freq, and geno_count_xxx functions. The are faster now, and they infer types, making the output have expected type behavior.
Changesโ
- You no longer need to import both
GeneticVariations.jlandGZip.jlto have thevcfandbcffunctions work. The reason is that if your file isn't gzipped, then why load in an unnecessary library? Therefore, if your file is gzipped, then you'll need to load inGZip.jltoo, otherwise you just needGeneticVariation.jl. ๐ avg_allele_freqnow has a different method, where the second positional argument ispower, which will raise the calculated frequencies to the given value (default =1). This simplifies having to do things like square the values of the resultingDict.
v0.4.0โ
This release adds a slew of relatedness estimators, which can be bootstrapped and are performed in parallel. Paired with release of PopGenSims.jl v0.0.2.
โ ๏ธโ ๏ธ Breaking changesโ
- CategoricalArrays replaced with PooledArrays
- VCF/BCF now lazy load and require
GeneticVariations.jlandGZip.jlseparately
โจโจ New featuresโ
- relatedness estimators (see blog for tutorial)
- internal functions:
loci_dataframeloci_matrixnonmissingspairwise_pairs
pairwiseidentical()to compare percent identical lociphase()method- Structure/fastStructure file IO
โกโก Improvementsโ
- some internal function locations moved around (housekeeping)
nancycats()andgulfsharks()are being phased out in favor of@nancycatsand@gulfsharks. (You will see deprecation warning)- documentation (Docusaurus) upgrades
- edit button now correctly works on blog posts
- B/VCF reader rewritten (see docs)