Genepop
Import a genepop file as PopData
genepop(infile; kwargs...)
Arguments
infile::String
: path to genepop file, in quotes
Keyword Arguments
digits::Integer
: number of digits denoting each allele (default:3
)popsep::String
: word that separates populations ininfile
(default: "POP")diploid::Bool
: whether samples are diploid for parsing optimizations (default:true
)silent::Bool
: whether to print file information during import (default:false
)
By default, the file reader will assign numbers as population ID's (as Strings) in order of appearance in the genepop file. Use the populations!
function to rename these with your own population ID's.
Example
julia> wasp_data = genepop("/data/wasp_hive.gen", digits = 3, popsep = "POP")
Format
Files must follow standard Genepop formatting:
- First line is a comment (and skipped)
- Loci are listed after first line as one-per-line without commas or in single comma-separated row
- A line with a particular and consistent keyword must delimit populations
- must be the same word each time and not a unique population name
- File is tab delimited or space delimited, but not both
- genepop w/loci stacked vertically
- genepop w/loci stacked horizontally
Wasp populations in New York
Locus1
Locus2
Locus3
POP
Oneida_01, 250230 564568 110100
Oneida_02, 252238 568558 100120
Oneida_03, 254230 564558 090100
POP
Newcomb_01, 254230 564558 080100
Newcomb_02, 000230 564558 090080
Newcomb_03, 254230 000000 090100
Newcomb_04, 254230 564000 090120
Wasp populations in New York
Locus1,Locus2,Locus3
POP
Oneida_01, 250230 564568 110100
Oneida_02, 252238 568558 100120
Oneida_03, 254230 564558 090100
POP
Newcomb_01, 254230 564558 080100
Newcomb_02, 000230 564558 090080
Newcomb_03, 254230 000000 090100
Newcomb_04, 254230 564000 090120
Writing to a Genepop file
All file writing options can be performed using PopGen.write()
, which calls genpop
when writing to a Genepop file.
genepop(data::PopData; filename::String = "output.gen", digits::Int = 3, format::String = "vertical", miss::Int = 0)
Writes a PopData
object to a Genepop-formatted file.
Arguments
data
: thePopData
object you wish to convert to a Genepop file
Keyword arguments
filename::String
: the output filenamedigits::Integer
: how many digits to format each allele- e.g.
digits = 3
will turn(1, 2)
into001002
- e.g.
format::String
: the way loci should be formatted- vertically (
"v"
or"vertical"
) - hortizontally (
"h"
, or"horizontal"
) - isolation-by-distance (
"ibd"
) where each sample is a population with coordinate data prepended
- vertically (
miss::Integer
: how you would like missing values written0
: as a genotype represented as a number of zeroes equal todigits × ploidy
like000000
(default)-9
: as a single value-9
Example
cats = @nancycats;
fewer_cats = omit(cats, name = samplenames(cats)[1:10]);
julia> genepop(fewer_cats, filename = "filtered_nancycats.gen", digits = 3, format = "h")
Acknowledgements
The original implementations of the importing parser were written using only Base Julia, and while the speed was fantastic, the memory footprint involved seemed unusually high (~650mb RAM to parse gulfsharks
, which is only 3.2mb in size). However, thanks to the efforts of CSV.jl, we leverage that package to preserve the speed and reduce the memory footprint quite a bit!