Skip to main content

Read/Write data

PopGen.jl (via PopGenCore.jl) provides a handful of file readers and writers with which to create PopData. Each of the file types have their own file reader denoted simply by the file type:

File FormatExtensionsDocstringReadWrite
delimited.csv, .txt, .tsv?delimited👍👍
genepop.gen, .genepop?genepop👍👍
structure.str, .structure?structure👍👍
plink (ped).ped?plink👍👍
variant call format (vcf).vcf, .vcf.gz?vcf👍
variant call format (bcf).bcf, .bcf.gz?bcf👍
baypass.baypass?baypass👍

Read in data

You're encouraged to use these functions, but PopGen.jl also provides you with an all-encompassing wrapper PopGen.read(). Given the ubiquity of the function name, it is not exported. If using PopGenCore.jl directly, you will need to call it with PopGenCore.read.

Windows users

Make sure to change the backslashes \ in your file path to double-backslashes \\ or forward slashes /

monomorphic loci

By default, the file reading methods drop monomorphic loci and inform you which were removed, so do not be alarmed if the number of loci in your PopData is different from the source data. You can disable this behavior with the argument allow_monomorphic = true. Monomorphic loci are removed by default because they can give spurious/misleading results for some analyses, such as kinship estimators.

PopGen.read()

PopGen.read(infile::String; kwargs...)

where infile is a String of your filename (in quotes) and kwargs are the corresponding keyword arguments associated with your file type. The function PopGen.read() uses all the same keyword arguments as do the commands specific to their file types, therefore you should have a look at those commands (usually the defaults suffice).

PopGen.read() infers the file type from the file extension, so for it to work properly your file must end with the extensions permitted below (case insensitive). If you're feeling particularly rebellious and your file does not conform to these extensions (such as a genepop file with a .gen.final.v2.seriously extension), then feel free to use the specific file importers, since they use the same exact syntax, there is zero difference in performance, and ignore file extensions. Ultimately, what crazy extensions you give your files is your business, and we love that about you.

Examples

salmon = PopGen.read("o_mykiss.gen", digits = 3, popsep = "SALMON")
ginko = PopGen.read("g_biloba.txt", delim = ",", digits = 2, silent = true)

Write PopData to file

PopGen.write(data::PopData; filename::String, kwargs...)

To complement PopGen.read(), PopGen.jl offers PopGen.write(), which writes PopData to different file formats. Like the file reader, PopGen.write() will infer the correct output file type from the output filename's extensions. Given the ubiquity of the function name, it is not exported. If using PopGenCore.jl directly, you will need to call it with PopGenCore.write.

Additional keyword arguments kwargs... are specific to the intended file type, and are listed in the docstrings of the specific file writer with the format ?filetype like shown above. For example, to find the appropriate keywords for a conversion to Genepop format, call up the docstring to genepop with ?genepop.

Examples

cats = @nancycats;
fewer_cats = omit(cats, names = samplenames(cats)[1:10]);
PopGen.write(fewer_cats, filename = "filtered_nancycats.gen", digits = 3, format = "horizontal")
PopGen.write(fewer_cats, filename = "filtered_nancycats.txt", digits = 4, format = "tidy", delim = ",")