Population data
Needless to say, population information is crucial for population genetics, so there are several handy tools for dealing with that information.
If you need to see the population for every sample, then use sampleinfo(popdata) to retrieve the dataframe containing sample information.
View unique population names
populations(data::PopData; counts::Bool = false)
If counts = false, returns a Vector of the unique populations present in the PopData. If counts = true, returns a
table of sample counts per population.
- unique populations
 - counts per population
 
Return a vector of the unique populations.
julia> populations(sharks)
7-element Array{String,1}:
 "CapeCanaveral"
 "Georgia"
 "SouthCarolina"
 "FloridaKeys"
 "MideastGulf"
 "NortheastGulf"
 "SoutheastGulf"
Retrun a table of the populations and their counts
julia> populations(sharks, counts = true)
7×2 DataFrame
 Row │ population      count 
     │ String          Int64 
─────┼───────────────────────
   1 │ Cape Canaveral     21
   2 │ Georgia            30
   3 │ South Carolina     28
   4 │ Florida Keys       65
   5 │ Mideast Gulf       28
   6 │ Northeast Gulf     20
   7 │ Southeast Gulf     20
Rename populations
populations!(data::PopData, rename::Dict)
populations!(data::PopData, rename::Vector{String})
populations!(data::PopData, samples::Vector{String}, populations::Vector{String})
There are a handful of methods to alter PopData population names depending on what you find most convenient. Each of these methods start with populations!() and vary in their inputs. It's for that reason this function has an uncharacteristically long docstring. However, all the methods for populations! are unified in that they edit PopData in place.
- with a Dictionary
 - with a Vector of names
 - reassign by sample
 
populations!(data::PopData, rename::Dict)
Recommended for renaming existing populations
Rename existing population ID's of PopData using a Dict of
population_name => replacement.
# create a dictionary of name conversions
julia> new_popnames = 
Dict(
  "CapeCanaveral" => "Atlantic",
  "Georgia" => "Atlantic",
  "SouthCarolina" => "Atlantic",
  "FloridaKeys" => "Gulf",
  "MideastGulf" => "Gulf",
  "NortheastGulf" => "Gulf",
  "SoutheastGulf" => "Gulf"
);  
julia> populations!(sharks, new_popnames)
julia> populations(sharks, counts = true)
2×2 DataFrame
 Row │ population  count 
     │ String      Int64 
─────┼───────────────────
   1 │ Atlantic       79
   2 │ Gulf          133
These methods are available, but the Dict method is recommended instead of (1) and the reassign-by-sample method is recommended
instead of (2)
populations!(data::PopData, rename::Vector{String})
- rename the unique populations
- condition: 
length(rename) == length(unique(populations)) renameis a vector of new unique population names in the order that they appear insampleinfo(popdata).
 - condition: 
 - rename the population association for every sample
- condition: 
length(rename) == length(samplenames(data)) renameis a vector of new populations names for the samples in the order that they appear insampleinfo(popdata)
 - condition: 
 
julia> new_popnames = ["Atlantic", "Atlantic", "Atlantic", "Gulf", "Gulf", "Gulf", "Gulf"] ;
julia> populations!(sharks, new_popnames)
julia> populations(sharks, counts = true)
2×2 DataFrame
 Row │ population  count 
     │ String      Int64 
─────┼───────────────────
   1 │ Atlantic       79
   2 │ Gulf          133
Recommended for assigning population ID's for specific samples.
populations!(data::PopData, samples::Vector{String}, populations::Vector{String})
You may want outright overwrite all current population information. This is particularly useful when importing from VCF format when population information is not provided. This method will completely replace the population names of PopData regardless of what they currently are. 
This method takes a vector of sample names and a vector of the new population names of the samples in the order that they appear in the name-vector.
# creating a vector of sample names
julia> ch_names = samplenames(sharks)[1:5]
5-element Array{String,1}:
 "cc_001"
 "cc_002"
 "cc_003"
 "cc_005"
 "cc_007"
and we then also create the vector of these samples' new population names:
julia> popnames = ["North Cape", "North Cape", "North Cape", "South Cape", "South Cape"] ;
Now we can combine them with populations! to rename the first 5 Cape Canaveral samples.
julia> populations!(sharks, ch_names, popnames)
julia> sampleinfo(sharks)[1:6,:]
6×5 DataFrame
 Row │ name     population     ploidy  longitude  latitude 
     │ String7  String         Int8    Float64    Float64  
─────┼─────────────────────────────────────────────────────
   1 │ cc_001   North Cape          2    28.3062  -80.5993
   2 │ cc_002   North Cape          2    28.3079  -80.5995
   3 │ cc_003   North Cape          2    28.3023  -80.5996
   4 │ cc_005   South Cape          2    28.6123  -80.4225
   5 │ cc_007   South Cape          2    27.8666  -80.3578
   6 │ cc_008   CapeCanaveral       2    27.8666  -80.3579