Cluster SNPs with HDBSCAN and identify haplotypes
Source:R/run_hdbscan_haplotyping.R
      run_hdbscan_haplotyping.Rdrun_hdbscan_haplotyping() performs HDBSCAN clustering of SNPs in region of interest to identify marker groups. Individuals are classified by haplotype combination based on shared combinations of marker group alleles. Returns a comprehensive haplotyping object (HapObject), which can be visualized with reference to phenotype and metadata using crosshap_viz() (set epsilon to 1 as a dummy value).
Usage
run_hdbscan_haplotyping(
  vcf,
  LD,
  pheno,
  MGmin,
  minHap = 5,
  hetmiss_as = "allele",
  metadata = NULL,
  keep_outliers = FALSE
)Arguments
- vcf
- Input VCF for region of interest. 
- LD
- Pairwise correlation matrix of SNPs in region (e.g. from PLINK). 
- pheno
- Input numeric phenotype data for each individual. 
- MGmin
- Minimum SNPs in marker groups, MinPts parameter for DBscan. 
- minHap
- Minimum nIndividuals in a haplotype combination. 
- hetmiss_as
- If hetmiss_as = "allele", heterozygous-missing SNPs './N' are recoded as 'N/N', if hetmiss_as = "miss", the site is recoded as missing. 
- metadata
- Metadata input (optional). 
- keep_outliers
- When FALSE, marker group smoothing is performed to remove outliers.