Format and hierarchically cluster a data.frame. If hclust could not normally be produced (usually because no samples are in common for a feature) pad the matrix with zeros and still calculate the distance

hclust_order(
  df,
  feature_pk,
  sample_pk,
  value_var,
  cluster_dim,
  distance_measure = "dist",
  hclust_method = "ward.D2"
)

Arguments

df

data.frame to cluster

feature_pk

variable uniquely defining a row

sample_pk

variable uniquely defining a sample

value_var

An abundance value to use with hclust

cluster_dim

rows, columns, or both

distance_measure

variable to use for computing dis-similarity

corr

pearson correlation

dist

euclidean distance

hclust_method

method from stats::hclust to use for clustering

Value

a list containing a hierarchically clustered set of rows and/or columns

Examples


library(dplyr)

df <- tidyr::crossing(letters = LETTERS, numbers = 1:10) %>%
  mutate(noise = rnorm(n()))
hclust_order(df, "letters", "numbers", "noise", "rows")
#> $rows
#>  [1] "K" "M" "P" "G" "Q" "T" "F" "J" "C" "R" "D" "H" "S" "N" "U" "B" "O" "X" "L"
#> [20] "I" "V" "W" "Z" "Y" "A" "E"
#>