gneiss.cluster.rank_linkage¶
-
gneiss.cluster.
rank_linkage
(r, method=’average’)[source]¶ Hierchical Clustering on feature ranks.
The hierarchy is built based on the rank values of the features given an input vector r of ranks. The distance between two features \(x\) and \(y\) can be defined as
\[d(x, y) = (r(x) - r(y))^2\]Where \(r(x)\) is the rank of the features. Hierarchical clustering is then performed using \(d(x, y)\) as the distance metric.
This can be useful for constructing principal balances.
Parameters: - r (pd.Series) – Continuous vector representing some ordering of the features in X.
- method (str) – Clustering method. (default=’average’)
Returns: Tree for constructing principal balances.
Return type: skbio.TreeNode
Examples
>>> import pandas as pd >>> from gneiss.cluster import rank_linkage >>> ranks = pd.Series([1, 2, 4, 5], ... index=['o1', 'o2', 'o3', 'o4']) >>> tree = rank_linkage(ranks) >>> print(tree.ascii_art()) /-o1 /y1------| | \-o2 -y0------| | /-o3 \y2------| \-o4