gneiss.cluster.proportional_linkage¶
-
gneiss.cluster.
proportional_linkage
(X, method='ward')[source]¶ Principal Balance Analysis using Hierarchical Clustering based on proportionality.
The hierarchy is built based on the proportionality between any two pairs of features. Specifically the proportionality between two features \(x\) and \(y\) is measured by
\[p(x, y) = var (\ln \frac{x}{y})\]If \(p(x, y)\) is very small, then \(x\) and \(y\) are said to be highly proportional. A hierarchical clustering is then performed using this proportionality as a distance metric.
Parameters: - X (pd.DataFrame) – Contingency table where the samples are rows and the features are columns.
- method (str) – Clustering method. (default=’ward’)
Returns: Tree generated from principal balance analysis.
Return type: skbio.TreeNode
References
[1] Pawlowsky-Glahn V, Egozcue JJ, and Tolosana-Delgado R. Principal Balances (2011). Examples
>>> import pandas as pd >>> from gneiss.cluster import proportional_linkage >>> table = pd.DataFrame([[1, 1, 0, 0, 0], ... [0, 1, 1, 0, 0], ... [0, 0, 1, 1, 0], ... [0, 0, 0, 1, 1]], ... columns=['s1', 's2', 's3', 's4', 's5'], ... index=['o1', 'o2', 'o3', 'o4']).T >>> tree = proportional_linkage(table+0.1)