Metrics

tmplot.entropy(phi: ndarray, max_probs: bool = False)

Renyi entropy calculation routine [1].

Renyi entropy can be used to estimate the optimal number of topics: fit several models varying the number of topics and choose the model for which Renyi entropy is minimal.

Parameters:

phi (np.ndarray) – Topics vs words probabilities matrix (T x W).

Returns:

  • renyi (double) – Renyi entropy value.

  • max_probs (bool) – Use maximum probabilities of terms per topics instead of all probability values.

References

Example

>>> import tmplot as tmp
>>> # Preprocessing step
>>> # ...
>>> # Model fitting step
>>> # model = ...
>>> # phi = ...
>>> # Entropy calculation
>>> entropy = tmp.entropy(phi)