label_entropy#
- label_entropy(domain, label_name, population=None, include_boundaries=None, exclude_boundaries=None, boundary_exclude_distance=0)#
Calculate the Shannon entropy of a specified label within a domain. The shannon entropy is a measure of the uncertainty or disorder of a system by comparing observations against the expectation of the probabilty distribution describing the data. Labels values are normalised to form probabilities pk and the Shannon entropy is calculated as H = -sum(pk * log(pk)). If the label is continuous, values are binned to form a histogram representing a discrete sample from a continuous probability space. If the label is categotical, for each category k, pk represents the probability of observing k in the domain.
- Parameters:
- domainobject
A muspan domain object.
- label_namestr
The name of the label to calculate density for.
- populationarray-like, optional
The population of objects to consider, by default None.
- include_boundarieslist, np.ndarray or query-like or None, optional
Boundaries to include or None to include all.
- exclude_boundarieslist, np.ndarray or query-like or None, optional
Boundaries to exclude or None to exclude none.
- boundary_exclude_distanceint, optional
Distance to exclude from boundaries, by default 0.
- Returns:
- float
The entropy of the specified label.
- Raises:
- ValueError
If the specified label is not found in the domain labels.