quadrat_correlation_matrix#

quadrat_correlation_matrix(domain, label_name, population=None, include_boundaries=None, exclude_boundaries=None, boundary_exclude_distance=0, regions_collection_name=None, regions_label_name=None, region_method='quadrats', region_kwargs={}, keep_regions_as_objects=False, verbose=False, max_iters_for_singular_matrix_encounter=1000, alpha=0.05, transform_counts=None, low_observation_bound=15, visualise_output=False, visualise_correlation_matrix_kwargs={})#

The Quadrat Correlation Matrix (QCM) identifies statistically significant co-occurrences between objects with different labels (e.g., cell types) within regions of interest (ROIs). It compares the observed correlation between pairs of labels against the expected correlation under random label assignment.

Regions can be generated using quadrats, hexagonal grids, or pre-defined labels in the domain (specified via the regions_label_name parameter). If no regions are provided, quadrats with a specified side length (via region_kwargs) will be generated.

Parameters:
domainobject

A muspan domain object.

label_namestr

The name of the label used to calculate the QCM.

populationquery-like, optional

Specifies the population of objects to include in the QCM. Can be a list or array of object indices, a muspan query, or None to include all objects. Default is None.

include_boundariesquery-like, optional

Specifies external boundaries within which to perform the QCM. Can be a list or array of object indices, a muspan query, or None to use the entire domain. Default is None.

exclude_boundariesquery-like, optional

Specifies internal boundaries (excluded regions within the shapes defined by include_boundaries). Can be a list or array of object indices, a muspan query, or None. Default is None.

boundary_exclude_distancefloat, optional

Buffer distance to exclude objects located near boundaries. Default is 0.

regions_collection_namestr, optional

Name of the regions collection to use. Default is None.

regions_label_namestr, optional

Name of the label used to assign objects to regions. Default is None.

region_methodstr, optional

Method to generate regions if no collection or label name is provided. Options are ‘hexgrid’ or ‘quadrats’. Default is ‘quadrats’.

region_kwargsdict, optional

Additional keyword arguments for region generation. Default is an empty dictionary.

keep_regions_as_objectsbool, optional

Whether to keep regions as objects if they are constructed by this function. Default is False.

verbosebool, optional

Whether to print verbose output during execution. Default is False.

max_iters_for_singular_matrix_encounterint, optional

Maximum number of attempts when encountering a singular matrix. Default is 1000.

alphafloat, optional

Significance level for statistical tests. Default is 0.05.

transform_countsstr, optional

Transformation to apply to counts. Options are ‘arcsinh’, ‘log’, ‘sqrt’, or None. Default is None.

low_observation_boundint, optional

Labels appearing fewer than this number of times in the population will be excluded from analysis. Default is 15.

visualise_outputbool, optional

Whether to visualise the correlation matrix. Default is False.

visualise_correlation_matrix_kwargsdict, optional

Additional keyword arguments for visualising the correlation matrix. Default is an empty dictionary.

Returns:
SESnumpy.ndarray

The standard effect size matrix. Positive values indicate increased correlation compared to random relabelling, while negative values indicate decreased correlation.

Anumpy.ndarray

The filtered quadrat correlation matrix. A subset of the SES matrix showing only statistically significant associations (after Benjamini-Hochberg correction).

label_categoriesnumpy.ndarray

The label categories corresponding to the rows and columns of the SES and A matrices.

Raises:
RuntimeError

If the label name is not found in the domain’s labels.

RuntimeError

If the label is not categorical.

RuntimeError

If too many singular matrices are encountered during computation.

Notes

  • Any shapes can be used as regions for cell counts. To implement this, use a pre-defined collection of regions containing label_name labels and pass this collection using regions_collection_name.

References

Examples

A range of related tutorials and examples: