quadrat_correlation_matrix#

quadrat_correlation_matrix(domain, label_name, population=None, include_boundaries=None, exclude_boundaries=None, boundary_exclude_distance=0, regions_collection_name=None, regions_label_name=None, region_method='quadrats', region_kwargs={}, keep_regions_as_objects=False, verbose=False, max_iters_for_singular_matrix_encounter=1000, alpha=0.05, transform_counts=None, low_observation_bound=15, visualise_output=False, visualise_correlation_matrix_kwargs={})#

The Quadrat Correlation Matrix (QCM) describes correlations between the counts of different cell types within squares, or ‘quadrats’, of a specified edge length. The QCM identifies statistically significant co-occurrences within a ROI between objects with different labels (e.g., cells with different types), comparing the strength of the observed correlation between a given pair of cell types against the expected correlation that would be observed if cell labels were assigned randomly. Regions are generated using either quadrats or hexagonal grids, or using pre-defined labels in the domain using the regions_label_name parameter.

If no quadrats are specified, this function will generate quadrats with side length specified in region_kwargs.

Parameters:
domainobject

A muspan domain.

label_namestr

The name of the label to use to calculate the QCM.

populationquery-like, optional

Query-like specifying the population of objects to include in the QCM. Can be a list or array of object indices, a muspan query, or None to include all objects. Default is None.

include_boundariesquery-like, optional

Query specifying the external boundaries within which to perform the QCM. Can be a list or array of object indices, a muspan query, or None to use the entire domain. Default is None.

exclude_boundariesquery-like, optional

Query specifying the internal boundaries (i.e., excluded regions within the shapes defined by include_boundaries) within which to perform the QCM. Can be a list or array of object indices, a muspan query, or None. Default is None.

boundary_exclude_distancefloat, optional

Buffer to exclude objects located within boundary_exclude_distance from the boundaries. Defaults to 0.

regions_collection_namestr, optional

The name of the regions collection, by default None.

regions_label_namestr, optional

The name of the label used to assign objects to regions, by default None.

region_methodstr, optional

The method to generate regions if no region colleciton or region label name provided, by default ‘quadrats’. Options are ‘hexgrid’ or ‘quadrats’.

region_kwargsdict, optional

Additional keyword arguments for region generation, by default {}.

keep_regions_as_objectsbool, optional

Whether to keep regions as objects if regions are constructed with this function, by default False.

verbosebool, optional

Whether to print verbose output, by default False.

max_iters_for_singular_matrix_encounterint, optional

Maximum number of tries when encountering a singular matrix, by default 1000.

alphafloat, optional

Significance level for statistical tests, by default 0.05.

transform_countsstr, optional

Transformation to apply to counts, by default None. Options are ‘arcsinh’, ‘log’, ‘sqrt’, or None.

low_observation_boundint, optional

If a label appears fewer than low_observation_bound times in the population, it will be excluded from analysis. Default is 15.

visualise_outputbool, optional

Whether to visualise the correlation matrix, by default False.

visualise_correlation_matrix_kwargsdict, optional

Additional keyword arguments for visualising the correlation matrix, by default {}.

Returns:
SESarray

The standard effect size matrix. Values above 0 indicate increased correlation compared to random relabelling, values below 0 indicate decreased correlation.

Aarray

The quadrat correlation matrix. Filtered subset of the SES in which only statistically significant associations are shown (after Benjamini-Hochberg correction for multiple comparisons)

label_categoriesarray

The label categories associated with the rows/columns of SES and A.

Raises:
RuntimeError

If the label name is not in the list of generated labels.

RuntimeError

If the label is not categorical.

RuntimeError

If too many singular matrices are encountered.

Notes

  • For more information see Morueta-Holme et al. 2016: https://doi.org/10.1111/ecog.01892.

  • Any shapes can be passed in as regions for cell counts. To implement this, use a pre defined collection of regions that contain label_name labels and pass this collection using regions_collection_name.

Examples

>>> import muspan as ms
>>> SES, A, label_categories = ms.region_based.quadrat_correlation_matrix(domain=a_domain,label_name='some_label_name',region_kwargs={'side_length':100})