Cropping a domain#

MuSpAn allows you to take a large domain and crop out a subset of it to form a new domain. This may be very useful when, for example, you want to restrict your focus to a specific area of your data, or if you want to see how some spatial metric varies across a large sample.

In this tutorial, we’ll explore how we can trim a large domain down into a smaller one (or many smaller ones).

Let’s start by loading in a domain, and making sure that it has some points and some shapes in it. We’ll make the shapes using alpha shapes, so that our ‘crypts’ of A and B cells are represented with a boundary shape.

[1]:

import muspan as ms
import matplotlib.pyplot as plt

# Load in a domain and make some ring shapes in it
domain = ms.datasets.load_example_domain('Synthetic-Points-Architecture')
type_A_cells = ms.query.query(domain,('label','Celltype'),'is','A')
type_B_cells = ms.query.query(domain,('label','Celltype'),'is','B')
rings = ms.query.query_container(type_A_cells,'OR',type_B_cells)
new_IDs = domain.convert_objects(rings,collection_name='New object',return_IDs=True,object_type='shape',conversion_method='alpha shape',conversion_method_kwargs=dict(alpha=13))

ms.visualise.visualise(domain,'Celltype',marker_size=5)

MuSpAn domain loaded successfully. Domain summary:
Domain name: Architecture
Number of objects: 5991
Collections: ['Cell centres']
Labels: ['Celltype']
Networks: []
Distance matrices: []

[1]:

(<Figure size 1000x800 with 2 Axes>, <Axes: >)

../../_images/_collections_workflows_Workflows_-_1_-_Cropping_a_domain_1_2.png

Let’s use one of those new IDs to make a new domain containing all the points inside a specific crypt. We’ll pick out one ID (object 5991, the first of the alpha shapes that we’ve just added) and see how it looks in the whole domain.

[2]:

ms.visualise.visualise(domain,objects_to_plot=[new_IDs[0]],show_boundary=True)
plt.title(f'Object ID {new_IDs[0]}')

[2]:

Text(0.5, 1.0, 'Object ID 5991')

../../_images/_collections_workflows_Workflows_-_1_-_Cropping_a_domain_3_1.png

Great, we know which object ID 5991 is now. Now let’s crop the domain down to that object.

[3]:

outputs = ms.helpers.crop_domain(domain, [5991])
tiny_domain = outputs[5991]
ms.visualise.visualise(tiny_domain,'Celltype',show_boundary=True)

[3]:

(<Figure size 1000x800 with 2 Axes>, <Axes: >)

../../_images/_collections_workflows_Workflows_-_1_-_Cropping_a_domain_5_1.png

There we have it - a brand new domain whose boundary matches the outside of the shape we just passed it. Notice a couple of things here. First, the labels in tiny_domain have been inherited from the big domain - we were able to colour the points by 'Celltype' without any difficulty.

The other, potentially more misleading thing, is that crop_domain didn’t immediately return a new domain - instead, it returned a dictionary that we called outputs, and we pulled our new domain out using the key 5991. This might seem a strange syntax when you’re cropping out a single domain, but makes much more sense if we crop out many different domains at once:

[4]:

outputs = ms.helpers.crop_domain(domain, new_IDs)
print(outputs)
#ms.visualise.visualise(tiny_domain,'Celltype',show_boundary=True)

{5991: <muspan.domain.domain object at 0x32176d010>, 5992: <muspan.domain.domain object at 0x3213cc450>, 5993: <muspan.domain.domain object at 0x321746e90>, 5994: <muspan.domain.domain object at 0x111023b10>, 5995: <muspan.domain.domain object at 0x321911090>, 5996: <muspan.domain.domain object at 0x320a8f1d0>, 5997: <muspan.domain.domain object at 0x32379b050>, 5998: <muspan.domain.domain object at 0x321913ed0>, 5999: <muspan.domain.domain object at 0x323a37090>, 6000: <muspan.domain.domain object at 0x321752350>, 6001: <muspan.domain.domain object at 0x323a94110>, 6002: <muspan.domain.domain object at 0x323435dd0>, 6003: <muspan.domain.domain object at 0x32375e610>, 6004: <muspan.domain.domain object at 0x31c971610>, 6005: <muspan.domain.domain object at 0x321780c10>, 6006: <muspan.domain.domain object at 0x323c4c350>, 6007: <muspan.domain.domain object at 0x320a7dfd0>, 6008: <muspan.domain.domain object at 0x321727f90>, 6009: <muspan.domain.domain object at 0x3217376d0>, 6010: <muspan.domain.domain object at 0x323a94150>, 6011: <muspan.domain.domain object at 0x3231b5cd0>}

outputs now contains a dictionary of 21 domains, corresponding to the 21 IDs in the list of new_IDs. In general, we don’t have to pass specific object IDs here - any query like object that returns some shapes will result in a new domain for each shape.

[5]:

fig, axes = plt.subplots(nrows=3,ncols=7,figsize=(21,7))
i, j = (0,0)
for key in outputs:
    this_domain = outputs[key]
    ms.visualise.visualise(this_domain,'Celltype',ax=axes[j,i],show_boundary=True)
    axes[j,i].set_title(key)
    i = i + 1
    if i > 6:
        i = 0
        j = j + 1

../../_images/_collections_workflows_Workflows_-_1_-_Cropping_a_domain_9_0.png

A situation where we might be interested in cropping the domain is when we’re using a lattice, for example. Let’s add a hexagonal lattice to our original domain, and use it to crop out a single hexagon.

[6]:

hex_IDs = ms.region_based.generate_hexgrid(domain,150)
ms.visualise.visualise(domain,'ROI')

[6]:

(<Figure size 1000x800 with 2 Axes>, <Axes: >)

../../_images/_collections_workflows_Workflows_-_1_-_Cropping_a_domain_11_1.png

generate_hexgrid has created hexagons with the label ‘ROI’. Let’s use a query to get the hexagon with ROI 14, and crop out a new domain with that as the boundary. Before we do, note that every object in ROI 14 has also been assigned this as a label - if we just get all the objects with ROI equal to ‘ROI_14’, we’ll find a lot of objects.

[7]:

ms.visualise.visualise(domain,'ROI',objects_to_plot=('ROI','ROI 14'))

[7]:

(<Figure size 1000x800 with 2 Axes>, <Axes: >)

../../_images/_collections_workflows_Workflows_-_1_-_Cropping_a_domain_13_1.png

As a result, we’ll use a more complex query to get the ID of the hexagon we’re interested in.

[8]:

q14 = ms.query.query(domain,'ROI','is','ROI 14')
qHex = ms.query.query(domain,('collection',),'is','Hexgrid')
our_hexagon = q14 & qHex
ms.visualise.visualise(domain,'ROI',objects_to_plot=our_hexagon,show_boundary=True)

[8]:

(<Figure size 1000x800 with 2 Axes>, <Axes: >)

../../_images/_collections_workflows_Workflows_-_1_-_Cropping_a_domain_15_1.png

When cropping to this hexagon, we have a choice to make - the hexagon intersects with some of our crypt shapes. Do we want these shapes to be excluded from the new domain? Do we want to crop them? Do we want to expand the hexagon so that they’re kept intact? Fortunately, MuSpAn allows us to choose which option we like best.

[9]:

options = ['exclude','clip','expand']

fig, axes = plt.subplots(ncols=3,figsize=(21,7))
for i, option in enumerate(options):
    new_domain_dict = ms.helpers.crop_domain(domain, our_hexagon, handle_intersecting_objects=option)
    the_key = list(new_domain_dict.keys())[0]
    ms.visualise.visualise(new_domain_dict[the_key],'Celltype',ax=axes[i],show_boundary=True)
    axes[i].set_title(f'Method = {option}')
    print(f'Method {option} - Area: {new_domain_dict[the_key].boundary.area}')
    print(f'Method {option} - Number of objects: {new_domain_dict[the_key].n_objects}')

Method exclude - Area: 58456.71475544251
Method exclude - Number of objects: 252
Method clip - Area: 58456.71475544251
Method clip - Number of objects: 254
Method expand - Area: 69181.8370033665
Method expand - Number of objects: 254

../../_images/_collections_workflows_Workflows_-_1_-_Cropping_a_domain_17_1.png

Notice the key differences between the three options. exclude is the strictest, and will completely discard any shape which intersects with the object boundary. By contrast, clip leads to a domain with the same area as exclude, but any objects intersecting with the boundary will be clipped (note that this may turn one object into many, if the clipped shape is no longer connected). Finally, expand is the most tolerant method, ensuring that all our shapes remain unaltered at the expense of increasing the domain area. The best method to choose will depend on your use case.

Finally, there is one more way of cropping to this hexagon. Notice that in the very red plot a few cells up (where we visualised all objects with the label ROI equal to ROI 14), only one of the two crypt shapes shown here was shown. This is because when we generate a hexgrid, objects are assigned a single label based on the region that their centroid is contained in. We can crop the domain to these objects immediately by specifying crop_method='label' in crop_domain.

[10]:

new_domain_dict = ms.helpers.crop_domain(domain,crop_method='label',label_name='ROI')
print(new_domain_dict)

{'ROI 0': <muspan.domain.domain object at 0x3240736d0>, 'ROI 1': <muspan.domain.domain object at 0x32586a350>, 'ROI 10': <muspan.domain.domain object at 0x3259250d0>, 'ROI 11': <muspan.domain.domain object at 0x32588e910>, 'ROI 12': <muspan.domain.domain object at 0x324178510>, 'ROI 13': <muspan.domain.domain object at 0x3230befd0>, 'ROI 14': <muspan.domain.domain object at 0x323d05d90>, 'ROI 15': <muspan.domain.domain object at 0x325d45890>, 'ROI 16': <muspan.domain.domain object at 0x3259ab4d0>, 'ROI 17': <muspan.domain.domain object at 0x3234f7a90>, 'ROI 18': <muspan.domain.domain object at 0x325852090>, 'ROI 19': <muspan.domain.domain object at 0x32409e5d0>, 'ROI 2': <muspan.domain.domain object at 0x32336cf10>, 'ROI 20': <muspan.domain.domain object at 0x3236a79d0>, 'ROI 22': <muspan.domain.domain object at 0x32336efd0>, 'ROI 23': <muspan.domain.domain object at 0x325cf1f90>, 'ROI 24': <muspan.domain.domain object at 0x3235d7d50>, 'ROI 25': <muspan.domain.domain object at 0x3233c5750>, 'ROI 26': <muspan.domain.domain object at 0x321747410>, 'ROI 3': <muspan.domain.domain object at 0x325a9aa10>, 'ROI 4': <muspan.domain.domain object at 0x323c0d710>, 'ROI 5': <muspan.domain.domain object at 0x32407d210>, 'ROI 6': <muspan.domain.domain object at 0x325cbc4d0>, 'ROI 7': <muspan.domain.domain object at 0x3243e01d0>, 'ROI 8': <muspan.domain.domain object at 0x325be3750>, 'ROI 9': <muspan.domain.domain object at 0x323111910>}

This has used the ROI label to crop a new domain based on every possible value of the label ROI. This is in general substantially faster than repeatedly iterating over shapes and cropping to them. Cropping by label will find all shapes with the same value of that label, and create a new domain which encloses all of them. The new domain boundary is the convex hull around the points/shapes contained within that label.

[11]:

ms.visualise.visualise(new_domain_dict['ROI 14'],'Celltype',show_boundary=True)

[11]:

(<Figure size 1000x800 with 2 Axes>, <Axes: >)

../../_images/_collections_workflows_Workflows_-_1_-_Cropping_a_domain_21_1.png

In this example, because we generated a hexgrid which added new shapes to the domain, we’ve picked up a giant hexagon within our new domain! If we wanted, we could now delete this (remember that the hexagon is in a different collection to the other objects, which would make it straightforward to remove). This cropping functionality is extremely useful in situations where we have local neighbourhoods - for instance, if we had assigned each cell centre a neighbourhood ID (see, for instance, the ‘Paper examples’ tutorial for making Figure 5 of the MuSpAn paper). It means that we can crop objects out of our domain even without having a shape that defines the boundary we wish to crop to. As an example, let’s make new domains which contains only points with the same cell type.

[12]:

new_domain_dict = ms.helpers.crop_domain(domain,crop_method='label',label_name='Celltype')
print(new_domain_dict)

fig, axes = plt.subplots(ncols=4,figsize=(21,7))
for i, option in enumerate(['A','B','C','D']):
    ms.visualise.visualise(new_domain_dict[option],'Celltype',ax=axes[i],show_boundary=True)

{'A': <muspan.domain.domain object at 0x324621810>, 'B': <muspan.domain.domain object at 0x325acdf90>, 'C': <muspan.domain.domain object at 0x32643e7d0>, 'D': <muspan.domain.domain object at 0x32640cf10>}

../../_images/_collections_workflows_Workflows_-_1_-_Cropping_a_domain_23_1.png

Just for fun, let’s see one of these plots again!

[13]:

options = ['expand']
for i, option in enumerate(options):
    new_domain_dict = ms.helpers.crop_domain(domain, our_hexagon, handle_intersecting_objects=option)
    the_key = list(new_domain_dict.keys())[0]
    ms.visualise.visualise(new_domain_dict[the_key],'Celltype',show_boundary=True)
    plt.gca().set_title(f'Method = {option}')
    print(f'Method {option} - Area: {new_domain_dict[the_key].boundary.area}')
    print(f'Method {option} - Number of objects: {new_domain_dict[the_key].n_objects}')

Method expand - Area: 69181.8370033665
Method expand - Number of objects: 254

../../_images/_collections_workflows_Workflows_-_1_-_Cropping_a_domain_25_1.png