Graph model
- class comseg.model.ComSegGraph(df_spots_label, selected_genes, dict_co_expression, dict_scale={'x': 0.103, 'y': 0.103, 'z': 0.3}, mean_cell_diameter=15, k_nearest_neighbors=10, edge_max_length=None, gene_column='gene', prior_name='in_nucleus', disable_tqdm=False)
Bases:
objectClass to the generate the graph of RNA spots from a CSV file/image this class is in charge of :
create the graph
apply community detection / graph partitioning
Compute the community expression vector \(V_c\)
add to the communities the labels/cell types computed by the clustering of \(V_c\) the by the
InSituClustering()classadd the centroid of the cells in the graph
associate RNAs to cell
compute the cell-by-gene matrix of the input sample
- __init__(df_spots_label, selected_genes, dict_co_expression, dict_scale={'x': 0.103, 'y': 0.103, 'z': 0.3}, mean_cell_diameter=15, k_nearest_neighbors=10, edge_max_length=None, gene_column='gene', prior_name='in_nucleus', disable_tqdm=False)
- Parameters:
df_spots_label (pd.DataFrame) – dataframe of the spots with column x,y,z, gene and optionally the prior label column
selected_genes (list[str]) – list of genes to consider
dict_scale (dict) – dictionary containing the pixel/voxel size of the images in µm default is {“x”: 0.103, ‘y’: 0.103, “z”: 0.3}
mean_cell_diameter (float) – the expected mean cell diameter in µm default is 15µm
k_nearest_neighbors (int) – number of nearest neighbors to consider for the graph construction default is 10
edge_max_length (float) – default is mean_cell_diameter / 4
- create_graph()
create the graph of the RNA nodes, all the graph generation parameters are set in the __init__() function
- Returns:
a graph of the RNA spots
- Return type:
nx.Graph
- community_vector(clustering_method='with_prior', seed=None)
- Partition the graph into communities/sets of RNAs and computes and stores the “community expression vector”
in the
community_anndataclass attribute
- Parameters:
clustering_method (str) – choose in [“with_prior”, “louvain”]. “with_prior” is our graph partitioning / community detection method modify from Louvain, taking into account prior knowledge from landmarks like nuclei or cell mask.
seed (int) – (optional) seed for the graph partitioning initialization
- Returns:
a graph with a new node attribute “community” with the community detection vector
- Return type:
nx.Graph
- add_cluster_id_to_graph(dict_cluster_id, clustering_method='leiden_merged')
add transcriptional cluster id to each RNA molecule in the graph
- Parameters:
dict_cluster_id (dict) – dict {index_commu : cluster_id}
clustering_method (str) – clustering method used to get the community
- Returns:
- Return type:
nx.Graph
- classify_centroid(dict_cell_centroid, n_neighbors=15, dict_in_pixel=True, max_dist_centroid=None, key_pred='leiden_merged', distance='ngb_distance_weights')
classify cells centroid based on their neighbor RNAs labels from
add_cluster_id_to_graph()- Parameters:
dict_cell_centroid (dict) – dict of centroid coordinate {cell : {z:,y:,x:}}
n_neighbors (int) – number of neighbors to consider for the classification of the centroid (default 15)
dict_in_pixel (bool) – if True the centroid are in pixel and rescals if False the centroid are in um (default True)
max_dist_centroid (int) – maximum distance to consider for the centroid, if None it is set to mean_cell_diameter / 2
key_pred (str) – key-name of the node attribute containing the cluster id (default “leiden_merged”)
distance (str) – leave it to “ngb_distance_weights” (default “ngb_distance_weights”)
- Returns:
self.G
- Return type:
nx.Graph
- associate_rna2landmark(key_pred='leiden_merged', distance='distance', max_cell_radius=100)
Associate RNA to cell based on the both transcriptomic landscape and the distance between the RNAs and the centroid of the cell
- Parameters:
key_pred (str) – key of the node attribute containing the cluster id (default “leiden_merged”)
prior_name (str) – prior_name attribut used in community_vector
max_distance (float) – maximum distance between a cell centroid and an RNA to be associated (default 100)
- Returns:
self.G
:rtype nx.Graph
- get_anndata_from_result(key_cell_pred='cell_index_pred', return_polygon=False, alpha=0.5, min_rna_per_cell=5, allow_disconnected_polygon=False)
Generate an anndata storing the estimated expression vector and their spots coordinates
- Parameters:
key_cell_pred (str) – key of the cell prediction in the graph (default cell_index_pred)
return_polygon (bool) – if True return the polygon of the cells
alpha (float) – alpha parameter to compute the alphashape polygone : https://pypi.org/project/alphashape/. alpha is between 0 and 1, 1 correspond to the convex hull of the cell
min_rna_per_cell (int) – minimum number of RNA to consider a cell
allow_disconnected_polygon – if True allow disconnected polygon
- Returns: