In-situ clustering
- class comseg.clustering.InSituClustering(anndata, selected_genes)
Bases:
objectIn situ clustering class takes as attribute an anndata object containing the community expression vectors \(V_c\) of RNA partitions/communities from one or many images. This class is in charge of identifying the single cell transcriptomic clusters present in the dataset.
- __init__(anndata, selected_genes)
- Parameters:
anndata (anndata object) – anndata object containing the expression vector of the community. The anndata can be the concatenation of several anndata object from different ComSeg instance
selected_genes (list[str]) – list of genes to take into account for the clustering the gene list order will define the order of the gene in the expression vector
- compute_normalization_parameters(debug_path=None, sample_size=10000)
Compute the ScTransform normalization parameters from the class attribute anndata
- Parameters:
debug_path
- Returns:
- cluster_rna_community(size_commu_min=3, norm_vector=True, n_pcs=15, n_comps=15, clustering_method='leiden', n_neighbors=20, resolution=1, n_clusters_kmeans=4, palette=None, plot_umap=False)
Cluster the RNA partition/community expression vector to identify the single cell transcriptomic cluster present in the dataset
- Parameters:
size_commu_min (int) – minimum number of RNA in a community to be considered for the clustering
norm_vector (bool) – if True, the expression vector will be normalized using the scTRANSFORM normalization parameters
n_pcs (int) – number of principal component to compute for the clustering; Lets 0 if no pca
n_comps (int) – number of components to compute for the clustering; Lets 0 if no pca
clustering_method (str) – choose in [“leiden”, “kmeans”, “louvain”]
n_neighbors (int) – number of neighbors similarity graph
resolution (float) – resolution parameter for the leiden/Louvain clustering
n_clusters_kmeans – number of cluster for the kmeans clustering
palette (list[str]) – color palette for the cluster list of (HEX) color
plot_umap – if True, plot the umap of the cluster
- Rtype n_clusters_kmeans:
int
- Returns:
- merge_cluster(nb_min_cluster=0, min_merge_correlation=0.8, cluster_column_name='leiden', plot=True)
Merge clusters based on the correlation of their centroid
- Parameters:
nb_min_cluster (int) – minimum number of clusters to merge
min_merge_correlation (float) – minimum correlation to merge clusters
cluster_column_name (str) – clustering method used
plot
- Returns:
- classify_small_community(key_pred='leiden_merged', classify_mode='pca', min_proba_small_commu=0)
associate unclassified RNA community expression vector by using a knn classifier and the already classify communities
- Parameters:
key_pred – leave default
unorm_vector_key – leave default
classify_mode – choose in ‘pca’ or ‘euclidien’. it either uses the euclidian space or PCA space
min_proba_small_commu – minimum probability to classify a small community based on the KNN classifier
- Returns: