Stuck on an issue?

Lightrun Answers was designed to reduce the constant googling that comes with debugging 3rd party libraries. It collects links to all the places you might be looking at while hunting down a tough bug.

And, if you’re still stuck at the end, we’re happy to hop on a call to see how we can help out.

Error in sc.tl.dendrogram: The truth value of a Index is ambiguous.

See original GitHub issue

I think with a recent numpy or Pandas update, an if clause in sc.tl.dendrogram no longer works properly.

    import numpy as np
    import pandas as pd
    import scanpy as sc

    # Use pbmc3k dataset
    adata = sc.datasets.pbmc3k()
    sc.pp.filter_genes(adata, min_counts=1)
    sc.pp.log1p(adata)
    sc.pp.normalize_total(adata)
    sc.pp.highly_variable_genes(adata)
    sc.tl.pca(adata)
    sc.pp.neighbors(adata)
    sc.tl.leiden(adata)
    sc.tl.rank_genes_groups(adata, groupby='leiden')

    # Save the ranks.
    results_dict = dict()
    for cluster_i in adata.uns['rank_genes_groups']['names'].dtype.names:
        # print(cluster_i)
        # Get keys that we want from the dataframe.
        data_keys = list(
            set(['names', 'scores', 'logfoldchanges', 'pvals', 'pvals_adj']) &
            set(adata.uns['rank_genes_groups'].keys())
        )
        # Build a table using these keys.
        key_i = data_keys.pop()
        results_dict[cluster_i] = pd.DataFrame(
            row[cluster_i] for row in adata.uns['rank_genes_groups'][key_i]
        )
        results_dict[cluster_i].columns = [key_i]
        for key_i in data_keys:
            results_dict[cluster_i][key_i] = [
                row[cluster_i] for row in adata.uns['rank_genes_groups'][key_i]
            ]
        results_dict[cluster_i]['cluster'] = cluster_i
    marker_df = pd.concat(results_dict, ignore_index=True)

    marker_df = marker_df.sort_values(by=['scores'], ascending=False)
    # Make dataframe of the top 3 markers per cluster
    marker_df_plt = marker_df.groupby('cluster').head(3)
    
    # here sc.tl.dendrogram will fail
    _ = sc.pl.dotplot(
        adata,
        var_names=marker_df_plt['names'],
        groupby='leiden',
        dendrogram=True,
        use_raw=False,
        show=False,
        color_map='Blues'
        save='{}.png'.format('test')
    )

/lib/python3.6/site-packages/scanpy/tools/_dendrogram.py in dendrogram(adata, groupby, n_pcs, use_rep, var_names, use_raw, cor_method, linkage_method, optimal_ordering, key_added, inplace)
    130         corr_matrix, method=linkage_method, optimal_ordering=optimal_ordering
    131     )
--> 132     dendro_info = sch.dendrogram(z_var, labels=categories, no_plot=True)
    133
    134     # order of groupby categories

/lib/python3.6/site-packages/scipy/cluster/hierarchy.py in dendrogram(Z, p, truncate_mode, color_threshold, get_leaves, orientation, labels, count_sort, distance_sort, show_leaf_counts, no_plot, no_labels, leaf_font_size, leaf_rotation, leaf_label_func, show_contracted, link_color_func, ax, above_threshold_color)
   3275                          "'bottom', or 'right'")
   3276
-> 3277     if labels and Z.shape[0] + 1 != len(labels):
   3278         raise ValueError("Dimensions of Z and labels must be consistent.")
   3279

/lib/python3.6/site-packages/pandas/core/indexes/base.py in __nonzero__(self)
   2148     def __nonzero__(self):
   2149         raise ValueError(
-> 2150             f"The truth value of a {type(self).__name__} is ambiguous. "
   2151             "Use a.empty, a.bool(), a.item(), a.any() or a.all()."
   2152         )

ValueError: The truth value of a Index is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

Versions:

scanpy==1.5.1 anndata==0.7.3 umap==0.4.4 numpy==1.17.5 scipy==1.5.0 pandas==1.0.5 scikit-learn==0.23.1 statsmodels==0.11.1 python-igraph==0.8.2 leidenalg==0.8.1

Conda environment is attached. environment.txt