"InvalidIndexError: Reindexing only valid with uniquely valued Index objects" when running st.tl.cci.run with custom LR set

[liana_consensus_db.csv](https://github.com/user-attachments/files/16233718/liana_consensus_db.csv)
Hi,

I am trying to run CCI with a custom LR database from liana (see above).

I load it like this:

```
# CCI
df = pd.read_csv('liana_consensus_db.csv')
lrs = np.array(df['x'], dtype='<U18')
print(len(lrs))
lrs
```
and that gives me this output: 
```
3998
array(['Dll1_Notch1', 'Dll1_Notch2', 'Dll1_Notch4', ..., 'Serpina1c_Lrp1',
       'Serpina1d_Lrp1', 'Serpina1e_Lrp1'], dtype='<U18')
```

But when I run this:

```
st.tl.cci.run(adata, lrs,
                  min_spots = 20, #Filter out any LR pairs with no scores for less than min_spots
                  distance=None, # None defaults to spot+immediate neighbours; distance=0 for within-spot mode
                  n_pairs=10000 # Number of random pairs to generate, recommend ~10,000
                  #n_cpus=4, # Number of CPUs for parallel. If None, detects & use all available.
                  )
```
I get this huge error indicating that pd.concat did not work for lr_features and quant_df in perm_utils.py

```
---------------------------------------------------------------------------
InvalidIndexError                         Traceback (most recent call last)
/tmp/ipykernel_4189510/1130224459.py in <module>
      2                   min_spots = 20, #Filter out any LR pairs with no scores for less than min_spots
      3                   distance=None, # None defaults to spot+immediate neighbours; distance=0 for within-spot mode
----> 4                   n_pairs=10000 # Number of random pairs to generate, recommend ~10,000
      5                   #n_cpus=4, # Number of CPUs for parallel. If None, detects & use all available.
      6                   )

~/anaconda3/envs/stlearn_Test/lib/python3.7/site-packages/stlearn/tools/microenv/cci/analysis.py in run(adata, lrs, min_spots, distance, n_pairs, n_cpus, use_label, adj_method, pval_adj_cutoff, min_expr, save_bg, neg_binom, verbose)
    347         verbose,
    348         save_bg=save_bg,
--> 349         neg_binom=neg_binom,
    350     )
    351 

~/anaconda3/envs/stlearn_Test/lib/python3.7/site-packages/stlearn/tools/microenv/cci/permutation.py in perform_spot_testing(adata, lr_scores, lrs, n_pairs, neighbours, het_vals, min_expr, adj_method, pval_adj_cutoff, verbose, save_bg, neg_binom, quantiles)
     57     ####### Quantiles to select similar gene to LRs to gen. rand-pairs #######
     58     lr_expr = adata[:, lr_genes].to_df()
---> 59     lr_feats = get_lr_features(adata, lr_expr, lrs, quantiles)
     60     l_quants = lr_feats.loc[
     61         lrs, [col for col in lr_feats.columns if "L_" in col]

~/anaconda3/envs/stlearn_Test/lib/python3.7/site-packages/stlearn/tools/microenv/cci/perm_utils.py in get_lr_features(adata, lr_expr, lrs, quantiles)
    323     ]
    324     quant_df = pd.DataFrame(lr_quants, columns=lr_cols, index=lrs)
--> 325     lr_features = pd.concat((lr_features, quant_df), axis=1)
    326     adata.uns["lrfeatures"] = lr_features
    327 

~/anaconda3/envs/stlearn_Test/lib/python3.7/site-packages/pandas/util/_decorators.py in wrapper(*args, **kwargs)
    309                     stacklevel=stacklevel,
    310                 )
--> 311             return func(*args, **kwargs)
    312 
    313         return wrapper

~/anaconda3/envs/stlearn_Test/lib/python3.7/site-packages/pandas/core/reshape/concat.py in concat(objs, axis, join, ignore_index, keys, levels, names, verify_integrity, sort, copy)
    305     )
    306 
--> 307     return op.get_result()
    308 
    309 

~/anaconda3/envs/stlearn_Test/lib/python3.7/site-packages/pandas/core/reshape/concat.py in get_result(self)
    526                     obj_labels = obj.axes[1 - ax]
    527                     if not new_labels.equals(obj_labels):
--> 528                         indexers[ax] = obj_labels.get_indexer(new_labels)
    529 
    530                 mgrs_indexers.append((obj._mgr, indexers))

~/anaconda3/envs/stlearn_Test/lib/python3.7/site-packages/pandas/core/indexes/base.py in get_indexer(self, target, method, limit, tolerance)
   3440 
   3441         if not self._index_as_unique:
-> 3442             raise InvalidIndexError(self._requires_unique_msg)
   3443 
   3444         if not self._should_compare(target) and not is_interval_dtype(self.dtype):

InvalidIndexError: Reindexing only valid with uniquely valued Index objects
```
Using `lrs = st.tl.cci.load_lrs(['connectomeDB2020_lit'], species='mouse')` works but I do not see the difference in formatting between that and the liana dataset. Can you help me?

Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

"InvalidIndexError: Reindexing only valid with uniquely valued Index objects" when running st.tl.cci.run with custom LR set #305

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

"InvalidIndexError: Reindexing only valid with uniquely valued Index objects" when running st.tl.cci.run with custom LR set #305

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions