chatomics! Scanpy and Seurat marker gene log2Fold change has a big discrepancy! Do you understand log2Fold change in single-cell RNAseq data? https://lnkd.in/ePiUKxF4 I hope you've found this post helpful. Follow me for more. Subscribe to my FREE newsletter https://lnkd.in/erw83Svn
🎯 Ming "Tommy" Tang Thank you for such an insightful post on the nuances of log2 fold change in scRNA-seq data! I found the discussion about the discrepancies between Seurat and Scanpy particularly interesting. In your opinion, how significant are these differences in practice? Do they often lead to biologically different conclusions, or are the variations usually minor? Additionally, do you think there’s a need for a standardized log2 fold change calculation method in scRNA-seq, or does the flexibility across tools serve a valuable purpose depending on the dataset and context? Looking forward to hearing your thoughts!
are the discrepancies mostly very lowly/sparsely expressed genes?
Honestly, I’m more concerned about getting the initial steps right—filtering out low counts, low variation, and mitochondrial genes. The real question is: how do we strike the right balance between keeping things simple and precise when a scatter plot alone can’t capture the full complexity of these early steps?
Antonino Zito we noticed this before.right? Can you explain? Something in the calculation of the mean. Here is a thread https://github.com/satijalab/seurat/issues/6701
The question is, which one to take. The ones from scanpy? or Seurat? Or average of both?
Bioinformatic Analyst @ University of Birmingham || MSc Bioinformatics @ University of Glasgow 2023 || NIT Rourkela '22
2moWhile analysing data and picking out genes of interest, I have had a very specific dilemma for a while. I have seen people rank the differential genes based on decreasing LogFC while some prefer to rank them based on increasing adj p value. Often times due to outlier cells in a cluster the LogFC gets inflated, so I generally trust the adj p value more. Do you have any inputs on this dilemma or is there a better metric you would suggest to truly pick out the most significant differential genes? 🎯 Ming "Tommy" Tang