🎯 Ming "Tommy" Tang’s Post

View profile for 🎯  Ming "Tommy" Tang

Director of Bioinformatics | Cure Diseases with Data | Author of From Cell Line to Command Line | Educator YouTube @chatomics

chatomics! Scanpy and Seurat marker gene log2Fold change has a big discrepancy! Do you understand log2Fold change in single-cell RNAseq data? https://lnkd.in/ePiUKxF4 I hope you've found this post helpful. Follow me for more. Subscribe to my FREE newsletter https://lnkd.in/erw83Svn

  • No alternative text description for this image
Chiranjit Das

Bioinformatic Analyst @ University of Birmingham || MSc Bioinformatics @ University of Glasgow 2023 || NIT Rourkela '22

2mo

While analysing data and picking out genes of interest, I have had a very specific dilemma for a while. I have seen people rank the differential genes based on decreasing LogFC while some prefer to rank them based on increasing adj p value. Often times due to outlier cells in a cluster the LogFC gets inflated, so I generally trust the adj p value more. Do you have any inputs on this dilemma or is there a better metric you would suggest to truly pick out the most significant differential genes? 🎯 Ming "Tommy" Tang

Like
Reply
Maryam A.

Bioinformatician | PhD Candidate in Medical Oncology | Lung Cancer | Epigenetics | Biomarkers | MicroRNA

2mo

🎯 Ming "Tommy" Tang Thank you for such an insightful post on the nuances of log2 fold change in scRNA-seq data! I found the discussion about the discrepancies between Seurat and Scanpy particularly interesting. In your opinion, how significant are these differences in practice? Do they often lead to biologically different conclusions, or are the variations usually minor? Additionally, do you think there’s a need for a standardized log2 fold change calculation method in scRNA-seq, or does the flexibility across tools serve a valuable purpose depending on the dataset and context? Looking forward to hearing your thoughts!

Like
Reply
Dean Lee

Figure One Lab: A Gateway Computational Biology Experience | 1829 Code-Enabled Biologists and Counting

2mo

are the discrepancies mostly very lowly/sparsely expressed genes?

Like
Reply
Carlos Buss, PhD

Bionformatician | Single-cell & Multi-Omics | Biomarker Expert & Drug Discovery

2mo

Honestly, I’m more concerned about getting the initial steps right—filtering out low counts, low variation, and mitochondrial genes. The real question is: how do we strike the right balance between keeping things simple and precise when a scatter plot alone can’t capture the full complexity of these early steps?

Like
Reply
Ivo Kwee

Co-founder and CTO of BigOmics Analytics

2mo

Antonino Zito we noticed this before.right? Can you explain? Something in the calculation of the mean. Here is a thread https://github.com/satijalab/seurat/issues/6701

Dr. Sadman S.

Senior Genomics Scientist | AI-Driven Innovation in Life Science Enthusiast | Neuroscientist | Community Builder & Career Mentor | Views Are My Own

2mo

The question is, which one to take. The ones from scanpy? or Seurat? Or average of both?

Like
Reply
See more comments

To view or add a comment, sign in

Explore topics