Skip to content

omit realizing transcripts #43

@HelenaLC

Description

@HelenaLC

Hi there. I noticed that the read_parquet call here realizes transcripts into memory as a data.frame. Instead, I would propose something like the following, which would realize only i) cell IDs and ii) unique FOV entries of the cell IDs present in the data. This also prevents costly dplyr computations (group_by, select, distinct, left_join) on a data.frame that could potentially contain millions of entries, which beats the purpose of having a .parquet file to begin with.

mol <- metadata(xen)$transcripts
mol <- read_parquet(mol, as_data_frame=FALSE) # this is important
idx <- pull(mol, "cell_id")
idx <- match(xen$cell_id, idx)
fov <- pull(mol[idx, ], "fov")

Metadata

Metadata

Labels

enhancementNew feature or request

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions