omit realizing transcripts

Hi there. I noticed that the `read_parquet` call [here](https://github.com/drighelli/SpaceTrooper/blob/790891ef75b30680e0e8d6b33fa98e4a8f014fed/R/readXenium.R#L188) realizes transcripts into memory as a `data.frame`. Instead, I would propose something like the following, which would realize only i) cell IDs and ii) unique FOV entries of the cell IDs present in the data. This also prevents costly `dplyr` computations (`group_by`, `select`, `distinct`, `left_join`) on a `data.frame` that could potentially contain millions of entries, which beats the purpose of having a .parquet file to begin with.

```
mol <- metadata(xen)$transcripts
mol <- read_parquet(mol, as_data_frame=FALSE) # this is important
idx <- pull(mol, "cell_id")
idx <- match(xen$cell_id, idx)
fov <- pull(mol[idx, ], "fov")
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

omit realizing transcripts #43

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

omit realizing transcripts #43

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions