perf(backend): optimize drive/files query for old root content by caipira113 · Pull Request #16514 · misskey-dev/misskey

caipira113 · 2025-09-03T15:55:44Z

What

This PR speeds up the drive/files endpoint for users who have old content in the root folder by adding a partial index on drive_file (userId, folderId IS NULL, id DESC). There are no API or behavior changes; it only adds a migration.

Why

Some users were hitting statement timeouts because PostgreSQL chose the primary key for ORDER BY id DESC, leading to very wide scans when their root files were old. The new index matches the query shape so the planner can return the latest matching rows immediately.

Additional info (optional)

On an affected account, the query improved from about 50 seconds to about 30-40 ms.

[ TEST Query ]

EXPLAIN (ANALYZE, BUFFERS) SELECT "file"."id", "file"."userId", "file"."userHost", "file"."md5", "file"."name", "file"."type", "file"."size", "file"."comment", "file"."blurhash", "file"."properties", "file"."storedInternal", "file"."url", "file"."thumbnailUrl", "file"."webpublicUrl", "file"."webpublicType", "file"."accessKey", "file"."thumbnailAccessKey", "file"."webpublicAccessKey", "file"."uri", "file"."src", "file"."folderId", "file"."isSensitive", "file"."maybeSensitive", "file"."maybePorn", "file"."isLink", "file"."requestHeaders", "file"."requestIp" FROM "drive_file" "file" WHERE "file"."userId" = '8zzt247op3' AND "file"."folderId" IS NULL ORDER BY "file"."id" DESC LIMIT 31;"

[ Before ]

QUERY PLAN
Limit  (cost=0.43..6225.04 rows=31 width=1997) (actual time=211.896..51068.746 rows=22 loops=1)
  Buffers: shared hit=1702945 read=701470
  ->  Index Scan Backward using "PK_43ddaaaf18c9e68029b7cbb032e" on drive_file file  (cost=0.43..1091916.33 rows=5438 width=1997) (actual time=211.893..51068.629 rows=22 loops=1)
        Filter: (("folderId" IS NULL) AND (("userId")::text = '8zzt247op3'::text))
        Rows Removed by Filter: 4547669
        Buffers: shared hit=1702945 read=701470
Planning:
  Buffers: shared hit=358 read=68
Planning Time: 19.364 ms
Execution Time: 51068.991 ms
(10 rows)

2025-09-03.11.52.56-1.mov

[ After ]

QUERY PLAN
Limit  (cost=0.43..118.09 rows=31 width=1997) (actual time=6.629..30.655 rows=22 loops=1)
  Buffers: shared hit=3814
  ->  Index Scan Backward using "IDX_a76118b66adb3228e0ee69c281" on drive_file file  (cost=0.43..25190.51 rows=6637 width=1997) (actual time=6.627..30.648 rows=22 loops=1)
        Index Cond: (("userId")::text = '8zzt247op3'::text)
        Filter: ("folderId" IS NULL)
        Rows Removed by Filter: 4898
        Buffers: shared hit=3814
Planning:
  Buffers: shared hit=463 dirtied=13
Planning Time: 18.885 ms
Execution Time: 30.709 ms
(11 rows)

2025-09-03.11.54.19-1.mov

Checklist

Read the contribution guide
Test working in a local environment
(If needed) Add story of storybook
(If needed) Update CHANGELOG.md
(If possible) Add tests

github-actions · 2025-09-03T15:57:57Z

このPRによるapi.jsonの差分
差分はありません。
Get diff files from Workflow Page

codecov · 2025-09-03T15:58:16Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 46.06%. Comparing base (e98252a) to head (c0b380b).
⚠️ Report is 925 commits behind head on develop.

Additional details and impacted files

@@             Coverage Diff             @@
##           develop   #16514      +/-   ##
===========================================
+ Coverage    42.56%   46.06%   +3.49%     
===========================================
  Files         1685     1771      +86     
  Lines       170650   182361   +11711     
  Branches      4223     5402    +1179     
===========================================
+ Hits         72643    84009   +11366     
- Misses       97546    98323     +777     
+ Partials       461       29     -432

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

eternal-flame-AD · 2025-09-03T16:53:17Z

Is this because of the ASC/DESC?

It seems the current index is (folderID ASC, id ASC) and that might cause bad query plans if the request is id DESC and some awkward filters are applied. Query planner might assume the data is well distributed and underestimate the cost of scanning the whole table.

If the reason is index order we should probably create index for (..., folderID, id DESC) instead of partial ones because descending queries on subfolders are legal API calls too.

Or maybe just simply CHANGE IDX_860fa6f6c7df5bb887249fba22 ("userId") to ("userId", "id") (which also will help with "list all files for a user in chronological list" case-useful for moderation)? This will 100% be chosen over a PK scan because it has the same depth and is more specific.

caipira113 · 2025-09-04T13:24:48Z

Thanks for the comment @eternal-flame-AD
I re‑ran the queries on the problematic user and a control user to validate the hypotheses. Changing the composite index from (userId, folderId, id) to use DESC on id did not fix the root‑folder case; on this dataset the planner continued to prefer a backward scan on the primary key and the query still took on the order of tens of seconds. By contrast, replacing the single‑column index on userId with a composite (userId, id) consistently made the planner choose that path and brought the worst‑case root‑folder query down from ~43s to ~40ms. The "all files for a user in reverse chronological order" path also benefits and becomes sub‑millisecond, while subfolder queries remain well served by the existing (userId, folderId, id) composite.

eternal-flame-AD · 2025-09-04T13:35:47Z

Thanks! I wonder why postgres refuse to use the original index then .. but yes .. I think (userId, id) should do because there is a practical limit on how many files a single user can create without it being abusing the service so the worst case performance of index scan over list of all files owned by a specific user in ID order should be good enough.

Doesn't seem like there is currently a role limit on # of drive files, but I would say it is more useful to add a limit than to try to optimize the query for that case-because that requires the file storage logic to not rely on the file system (which is basically another index) as well.

github-project-automation Bot added this to [実験中] 管理用 Sep 3, 2025

github-project-automation Bot moved this to Todo in [実験中] 管理用 Sep 3, 2025

dosubot Bot added the size:S This PR changes 10-29 lines, ignoring generated files. label Sep 3, 2025

github-actions Bot added the packages/backend Server side specific issue/PR label Sep 3, 2025

caipira113 force-pushed the perf/optimize-drive-files-query branch from 0d5a916 to fcfbf4a Compare September 8, 2025 12:44

caipira113 added 2 commits September 8, 2025 21:46

perf(backend): optimize drive/files query for old root content

4ec9c3e

perf(backend): use (userId, id DESC) composite index

c0b380b

caipira113 force-pushed the perf/optimize-drive-files-query branch from fcfbf4a to c0b380b Compare September 8, 2025 12:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

perf(backend): optimize drive/files query for old root content#16514

perf(backend): optimize drive/files query for old root content#16514
caipira113 wants to merge 2 commits intomisskey-dev:developfrom
caipira113:perf/optimize-drive-files-query

caipira113 commented Sep 3, 2025 •

edited

Loading

Uh oh!

github-actions Bot commented Sep 3, 2025 •

edited

Loading

Uh oh!

codecov Bot commented Sep 3, 2025 •

edited

Loading

Uh oh!

eternal-flame-AD commented Sep 3, 2025 •

edited

Loading

Uh oh!

caipira113 commented Sep 4, 2025

Uh oh!

eternal-flame-AD commented Sep 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

caipira113 commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What

Why

Additional info (optional)

Checklist

Uh oh!

github-actions Bot commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

codecov Bot commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

eternal-flame-AD commented Sep 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

caipira113 commented Sep 4, 2025

Uh oh!

eternal-flame-AD commented Sep 4, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

caipira113 commented Sep 3, 2025 •

edited

Loading

github-actions Bot commented Sep 3, 2025 •

edited

Loading

codecov Bot commented Sep 3, 2025 •

edited

Loading

eternal-flame-AD commented Sep 3, 2025 •

edited

Loading