Skip to content

refactor(metro-file-map): Pre-resolve symlink targets and store normal posix paths#1687

Open
kitten wants to merge 6 commits into
facebook:mainfrom
kitten:@kitten/refactor/metro-file-map/preresolve-symlinks-and-normalize
Open

refactor(metro-file-map): Pre-resolve symlink targets and store normal posix paths#1687
kitten wants to merge 6 commits into
facebook:mainfrom
kitten:@kitten/refactor/metro-file-map/preresolve-symlinks-and-normalize

Conversation

@kitten
Copy link
Copy Markdown
Contributor

@kitten kitten commented Apr 11, 2026

Summary

Note

This is part of the same pick/patch-set from an experiment on the Expo repo,
like #1676, #1677, and #1686, for metro-file-map (conflicts are expected, since changes were pulled out into separate PRs for clarity)

The symlinks have two oddities to them compared to other file data operations and entries:

  • a separate WeakMap cache is kept for resolved symlink targets that have been converted
  • the values stored for symlink targets may be absolute paths and are unnormalised system paths

The WeakMap (in theory) is inefficient especially for projects with a larger amount of symlinks. It's also not necessary since the values can be trivially pre-resolved to a normal path. We can additionally store this value as a posix normal path, to make sure that it's identical between systems, which may help testing.

Edit: A primitive (LLM-generated) benchmark that shows that performance remains within margins with this change can be found at kitten@eee7a82

Changelog: [Internal] Pre-resolve symlinks and store normalized symlink targets in file map

Test plan

  • Unit tests were added to demonstrate the changes
  • This can further be tested E2E in the linked Expo PR, although not in isolation (if necessary I can create a new, isolated branch with a pnpm patch for these changes)

kitten added 4 commits April 11, 2026 01:37
The WeakMap is likely more expensive and this can leave unnormalized
absolute paths in the file map cache.

NOTE: This bumps the `CACHE_BREAKER` value!
This keeps the metro-file-map value system-stable. Before they could
store absolute paths, and system paths, however, this is now normalized
to posix normal paths.
@meta-cla meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 11, 2026
@facebook-github-tools facebook-github-tools Bot added the Shared with Meta Applied via automation to indicate that an Issue or Pull Request has been shared with the team. label Apr 11, 2026
const cachedResult = this.#cachedNormalSymlinkTargets.get(symlinkNode);
if (cachedResult != null) {
return cachedResult;
}
Copy link
Copy Markdown
Contributor

@robhogan robhogan Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the face of it, this looks like it could be a perf regression, because we're introducing a new allocation and some string manipulation every time the same symlink is traversed, vs once per symlink.

I'd be interested to see a benchmark on a symlink-heavy project. Weakmap lookups should generally be very fast.

Copy link
Copy Markdown
Contributor Author

@kitten kitten Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was admittedly tied to the on-demand file system change, which necessitated changes here anyway. It then made sense to pre-normalise values which ended up being a net neutral change (performance-wise) across all combined changes (!)

I can give benchmarking this in isolation a go next week 👍

Together with the lazy symlink resolution I would expect this to be mainly a neutral change since the old code paths assumed a low amount of symlinks. But I don't have an intuition of what this would look like in a project with many symlinks where the cost isn't negligible. My intuition overall with this change was that the cost of WeakMap lookups can add up faster than computing the path once, provided symlink resolution is lazier than it is now

Copy link
Copy Markdown
Contributor Author

@kitten kitten Apr 22, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to show that it's mostly neutral: kitten@eee7a82

Intuitively, the same amount of work happens (at least for macOS), since the path is prenormalised and stored in the file tree data structure, which means that the work (which was negligible to begin with) is now happening earlier and makes the WeakMap obsolete.

I didn't test this on Windows, where presumably there's a small additional cost due to the path normalisation (win -> posix -> win), which keeps the file map data structure portable. We could simply skip this and keep this denormalised, but the parity is nice imo

Comment on lines +1223 to +1225
ancestorOfRootIdx: this.#pathUtils.getAncestorOfRootIdx(symlinkTarget),
normalPath: symlinkTarget,
startOfBasenameIdx: symlinkTarget.lastIndexOf(path.sep) + 1,
Copy link
Copy Markdown
Contributor

@robhogan robhogan Apr 19, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we're not caching, we may be better off computing some of this lazily (and eg, just return the normalPath here).

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed the intermediate data structure in 0a64351

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. Shared with Meta Applied via automation to indicate that an Issue or Pull Request has been shared with the team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants