Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: use path.join from stdlib instead of custom join #1861

Merged

Conversation

scolladon
Copy link
Contributor

@scolladon scolladon commented Jan 26, 2024

Implementation refactoring

The goal of this PR is to propose a way to deal with path join that

  • uses as much as possible the node API
  • still validates the corner cases
  • execute fast

Discussion

I also tried to use path.join directly but it does not fulfill the corner cases where the output should differ according to the spec. I found it harder to deal/read/maintain with those corner cases just with path.join than just replacing normalizePath with path.normalize and just deal with the remaining special case.

I don't know why the library need to deal differently than path.join for some cases. It would require a lot of spiking and I actually do not really know where to start the investigation.
In order to remove the internal join we need to be sure we can get rid of those special behavior first, and make sure webpack has a path.posix.join implementation.

closes #1852

@scolladon scolladon changed the title test: parameterized internal join refactor: internal join to use path.normalize Jan 26, 2024
@scolladon scolladon changed the title refactor: internal join to use path.normalize refactor: internal join to use path.normalize Jan 26, 2024
@jcubic
Copy link
Contributor

jcubic commented Jan 26, 2024

This fails two tests, the bundle is more than 300KB. And it fails tree shaking. I think it bundles the whole path module from NodeJS for the browser.


export function join(...parts) {
return normalizePath(parts.map(normalizePath).join('/'))
let normalizedPath = normalize(parts.map(normalize).join('/'))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it's necessary to normalize each part passed in as normalize is already going to do that on the final result...so it's redundant.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with this assumption as well at first.
But it fails the tests (both nominal cases and special cases)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd be interested to know why. That seems really odd to me. Unless the .join('/') is too naive and causing problems.

btw, in Node.js, path.join already calls normalize, so perhaps we could leverage that as well.

(sometimes the abstractions in isomorphic-git just seem way overdone).

Copy link
Contributor Author

@scolladon scolladon Jan 26, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the difference with node path.join algorithm is that isomorphic-git "normalize" every parameters before joining them altogether and normalize the result, whereas node path.join normalize only after joining every parts.

Pass all the tests (I'll commit that):

let normalizedPath = path.join(...parts.map(path.normalize))

Does not pass the when "internal join" generates paths differently from "path.join" tests:

let normalizedPath = path.join(...parts)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why isomorphic-git would rely on logic that normalizes each segment first. That just strikes me as odd. Maybe there is a good reason. I'd like to know which test case in particular is failing so we can understand why the behavior is different, and why it is relying on that behavior.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Me too, I don't understand why isomorphic-git need this extra special cases... I don't see the use cases.

All the tests case when "internal join" generates paths differently from "path.join" using this implementation:

let normalizedPath = path.join(...parts)

Screenshot 2024-01-27 at 09 53 19
Screenshot 2024-01-27 at 09 59 02
Screenshot 2024-01-27 at 09 59 11
Screenshot 2024-01-27 at 09 59 20

Those "special" test case are a bit surprising because they fail mostly by adding a / in front or in the end of the path, the front / is not an issue in the nominal cases and the trailing / could easily be trimmed.
The first two cases dealing with ./ can be treated the same as removing trailing /.
If those special cases can be avoided then we could write it this way:

let normalizedPath = path.join(...parts)

  if (normalizedPath.endsWith('/')) {
    normalizedPath = normalizedPath.slice(0, -1)
  }

  return normalizedPath

And if trailing / is not important then we could use path.join directly imho

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is just to satisfy unit tests, then I'd say the tests are overzealous. From the comment at the top of join.js, it seems to me like the intention was to replace path.posix. If that's the case, it's path.posix to which we should be aligning, NOT this custom implementation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I totally agree on the principle, I did not want to force this kind of design as I assumed it was there for a good reason even if I don't know it.

Anyway, I just tried using path.join with my application and it works.
So I'll update this PR so the code uses the src/utils/path.js implementation

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the actual src/utils/path.js is not treeshakeable.
It is detected by agadoo.
It was not detected before because treeshakeability test is done after rollup, and, as src/utils/path.js was not used it was not included in index.js then it was not a problem for agadoo.
Now it is a problem because we use it instead of join...

I tried

  • to make the export of path work, without success ❌
  • upgrading rollup, without success ❌
  • upgrading agadoo, without success ❌
  • using another lib that behave always the same instead of path (upath), success ✅

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I assumed it was there for a good reason even if I don't know it.

I looked at the history and there really doesn't seem to be a good reason other than to avoid double slash, which all these implementation options do.

src/utils/join.js Outdated Show resolved Hide resolved
@scolladon scolladon changed the title refactor: internal join to use path.normalize refactor: use path.posix.join instead of "internal" join Jan 27, 2024
@scolladon scolladon force-pushed the refactor/internal-path-join branch 4 times, most recently from abb1032 to 98c4689 Compare January 27, 2024 15:05
@scolladon scolladon changed the title refactor: use path.posix.join instead of "internal" join refactor: use upath.joinSafe instead of "internal" join Jan 27, 2024
@scolladon scolladon force-pushed the refactor/internal-path-join branch 2 times, most recently from 39a7db2 to ba26099 Compare January 27, 2024 17:11
@scolladon
Copy link
Contributor Author

What needs to be done for this PR to move forward ?
What should I do ? @mojavelinux @jcubic

@mojavelinux
Copy link
Contributor

mojavelinux commented Feb 12, 2024

I have reservations about this solution. First, I don't think renaming join to joinPath everywhere is the right thing to do. That exposes too much about the implementation chosen. The new function should be remapped to the name join so the actual function's name only appears where it is required/imported. (That also reduces the severe number files that are being changed by this PR).

Second, I strongly dislike adding dependencies, especially ones that are basic utilities that can be easily rewritten (and done so efficiently). The dependency you added is also written in CoffeeScript, which makes it highly suspect (a completely unnecessary abstraction for what it does, IMO). If we're not going to use the path library from stdlib, then the code for join should be in this repository (it's just not that complicated).

On a side note, I find it completely unfathomable that the compiler does not understand how to inline the join function from Node's stdlib. What's the point of the compiler if it can't handle such a primordial function? I'll admit, I don't really get the isomorphic aspect of this library, so I'm a bit biased towards Node.js here.

@jcubic
Copy link
Contributor

jcubic commented Feb 12, 2024

Another issue is that tests are still failing because of Android 10 / Chrome 120. So even if everyone agrees that this is the way to go, this can't be merged until the tests are fixed.

@scolladon
Copy link
Contributor Author

Ok then maybe we are fine with the actual implementation ?
I'm ok to close this PR without merging it

@mojavelinux
Copy link
Contributor

I'm not fine with the actual implementation.

I also don't think it's appropriate to be holding up all fixes because Android tests are failing. I think that profile should be removed until such time someone figures out why it is failing. Android is not the core audience of this library and blocking releases because of an infrastructure problem is not a good idea.

@jcubic
Copy link
Contributor

jcubic commented Feb 12, 2024

@mojavelinux I will disable the tests tomorrow maybe.

@scolladon scolladon force-pushed the refactor/internal-path-join branch 2 times, most recently from 303927c to 2367afc Compare February 15, 2024 09:56
@scolladon
Copy link
Contributor Author

@mojavelinux in the last version I let webpack use browserify to generate path dependency.
I removed the tests that are checking for cases outside the join classic implementation and let this implementation in the utils/join as it serves now as a OS Abstract Layer (OSAL).
I can remove this file and the tests related to it as it is not needed anymore, or we can keep the OSAL.

@mojavelinux
Copy link
Contributor

That seems reasonable to me.

@scolladon scolladon changed the title refactor: use upath.joinSafe instead of "internal" join refactor: use webpack path.join instead of "internal" join Feb 16, 2024
@scolladon
Copy link
Contributor Author

That seems reasonable to me.

What should I do to make this PR move forward ? @jcubic @mojavelinux

@jcubic
Copy link
Contributor

jcubic commented Mar 19, 2024

@mojavelinux does the code ok now. You did the code review. If it's ok then you can merge, and if there is something other to be done. Maybe change something. The tests are passing now. The code looks good to me.

@mojavelinux
Copy link
Contributor

That's great news about the tests!

I now approve this implementation.

I'm not sure the subject of the PR is correct though. Should it say "refactor: use path.join from stdlib instead of custom join" or something like that? Once I get confirmation, I'll update it and move forward with the merge.

@scolladon
Copy link
Contributor Author

I agree with you @mojavelinux, the title should be changed and your suggestion fits perfectly.
I'll change the title

Thank you guys for your help and time 🙏

@scolladon scolladon changed the title refactor: use webpack path.join instead of "internal" join refactor: use path.join from stdlib instead of custom join Mar 20, 2024
@scolladon
Copy link
Contributor Author

Hello here @mojavelinux @jcubic !
Could it be merged ?

@jcubic
Copy link
Contributor

jcubic commented May 29, 2024

@mojavelinux you did the review, if you think it's ok, you can merge.

@mojavelinux
Copy link
Contributor

I'll go ahead an merge.

@mojavelinux mojavelinux merged commit bbcdda7 into isomorphic-git:main May 31, 2024
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve normalizePath performance
3 participants