Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: use lstat bigint=true for local environments to fix #1245 #1778

Open
wants to merge 14 commits into
base: main
Choose a base branch
from

Conversation

jayree
Copy link
Contributor

@jayree jayree commented Jun 18, 2023

I'm fixing a bug or typo

  • if this is your first time contributing, run npm run add-contributor and follow the prompts to add yourself to the README
  • Browser: what should be returned by SecondsNanoseconds if nanoseconds === undefined
  • squash merge the PR with commit message "fix: [Description of fix]"

this fix uses lstat with bigint: true to be able to compare mtimeNs with mtime nanosecond from git index

@jayree jayree force-pushed the bigint-fix branch 2 times, most recently from 3cbf517 to ce73521 Compare June 18, 2023 18:17
@dead-end
Copy link
Contributor

Hi, I just saw that you commit a fix for #1245. I tried this a few weeks ago and we had to revert my fix. The issue was that there were small differences in the timestamp in the index file and what lstat returned although the content did not changed. The result was that the sha1 had to be computed for files that did not change. For a large repo the time to get the statusmatix took minutes.

I do not know where the small differences in the timestamp came from, but I found that the whole issue is known as race git, which is described in the article.

Are you sure that your fix works?

@jcubic
Copy link
Contributor

jcubic commented Jun 18, 2023

The tests are not passing, it seems that the pipeline run for way too long as it uses it. I'll try to restart the pipeline if it still fails it's probably an issue with the PR.

@jayree
Copy link
Contributor Author

jayree commented Jun 19, 2023

I do not know where the small differences in the timestamp came from,

@dead-end these small differences in nanoseconds occur because the exact nanoseconds cannot be calculated from milliseconds const nanoseconds = (milliseconds - seconds * 1000) * 1000000. I'm using mtimeNs and ctimeNs (https://nodejs.org/api/fs.html#statsctimens) but this fs option doesn't seem to be compatbile with the browser @jcubic

@jcubic
Copy link
Contributor

jcubic commented Jun 19, 2023

Can you write a unit test for the use case in #1245? When the author is writing the number to the file. If it works we will be able to close that issue.

__tests__/test-status.js Outdated Show resolved Hide resolved
@jayree jayree marked this pull request as draft June 19, 2023 15:37
@jayree
Copy link
Contributor Author

jayree commented Jun 19, 2023

@jcubic before merging, we need to evaluate if the changes SecondsNanoseconds break something else. normalizeStats is also used in other cases!

@jayree jayree force-pushed the bigint-fix branch 8 times, most recently from 0ad5b73 to 4c2d6f1 Compare June 19, 2023 23:11
@jayree jayree force-pushed the bigint-fix branch 2 times, most recently from 8b22ec6 to a2aaf2f Compare June 19, 2023 23:43
@jayree jayree marked this pull request as ready for review June 20, 2023 07:58
@dead-end
Copy link
Contributor

@jayree: Sorry, I am still thinking about the differences in the timestamps.

We want to check if a file has changed and for this we compare the lastmodified time of the file with the lastmodified time in the index file. In isomorphic-git both are computetd with lstat so both should be the same if the file did not change. If there is a precision issue than this issue is on both sides.

So I think the issue occurs if users use their repository with git and isomorphic-git. If you stage a file with git and then call statusMatrix with isomorphic-git then you compare the git lstat with the isomorphic-git lstat and there can be differences. And if you have differences then you throw away the cache and get a performance issue.

I think it would be a good idea to write a test that addresses this issue. We can create a repository with git that contains a file and add it to the test repos and then compare the isomorphic-git lstat values with the values in the index file.

@@ -222,6 +222,12 @@ export class FileSystem {
async lstat(filename) {
try {
const stats = await this._lstat(filename)
// For non-browser (local) scenarios regular git 'add' command might be used to write the index. Therefore we need to have the exact nanoseconds (see. normalizeStats.js SecondsNanoseconds ).
Copy link
Contributor Author

@jayree jayree Jun 20, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I think the issue occurs if users use their repository with git and isomorphic-git. If you stage a file with git and then call statusMatrix with isomorphic-git then you compare the git lstat with the isomorphic-git lstat and there can be differences. And if you have differences then you throw away the cache and get a performance issue.

@dead-end thats right, and is mentioned here. if you use native git the timestamp (nanoseconds) is more accurate than the timestamp ('faked' nanoseconds by using ms * 1000000) in isomorphic-git. (this is also the reason why the git index is broken after isomorphic-git tries to rebuild it, becuase it uses the 'faked' nanoseconds)

Thats why we need to differentiate between browser and local. With bigint: true isomorphic-git is able to create the same timestamp (nanoseconds) as native git does but this option is not supported by browsers.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jayree You have written:

(this is also the reason why the git index is broken after isomorphic-git tries to rebuild it, becuase it uses the 'faked' nanoseconds)

Can you explain how the faked nanos lead to the broken index file? We have a lot of issues with this and I would like to understand it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dead-end with the old behaviour nanoseconds were calculated based on ms.

From a file with mtimeMs: 1683216016493.005
you'll get 1683216016s and 493.0048828125ms --> 493004882.8125ns
Due to the numeric data types, floating point values are used, which is not the case in the BigInt data structure.
If you use the mtimeNs: 1683216016493004997n (bigint) value
you'll get 1683216016s and 493004997ns. These are the vaules git uses in the index.

The old logic fixing the #1245 compared 493004882.8125ns with 493004997ns which caused Isomorphic-git to think the index is broken telling isomorphic-git to rebuild the index, but with 1683216016s 493004882.8125ns wich doesn't match the real ns timestamp of the file.

@@ -117,4 +117,15 @@ describe('GitIndex', () => {
expect(index.entriesMap.get('c').stages.length).toBe(1)
})
})

it('ensure nanoseconds are always present', async () => {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be a good idea to write a test that addresses this issue. We can create a repository with git that contains a file and add it to the test repos and then compare the isomorphic-git lstat values with the values in the index file.

@dead-end We can try to enhance this one but I don't know if we can hardcode timestamps to test them the way you want.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you create a repo with git, then this should have an index file with the timestamps. Call GitIndex.fromBuffer to get the index content and then call lstat for each entry and compare the values. I think this should be sufficient.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dead-end tried it with this approach: e6c277a but as 'nano.txt' is checked out on the build server, the timestamp changes, so the test will always fail.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jayree Sorry, I did not think of that. The only possibilities I can think of are calling git from node or put the nano.txt in a zipfile and unzip the file during the test run (this should preserve the timestamps). I do not know if this is possible.
@jcubic

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dead-end I'm simply using the isomorphic-git repo now

@jayree jayree force-pushed the bigint-fix branch 3 times, most recently from 38c7626 to 0fa17a2 Compare June 20, 2023 12:57
@jayree jayree changed the title fix: use lstat bigint=true to fix #1245 fix: use lstat bigint=true for local environments to fix #1245 Jun 20, 2023
@jayree
Copy link
Contributor Author

jayree commented Jun 25, 2023

@jcubic The same question should isomorphic-git and native Git be able to coexist? Should isomorphic-git be able to read/write to an git index created with git add and vice versa?

@jayree jayree force-pushed the bigint-fix branch 3 times, most recently from 0fa17a2 to 195ec89 Compare June 26, 2023 05:05
@jcubic
Copy link
Contributor

jcubic commented Jun 26, 2023

@jayree yes, I think so, isomorphic-git is stateless everything happens inside the filesystem (in .git directory).

@jayree jayree force-pushed the bigint-fix branch 4 times, most recently from ce1fda3 to 8df074a Compare June 26, 2023 14:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants