#23 Gin Windows client freezes when committing lots of files

Open
opened 4 years ago by seharak · 9 comments

Versions

  • gin client for Windows, version 1.8 build 001390
  • git version 2.21.0.windows.1
  • git-annex version 7.20190508-g8b44548d0

What I experienced

Please refer to the screenshots, if it is helpful.

  • Managing the repository on the C drive
  • Add ~300 files at once (~35 MB/file), and tried to commit
  • The client logged :: Adding file changes, and listed up all the files being registered
  • The commit process did not go further
  • Waited for 3 days to see if anything changes, but nothing happened further
  • Pressing Ctrl+C aborted the process.
  • Running gin ls after it froze the console window; aborted it again.
  • Retrying gin commit seemed to have resumed the process.
  • Later gin ls also went successful.

screenshot 1

screenshot 2

You may be able to see that I aborted the process around the line #5 of the second screenshot.

Can you track down what went wrong? It was difficult for me to observe what step is problematic.

### Versions - gin client for Windows, version 1.8 build 001390 - git version 2.21.0.windows.1 - git-annex version 7.20190508-g8b44548d0 ### What I experienced Please refer to the screenshots, if it is helpful. - Managing the repository on the C drive - Add ~300 files at once (~35 MB/file), and tried to commit - The client logged `:: Adding file changes`, and listed up all the files being registered - The commit process did not go further - Waited for 3 days to see if anything changes, but nothing happened further - Pressing `Ctrl`+`C` aborted the process. - Running `gin ls` after it froze the console window; aborted it again. - Retrying `gin commit` seemed to have resumed the process. - Later `gin ls` also went successful. ![screenshot 1](https://gin.g-node.org/attachments/20409c27-c968-4ebd-b97e-2c7ac19fb213) ![screenshot 2](https://gin.g-node.org/attachments/d538cc8c-765e-4d12-97aa-d4ab0286830b) You may be able to see that I aborted the process around the line #5 of the second screenshot. Can you track down what went wrong? It was difficult for me to observe what step is problematic.
Achilleas Koutsou commented 4 years ago
Owner

Hi Keisuke Sehara, thanks for the report.

I haven't managed to reproduce this issue consistently, though I have seen it pop up from time to time. I'm going to try and reproduce your situation (almost) exactly, with the same number and size of files. I'll leave this issue open and update it with anything I find and I'll let you know when it's fixed.

Hi Keisuke Sehara, thanks for the report. I haven't managed to reproduce this issue consistently, though I have seen it pop up from time to time. I'm going to try and reproduce your situation (almost) exactly, with the same number and size of files. I'll leave this issue open and update it with anything I find and I'll let you know when it's fixed.
Keisuke Sehara commented 4 years ago
Poster

Thanks, Achilleas!

Thanks, Achilleas!
Achilleas Koutsou commented 4 years ago
Owner

Update on the current state:

There are a couple of candidates for the root cause of this issue. I've eliminated one which I believe was the most likely and was definitely causing the client to hang at the end of a gin commit. This was the client performing a git add immediately after finishing a git annex add. This isn't a problem itself, but it does mean that the adding was done twice. This still doesn't explain why it wasn't completing even after 3 days, but it is possible that the second add was putting the client in a hanged state

I created a test repository to match your description (300 files, ~35 MiB each) and tested doing a commit on Windows. After the change, it seems to be running much faster. There is still a point after the files are added and before the commit where the client will appear to hang. During this time, the client is adding metadata to the annex about the names of files that have been added. I intent to make this run concurrently with the add command to avoid having the user wait for long and print appropriate info so the user knows what's going on.

If you want to test the current state, you can get a dev version of the client here: https://gin.g-node.org/achilleas/gin-cli-builds If you decide to use a dev build for testing, please set up a testing repository and don't use your primary data repository.

Update on the current state: There are a couple of candidates for the root cause of this issue. I've eliminated one which I believe was the most likely and was definitely causing the client to hang at the end of a `gin commit`. This was the client performing a `git add` immediately after finishing a `git annex add`. This isn't a problem itself, but it does mean that the adding was done twice. This still doesn't explain why it wasn't completing even after **3 days**, but it is possible that the second `add` was putting the client in a hanged state I created a test repository to match your description (300 files, ~35 MiB each) and tested doing a commit on Windows. After the change, it seems to be running much faster. There is still a point after the files are added and before the commit where the client will appear to hang. During this time, the client is adding metadata to the annex about the names of files that have been added. I intent to make this run concurrently with the `add` command to avoid having the user wait for long and print appropriate info so the user knows what's going on. If you want to test the current state, you can get a dev version of the client here: https://gin.g-node.org/achilleas/gin-cli-builds If you decide to use a dev build for testing, please set up a testing repository and don't use your primary data repository.
Keisuke Sehara commented 4 years ago
Poster

Hi,

thanks for the update! I don't know when I can actually test, but I will keep this in mind.

I agree that, although the add process works flawlessly, the client does not seem to respond after the completion of the process. On my side, I changed the strategy these days, and try to abort the process as soon as I feel that the client hanged. Every time I abort and re-execute gin commit, I saw the number of checked-in files getting smaller, and eventually the commit was complete (after 6-7 runs). So, even though the client seems hanged, some of the concurrent processes are getting completed behind the scene. For the moment, I will tell my colleagues about this "symptomatic solution".

thanks again for your efforts.

Hi, thanks for the update! I don't know when I can actually test, but I will keep this in mind. I agree that, although the `add` process works flawlessly, the client does not seem to respond after the completion of the process. On my side, I changed the strategy these days, and try to abort the process as soon as I feel that the client hanged. Every time I abort and re-execute `gin commit`, I saw the number of checked-in files getting smaller, and eventually the commit was complete (after 6-7 runs). So, even though the client seems hanged, some of the concurrent processes are getting completed behind the scene. For the moment, I will tell my colleagues about this "symptomatic solution". thanks again for your efforts.
Achilleas Koutsou commented 4 years ago
Owner

I'm still working to make this as painless as possible. I'm certain now that the real issue is the metadata writing. I could remove the whole thing, but that would mean making some other changes, so I'm not ready to finalise it yet. I'll let you know when there's an update.

I'm still working to make this as painless as possible. I'm certain now that the real issue is the metadata writing. I could remove the whole thing, but that would mean making some other changes, so I'm not ready to finalise it yet. I'll let you know when there's an update.
Achilleas Koutsou commented 4 years ago
Owner

It seems that part of this freezing is unavoidable when adding new files to a repository. The delay occurs when there are a lot of files being added that don't go into the annex but instead are checked into git. I don't know if there is a way around this right now, or if there will be in the near future. For now, I'll see if I can add some useful output during that operation so that it doesn't look like it's frozen.

All that said, it still shouldn't be running for 3 days, so I'll keep looking into what else might be going on.

It seems that part of this _freezing_ is unavoidable when adding new files to a repository. The delay occurs when there are a lot of files being added that don't go into the annex but instead are checked into git. I don't know if there is a way around this right now, or if there will be in the near future. For now, I'll see if I can add some useful output during that operation so that it doesn't look like it's frozen. All that said, it still shouldn't be running for **3 days**, so I'll keep looking into what else might be going on.
Keisuke Sehara commented 4 years ago
Poster

Thanks for working with it!

All that said, it still shouldn't be running for 3 days, so I'll keep looking into what else might be going on.

Just to make sure: it just hangs without any real work going behind the scene. It seems that the gin process can be readily killed after ~1 min or so.

After some "run-and-kill"s, all the subprocesses (or child threads / goroutines?) will have finished, and only then will gin commit finish properly.

Thanks for working with it! > All that said, it still shouldn't be running for **3 days**, so I'll keep looking into what else might be going on. Just to make sure: it just hangs without any real work going behind the scene. It seems that the `gin` process can be readily killed after ~1 min or so. After some "run-and-kill"s, all the subprocesses (or child threads / goroutines?) will have finished, and only then will `gin commit` finish properly.
Achilleas Koutsou commented 4 years ago
Owner

Yeah, I've been experimenting with a few situations and this does happen as well. It's possible the underlying shell doesn't return properly if it's been running for a while, or it does and gin-cli just doesn't close it. I'm going through a bit of a restructuring of the gin-cli internals and I'm hoping this might shed some light on this particular bug.

Yeah, I've been experimenting with a few situations and this does happen as well. It's possible the underlying shell doesn't return properly if it's been running for a while, or it does and gin-cli just doesn't close it. I'm going through a bit of a restructuring of the gin-cli internals and I'm hoping this might shed some light on this particular bug.
Keisuke Sehara commented 3 years ago
Poster

Now I have more info on this issue (sorry that I do things quite slowly).

I think the gin commit command needs to add all files through git annex or git, instead of splitting the jobs into goroutines (please see below for more details).

GIN command line client 1.11 Build 001459 (103e923620de08ff77cf290e0b840a6d66cebfe1)
  git: 2.28.0
  git-annex: 8.20200908

Test case

I have lots (500–1000) of medium-sized (100–400 KB) files, and I checked them into git-annex all at once.

Method 1

I thought serializing all the procedures could prevent accidental hangs, so I did something like below:

cp 'files/located/somewhere/else' $DIR

for FILE in `find $DIR`; do
  gin git add -v $FILE
done 

Observations

I saw an exponential increase in the latency to check a file in. At first, a file was checked in every 0.3 sec or so, but the latency became >30 secs per file e.g. after 500–1000 files. The total time for one commit was more than 12 hours.

Speculation

I suspect that, when one call git add 'A/B/file', tree objects for directory A or B is made from scratch, with already-existing/indexed child objects being added to them, every time. That can be a reasonable explanation for the exponential increase in the latency.

Method 2

Now I used the following:

cp 'files/located/somewhere/else' $DIR
gin git add -v $DIR

and the latency stayed constant for all files from the beginning to the end. The total time for a commit was less than 30 minutes.

Now I have more info on this issue (sorry that I do things quite slowly). I think the `gin commit` command needs to add all files through `git annex` or `git`, instead of splitting the jobs into goroutines (please see below for more details). ``` GIN command line client 1.11 Build 001459 (103e923620de08ff77cf290e0b840a6d66cebfe1) git: 2.28.0 git-annex: 8.20200908 ``` ## Test case I have lots (500–1000) of medium-sized (100–400 KB) files, and I checked them into git-annex all at once. ## Method 1 I thought serializing all the procedures could prevent accidental hangs, so I did something like below: ```bash cp 'files/located/somewhere/else' $DIR for FILE in `find $DIR`; do gin git add -v $FILE done ``` ### Observations I saw an exponential increase in the latency to check a file in. At first, a file was checked in every 0.3 sec or so, but the latency became >30 secs per file e.g. after 500–1000 files. The total time for one commit was more than 12 hours. ### Speculation I suspect that, when one call `git add 'A/B/file'`, tree objects for directory `A` or `B` is made from scratch, with already-existing/indexed child objects being added to them, every time. That can be a reasonable explanation for the exponential increase in the latency. ## Method 2 Now I used the following: ```bash cp 'files/located/somewhere/else' $DIR gin git add -v $DIR ``` and the latency stayed constant for all files from the beginning to the end. The total time for a commit was less than 30 minutes.
Sign in to join this conversation.
No Milestone
No assignee
2 Participants
Loading...
Cancel
Save
There is no content yet.