#82 more complex spikeglx folder

Đã hợp nhất
sprenger đã nhập 8 commit từ NeuralEnsemble/spikeglx_extended vào [3]s 1 năm trước cách đây

New dataset from Graham Findlay See issue #81.

See also https://github.com/SpikeInterface/spikeinterface/issues/628

This is needed for a patch in neo.

New dataset from Graham Findlay See issue #81. See also https://github.com/SpikeInterface/spikeinterface/issues/628 This is needed for a patch in neo.
sprenger đã nhận xét 2 năm trước cách đây
Chủ sở hữu

Hi @samuelgarcia, thanks for taking care of the upload. Some open questions / comments are:

  • spikeglx/README.txt: the sentence * have a more in the sub folder complete README.md needs to be completed
  • if spikeglx has an internal format versioning it would be good to use this as main folder label instead of sample_data_v2 (or maybe the spikeglx version v.20201103)
  • this dataset contains a lot (264) of tiny files summing up to 33MB in total, Many of the tiny files seem to be duplicates, e.g. /SpikeGLX/5-19-2022-CI5/5-19-2022-CI5_g0/ containing 8 .bin files. Is this duplication essential for the features you want to test or would it be possible to
    • remove some of the duplicate files
    • make the files duplicates on the git-annex level (keeping different filenames, that all link to the same content)
Hi @samuelgarcia, thanks for taking care of the upload. Some open questions / comments are: - `spikeglx/README.txt`: the sentence `* have a more in the sub folder complete README.md` needs to be completed - if spikeglx has an internal format versioning it would be good to use this as main folder label instead of `sample_data_v2` (or maybe the spikeglx version v.20201103) - this dataset contains a lot (264) of tiny files summing up to 33MB in total, Many of the tiny files seem to be duplicates, e.g. `/SpikeGLX/5-19-2022-CI5/5-19-2022-CI5_g0/` containing 8 `.bin` files. Is this duplication essential for the features you want to test or would it be possible to - remove some of the duplicate files - make the files duplicates on the git-annex level (keeping different filenames, that all link to the same content)
Samuel Garcia đã nhận xét 2 năm trước cách đây
Chủ sở hữu

Hi Julia, I will fix the naming and readme.

Theses little bin are not duplicated. They are 10ms recording with several case of the acquisition system : mono/several gate and mono/several trigger. With overlapping or not chunks.

In neo, this will make the segment index a bit more complicated, I am woring on it.

I know that it increase the dataset but I think this is need.

@grahamfindlay: any comments ?

Hi Julia, I will fix the naming and readme. Theses little bin are not duplicated. They are 10ms recording with several case of the acquisition system : mono/several gate and mono/several trigger. With overlapping or not chunks. In neo, this will make the segment index a bit more complicated, I am woring on it. I know that it increase the dataset but I think this is need. @grahamfindlay: any comments ?
sprenger đã nhận xét 2 năm trước cách đây
Chủ sở hữu

@samuelgarcia Ok, but for the bin files that have exactly the same size you don't really care about the values of the samples in there as these only contain signal samples and no metadata, right? So I could replace the content of all bin files of identical size with the content of a single file.

@samuelgarcia Ok, but for the `bin` files that have exactly the same size you don't really care about the values of the samples in there as these only contain signal samples and no metadata, right? So I could replace the content of all `bin` files of identical size with the content of a single file.
sprenger đã nhận xét 2 năm trước cách đây
Chủ sở hữu

Note: I added a commit to lock the files.

Note: I added a commit to lock the files.
Samuel Garcia đã nhận xét 2 năm trước cách đây
Chủ sở hữu

You mean with symbolic link ?

You mean with symbolic link ?
sprenger đã nhận xét 2 năm trước cách đây
Chủ sở hữu

With a symbolic link when the files are locked, but when unlocked the files will be independent, just containing the identical content.

With a symbolic link when the files are locked, but when unlocked the files will be independent, just containing the identical content.
Samuel Garcia đã nhận xét 2 năm trước cách đây
Chủ sở hữu

how we can do that in gin ?

how we can do that in gin ?

Yes, if you don't care about the content of the .bin files, it would be fine to replace their values with the content of a single file.

Caveats:

  • .meta files cannot be consolidated in this way.
  • You may care about the contents of the .bin files if you wish to write tests confirming that they were concatenated/loaded properly, especially in the case of overlapping t-segments.
  • .meta files contain information like hashes for the .bin files, which will obviously no longer be accurate.
  • Although I requested that the acquisition system give me files of consistent duration, there may be some variability in the actual number of samples per file. If you truly make all these .bin filenames point to the same underlying data, meta fields like fileTimeSecs and fileSyzeBytes may be inaccurate.
Yes, if you don't care about the content of the `.bin` files, it would be fine to replace their values with the content of a single file. Caveats: - `.meta` files cannot be consolidated in this way. - You may care about the contents of the `.bin` files if you wish to write tests confirming that they were concatenated/loaded properly, especially in the case of overlapping t-segments. - `.meta` files contain information like hashes for the `.bin` files, which will obviously no longer be accurate. - Although I requested that the acquisition system give me files of consistent duration, there may be some variability in the actual number of samples per file. If you truly make all these `.bin` filenames point to the same underlying data, meta fields like `fileTimeSecs` and `fileSyzeBytes` may be inaccurate.
sprenger đã nhận xét 2 năm trước cách đây
Chủ sở hữu

@samuelgarcia: if two files have the identical content git-annex will automatically only store the content once. So you could (e.g. using gin-cli):

  • unlock all bin files
  • replace the content of all files with identical size by only a single version
  • commit the files again
  • lock the files again
  • upload the locked version of the files
@samuelgarcia: if two files have the identical content git-annex will automatically only store the content once. So you could (e.g. using `gin-cli`): - unlock all `bin` files - replace the content of all files with identical size by only a single version - commit the files again - lock the files again - upload the locked version of the files
Samuel Garcia đã nhận xét 1 năm trước cách đây
Chủ sở hữu

@sprenger : can we merge this ? I already merge your PR into that branch.

@sprenger : can we merge this ? I already merge your PR into that branch.
sprenger đã nhận xét 1 năm trước cách đây
Chủ sở hữu

@samuelgarcia: It's merged. Can you confirm again the merged version works for your tests?

@samuelgarcia: It's merged. Can you confirm again the merged version works for your tests?
Samuel Garcia đã nhận xét 1 năm trước cách đây
Chủ sở hữu
test seams to pass!! https://github.com/NeuralEnsemble/python-neo/pull/1125
Yêu cầu kéo này đã được sáp nhập thành công!
Đăng nhập để tham gia bình luận.
Không có nhãn
Không có Milestone
Không có người được phân công
3 tham gia
Đang tải...
Hủy bỏ
Lưu
Ở đây vẫn chưa có nội dung nào.