-
Notifications
You must be signed in to change notification settings - Fork 92
Description
Feature Request
Problem
Right now the hash of large files in a store can take quite a long time. For example, on a ~200GB raw data file, the has took >200 seconds, which results in a long delay before data from that file can be used.
Requirements
Ideally the system might store the hash, the size, and the modification date, and would first check the size and date for a match. Given a matching size and date, that seem sufficient to ensure that it's the same file. This could also be a .config option for the store.
Justification
Provide the key benefits in making this a supported feature. Ex. Adding support for this feature would ensure [...]
Alternative Considerations
Do you currently have a work-around for this? Provide any alternative solutions or features you've considered.
Related Errors
Add any errors as a direct result of not exposing this feature.
Please include steps to reproduce provided errors as follows:
- OS (WIN | MACOS | Linux)
- Python Version OR MATLAB Version
- MySQL Version
- MySQL Deployment Strategy (local-native | local-docker | remote)
- DataJoint Version
- Minimum number of steps to reliably reproduce the issue
- Complete error stack as a result of evaluating the above steps
Screenshots
If applicable, add screenshots to help explain your feature.
Additional Research and Context
Add any additional research or context that was conducted in creating this feature request.
For example:
- Related GitHub issues and PR's either within this repository or in other relevant repositories.
- Specific links to specific line or focus within source code.
- Relevant summary of Maintainers development meetings, milestones, projects, etc.
- Any additional supplemental web references or links that would further justify this feature request.