Releases: coldint/coldint_validator
Releases · coldint/coldint_validator
Version 1.0.7
Summary of the most important changes:
- Slowly rotating pool of cached samples.
- Up to 20k samples and losses are cached for all available models
- Much faster step times: models in the pools typically don't have to be evaluated over and over on new samples.
- Refresh rate: 500 samples per 4 hours.
- Increase samples/step to 1500 (was 800) for more fine-grained model comparison
- Remove logging of selected sample pages, because the sample pool should remain private to the validator.
- Adjusted model re-evaluation criteria:
- All models: once every 48 hours (if not marked as non-competitive when wins and abs wins <2%)|
- Top models: once every 8 hours
- Top model criterion: one of top 10 validators >=0.05 weight (was: 0.1)
- Duplicate commitments of the same repo are filtered out
- Model content duplications are filtered out
- Sync/download thread obeys a TTL so that top and new models have priority over re-evaluation of other models
- Improved logging about which uids/models are visited
Version 1.0.4
- overall reduction of logging noise
- add btlite.py that replaces bittensor lib e.g. for weight-setting with better logging and error handling
- add btlite_test.py to test weight setting, including 'havoxy' (a really bad proxy that intentionally injects issues)
- lots of weight setting improvements
- reworked model eval loop completely (too many things to describe in detail here)
- added disk space management: larger disk means less model re-downloads!
- proper handling of SIGINT (i.e. ctrl-C)
- improve logging and exception handling for subprocesses
- work around huggingface_hub file locking issue
Please see Discord for additional background and details.
Version 1.0.2
This release should fix several issues:
- subtensor connection stability: sometimes metagraph syncing and/or weight settings fails, which seems to be caused by a defunct subtensor connection. This connection is now recreated on detection of these errors
- bittensor library pinned to v3.7.1, to prevent problems relating to bt.logging
- Removed a bogus error message about competitions-uid
Version 1.0.1
This release contains may small and some large improvements:
- sliced Llama, Phi and Phi3 evaluation, unlocking virtually unlimited model size
- 30% speedup of model evaluation
- period re-evaluation of all models
- sync model/metagraph rewrite
- subprocess logging improvement
- improved error handling and reporting
- reduction/prevention of excessive errors/logging
- logging of more statistics
- even more runtime configurable parameters
- improved competitions.json format
- fixes for tokenizer hacks
- allow for a limited number of tokenization issues (true fix under investigation)
- various clarifications and simplifications
Version 1.0.0
This release consists of many improvements and introduces several new features:
- Competitions. The validator now supports multiple competitions, with settings fetched from the coldint/sn29 repo (competitions.json).
- Include tokenizer with model (in upload and download). Competitions can either have a fixed tokenizer, or load the one included with the model on huggingface.
- Many constants are now properties of the competition.
- Loss sum is computed instead of average, although both are displayed in several places. The sum is more meaningful when comparing different tokenizers; this is a first step in exploring that comparison.
- Internal state reworked. To add competition support and make things somewhat simpler
- Option to save json output which can be used for a leaderboard (our leaderboard will be included in the codebase soon)