-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
onnx: workaround pow with negative base #2439
Conversation
candle-onnx/src/eval.rs
Outdated
// HACK: current implementation of (broadcast_)pow cannot handle negative base; | ||
// this is rarely an issue, but blocked support of silero-vad which uses x^2. | ||
// Rather than fix pow for all negative base, we choose to handle just x^2. | ||
if let Ok(2.0) = (|| input1.flatten_all()?.to_scalar::<f32>())() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than handling the case 2. specifically (which indeed seems very hacky) why not handle any "constant" power with powf, this would feel less like a hack and would handle a lot more cases, e.g. square root etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would require looping through all the tensor values (sort of inline implementation of broadcast_pow). I guess still kind of a hack but if that's preferable I can try that.
edit: ah, not f32::powf
but rather candle's Tensor::powf
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, it also requires switching on the DType
of the tensor, as i32
,u32
don't have powf(self, exp: f32)
defined, but rather pow(self, exp: i32)
.
edit: see above
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep exactly, it would use Tensor::powf
and so work for exponents that are not 2 (and not feel anymore like a hack but rather an optimization for this case).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still left a note in case anyone in the future looks at this thinking "why not just use broadcast_pow
for both cases?". As for optimization, I'm not sure if it is faster or not, this could end up a performance pessimization if powf
isn't using acceleration but broadcast_pow
is.
rather than fully defining pow in the cpu backend (as in #2318), this implements a much smaller change which is sufficient to evaluate silero-vad onnx models. Specifically, checking if pow is run with 2.0 exponent, and if so evaluate as simply `x*x` instead of the cpu backend of `e^(2.0 * ln(x))`.
powf correctly handles a negative base.
Thanks! |
* Bump the version to 0.6.1. (huggingface#2438) * onnx: workaround pow with negative base (huggingface#2439) * onnx: workaround pow with negative base rather than fully defining pow in the cpu backend (as in huggingface#2318), this implements a much smaller change which is sufficient to evaluate silero-vad onnx models. Specifically, checking if pow is run with 2.0 exponent, and if so evaluate as simply `x*x` instead of the cpu backend of `e^(2.0 * ln(x))`. * PR: use Tensor::powf insead powf correctly handles a negative base. * onnx: support negative index in Gather (huggingface#2440) index_select does not support negative indexing, but this change adds just enough workarounds in onnx to allow evaluating silero-vad models (which make use of negative indices). * silero-vad v5 example (huggingface#2321) * silero-vad v5 example This change adds an example of how to run silero-vad v5 * PR: rename 'vad' to 'silero-vad' * Update README.md --------- Co-authored-by: Laurent Mazare <[email protected]> * Fix for parler-tts, do not add the last slice of padding tokens. (huggingface#2442) * Fix for parler-tts, do not add the last slice of padding tokens. * Support for the mini model. * Add FastViT model. (huggingface#2444) * fix: qwen2 lm_head loading huggingface#2443 (huggingface#2445) Co-authored-by: Yi Xu <[email protected]> * Update cudarc to 0.12. (huggingface#2451) * Update cudarc to 0.12. * Some cudnn tweaks. * FastViT fixes. (huggingface#2452) * correct optional SE layer dimensions. * head_dim instead of num_heads is 32. * update test example output. * MobileCLIP models S1 and S2 (huggingface#2454) * Allow loading images with given std and mean * OpenCLIP text encoder component * Two MobileCLIP models * Clippy fixes. --------- Co-authored-by: Laurent <[email protected]> * Fix FLUX.1 weights (huggingface#2457) * fix FLUX.1 weights * added flux1-dev.safetensors * Clippy fixes for 1.81.0. (huggingface#2461) * Clippy fixes for 1.81.0. * Another fix. * Make Error::msg more in line with anyhow::Error::msg * Add context trait * Even more flexible * Format --------- Co-authored-by: Laurent Mazare <[email protected]> Co-authored-by: shua <[email protected]> Co-authored-by: Jani Monoses <[email protected]> Co-authored-by: ilookee <[email protected]> Co-authored-by: Yi Xu <[email protected]> Co-authored-by: Eugene Hauptmann <[email protected]>
* onnx: workaround pow with negative base rather than fully defining pow in the cpu backend (as in huggingface#2318), this implements a much smaller change which is sufficient to evaluate silero-vad onnx models. Specifically, checking if pow is run with 2.0 exponent, and if so evaluate as simply `x*x` instead of the cpu backend of `e^(2.0 * ln(x))`. * PR: use Tensor::powf insead powf correctly handles a negative base.
rather than fully defining pow in the cpu backend (as in #2318), this implements a much smaller change which is sufficient to evaluate silero-vad onnx models. Specifically, checking if we can use
Tensor::powf
which is defined for negative base.