You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Do you know if the n-gram speculation is working? I think that would be even more impactful and simpler to handle since a lot of structured task are rewrite
What behavior of the library made you think about the improvement?
As of now Medusa is generating hallucinations as the speculative multihead is not supporting the outline decoding grammar.
How would you like it to behave?
Support speculative decoding for performance reasons
Note: only tgi is supporting Medusa not vllm for now but planned.
The text was updated successfully, but these errors were encountered: