Issues: predibase/lorax
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
Improve warmup checking for max new tokens when using speculative decoding
bug
Something isn't working
good first issue
Good for newcomers
#474
opened May 17, 2024 by
tgaddair
Bug Report: lorax-launcher failed with --source "s3" for model_id "mistralai/Mistral-7B-Instruct-v0.2"
#473
opened May 17, 2024 by
donjing
1 of 4 tasks
Ensure api_token is not included in the response on error
bug
Something isn't working
#469
opened May 15, 2024 by
tgaddair
AutoTokenzier.from_pretrains needs setting with
trust_remote_code
inside load_module_map
#466
opened May 13, 2024 by
thincal
1 of 4 tasks
Add all launcher args as optional in the Helm charts
enhancement
New feature or request
#465
opened May 9, 2024 by
tgaddair
Retrieve all lora models from Huggingface hub by base model setting.
#463
opened May 8, 2024 by
svjack
Improve async load for adapters to avoid main thread lockups in server
enhancement
New feature or request
#457
opened May 3, 2024 by
tgaddair
Batch inference endpoint (OpenAI compatible)
enhancement
New feature or request
#448
opened Apr 30, 2024 by
tgaddair
Fallback to Flash Attention v1 for pre-Ampere GPUs
enhancement
New feature or request
good first issue
Good for newcomers
#440
opened Apr 26, 2024 by
tgaddair
Improve the latency of New feature or request
load_batched_adapter_weights
enhancement
#433
opened Apr 22, 2024 by
thincal
Inference with AWQ quantized base model + compile enabled results in the <unk> tokens
#426
opened Apr 19, 2024 by
thincal
4 tasks
Error: Warmup(Generation("'bool' object has no attribute 'dtype'"))
#422
opened Apr 18, 2024 by
KrisWongz
1 of 4 tasks
Can't run Mistral quantized on T4
enhancement
New feature or request
#417
opened Apr 16, 2024 by
emillykkejensen
2 of 4 tasks
LoRAX server with 2 GPUs and multiple adapters becomes permanently faster in swapping ONLY after parallel execution of requests.
#395
opened Apr 8, 2024 by
lighteternal
1 of 4 tasks
In Structured Output, a JSON schema with a date string format will yield invalid JSON
#392
opened Apr 5, 2024 by
oscarjohansson94
2 of 4 tasks
Previous Next
ProTip!
Mix and match filters to narrow down what you’re looking for.