You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Currently, the examples/server should be able to provide an OpenAI API compatibility layer. If you encounter specific issues, we welcome your feedback.
For parallel inference with the same prompt, you can use examples/batched. For different prompts, the --cont-batching option in examples/server might be of help, but it is not recommended due to significant errors observed in testing. We have recently optimized operations for batch sizes greater than 1, reverting to dense computation for very high batch sizes. We expect the speed to be at least as fast as dense models under various conditions.
We have released the adapted Mistral-7B model named Bamboo. To restore model performance, we conducted retraining and further pretraining on approximately 200B tokens, which would be unfeasible for individuals.
Yes, we are working on adapting both models and platforms. For more details, please see our Kanban. As the model availability and support level of the inference framework mature, we will actively establish or participate in the upstream and downstream ecosystem, using PowerInfer as a starting point.
1、是否有计划兼容openai API风格的计划?
2、能否并发处理或者是否有兼容并发处理的计划?
3、训练适配的mistral-7b模型需要多长时间?个人训练适配的模型难度和成本大吗?
4、是否有扩大该推理框架生态的计划?
The text was updated successfully, but these errors were encountered: