Speed up index construction by converting vocabulary types while loading the model #768
Labels
optimization
Related to performance optimizations
structured generation
Linked to structured generation
Because we use Numba to compile the index we need to convert the vocabulary types, which takes a non-negligible amount of time every time the script is run. A simple way to go around this is to execute this function in a separate thread while model is being loaded. We may also be able to make Numba cache JIT-compiled function by compiling the index for a trivial regex.
The text was updated successfully, but these errors were encountered: