New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[NEW] Valkey Modules Bundling #408
Comments
Does option 1 require a change to the module build process? We currently build each module into its own .so file. Does option 2 affect the distribution only? or it implies that all modules distributed along with the core will be loaded automatically? |
I think there's an option #4 too which is an extension to #3: Valkey as it is now, in a container, with some blessed group of modules installed, and suitable valkey.conf settings to ensure those modules are automatically loaded at startup. That would be one definition of "bundling" anyway, that doesn't require any new code, makefile or other linkage changes to Valkey itself. This option provides the benefits of all three of the listed options, I think...? And its what all the cloud vendors need (a container) - and can form the basis of future work on a Valkey Kubernetes operator. |
IMO, I do not like option 2, because module is similar to a plugin. With more and more modules appear, the binary file could be very big. We should allow clients to choose which module(s) they want to use. There are thousand of plugins for eclipse and VS code I prefer option 3. option 1 is a a little bit weird. |
Definitely don't like 2. We would like to have a lean distribution of Valkey. I'm also leaning towards 3 over 1.
It would yeah, we would probably have a separate build repository that builds everything together with special flags. I will say it make it much more difficult for end users to build, as you'll probably need to check out all the sub-modules and build them all individually. We've basically defined three distribution channels: direct downloads, containers, and through linux distros. I think this approach only really helps the first and third since you have everything built, but I have a suspicion that both 1 and 3 would be okay with downloading the modules separately and having to do a little assembly on their end.
I think this makes a lot of sense. This is the easiest way to "download" and try out the functionality. |
Option 3 is a subset of option 1, the only issue I see with it is some additional management operation needs to be performed by the admin to vet/load the module(s) on the node(s). |
How much risk is there that putting multiple modules written by various authors into a single binary will cause identifier collisions leading to undefined behavior? hmmmm, this is making me think about the difference between the valkey binary and the shared libraries. Won't options 1 and 2 require code changes to allow non-loading of the modules that the user doesn't want? How would we even do that, given that the existing code searches the library for a function named XXXModule_OnLoad, and calls it for each module? What am I missing? |
For iterative features, I think this makes sense (Option 3). But if we ever wanted to take functionality out of the core and put it into a module - it would probably break a lot of users who now would need to do If we did go with Option 1, I think we should be selective about what gets automatically bundled, it should only be widely-used production-ready modules. |
I'd like to avoid bundling modules, so option 3 (or maybe even looser). I don't like option 1. It picks winners. Say, I create a module that uppercases a string (silly example). It gets accepted in Valkey Plus. Then someone else comes along and creates a better uppercase module with a different API. Then the TSC has to make a gross decision: have two, incompatible uppercase modules (yuck), drop one for the other (breakage), or let the better module wallow in disuse because it's not in Valkey Plus (yuck). I don't like option 2 and it troubles me about what Valkey even is. It has all the problems of option 1 AND why even have a code and modules then? To a user, they'll see commands and not differentiate between core commands and module commands, meaning the project will probably need to treat them the same way for versioning, maintenance, etc. Frankly, I'd love to keep modules outside of Valkey. Maybe create a registry and an easy way to install modules from the registry. Any way you cut it, the user should be in control and the bundling should minimize situation where the bundle picks winners. |
This seems like an important observation. There are many references to moving core features into modules. A very significant recent suggestion is to move the consensus algorithm into a module in cluster V2. Presumably this really means creating module-API functions that allow the default implementation of a feature to be overridden by means of the new module-API functions. I'd very much like to hear other peoples thoughts about what "move it into a module" means. |
@murphyjacob4 that's the idea with option 1, I believe the TSC/maintainers are the best set of folks to take the decision on behalf of the users and reduce some of the administrator pain. And if in future if there is a better alternative (spec/performance/feature/memory usage) a drop in replacement can be performed without impacting the users. |
I am also for option3. I like the idea of official modules ( and user developed modules). But I think it's better to have separate releases for Valkey core vs each module. Each module will have its own feature set, engineers and hence timelines. eventually if a module becomes super popular with everyone using it then we could move that code into the Valkey code but then it's no longer a module but part of core. |
I think we are a bit off topic since the proposal here is about the "true" modules, for lack of a better word, such as JSON/Bloom filter/etc. I am partially guilty for this, given that I used "modules" a lot in the cluster v2 discussion. @daniel-house, agreed the "modularization" idea that we were discussing in the cluster V2 thread need further clarification/deep-dive. At least from my end, I have been using "modules" in a very loose way in the context of the cluster V2 discussion. Let me expand my thoughts a bit more to help distinguish the two types of "modules" so we can focus back on the "true" modules on this thread and continue the "modularization" discussion separately. First of all, I am fully aware of the operational convenience that we only have to deal with a single binary today. Not saying we should never break away from it but I think there is an extremely high bar that should be met before we start introducing a collection of binaries. On the other end, I don't like that there is no clear layering nor strict contracts between the logical components in the engine, such as clustering, persistence, and replication, etc. The recent refactoring of cluster.c is not helping either IMO. There needs to be a clean contract/mechanism that allows us to abstract away the low level implementation from the rest of the engine. This mechanism could be based on the existing module APIs or a further extension of it; or it could be something totally different. The keyword IMO is "abstraction" and this is what I had in mind when I wrote "modules" or "modularization" in the cluster V2 thread. And to be clear, I am not advocating we create separate binaries for cluster (v1 or v2). There should still be one |
Valkey supports dynamic loading of modules, which expands its capabilities by allowing users to add functionality beyond the core data structures at runtime. This feature enables users to enhance the core engine with custom modules developed independently. Bundling popular modules such as Bloom filters, JSON, Search, Timeseries, etc., along with the core engine enables adoption of Valkey for users seeking these features and also simplifies transitioning from Redis to Valkey.
Valkey can pursue one of the following options regarding bundling of Modules for each release version:
Personally, I prefer option 1 as it provides flexibility to users in choosing modules according to their needs as well as avoids the complexity of loading modules separately if they need any of the module based features.
Ref: #407
The text was updated successfully, but these errors were encountered: