-
-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance and authoring experience of mkdocs serve
#3695
Comments
Are there public examples of large repositories that take up to 30 minutes to build? I tried locally with 10K dummy files and ran out of memory before the site was built 😅 With 1K files, the template rendering seems to be the most costly. |
Users have mentioned this in multiple occasions, but I'm having a hard time finding it due to GitHub's rather mediocre issue search. Here's what I could gather from a quick search:
The fact is that 30min is a worst case scenario. Even a repeated build that takes 1 minute is too slow to be useful, and Running out of memory is another problem that should be fixed, as already discussed in #2669 |
This is a project with 3,400 files and a very limited set of plugins, i.e., IMHO, not many plugins, and the social plugin which I wrote employs caching, which means repeated builds are much cheaper due to leveraging cached images. I've built the project on my machine, an M2 MacBook Pro: First build
Repeated build (social plugin cached)
It's infeasible to make edits on this project without |
I tested the repository mentioned above on my Ryzen 3600 Windows 10 PC, mkdocs==1.6.0, mkdocs-material==9.5.20
Repeated build: I used my performance_debug hook. PLUGINS_PER_EVENTS:
on_post_page|mkdocs_minify_plugin.plugin.MinifyPlugin: 958.97267 # The main culprit of the long build time
on_page_context|material.plugins.search.plugin.SearchPlugin: 23.14142 # Expected given the amount of files
on_config|material.plugins.social.plugin.SocialPlugin: 10.61701 # on_config not expected being affected by amount of files, is it always this slow?
on_page_markdown|material.plugins.social.plugin.SocialPlugin: 1.20333 # magic of concurrency
...
on_post_build|material.plugins.social.plugin.SocialPlugin: 0.00389 # magic of concurrency Currently the MARKDOWN_PER_CLASSES:
pymdownx.superfences.SuperFencesBlockPreprocessor: 11.09541
markdown.treeprocessors.InlineProcessor: 10.98559 I'm surprised those markdown values are so low, as last time I checked with GMC (~190 files) the same classes had ~6 seconds each. Perhaps the complexity of the Markdown or the amount of Code Blocks has a bigger impact than I thought. But still 3k files vs 200 files and only a x2 time increase seems odd hmm Template rendering took ~270 seconds: TEMPLATE_ROOTS:
main.html|sum: 267.45968 this time gets repeated each re-serve without Caching for later builds with So I would like to see some sort of on-demand loading, I guess this would requires a fork in |
Like this one https://github.com/monosans/mkdocs-minify-html-plugin? Could you build once with it and see if you just spared 950 seconds or so 😛? It only minifies HTML files though apparently (but still CSS and JS within them). |
Also, solid work @kamilkrzyskow 👍 Thanks for making and sharing all this! |
Ah, nice, I didn't know about the minify-html plugin! I'll check it out and probably switch to it. Offloading pure string processing to Rust makes a lot of sense. |
The issue is that the site navigation requires the entire pages collection to be available for the one page to be rendered. This is where caching and/or concurrency would likely be helpful. For that matter, the pages don't need to all be fully rendered, but they all do need to be read and processed to a certain extent to determine the page title, etc for the nav. And then there are those scenarios where a page's content consists of the pages collection (either be means of a plugin or as a static template). In that case, to render that page (even if the nav is excluded), the entire pages collection is needed. Ultimately, it has been the above two issues which have thus far prevented a better solution from being developed. Work out a way to address those and then we may have a workable solution. |
Quick thought: what if plugins informed MkDocs whether each one of their hooks could be executed concurrently, or only sequentially? I'm imagining some utilities to build a "pipeline" of things to run depending on whether they support concurrency or not. Quick flowchart which doesn't make sense but illustrate the idea: flowchart TD
p1f["plugin1.on_files"]
p2f["plugin2.on_files"]
p3f["plugin3.on_files"]
p1n["plugin1.on_nav"]
p2n["plugin2.on_nav"]
p3n["plugin3.on_nav"]
p1pm["plugin1.on_page_markdown"]
p2pm["plugin2.on_page_markdown"]
p3pm["plugin3.on_page_markdown"]
start --> p1f & p2f
p1f & p2f --> p3f
p3f --> p1n & p2n & p3n
p1n & p2n & p3n --> p1pm
p1pm --> p2pm & p3pm
EDIT: hmm I suppose there's another possible layer of concurrency on files/pages themselves. The transformation pipeline would likely be quite complex. I'm sick and have fever today so please be indulgent 😂 |
This is exactly what Sphinx does. Each extension defines if it's safe for parallel reading and/or parallel writing. See https://www.sphinx-doc.org/en/master/extdev/index.html#extension-metadata I haven't checked how it works internally, but it's probably something to explore a little more and see if there are some ideas that can be reused. |
I would like to be able to use parallel build. It has been stated in #1900 that the benefit is not so high. However, I have lots of jupyter-notebooks to convert (the execute step consumes most of the time). I ended up executing all notebooks concurrently in advance. |
Okay, so I've been working on this and I've got enough to demo now... https://github.com/mkdocs/sketch/tree/main That's a work-in-progress of "how could mkdocs look" that properly deals with this issue. Specifically, the I needed to do a bit of poking to make this work with the terraform example above (since it doesn't include a There's other aspects that I'm looking to address as part of that work, just getting things into shape so that I've got a coherent body of work to start sharing here. * search indexes aren't in there just yet. yes they would require a full-site build, but we can use HTML |
Nice work @tomchristie!
In the case of mkdocstrings and its cross-references ability, |
This looks really promising! Really excited how this will work with more complex setups. I guess there're still things to be worked out (haven't checked the implementation), but it's a great start! 👏 |
Note
As the maintainer of Material for MkDocs, I'd like to open a discussion on how we can collaborate to enhance MkDocs. This initiative is inspired by Tom Christie's recent reflections on the future development of MkDocs. I believe that through collective efforts, we can identify and implement improvements that will benefit our users significantly.
The
mkdocs serve
command provides a powerful write-build-check-repeat loop that is integral for documentation projects, setting MkDocs apart from many static site generators that lack live preview functionality. This feature greatly enhances the efficiency and accuracy of developing and refining documentation, allowing for immediate feedback and iterative improvements.Startup time of
mkdocs serve
Unfortunately, there are significant issues with the
mkdocs serve
command, particularly when working with large documentation projects that consist of thousands of pages. Currently,mkdocs serve
requires a full build of the documentation before it becomes interactive. This process can take an extensive amount of time, ranging from 30 to 40 minutes for large projects. This delay significantly impedes the ability to usemkdocs serve
effectively for previewing changes.The need for a preview is crucial, especially given that Material for MkDocs integrates with the Python Markdown Extensions, a powerful set of Markdown extensions, especially for technical writing, adding features like content tabs via Tabbed and enhanced indent detection through SuperFences. Unfortunately, editor support for these syntaxes is limited, if not non-existent. This lack of support means that authors must rely on
mkdocs serve
to preview changes. Given the current build times on large projects, authors face considerable difficulty in efficiently making and reviewing changes, essentially working 'blind' without this functionality. Performance is in fact one of the most major critiques on MkDocs.Problems with the
--dirtyreload
flagThe
--dirtyreload
flag in MkDocs offers a partial solution to speed up the re-build process during a documentation project's development by not rebuilding the entire site with each change. However, this flag only affects subsequent builds and does not improve the initial build time. Moreover, it introduces issues such as incorrect navigation and incomplete metadata, which can disrupt the functionality of plugins, like the blog plugin that struggles to correctly update archive and category indexes under--dirtyreload
. Consequently, plugins must be designed to specifically work around these limitations, complicating their development and integration.Conclusion
To significantly enhance the editing experience with MkDocs and reduce the environmental impact by saving thousands of build minutes daily, we need to focus on two critical improvements:
Reducing the initial preview load time: The time it takes from starting the live server to when the preview is first available needs to be substantially decreased. This change would make MkDocs more usable, especially for large projects.
Speeding up live preview updates: After making edits, the time to see these changes in the preview should be minimized. This improvement will support a more efficient and iterative documentation process.
Potential strategies to achieve these improvements include implementing more sophisticated caching mechanisms and exploring the possibility of parallelizing the build process. These changes would address both the initial and subsequent build times, making
mkdocs serve
a more robust tool for documentation development.The text was updated successfully, but these errors were encountered: