Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(treesitter): add ability to reload parser #28715

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

DanilaMihailov
Copy link

@DanilaMihailov DanilaMihailov commented May 12, 2024

Problem:
There is no way to reload current parser for buffer. This is needed when
vim.opt.rtp changes and new paths have treesitter queries. For example
when loading plugins lazily.

Solution:
Add options reload to vim.treesitter.get_parser(), that will
remove previous parser and create new one, effectively loading new
queries from runtimepath

@clason
Copy link
Member

clason commented May 12, 2024

Do we really need a new method for this? Can't we instead add a reload or force parameter (name TBD) to vim.treesitter.query.get()?

(Also needs tests.)

@DanilaMihailov
Copy link
Author

Do we really need a new method for this? Can't we instead add a reload or force parameter (name TBD) to vim.treesitter.query.get()?

(Also needs tests.)

Not sure about vim.treesitter.query.get(). The problem is I want to update LanguageTree.__injection_query field in parser, but it is a private field. Maybe vim.treesitter.get_parser(bufnr, lang, {force = true})? That would remove current parser and create new one with vim.treesitter._create_parser() and effectively reload _injection_query.

I would love to add tests, but is there an example how to test function that depends on vim.opt.rtp changes? As I understand I would need to create parser, than add new paths to vim.opt.rtp and then reload quieries. Should I store queries somewhere or just create files on the fly and then remove?

@DanilaMihailov

This comment was marked as outdated.

@clason
Copy link
Member

clason commented May 12, 2024

Maybe vim.treesitter.get_parser(bufnr, lang, {force = true})? That would remove current parser and create new one with vim.treesitter._create_parser() and effectively reload _injection_query.

Yes, that is exactly what I was thinking of. What I don't like about it is that it's specific to injections, but you (sometimes) want to reload the main queries. Even if that is the actual effect, the API is misleading then, hurting discoverability.

I would love to add tests, but is there an example how to test function that depends on vim.opt.rtp changes? As I understand I would need to create parser, than add new paths to vim.opt.rtp and then reload quieries. Should I store queries somewhere or just create files on the fly and then remove?

Neovim already ships with queries in runtimepath, so, yes, creating a temporary file and adding that directory to runtimepath could work. The best thing would be to look through the test/functional/treesitter tests and see if there is one that loads queries from a file, then adapt it.

@DanilaMihailov
Copy link
Author

Yes, that is exactly what I was thinking of. What I don't like about it is that it's specific to injections, but you (sometimes) want to reload the main queries. Even if that is the actual effect, the API is misleading then, hurting discoverability.

Yeah, more general "reload parser" is what I was looking for initially. So I would do it this way. But there is a lot happening with callbacks, that I don't fully grasp. Do I need to do any specific clean up for this:

  api.nvim_buf_attach(
    source,
    false,
    { on_bytes = bytes_cb, on_detach = detach_cb, on_reload = reload_cb, preview = true }
  )

@clason
Copy link
Member

clason commented May 12, 2024

I don't think so; tests will show.

Problem:
There is no way to reload current parser for buffer. This is needed when
`vim.opt.rtp` changes and new paths have treesitter queries. For example
when loading plugins lazily.

Solution:
Add options `reload` to `vim.treesitter.create_parser()`, that will
remove previous parser and create new one, effectively loading new
queries from `runtimepath`
@DanilaMihailov DanilaMihailov marked this pull request as draft May 12, 2024 10:20
@DanilaMihailov DanilaMihailov changed the title feat(treesitter): add method to reload queries feat(treesitter): add ability to reload parser May 12, 2024
Comment on lines +112 to +118
parsers[bufnr]:destroy()
parsers[bufnr] = M._create_parser(bufnr, lang, opts)

if M.highlighter.active[bufnr] then
M.highlighter.active[bufnr]:destroy()
M.highlighter.new(parsers[bufnr])
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why is this different than LanguageTree:invalidate() ? in fact we have something called _on_reload that currently calls invalidate():

function LanguageTree:_on_reload()
self:invalidate(true)
end

Would be very helpful to determine which semantics are correct and tighten things up

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalidate() does not reload LanguageTree._inject_query which is the reason I want to reload the parser. If there is different way it should be done, I am happy to do it.

What happens there is basically what is going to happen if :edit executed. When :edit executed fires event "detach" for buffer, which triggers detach_cb and this deletes parser. Then vim.treesitter.start() called by ft plugin and creates new parser.

local function detach_cb(_, ...)
if parsers[bufnr] == self then
parsers[bufnr] = nil
end
self:_on_detach(...)
end

Right now my goal is to make parser read queries from filesystem again, because I change vim.opt.rtp where I add more queries from plugin.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking, maybe a better idea would be refreshing queries in invalidate? Something like this

function LanguageTree:invalidate(reload, refresh_queries)
  self._valid = false

  if refresh_queries then
    if not self._opts.injections then
      query._reset_cache(self:lang(), 'injections')
      self._injection_query = query.get(self:lang(), 'injections')
    end
    self:_do_callback('changedquery', self:lang())
  end
    -- ...

and then highlighter.lua would register callback

 -- ...
    on_changedquery = function(lang)
      -- just remove queires, they are loaded later on demand
      query._reset_cache(lang, 'highlights')
      self._queries[lang] = nil
    end,

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalidate is only supposed to be called when the tracked state of the Languagetree is not valid against the parse tree in tree-sitter, and is called fairly often. I expect clearing file system caches will kill performance.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

invalidate is only supposed to be called when the tracked state of the Languagetree is not valid against the parse tree in tree-sitter, and is called fairly often

I attempted to mention this in the docstring of invalidate() #28747

@DanilaMihailov DanilaMihailov marked this pull request as ready for review May 13, 2024 05:35
@lewis6991
Copy link
Member

There is no way to reload current parser for buffer. This is needed when
vim.opt.rtp changes and new paths have treesitter queries. For example
when loading plugins lazily.

Which plugins are we talking about? Is the lazy loading being done meaningfully or just blindly? And why can't the plugin be improved so it doesn't need to be lazy loaded (assuming it does).

It's still up in the air whether we want to support rtp changes, since there is so much state that needs to be reset and will get more difficult if we want to add more caches. I'm not sure if the added complexity is worth it for the 0.2ms startup time you get from avoiding a specific plugins plugin/ dir.

@DanilaMihailov
Copy link
Author

Which plugins are we talking about? Is the lazy loading being done meaningfully or just blindly? And why can't the plugin be improved so it doesn't need to be lazy loaded (assuming it does).

This one https://github.com/S1M0N38/love2d.nvim , I actually added the queries there and thats when I hit the problem with rtp changes.

It's still up in the air whether we want to support rtp changes, since there is so much state that needs to be reset and will get more difficult if we want to add more caches. I'm not sure if the added complexity is worth it for the 0.2ms startup time you get from avoiding a specific plugins plugin/ dir.

In my case I load this plugin only in specific projects, so it is not about speed, but more about per project configuration.

There is similar idea in another PR #27392 that may solve my problem.

@lewis6991
Copy link
Member

love2d.nvim creates two user commands: https://github.com/S1M0N38/love2d.nvim/blob/main/plugin/love2d.lua

Is having this enabled only for specific projects really gaining you anything?

@DanilaMihailov
Copy link
Author

DanilaMihailov commented May 14, 2024

love2d.nvim creates two user commands: https://github.com/S1M0N38/love2d.nvim/blob/main/plugin/love2d.lua

Is having this enabled only for specific projects really gaining you anything?

It adds love2d library to lua_ls path as well. But sure, this plugin can be loaded eagerly without a problem. I will have some irrelevant functions in autocomplete, but it is not a big deal.

Before creating PR I tried to solve this with available apis, but had to use private field

local parser = vim.treesitter.get_parser()
parser._injection_query = vim.treesitter.query.get("lua", "injections")
parser:invalidate(true)
parser:parse(true)

Weirdly never hit cache with this call, so it worked, but using private field bothers me :)

@DanilaMihailov
Copy link
Author

It's still up in the air whether we want to support rtp changes, since there is so much state that needs to be reset and will get more difficult if we want to add more caches.

The whole parser already getting dropped and new one created when executing :edit command for example

local function detach_cb(_, ...)
if parsers[bufnr] == self then
parsers[bufnr] = nil
end
self:_on_detach(...)
end

elseif parsers[bufnr] == nil or parsers[bufnr]:lang() ~= lang then
assert(lang, 'lang should be valid')
parsers[bufnr] = M._create_parser(bufnr, lang, opts)
end

I am doing the same, when calling vim.treesitter.get_parser(..., {reload = true}), so no state from previous parser is kept or managed in any way. The only thing is I have to clear cache for queries, which does not seem like a big deal. But you saying there might be added more query files and then more cache is needed to be cleared?

@DanilaMihailov
Copy link
Author

If explicitly dropping query cache is a problem (or could become a problem in the future) how about just having an api to remove parser from a buffer?

function M.drop_parser(bufnr)
  parsers[bufnr]:destroy()
  parsers[bufnr] = nil
end

Or making parsers available outside of module? Like in vim.treesitter.highliger.active[bufnr]?

Or even adding parameter to vim.treesitter.stop(bufnr, {drop = true}) that will delete parser?

This will enable me to write something like this (probably in a loop for all lua buffers)

vim.treesitter.stop(bufnr, {drop = true})
collectgarbage() -- maybe clear query cache
vim.treesitter.start(bufnr, "lua") -- create new parser with fresh queries

@clason
Copy link
Member

clason commented May 15, 2024

Remember that the highlighter is not the only "module" that is using treesitter parsers (which are shared!) and queries.

@DanilaMihailov
Copy link
Author

Remember that the highlighter is not the only "module" that is using treesitter parsers (which are shared!) and queries.

Yeah, but they already have to handle buffer detaching

local function detach_cb(_, ...)
if parsers[bufnr] == self then
parsers[bufnr] = nil
end
self:_on_detach(...)
end

So calling self:_on_detach() should be enough to handle this.

function M.drop_parser(bufnr)
  if parsers[bufnr] then
    parsers[bufnr]:_on_detach()
    parsers[bufnr] = nil
  end
end

Regarding queries, query cache already drops if no one is referencing given query. So by removing parser this way I try to create situation where no one references queries/lua/injections.scm, so it will be dropped from cache.

@DanilaMihailov
Copy link
Author

@lewis6991 please, let me know if this is a good idea. This is the only needed change. Not touching query cache. Just an ability to optionally destroy a parser.

-- vim/treesitter.lua
function M.stop(bufnr, opts)
  opts = opts or {}
  -- ...
  if opts.destroy and parsers[bufnr] then
    parsers[bufnr]:_on_detach()
    parsers[bufnr]:destroy()
    parsers[bufnr] = nil
  end
  -- ...
end

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants