Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Copilot Labs features #79

Open
rockofox opened this issue Nov 13, 2022 · 17 comments
Open

Copilot Labs features #79

rockofox opened this issue Nov 13, 2022 · 17 comments

Comments

@rockofox
Copy link
Contributor

Don't know how possible or hard to implement that would be, but features like "Explain code" or "translate code" could certainly be interesting.

https://githubnext.com/projects/copilot-labs/

@zbirenbaum
Copy link
Owner

This likely won't be possible, at least not for a long time. I've looked into it before, and did again just now to confirm nothing has changed (it hasn't) and there are a few issues:

  1. The copilot labs extension is not an LSP. It is very strange, in that it appears to handle tons of NLP and connects to the normal copilot extension, but the normal copilot extension doesn't have handlers for something like "explain" or "translate" code (I checked in the packed js file for this).
  2. My best guess is that the copilot labs extension does some extensive text manipulation and parsing, and then passes said output to something like getCompletions with tuned parameters. There is a massive amount of code just listing keywords, along with treesitter calls, and language structures defined in the copilot-labs files which supports that hypothesis. This is made even more strange by the fact that OpenAI codex models have a specific endpoint for this functionality, that I have actually written prototype handlers for, so I can't think of a reason they would be doing it this way.

The only way of maybe doing this would be to use something like wireshark or fiddler and intercepting the outbound requests to try and see what it is copilot-labs sends off and to which endpoint it goes to. Then, by modifying the code you run it on, try to figure out generally what the transformation mechanism is and what the general parameters are/if they change. I don't currently have the time to do that, but if anyone would like to do some basic testing and post it here, I'd be happy to review the results. If it turns out my above hypothesis was incorrect and all the language processing and AST stuff in those files is pointless, it would be pretty easy to just send the raw code off to an endpoint like they do.

@MunifTanjim
Copy link
Collaborator

Copilot Lab has a separate agent.js (just like copilot/dist/agent.js). If somebody can figure out the the endpoints from that, it would be fairly easy to do the rest.

If somebody's willing to do it, you can download the extension archive from here: https://marketplace.visualstudio.com/items?itemName=GitHub.copilot-labs

@zbirenbaum
Copy link
Owner

Copilot Lab has a separate agent.js (just like copilot/dist/agent.js). If somebody can figure out the the endpoints from that, it would be fairly easy to do the rest.

If somebody's willing to do it, you can download the extension archive from here: https://marketplace.visualstudio.com/items?itemName=GitHub.copilot-labs

Is that the old copilot-labs extension or the new one? It gets really confusing since they have a few of them. I got the most up to date one and there was no agent.js.

If the old one has it though, I can reverse engineer it pretty easily.

@MunifTanjim
Copy link
Collaborator

Is that the old copilot-labs extension or the new one?

Ah dang, there are multiple? 😂 I had no idea. I only played around with it for a few hours a month ago. That's it.

Where do I get the latest one?

@MunifTanjim
Copy link
Collaborator

MunifTanjim commented Nov 14, 2022

btw, it's not really agent.js. But it has a extension.js file. Guessing all the magic happens there.

image

@zbirenbaum
Copy link
Owner

btw, it's not really agent.js. But it has a extension.js file. Guessing all the magic happens there.

image

Ahh yeah, I looked through the worker.js and extension.js, but as far as I could tell it doesn't have any endpoints we could use like agent.js does.

I just checked and that is the most recent one. Maybe I was thinking of copilot nightly, oops

@zbirenbaum
Copy link
Owner

zbirenbaum commented Nov 14, 2022

@MunifTanjim I thought about this a bit more and one idea would be to try using the same endpoint that the Codex model uses for the copilot api. The only trouble would be with authorizing our requests on the LSP side.

I don't have a ton of experience with OAuth stuff, since you wrote the authentication code here I was wondering if you knew how hard it might be to do that?

The added benefit of getting our own auth server working would be that I could write a LSP to replace agent.js and open source it. I'm familiar enough with the key parts of the copilot agent and writing LSPs that I'm pretty confident I could do it, though it wouldn't be a quick project to implement. We would likely need to check with Microsoft/GitHub before doing any of that though, since it would be a big endeavor and I wouldn't want it to get shut down after putting a bunch of work into it. Unfortunately I'm not entirely sure who to contact.

@zbirenbaum
Copy link
Owner

zbirenbaum commented Nov 16, 2022

If anyone coming across this would like to see such features ever implemented, please upvote this discussion post so Microsoft/Github won't just ignore it like they did for every past discussion on the topic: community/community#39278

@MunifTanjim
Copy link
Collaborator

I don't have a ton of experience with OAuth stuff, since you wrote the authentication code here I was wondering if you knew how hard it might be to do that?

I saw a bunch of auth codes in that extension.js file. With some effort, it would be possible to do it. But I guess it depends on whether MS would allow using copilot labs with Neovim or not. It would be a shame if bunch of efforts went into it only to get shut down.

@zbirenbaum
Copy link
Owner

zbirenbaum commented Nov 16, 2022

I don't have a ton of experience with OAuth stuff, since you wrote the authentication code here I was wondering if you knew how hard it might be to do that?

I saw a bunch of auth codes in that extension.js file. With some effort, it would be possible to do it. But I guess it depends on whether MS would allow using copilot labs with Neovim or not. It would be a shame if bunch of efforts went into it only to get shut down.

I asked about it getting shut down specifically in the discussion here https://github.com/orgs/community/discussions/39278 but every discussion post on a similar topic has been ignored so i'm hoping that some people here will upvote it so it gains some traction.

@zbirenbaum
Copy link
Owner

zbirenbaum commented Nov 16, 2022

Well I got a response on that discussion post, but they only addressed "Will you offer a public endpoint" to which the answer was "No" and ignored "Will it get shut down if I implement an endpoint", followed by closing and locking the thread. I unmarked it as an answer and edited my question, but I doubt the devs would really be making that call or know.

My bet is, they probably wouldn't shut it down directly. They may just add some annoying checks to try and prevent it. Typically that's how Microsoft has gone about stuff like this in the past, like with Pylance. Unlike Pylance, The copilot license (at least currently) is entirely different and there is nothing in it that would outright imply making our own endpoint wouldn't be acceptable. The only reason its a question is that for some unknown reason they don't want or plan to supply their own api. @MunifTanjim

The copilot agent.js license actually explicitly states that redistribution in its original or modified form is allowed. The general copilot license is under creative commons which also allows modification and distribution

Honestly, I don't think they would go through the trouble of taking it down as long as their license stays as is, I can't imagine them ignoring their own license would go over well in the open source community.

@sirupsen
Copy link

FWIW, I've been using GPT-3 directly via the API for features like this (refactor, explain, ...) and it works pretty well. Whenever GPT-4 comes out, I bet it's going to be incredible at this.

@zbirenbaum
Copy link
Owner

zbirenbaum commented Nov 20, 2022

FWIW, I've been using GPT-3 directly via the API for features like this (refactor, explain, ...) and it works pretty well. Whenever GPT-4 comes out, I bet it's going to be incredible at this.

While gpt3 codex didn't work for completions for rate limiting reasons, given that refactor and explain are much less frequently called options it might be viable as a separate plugin. I'll dig out my old code for a gpt3 lsp, remove some of the hard coding and upload it at some point tomorrow.

It's pretty bare bones but it connects to neovim perfectly fine so it may be a good starting point if there is interest in pursuing that route.

@zbirenbaum
Copy link
Owner

zbirenbaum commented Nov 28, 2022

FWIW, I've been using GPT-3 directly via the API for features like this (refactor, explain, ...) and it works pretty well. Whenever GPT-4 comes out, I bet it's going to be incredible at this.

Just wanted to provide an update, I haven't forgotten about this. I seem to have left it in a very messy state I would be embarrassed to upload while trying a million different things to fix the Rate limiting issues.

I haven't gotten around to cleaning it yet because I got put in charge of finishing a few new things and deploying them to production at my job (which was already quite busy), and then the holidays came up.

I will get around to this and I'm truly sorry for the delay

@Pytness
Copy link

Pytness commented Feb 2, 2023

The only way of maybe doing this would be to use something like wireshark or fiddler and intercepting the outbound requests to try and see what it is copilot-labs sends off

There is a proxy option in vs-code:

"github.copilot-labs.advanced": {
        "debug.overrideProxyUrl": "http://localhost:8080",
},

Here is how it works.

Take this code:

void clear()
    {
        screen->clearDisplay();
    }

The selected code is just inserted between START_CODE\n and END_CODE\n, Then goes a phrase that explains the modifications to the code.
The READABLE button puts Make this code easier to read, including by adding comments, renaming variables, and/or reorganizing the code.

Request example:

POST /v1/engines/copilot-labs-codex/completions HTTP/1.1
{
    'prompt': 'START_CODE\nvoid clear()\n    {\n        screen->clearDisplay();\n    }\nEND_CODE\n\nMake this code easier to read, including by adding comments, renaming variables, and/or reorganizing the code.\n\nSTART_CODE\n',
    'suffix': '',
    'max_tokens': 1987,
    'temperature': 0.75,
    'top_p': 1,
    'n': 1,
    'stop': ['END_CODE'],
    'feature_flags': ['trim_to_block'],
    'stream': True,
    'extra': {'language': 'cpp', 'force_indent': -1}
}

The response is in the form of:

data: {'id': 'cmpl-6fQ...', 'model': 'cushman-ml', 'created': 1675329559, 'choices': [{'text': '', 'index': 0, 'finish_reason': None, 'logprobs': None}]}
data: {'id': 'cmpl-6fQ...', 'model': 'cushman-ml', 'created': 1675329559, 'choices': [{'text': 'void', 'index': 0, 'finish_reason': None, 'logprobs': None}]}
data: {'id': 'cmpl-6fQ...', 'model': 'cushman-ml', 'created': 1675329559, 'choices': [{'text': ' clear', 'index': 0, 'finish_reason': None, 'logprobs': None}]}
data: {'id': 'cmpl-6fQ...', 'model': 'cushman-ml', 'created': 1675329559, 'choices': [{'text': '()\n', 'index': 0, 'finish_reason': None, 'logprobs': None}]}
data: {'id': 'cmpl-6fQ...', 'model': 'cushman-ml', 'created': 1675329559, 'choices': [{'text': '{\n', 'index': 0, 'finish_reason': None, 'logprobs': None}]}
data: {'id': 'cmpl-6fQ...', 'model': 'cushman-ml', 'created': 1675329559, 'choices': [{'text': '   ', 'index': 0, 'finish_reason': None, 'logprobs': None}]}
...
data: {'id': 'cmpl-6fQ...', 'model': 'cushman-ml', 'created': 1675329559, 'choices': [{'text': '();\n', 'index': 0, 'finish_reason': None, 'logprobs': None}]},
data: {'id': 'cmpl-6fQ...', 'model': 'cushman-ml', 'created': 1675329559, 'choices': [{'text': '}\n', 'index': 0, 'finish_reason': 'stop', 'logprobs': None}]}

(^ I think you get the response all at once, so i dont know why this is not an array)

After joining the all the choices.text:

void clear()
{
    // clears the display
    screen->clearDisplay();
}

Hope this is helpful, here is the proxy in python
(you need to pip install rich)

@haukot
Copy link

haukot commented Jul 1, 2023

I got some success with the implementation of Github Copilot Labs plugin's base, but unfortunately don't have enough time to make it in good quality.
You can see POC here, it runs Copilot Labs extension.js inside a wrapper with stubbed VS Code functions, and provides a jsonrpc interface. Hope it could be useful!

@zbirenbaum
Copy link
Owner

zbirenbaum commented Jul 20, 2023

I've got a LSP implementation fetching completions from copilot servers and sending back the results to neovim. The results are currently missing information needed by copilot.lua and copilot-cmp and some stuff like the language of the file is still hardcoded, but the prompt and suffix info in the completion request is properly synced to the document. It may be best in the long term to implement equivalents to what the copilot labs extension provides so that we aren't at the mercy of closed source code for updates and functionality.

zbirenbaum/copilot-rs

I wrote it in rust since I wanted some practice and rust is really fast, but if people are heavily opposed to that it's still early enough in development I could redo it in typescript.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants