Github Copilot

This is an analysis of the Github Copilot extension for Visual Studio Code.

Under macOS the VS Code Extensions are located in the following directory:

~/.vscode/extensions

Analysis of version 1.92.177

For an analysis of Copilot Chat see README_COPILOT_CHAT.md.

Prompts

The Github Copilot extension generates three types of prompts.

Prompt 1: Single file

We start with the simplest case with only one file file1.py.

filename: file1.py
file content: # Print hello, world

If the user presses enter after # Print hello, world, the extension generates the following prompt:

# Path: file1.py
# Print hello, world

The path to the file is part of the prompt.

Prompt 2: Multiple files

Now let's consider a slightly more complex two-file case where file file2.py is edited.

filename: file1.py
file content: # Print hello, world
filename: file2.py
file content: # Print he

In this case, the extension generates the following prompt:

# Path: file2.py
# Compare this snippet from file1.py:
# # Print hello, world
# Print he

Files with similar content are also included in the prompt.

Prompt 3: Fill in the middle

Copilot supports Fill in the Middle. That means the extension sends the code before and after the cursor position to the model.

filename: file3.py
file content: # Test prefix\n# Test suffix

If the user presses enter after # Test prefix, the extension generates the prefix

# Path: file3.py
# Test prefix

and the suffix

# Test suffix

Communication

Language model

To generate a completion, the extension sends a POST request to the endpoint https://copilot-proxy.githubusercontent.com/v1/engines/copilot-codex/completions.

After sending the request, the endpoint returns the following response.

Telemetry

The Github Copilot extension sends telemetry data to the endpoint https://dc.services.visualstudio.com:

Deeper Analysis

Vocabulary

The extension contains two vocabulary files

Filename	Vocabulary Size	Comment
`vocab_cushman001.bpe`	50,276	This vocabulary is based on the GPT-2 vocabulary
`vocab_cushman002.bpe`	100,000	This vocabulary is new and not based on the GPT-2 vocabulary anymore

Min prompt chars

The length of the prompt has to be >= 10 characters before the prompt is sent to the model.

if ((_ > 0 ? n.length : d) < t.MIN_PROMPT_CHARS)
    return t._contextTooShort;

File information

The following information is collected about the file being edited:

const m = {
    uri: d.toString(), // The absolute path of the file
    source: t, // Content of the file
    offset: n, // The offset of the cursor
    relativePath: u, // The relative path of the file
    languageId: p // The programming language of the file
}

Neighbor Files

The extension remembers the files that have been accessed before. The function getNeighborFiles calls the function truncateDocs. Input of the function truncateDocs are the files sorted by access time.

When the combined size of all files exceeds 200,000, any additional files will be disregarded. The function truncateDocs returns a truncated list of files.

Copilot Performance

We have evaluated the copilot model cushman-ml with the HumanEval dataset. Out of 164 programming problems, the model can solve 56.10%.

Model name	Pass@1	Date	Comment
code-cushman-001	32.93%	2022-10-23	https://openai.com/api/
code-davinci-002	46.95%	2022-10-23	https://openai.com/api/
cushman-ml	56.10%	2022-10-23	Copilot

Completions of the evaluation run: 2022-10-23-samples-cushman-ml.jsonl

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

copilot-chat

copilot-chat

copilot

copilot

.gitignore

.gitignore

2022-10-23-samples-cushman-ml.jsonl

2022-10-23-samples-cushman-ml.jsonl

README.md

README.md

README_COPILOT_CHAT.md

README_COPILOT_CHAT.md

file1.py

file1.py

file2.py

file2.py

file3.py

file3.py

Repository files navigation

Github Copilot

Prompts

Prompt 1: Single file

Prompt 2: Multiple files

Prompt 3: Fill in the middle

Communication

Language model

Telemetry

Deeper Analysis

Vocabulary

Min prompt chars

File information

Neighbor Files

Copilot Performance

About

Releases

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 35 Commits
copilot-chat		copilot-chat
copilot		copilot
.gitignore		.gitignore
2022-10-23-samples-cushman-ml.jsonl		2022-10-23-samples-cushman-ml.jsonl
README.md		README.md
README_COPILOT_CHAT.md		README_COPILOT_CHAT.md
file1.py		file1.py
file2.py		file2.py
file3.py		file3.py

saschaschramm/github-copilot

Folders and files

Latest commit

History

Repository files navigation

Github Copilot

Prompts

Prompt 1: Single file

Prompt 2: Multiple files

Prompt 3: Fill in the middle

Communication

Language model

Telemetry

Deeper Analysis

Vocabulary

Min prompt chars

File information

Neighbor Files

Copilot Performance

About

Topics

Resources

Stars

Watchers

Forks

Languages