-
Notifications
You must be signed in to change notification settings - Fork 2.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support for graphics in the terminal #4763
base: master
Are you sure you want to change the base?
Conversation
d5a1c08
to
f11ce16
Compare
This sounds like it might have a significant performance impact. Have you actually tested the performance of this PR?
That sounds extremely hacky, which I am not a fan of. This will likely just cause an endless heap of issues with things like selection, so it's not something we can just throw in and forget about.
This seems to just pile on a heap of code to support a bunch of image protocols. I don't like the idea of adding thousands of lines to Alacritty for something with so little use. If we support any protocol it should be the simplest one without any performance impact. I see no benefit in supporting multiple formats.
What about support for devices below OpenGL 3.3? This is something we would like to look into for the future so adding more code that requires at least OpenGL 3.3+ does not seem ideal.
I see little reason for adding memoffset/lazy_static usually, but I haven't looked at the code yet. |
Same |
I launched vtebench three times, in the current Running $ base64 ./target/release/alacritty > /tmp/data
$ wc -l /tmp/data
635257 /tmp/data
# graphics branch
$ perf stat -e cycles,instructions,branches,branch-misses ./target/release/alacritty -e cat /tmp/data
Performance counter stats for './target/release/alacritty -e cat /tmp/data':
4,590,935,968 cycles
10,125,002,121 instructions # 2.21 insn per cycle
2,038,270,623 branches
4,651,069 branch-misses # 0.23% of all branches
0.854657772 seconds time elapsed
0.774462000 seconds user
0.466697000 seconds sys
# master branch
$ perf stat -e cycles,instructions,branches,branch-misses ./target/release/alacritty -e cat /tmp/data
Performance counter stats for './target/release/alacritty -e cat /tmp/data':
4,692,460,970 cycles
10,053,624,431 instructions # 2.14 insn per cycle
2,035,973,894 branches
4,795,961 branch-misses # 0.24% of all branches
0.855561156 seconds time elapsed
0.762745000 seconds user
0.493408000 seconds sys
I changed the NBSP character with a
Without comments and tests, the support for the iTerm2 protocol is less than 100 lines of code.
The only performance impact of this protocol is an extra branch in the
The fallback uses |
This issue is now fixed. |
That doesn't justify adding it. If Alacritty supports any graphics protocol, there should be one and it should be the simplest one available. |
I'm also getting the following error:
|
What is the output of |
96 or something like that iirc. |
I uploaded a fix in 387a224 to check if this fixes the problem. If this works, we may need to generate the code of the fragment shader dynamically, in order to use the 32 units if they are available in the hardware. |
@ayosec did you test the iTerm2 protocol with ranger? previewed images are kept on screen and not cleared, so new images are overlayed on top of the old ones, and images are kept in the background when text is rendered in the preview pane. Another issue is that the iterm2 graphics is working with tmux as long as the image file is small, but with large files, it looks like I'm hitting this, which is not an Alacritty issue but Alaciritty is spamming with warning messages like this:
You should be able replicate with:
Note that this issue doesn't happen when running ranger directly without going through a tmux session, and not when running in an ssh session. |
Ranger relies on overwriting images with spaces. The discussion in #910 rejected my first approach to be able to have this functionality. However, if we can change the implementation in this patch to use As a work-around, images can be replaced with the █ character (reversed with
This error happens because the base64 stream is incomplete. Did you use iTerm2 imgcat? It seems that tmux requires specific sequences to send images. Anyways, since iTerm2 will not be accepted, maybe it's not worth it to spend time debugging the issue. |
I can replicate the issue with the linked imgcat script and an image. I won't attach the image here because I'm not sure about its license but I can send it via email.
That's unfortunate, it would have been nice to have it. |
I guess it's better to wait for sixel to support 24-bit colors? |
Sixel isn't likely to change. |
I have updated the patch with the following changes:
|
Can your implementation of sixel convert 24bit colors into 256 colors and display a video efficiently? |
18fd66e
to
0a41635
Compare
about the performance drop, if it would be made into a setting, and disabled by default - would this also hurt it? if not that would be a nice way to support it if needed but still have alacritty as fast as possible in normal install cases |
Opting out of performance is not something that aligns with Alacritty's goals. |
That's unfortunate because image support is a deal breaker for me. Ligatures and image support, which are by and large the most requested features for this terminal as far as I can tell, aren't exactly the most difficult of asks. We aren't asking for an SSH tool or anything. |
Saying anything text-related isn't difficult definitely shows the depths of understanding you have for these issues. Regardless of difficulty, if you're looking for a terminal emulator with every feature under the sun, I'd recommend looking elsewhere. |
I'd agree with you if I had said they aren't difficult, which I didn't. I said they aren't the most difficult features to add. And saying I'm asking for every feature under the sun, when I'm actually asking for two incredibly mediocre features, one of which even basic terminal emulators such as foot or xterm have, is a strawman. I won't argue with you, though. It's your terminal. Have a nice day |
@ayosec can you enable issues/discussions on your fork? |
Performance isn't the issue here, there should be no performance impact if the code paths in this PR aren't hit. There's not much point in discussing performance impact without measuring it first. |
XTerm is the exact opposite of basic.
It has been measured. |
@chrisduerr what will you consider acceptable to merge? If you were one to accept this PR, what would you change? I see that performance is the most important thing to you - but what else? On the other hand, the performance tests were made 3 years ago. Maybe it's time to revisit this problem? Furthermore, if this feature is hidden behind the flag, will you accept minimal performance decrease? It would be (then) an optional, and fully aware choice for the user to lose a little bit of speed. |
Maybe @ayosec could add a donation button, so that he can keep mantaining this fork |
First of all the protocol is garbage. It's used only because it is somewhat widespread already, not because it's actually good. So that immediately stacks the odds against it. I also couldn't care less about images in the terminal, so any implementation should be simple so I don't have to spend a bunch of time maintaining something I don't care about. ~3k lines is a significant portion of Alacritty's code, so I would consider this "not simple". @ayosec has spent significant effort writing and maintaining this, but I don't see how one could justify upstreaming this. |
I think this is probably the most important part. You don't care about, or (I'm guessing) use images in the terminal. So, terminal users who want images in the terminal just shouldn't use your project. Discussion in this topic has made it pretty clear that even a more modern protocol (kitty) also wouldn't be welcome. So, why even keep this issue open, just refuse any image protocol, and tell your image-desiring users to go somewhere else (which you've basically done).
After having worked on a client-side implementation, I agree completely. Sixel makes no sense, is hugely wasteful (it is incomprehensible why it was chosen to draw in groups of six vertical pixels). Kitty looks a bit better, but you seem to dislike it also? Would there be any image-in-the-terminal protocol you would support? If you were to design a minimal implementation (<= 1K LoC), what would that look like? At the risk of running into the XKCD situation, it would be nice to have a modern, "just draw pixels into the terminal" protocol. No GIFs, no motion, basic formatting support (maybe scrolling support). |
It’s not incomprehensible. The Sixel protocol was designed to be compatible with their dot–matrix printers. Those printers had a nine–pin print head, but six pins can be encoded directly into 64 characters that won’t conflict or be confused with other escape sequences. In particular, the Sixel data will never contain an ESC character. |
This is a PR, not an issue. It is only kept open because ayosec has demonstrated to be a responsible maintainer of this patch set and hosting it on Alacritty's official repository makes it easier to find for people interested in it.
That's the issue with image protocols: If I were to design an image protocol, it would be intentionally garbage. No guarantee as to alignment within cells, whatever size seems right, fail frequently. Kitty's keyboard protocol is good for people that want to use the graphics protocol to do non-text rendering in a terminal, but that's not what a terminal's purpose is to begin with. Any image protocol I would like should be awful enough that nobody wants to use it, which is something that things like w3m have provided very well in the past. So at that point why even bother? |
The point, for me at least, is to have a fairly cross-platform UI abstraction that is minimalist (read low attack surface), and is capable of rendering images for interfacing with applications that supply image data. Whether that application is a website, where a terminal-based web browser like a modern take on Or, a simple image viewer for not having to shell out to a full GUI app to view images in your filesystem. There are plenty of use-cases for wanting image support in the terminal. |
The terminal is not the right place for implementing a modern browser. Websites are not even remotely text-based anymore. |
On macOS you can do A simple application that spawns a window and renders an image should be very fast. |
Even sixel serves its purpose of image preview in terminal file managers. Yazi terminal file manager has built-in support for sixel, kitty, and iTerm2 image protocols. Image preview in terminal file manager makes yazi a good alternative to GUI file managers, all of which seem worse than terminal file managers outside image preview. The reason that people flock to terminal emulator for images is because all major existing GUI toolkits suck and most GUI applications have bad UI. |
@nixpulvis There are a number of use cases I take advantage of regularly:
That said, I think the current situation works well enough provided @ayosec doesn't mind maintaining a soft fork. I couldn't expect y'all to pull in code you weren't interested in maintaining. I also appreciate alacritty's simplicity compared to a lot of other terminals, and I assume the same attitude that resulted in your opinions on this PR is why it doesn't have a lot of other stuff I probably wouldn't want included. Big thanks to all the maintainers of Alacritty, and another big thank you ayosec for maintaining this fork :) |
The reason that people demand images in terminal is that major GUI toolkits like Gtk and Qt are worse than terminal user interface in many ways. Gtk and Qt break backward compatibility every few years in the name of innovations which tend to lead to more complexity instead of more simplicity. I evaluated Gtk and Qt and other GUI toolkits and decided to use a TUI library because I didn't want to migrate my application to new versions of Gtk or Qt every few years. And, Gtk and Qt are difficult to use in good programming languages. Gtk and Qt are useable on Python and C++, all of which are bad languages. I found a TUI library that has a saner programming model than either Gtk or Qt, and doesn't break backward compatibility every few years in the name of innovations. If we have a good GUI library that easily allows creation of GUI applications better than terminal UI applications, then this issue wouldn't exist. |
This is not an argument to add image support to terminals. If anything it's an argument why it should not be added.
So to avoid bloating UI toolkits, whose purpose is rendering textures, you want to instead increase the complexity of terminal emulators, which never had any business rendering images?
You can open as many images as you want with sixel and then just go through them. So image count alone isn't really an argument for/against simple image viewers.
This is just using a terminal multiplexer as a replacement for a proper window manager.
GUI as root is just as bad as terminal as root. If anything a file manager temporarily elevating privileges to root has less resource access than a terminal emulator.
E-Mails, just like Websites, unfortunately aren't text-based protocols anymore. There is no "good" solution here, HTML E-Mail is just always atrocious. |
With sixel/timg I can list out thumbnails of a large directory of images all at once in a single view. I'm not aware of a GUI image viewer that defaults to a full page thumbnail view, and going through separate windows or a single window with one at a time would take way too long, so before I had sixel integrated into my setup I'd end up opening a file manager each time. It isn't the end of the world, but it's nice not having to do that 20+ times a day.
I do use terminal apps pretty heavily for a lot of my computing, and I use tmux as a window manager for them, but I'm not sure there's a proper window manager solution to having three panes with vim on the left and one on the top right with mutt sized to 30% of the height, and then quickly getting an image to fill the spot on the bottom right and only have it there when the terminal is in focus.
My file manager doesn't offer the ability to switch to root or other users. There's also a quote by a Gnome dev at the top of https://wiki.archlinux.org/title/Running_GUI_applications_as_root that I agree with, but even ignoring that end of things, I still wouldn't be able to quickly access images locked to a separate user without having to become root and then fish them out. I actually remembered a related use case that I hadn't included before: viewing images on other computers over ssh.
I've managed everything but my work email pretty well exclusively by terminal for many years. Until I incorporated sixel support I had to save the html attachments and then open them with a web browser, but haven't had to do that once since. I don't see the email exactly like the sender intends (if such a thing is possible), but I'm not missing any of the content either. I fully agree that HTML email is garbage. Anyway, I wasn't trying to argue, apologies if it came off like I was. I realize that everyone uses software their own way, and yours should take priority being that you're developing it. I thought it was worth sharing that there are reasonable use cases for some sort of terminal image solution as having sixel has improved my workflow a ton, but I'm sure others out there would have their workflow improved by gui tabs and I'm happy I don't need to unconfigure those in each Alacritty install. I also wanted an excuse to thank you all for the amazing project! |
Yeah I'm not doubting that it might have helped you. I just don't personally ever see any arguments that make me feel like they're of sufficient value to compromise on one of the core principles of terminal emulation. |
I don't think I really understand what that principle is. If Alacritty is an emulation, it is presumably an emulation of a historical terminal. Why then shouldn’t it include every feature that the historical terminal had? DEC terminals had both Sixel and ReGIS graphics protocols (for bitmap and vector graphics, respectively). Tektronics terminals had their own graphics protocol which was emulated by terminals from Wyse and others. The IBM 2250 had better support for vector graphics than for text; you either had to pay extra to get the optional character generator or you would draw characters using vector graphics. |
I think any tiling WM should do the trick fairly easily. From the top of my head i3wm for X11-based, sway for Wayland-based, or some tiling-capable WM like awesome should be able to do this too without having to enforce tiling on all windows
Terminal were HID peripherals that evolved until being ultimately replaced by the screen+keyboard+mouse we have nowadays. The choice of emulating based on standard hardware is for backward compatibility and integrating with/benefiting from an ecosystem that has been there for decades. That being said, you can't emulate any and all terminal fonction because you'd end up getting closer and closer to just having a software screen or a graphics library. We already have those and they use more efficient approaches than a text-based protocol. This leaves us with having to draw a line between what is considered "basic text-based interface to interact with users and display things" and what is better left to graphics applications. That line is the responsibility of maintainers and each project has its own. |
If each project has it’s own threshold, then it isn’t “one of the core principles of terminal emulation.” |
I'll admit my previous response is a bit all over the place... The core principle is being text-oriented as opposed to being/providing a graphical framework. You won't see a purely graphical terminal emulator without text support but you can find lots of terminal emulator without graphics support (never say never I know but I'd like to see a real life use case of such emulator) |
Here is a picture of someone using a terminal in 1970: Or how about this one, from 1971: (source) Of course, Alacritty doesn’t emulate the IBM 2250. I point out these particular photos because on this terminal, text really was optional. You had to pay tens of thousands of dollars extra to get a character generator! Alacritty mostly emulates the DEC terminals. But the DEC terminals had both vector and bitmap graphics support, and plenty of applications used them. There were CAD programs and mapping (GIS) programs. There were engineers and scientists studying graphs. There were games. Surely an emulator should emulate those capabilities accurately, even if they are inefficient by comparison with modern graphics apis. And even if it is hard to find photographs of people actually using them. Does anyone know what type of terminals were shown in Farewell Etaoin Shrdlu? Not many images, but definitely some early WYSIWYG text rendering in 1978. |
This patch adds support for graphics in the terminal. Graphics can be sent as Sixel images.
New features
Sixel parser
The Sixel parser is based on the SIXEL GRAPHICS EXTENSION chapter in a DEC manual.
The support is complete except for the pixel aspect ratio parameters. According to the manual, a Sixel image can specify that it expects a specific shape for the pixels in the device, but in none of terminals that I checked these parameters had any effect: they always assume a 1:1 ratio. Also, I didn't find any Sixel generator that emits a different ratio. To avoid extra complexity in the parser, it always assume 1:1 when the image is built.
There are two new terminal modes:
SixelScrolling
(80
)If enabled, new graphics are inserted at the cursor position, and they can scroll the grid.
If disabled, new graphics are inserted at top-left, and they are limited by the height of the window.
SixelPrivateColorRegisters
(1070
)If enabled, every Sixel parser has its own color palette.
If disabled, Sixel images can share a color palette.
Initially I didn't plan to support this mode, since it seems to be specific to xterm, but when I was testing applications using Sixel I found that mpv uses it to reuse a palette between video frames. Since mpv is based on libsixel, I guess that this feature could be used by more applications.
Both modes are enabled by default.
The function to convert HLS colors to RGB is a direct port of the implementation of the same function in xterm. I verified that the function emits the same results in all combinations of values 0, 30, 60, 90, and 100 in every color component. Only a few of these combinations were left in the tests to reduce the noise in the code.
To test the parser there are Sixel images generated with 3 different applications. For each one, there is a
.rgba
file in the same directory with the expected RGBA values. The commands to produce these files are inalacritty_terminal/tests/sixel/README.md
.Byte
90
as DCSSixel images using byte
90
as DCS are not supported. DCS can be eitherESC P
or90
, but thevte
crate only recognizesESC P
. I guess that this is because90
can be a continuation byte in a UTF-8 sequence (two most significant bits are10
), so it can be a valid input from users.Xterm has the same limitation. I don't expect that many applications depends on it.
ppmtosixel
is an exception. It uses90
to start the Sixel data, and it has to be replaced if we want to see an image generated by it.It is still interesting to test
ppmtosixel
because it was written in 1991, long before Sixel was added to most (if not all) terminal emulators.