Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always open links in chat in a new tab #2180

Merged
merged 2 commits into from
May 26, 2024
Merged

Conversation

austenadler
Copy link

@austenadler austenadler commented May 11, 2024

Pull Request Checklist

  • Target branch: Pull requests should target the dev branch.
  • Description: Briefly describe the changes in this pull request.
  • Changelog: Ensure a changelog entry following the format of Keep a Changelog is added at the bottom of the PR description.
  • Documentation: Have you updated relevant documentation Open WebUI Docs, or other documentation sources?
  • Dependencies: Are there any new dependencies? Have you updated the dependency versions in the documentation?
  • Testing: Have you written and run sufficient tests for the changes? -> I did not see any tests
  • Code Review: Have you self-reviewed your code and addressed any coding standard issues?

Description

Add target="_blank" rel="nofollow" to all links in the chat history window. It is annoying to click a link in a chat and have it unload Open-Webui, so I added this feature. (This)[https://github.com/markedjs/marked/issues/655#issuecomment-383226346] issue in the marked repo describes how to do this properly.

This only applies to links within chat messages. For example, asking a model what is the URL of https://google.com? and clicking the link in your message or the one it produces.


Changelog Entry

Changed

  • Links in chat open in a new tab instead of the same tab

@anuraagdjain
Copy link
Contributor

Screenshot 2024-05-13 at 10 45 44 AM

The UI looks fine and clicking on the link opens a new page 👏 . However there seems to be some extra property at the end of the URL which renders broken page. This isn't related to the changes done in PR, but would you be able to look at it?

<p>The Google website link is &lt;<a target="_blank" rel="nofollow" href="https://www.google.com/%3E">https://www.google.com/&gt;</a>.</p>

@austenadler
Copy link
Author

I believe the issue here is this line:

$: tokens = marked.lexer(sanitizeResponseContent(message.content));

Which calls this function:

export const sanitizeResponseContent = (content: string) => {
return content
.replace(/<\|[a-z]*$/, '')
.replace(/<\|[a-z]+\|$/, '')
.replace(/<$/, '')
.replaceAll(/<\|[a-z]+\|>/g, ' ')
.replaceAll('<', '&lt;')
.trim();
};

This transforms <https://www.google.com> into &lt;https://www.google.com>, and then this is passed in to marked here:

{@html marked.parse(token.raw, {
...defaults,
gfm: true,
breaks: true,
renderer
})}
which correctly parses it

marked knows how to parse <https://example.com> in the latest version as well as 9.1.6, so if we fix that &lt; replacement, it will work.

I don't know what the impact of adding .replaceAll('>', '&gt;') is to sanitizeResponseContent. Another option is to not replace < with &lt;.

@tjbck tjbck marked this pull request as draft May 15, 2024 18:42
@austenadler
Copy link
Author

I tried two ways to fix sanitization

  1. Method 1: Replace > with &gt; as well (included in this PR)
Patch

diff --git a/src/lib/utils/index.ts b/src/lib/utils/index.ts
index 03cbc626..2f9b1dc2 100644
--- a/src/lib/utils/index.ts
+++ b/src/lib/utils/index.ts
@@ -36,11 +36,14 @@ export const sanitizeResponseContent = (content: string) => {
 		.replace(/<$/, '')
 		.replaceAll(/<\|[a-z]+\|>/g, ' ')
 		.replaceAll('<', '&lt;')
+		.replaceAll('>', '&gt;')
 		.trim();
 };
 
 export const revertSanitizedResponseContent = (content: string) => {
-	return content.replaceAll('&lt;', '<');
+	return content
+		.replaceAll('&lt;', '<')
+		.replaceAll('&gt;', '>');
 };
 
 export const capitalizeFirstLetter = (string) => {

  1. Method 2: Use dompurify after markup parsing (as suggested here)
Patch

commit 9c48a1568d00029bef65b9b653d02fd013f52eef
Author: Austen Adler <agadler@austenadler.com>
Date:   Thu May 16 16:28:10 2024 -0400

    Use dompurify instead of manual sanitization

diff --git a/package-lock.json b/package-lock.json
index be7ae0dc..80477544 100644
--- a/package-lock.json
+++ b/package-lock.json
@@ -12,6 +12,7 @@
 				"async": "^3.2.5",
 				"bits-ui": "^0.19.7",
 				"dayjs": "^1.11.10",
+				"dompurify": "^3.1.3",
 				"eventsource-parser": "^1.1.2",
 				"file-saver": "^2.0.5",
 				"highlight.js": "^11.9.0",
@@ -2730,6 +2731,11 @@
 				"url": "https://github.com/fb55/domhandler?sponsor=1"
 			}
 		},
+		"node_modules/dompurify": {
+			"version": "3.1.3",
+			"resolved": "https://registry.npmjs.org/dompurify/-/dompurify-3.1.3.tgz",
+			"integrity": "sha512-5sOWYSNPaxz6o2MUPvtyxTTqR4D3L77pr5rUQoWgD5ROQtVIZQgJkXbo1DLlK3vj11YGw5+LnF4SYti4gZmwng=="
+		},
 		"node_modules/domutils": {
 			"version": "3.1.0",
 			"resolved": "https://registry.npmjs.org/domutils/-/domutils-3.1.0.tgz",
diff --git a/package.json b/package.json
index f8096269..ea7de885 100644
--- a/package.json
+++ b/package.json
@@ -49,6 +49,7 @@
 		"async": "^3.2.5",
 		"bits-ui": "^0.19.7",
 		"dayjs": "^1.11.10",
+		"dompurify": "^3.1.3",
 		"eventsource-parser": "^1.1.2",
 		"file-saver": "^2.0.5",
 		"highlight.js": "^11.9.0",
diff --git a/src/lib/components/chat/Messages/ResponseMessage.svelte b/src/lib/components/chat/Messages/ResponseMessage.svelte
index 1ad252f6..b067aac1 100644
--- a/src/lib/components/chat/Messages/ResponseMessage.svelte
+++ b/src/lib/components/chat/Messages/ResponseMessage.svelte
@@ -1,5 +1,6 @@
 <script lang="ts">
 	import { toast } from 'svelte-sonner';
+	import DOMPurify from 'dompurify';
 	import dayjs from 'dayjs';
 	import { marked } from 'marked';
 	import tippy from 'tippy.js';
@@ -441,15 +442,15 @@
 									{#if token.type === 'code'}
 										<CodeBlock
 											lang={token.lang}
-											code={revertSanitizedResponseContent(token.text)}
+											code={token.text}
 										/>
 									{:else}
-										{@html marked.parse(token.raw, {
+										{@html DOMPurify.sanitize(marked.parse(token.raw, {
 											...defaults,
 											gfm: true,
 											breaks: true,
 											renderer
-										})}
+										}))}
 									{/if}
 								{/each}
 							{/if}
diff --git a/src/lib/utils/index.ts b/src/lib/utils/index.ts
index 03cbc626..20c68d84 100644
--- a/src/lib/utils/index.ts
+++ b/src/lib/utils/index.ts
@@ -35,14 +35,9 @@ export const sanitizeResponseContent = (content: string) => {
 		.replace(/<\|[a-z]+\|$/, '')
 		.replace(/<$/, '')
 		.replaceAll(/<\|[a-z]+\|>/g, ' ')
-		.replaceAll('<', '&lt;')
 		.trim();
 };
 
-export const revertSanitizedResponseContent = (content: string) => {
-	return content.replaceAll('&lt;', '<');
-};
-
 export const capitalizeFirstLetter = (string) => {
 	return string.charAt(0).toUpperCase() + string.slice(1);
 };

Which one would you prefer? I have already tested both and they're about the same. I think dompurify is better long-term, but it's a larger change.

@tjbck
Copy link
Contributor

tjbck commented May 21, 2024

1st option would be preferable for now!

@austenadler
Copy link
Author

Okay, I pushed patch 1

@austenadler austenadler marked this pull request as ready for review May 23, 2024 13:40
@anuraagdjain
Copy link
Contributor

Couldn't check this earlier. The change seems to be working thank you 🙏

Copy link
Contributor

@anuraagdjain anuraagdjain left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 🚀

@tjbck
Copy link
Contributor

tjbck commented May 26, 2024

Thanks everyone!

@tjbck tjbck merged commit 6e89a48 into open-webui:dev May 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants