fix(text-to-unicode): handle non-BMP + more conversion options #1087

lionel-rowe · 2024-05-14T13:46:33Z

vercel · 2024-05-14T13:46:37Z

The latest updates on your projects. Learn more about Vercel for Git ↗︎

Name	Status	Preview	Updated (UTC)
it-tools	✅ Ready (Inspect)	Visit Preview	May 15, 2024 3:23am

sonarcloud · 2024-05-15T03:22:38Z

Quality Gate passed

Issues
7 New issues
0 Accepted issues

Measures
0 Security Hotspots
No data about Coverage
0.0% Duplication on New Code

See analysis details on SonarCloud

sharevb · 2024-05-15T16:16:13Z

Hi @lionel-rowe, great job, could be interesting to add a transcript including "character names" (ie, using https://www.npmjs.com/package/@unicode/unicode-15.1.0) ?
This would fix #544

lionel-rowe · 2024-05-17T00:01:08Z

could be interesting to add a transcript including "character names" (ie, using https://www.npmjs.com/package/@unicode/unicode-15.1.0) ? This would fix #544

I have a standalone tool I currently use that provides similar functionality. The JSON file mapping chars/ranges to their names is around 2MB, which doesn't seem like a reasonable amount to pull in unconditionally here. An async solution loading only the relevant Unicode blocks with dynamic import() (perhaps with ASCII range loaded synchronously by default) could work, but it'd take a little work to make async-friendly.

Edit: Actually it looks like the RLE+gzip+base64-encoded version of the name data in the node-unicode package you linked to is "only" ~194 KB, which is much more reasonable, especially if only loaded conditionally. And it could be reduced by a further ~25% if loaded as a raw binary file instead of base64. But I'm not convinced it's within the scope of this tool, it's more of a Unicode "explainer" than a Unicode "converter". Output should probably be tabular, maybe with a link to an external site like compart (like in my standalone tool). And the reverse direction (if implemented) obviously wouldn't be inputting that tabular data, it'd be a search function, preferably with fuzzy matching. In any case, it's definitely not a simple bidirectional converter like the text-to-unicode tool.

sharevb · 2024-05-19T16:40:24Z

Hi @lionel-rowe, yes, right this is not bi-directional and yes the reverse will be a lookup

fix(text-to-unicode): handle non-BMP + more conversion options

b0ae8d7

vercel bot deployed to Preview May 14, 2024 13:48 View deployment

lionel-rowe marked this pull request as draft May 14, 2024 14:29

Always escape ASCII chars with special meaning

1dc965d

vercel bot deployed to Preview May 15, 2024 02:55 View deployment

Convert Converter to class

d75473b

vercel bot deployed to Preview May 15, 2024 03:10 View deployment

Fix type checking issues

7e90365

vercel bot deployed to Preview May 15, 2024 03:23 View deployment

lionel-rowe marked this pull request as ready for review May 15, 2024 03:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(text-to-unicode): handle non-BMP + more conversion options #1087

fix(text-to-unicode): handle non-BMP + more conversion options #1087

lionel-rowe commented May 14, 2024

vercel bot commented May 14, 2024 •

edited

sonarcloud bot commented May 15, 2024

sharevb commented May 15, 2024

lionel-rowe commented May 17, 2024 •

edited

sharevb commented May 19, 2024

fix(text-to-unicode): handle non-BMP + more conversion options #1087

Are you sure you want to change the base?

fix(text-to-unicode): handle non-BMP + more conversion options #1087

Conversation

lionel-rowe commented May 14, 2024

vercel bot commented May 14, 2024 • edited

sonarcloud bot commented May 15, 2024

Quality Gate passed

sharevb commented May 15, 2024

lionel-rowe commented May 17, 2024 • edited

sharevb commented May 19, 2024

vercel bot commented May 14, 2024 •

edited

lionel-rowe commented May 17, 2024 •

edited