[PATCH 0/2] Implement bygrapheme()
[Thread Prev] | [Thread Next]
- Subject: [PATCH 0/2] Implement bygrapheme()
- From: "S. Gilles" <sgilles@xxxxxxxxxxxx>
- Reply-to: myrddin-dev@xxxxxxxxxxxxxx
- Date: Sun, 5 Nov 2017 01:28:48 -0500
- To: "myrddin-dev" <myrddin-dev@xxxxxxxxxxxxxx>
- Cc: "S. Gilles" <sgilles@xxxxxxxxxxxx>
A week or so ago, Ori suggested to me on irc that std.bygrapheme() would be good to have. I'm trying to improve support of zalgo text in libtermdraw, and I find myself wanting this as well. This patch has a slightly narrower view than the Unicode spec on what a “grapheme” is. Unicode's definition of grapheme is context- and user-dependent. For simplicity of implementation, this patch treats a grapheme as a codepoint of width > 0 (as determined by std.cellwidth()), followed by 0 or more codepoints of width 0. If the argument to bygrapheme() doesn't start with a grapheme, or if it isn't valid UTF-8, the function will attempt to read off enough bytes to generate something that would display with positive width. I'm coming at this entirely from the perspective of libtermdraw, hopefully it is not too awkward for other applications. Also, in order to make the test patch readable, patch 1 removes all 0x00 bytes from lib/std/test/utf.myr, so that git knows to display diffs correctly. S. Gilles (2): Make lib/std/test/utf.myr a non-binary file Implement bygrapheme() lib/std/test/utf.myr | Bin 1781 -> 4927 bytes lib/std/utf.myr | 25 +++++++++++++++++++++++++ 2 files changed, 25 insertions(+) -- 2.15.0
[PATCH 1/2] Make lib/std/test/utf.myr a non-binary file | "S. Gilles" <sgilles@xxxxxxxxxxxx> |
[PATCH 2/2] Implement bygrapheme() | "S. Gilles" <sgilles@xxxxxxxxxxxx> |
- Prev by Date: Re: New maintainer for libtermdraw
- Next by Date: [PATCH 1/2] Make lib/std/test/utf.myr a non-binary file
- Previous by thread: Re: New maintainer for libtermdraw
- Next by thread: [PATCH 1/2] Make lib/std/test/utf.myr a non-binary file
- Index(es):