Make constrainByteLength work by tsdko · Pull Request #751 · ynoproject/forest-orb

tsdko · 2026-02-09T16:36:44Z

Might require more testing; I have run the tests on Firefox and Chromium and tested the input manually with Firefox (and IME input with Mozc on Linux) but have not tested on other platforms.

Should hopefully prevent overly long non-ASCII messages from getting eaten during send attempts.

Seems like the current implementation as present in master could have worked with a bit more space in buf (enough to fit the next largest UTF-8 character) and comparing written instead of read as read is in UTF-16 code units instead of bytes, but it still has weird behavior when the caret is not at the end of the string (on regular input it's forced to the end; if you paste something and the entire string is too long the existing text at the end gets cut off).

The behavior of the built-in maxlength attribute is not very consistent across browsers: if the user attempts to replace currently selected text and not even one character from the replacement string fits, Firefox preserves the selection while Chromium discards it instead. This implementation discards the selection.

Shortcoming: hitting the length limit breaks undo (does nothing). (This is a problem with the current implementation as well, it's just a bit more hidden as ASCII inputs get properly constrained via HTML maxlength.)

Test code

// "|" is the caret
const tests = [
  // below the byte limit, unchanged
  "🐱|",       "🐱",
  "あい|",     "あい",
  "abc🐱|",    "abc🐱",
  "abcdあ|",   "abcdあ",
  "abcdefg|",  "abcdefg",
  // above the byte limit, truncated
  "abcdefgh|", "abcdefg",
  "あabcde|",  "あabcd",
  "abcdeあ|",  "abcde",
  "abcd🐱|",   "abcd",
  "あいう|",   "あい",
  "🐱🦈|",     "🐱",
  // above the byte limit, caret in the middle of the string
  "abcd|efgh", "abcefgh",
  "あb|cdef",  "あcdef",
  "abc|deあ",  "abdeあ",
  "abc|d🐱",   "abd🐱",
  "あい|う",   "あう",
  "🐱|🦈",     "🦈",
];
const cbl = constrainByteLength(7);
for (let i = 0; i < tests.length; i += 2) {
  const sel = tests[i].indexOf('|');
  console.assert(sel >= 0, `no caret in ${tests[i]}`);
  const inVal = tests[i].substring(0, sel) + tests[i].substring(sel+1);
  const event = {target: {value: inVal, selectionStart: sel, selectionEnd: sel}};
  cbl(event);
  const actual = event.target.value, expected = tests[i+1];
  console.assert(expected === actual, `expected ${expected}, got ${actual}`);
}

Should hopefully prevent overly long non-ASCII messages from getting eaten during send attempts. Seems like the current implementation could have worked with a bit more space in buf (enough to fit the next largest UTF-8 character) and comparing `written` instead of `read` as `read` is in UTF-16 code units instead of bytes, but it still has weird behavior when the caret is not at the end of the string (on regular input it's forced to the end, if you paste something the existing text at the end gets cut off if the entire string is too long). The behavior of the built-in `maxlength` attribute is not very consistent across browsers: if the user attempts to replace currently selected text and not even one character from the replacement string fits, Firefox preserves the selection while Chromium discards it instead. This implementation discards the selection. Shortcoming: hitting the length limit breaks undo (does nothing). (This is a problem with the current implementation as well, it's just a bit more hidden as ASCII inputs get properly constrained via HTML `maxlength`.) Test code: // "|" is the caret const tests = [ // below the byte limit, unchanged "🐱|", "🐱", "あい|", "あい", "abc🐱|", "abc🐱", "abcdあ|", "abcdあ", "abcdefg|", "abcdefg", // above the byte limit, truncated "abcdefgh|", "abcdefg", "あabcde|", "あabcd", "abcdeあ|", "abcde", "abcd🐱|", "abcd", "あいう|", "あい", "🐱🦈|", "🐱", // above the byte limit, caret in the middle of the string "abcd|efgh", "abcefgh", "あb|cdef", "あcdef", "abc|deあ", "abdeあ", "abc|d🐱", "abd🐱", "あい|う", "あう", "🐱|🦈", "🦈", ]; const cbl = constrainByteLength(7); for (let i = 0; i < tests.length; i += 2) { const sel = tests[i].indexOf('|'); console.assert(sel >= 0, `no caret in ${tests[i]}`); const inVal = tests[i].substring(0, sel) + tests[i].substring(sel+1); const event = {target: {value: inVal, selectionStart: sel, selectionEnd: sel}}; cbl(event); const actual = event.target.value, expected = tests[i+1]; console.assert(expected === actual, `expected ${expected}, got ${actual}`); }

zebraed · 2026-02-10T00:45:59Z

I can help testing with this on another platform, please wait a little while

zebraed · 2026-02-14T17:01:07Z

i tested your test snippet on several platform

Platform	Browser	Assertion Test
macOS	Chrome	Passed
macOS	Safari	Passed
Windows	Chrome	Passed
Windows	Firefox	Passed

Manual test on macOS Chrome + Japanese IME, (haven't tested anything else yet, sorry
i can reproduce a difference between paste vs IME typing:

Pasting a >150-byte Japanese string gets trimmed immediately (as expected).
Example: pasting
あのイーハトーヴォのすきとおった風、夏でも底に冷たさをもつ青いそら、うつくしい森で飾られたモリーオ市、郊外のぎらぎらひかる草の波。
trims down to
あのイーハトーヴォのすきとおった風、夏でも底に冷たさをもつ青いそら、うつくしい森で飾られたモリーオ市
However, after the paste-trim, i can continue typing (with Japanese IME) and the input can exceed 150 UTF-8 bytes again.
e.g. append 、あああああ and it keeps accepting input.

this seems related to IME composition: many input events are emitted with isComposing=true, and trimming is skipped during composition, so byte overflow can accumulate unless we also trim on compositionend and/or enforce once right before sending.
As you mentioned in your comment, this suggests that while skipping trimming during composition is correct, the implementation also needs a guaranteed recovery point after composition completes.

Apply the same byte-limiting function on compositionend
Enforce a final constrainByteLength(150) call immediately before sending

With these additions, IME input and normal typing both remain within the 150 UTF-8 byte limit across the tested environments.

displaying the current UTF-8 byte count as an indicator in the chat box could also improve the overall UX. i will experiment with this on my side when once your implementation is finalized.

tsdko · 2026-02-14T21:32:38Z

Manual test on macOS Chrome + Japanese IME, (haven't tested anything else yet, sorry i can reproduce a difference between paste vs IME typing:
1. Pasting a >150-byte Japanese string gets trimmed immediately (as expected).
   Example: pasting
   `あのイーハトーヴォのすきとおった風、夏でも底に冷たさをもつ青いそら、うつくしい森で飾られたモリーオ市、郊外のぎらぎらひかる草の波。`
   trims down to
   `あのイーハトーヴォのすきとおった風、夏でも底に冷たさをもつ青いそら、うつくしい森で飾られたモリーオ市`

2. However, after the paste-trim, i can continue typing (with Japanese IME) and the input can exceed 150 UTF-8 bytes again.
   e.g. append `、あああああ` and it keeps accepting input.
this seems related to IME composition: many input events are emitted with isComposing=true, and trimming is skipped during composition, so byte overflow can accumulate unless we also trim on compositionend and/or enforce once right before sending. As you mentioned in your comment, this suggests that while skipping trimming during composition is correct, the implementation also needs a guaranteed recovery point after composition completes.

True; compositionend has been addressed now. I should've tested more thoroughly, this was a case I could reproduce. It seems there are differences in how different browsers send input events and I was working on the assumption one would always get input with isComposing=false after composition, which on Chrome seems to happen only if the user closes the IME without submitting via the enter key.

As for limiting right before sending: I was not able to find a case where it was needed after adding a handler for compositionend but I wouldn't mind it having it added if there is one. It is a bit bad for UX to have parts of the message cut off only after it's submitted but arguably the current behavior of eating the entire thing might be considered much worse.

displaying the current UTF-8 byte count as an indicator in the chat box could also improve the overall UX. i will experiment with this on my side when once your implementation is finalized.

I think doing this and letting the user freely exceed the limit (but prevent them from submitting if text is too long) is a much better idea than the current approach of emulating HTML maxlength behavior with byte counts. Ideally this would apply to the "in-game" input (gameChatInput) as well. My focus with this PR was just to make the current handler work.

zebraed · 2026-02-15T05:22:59Z

As for limiting right before sending: I was not able to find a case where it was needed after adding a handler for compositionend but I wouldn't mind it having it added if there is one. It is a bit bad for UX to have parts of the message cut off only after it's submitted but arguably the current behavior of eating the entire thing might be considered much worse.

that makes sence, it does seem somewhat redundant after adding the compositionend handler.

My focus with this PR was just to make the current handler work.

yes in that case, would it be okay if we merge this PR first so that the current design is completed as intended?
this is an important fix.

I think doing this and letting the user freely exceed the limit (but prevent them from submitting if text is too long) is a much better idea than the current approach of emulating HTML maxlength behavior with byte counts.

Okay, after that, i will then triage it as a separate issue for UX improvements and additional feature implementation.
i also think this is better approach, as you said, since real-time trimming based on UTF-8 byte limits will likely always conflict with IME behavior

Check chat message length on compositionend as well

26a1c69

tsdko force-pushed the fix-constrain-byte-length branch from ba5ebf3 to 26a1c69 Compare February 14, 2026 22:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make constrainByteLength work#751

Make constrainByteLength work#751
tsdko wants to merge 2 commits intoynoproject:masterfrom
tsdko:fix-constrain-byte-length

tsdko commented Feb 9, 2026 •

edited

Loading

Uh oh!

zebraed commented Feb 10, 2026

Uh oh!

zebraed commented Feb 14, 2026

Uh oh!

tsdko commented Feb 14, 2026 •

edited

Loading

Uh oh!

zebraed commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tsdko commented Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zebraed commented Feb 10, 2026

Uh oh!

zebraed commented Feb 14, 2026

Uh oh!

tsdko commented Feb 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zebraed commented Feb 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

tsdko commented Feb 9, 2026 •

edited

Loading

tsdko commented Feb 14, 2026 •

edited

Loading