preliminary minimal changes by zjmarlow · Pull Request #251 · raku-community-modules/HTTP-UserAgent

zjmarlow · 2026-01-09T02:09:05Z

draft pull request with minimal changes to handle eol. also updates is-chunked check to include the case where multiple Transfer Encodings are present (chunked should always be last).

requesting feedback before final changes.

lizmat · 2026-01-12T11:29:09Z

Appears to be failing 2 tests for me on MacOS:

=== t/020-message.rakutest
1..21
ok 1 - new 1/4
ok 2 - new 2/4
ok 3 - new 3/4
ok 4 - new 4/4
ok 5 - push-field 1/2
ok 6 - push-field 2/2
ok 7 - add-content 1/2
ok 8 - add-content 2/2
ok 9 - remove-field 1/1
ok 10 - parse 1/4
ok 11 - parse 2/4
ok 12 - parse 3/4
ok 13 - parse 4/4
not ok 14 - Str 1/2
# Failed test 'Str 1/2'
# at t/020-message.rakutest line 41
# expected: 'a: b, c, d
# 
# line
# '
#      got: 'a: b, c, d
# Content-Length: 4
# 
# line'
not ok 15 - Str 2/2
# Failed test 'Str 2/2'
# at t/020-message.rakutest line 42
# expected: 'a: b, c, d

line
'
#      got: 'a: b, c, d
Content-Length: 4

line'
ok 16 - clear 1/2
ok 17 - clear 2/2
ok 18 - parse complex 1/3
ok 19 - parse complex 2/3
ok 20 - parse complex 3/3
# Subtest: charset
    ok 1 - dumb default charset
    ok 2 - default text charset
    ok 3 - default "non-text" charset
    ok 4 - explicity charset
    1..4
ok 21 - charset
# You failed 2 tests of 21

zjmarlow · 2026-01-16T21:23:40Z

Should HTTP::UserAgent be producing valid Messages, or is it up to the users of the library to use it in such a way that it only produces valid messages? Some of the test messages are themselves invalid, so what needs changing will depend on the library's purpose.

lizmat · 2026-01-16T23:48:38Z

Well, that is a good question. What this module really needs, is a maintainer.

Would you be willing to take on that role? You apparently have a need to fix things in it :-)

zjmarlow · 2026-01-17T18:20:05Z

Yes, but I'll need guidance and feedback as I maintain it, if that's okay?

Some first questions:

should the module or the module users be responsible for making sure messages generated are valid, and
what should be done when an invalid message is received:
- nothing / ignore
- parse as much as possible and silently ignore the rest
- warning
- exception
- optional error handler callback that takes the message as an argument
- other?

I'll start looking through the other open issues, too.

zjmarlow · 2026-01-19T21:17:52Z

What would the best channel / medium be for discussing this module?

lizmat · 2026-01-19T21:37:33Z

Either #raku on libera.chat or on Discord

zjmarlow · 2026-01-24T09:47:06Z

There are enough changes to get things conformant that it would be difficult not to break existing behavior. My plan is to provide alternate classes that can be imported with:

use HTTP::UserAgent 'strict';

if that is acceptable, though I'm open to other options.

lizmat · 2026-01-24T09:53:46Z

That would make it opt-in, and as such backward compatible. 👍

zjmarlow · 2026-01-25T19:20:05Z

I wasn't able to get exporting to work how I was hoping, so instead, the strict versions are just named after the originals with "-Strict" attached. They are subclasses of the originals. The new classes are:

HTTP::UserAgent-Strict
HTTP::Message-Strict
HTTP::Request-Strict
HTTP::Response-Strict
HTTP::Header-Strict
HTTP::Header::ETag (HTTP::Header::Field with a "weak" attribute)

They are all made available with

use HTTP::UA-Strict;

so that it is easy to just import the strict classes. Behavior when mixing strict and original classes is "undefined".

The only change to the original should be the is-chunked multi methods in the HTTP::Message class. One checks the message's own headers, the other takes an arbitrary header object.

The following new testcases were added:

011-headers-strict.rakutest
021-message-issue-226.rakutest
042-request-issue-226.rakutest
051-response-issue-226.rakutest

I still need to do some testing with the HTTP::UserAgent-Strict class before turning this draft request into a full pull request, but this should be a good place for review and verifying testcases (the new behavior passed for me locally across Windows, Linux, and Mac).

zjmarlow · 2026-01-26T14:41:42Z

need to fix body-less messages. will comment again once fixed.
fix committed.

jonathanstowe · 2026-01-26T16:06:47Z

 unit class HTTP::Cookies;

 use HTTP::Cookie;
-use HTTP::Response:auth<zef:raku-community-modules>;


I think these were qualified like this because there are un-related packages that provide similarly named modules, so I'd be careful about removing that.

Yeah, the removal of this will requires an explanation

I missed that one. It is added back now. I wasn't aware of the similarly named modules.

zjmarlow · 2026-01-26T19:05:53Z

My apologies. I am now reading through the META6.json and distributions documentation and committed some fixes. Will the version's patch number eventually need to be updated?

zjmarlow · 2026-01-26T21:47:10Z

Okay, the META6.json should be fixed. I did not change the version number.

zjmarlow · 2026-02-06T21:44:46Z

Sorry, I was too quick with the auto-complete. It was meant to be in reference to #226 , which was opened by bbkr.

…ests

zjmarlow · 2026-02-07T13:35:14Z

Testcases added for the various UserAgent and Request strict / non-strict interactions.

ugexe

That's quite a lot of changes, nice! Unfortunately making lots of changes also has drawbacks... would it be possible to clean up this PR further? There are multiple commits fixing things broken in/from previous commits which makes reviewing things commit by commit frustrating. The PR title is still "preliminary changes" and is described as being a draft pull request (among other things), which makes understanding the git history more difficult. There are also many leftover comments in the code itself.

I left quite a few comments, but that was just from a cursory review so I suspect there are things I missed. Hopefully it is not too discouraging... you've bitten off a rather large piece of work.

ugexe · 2026-02-08T20:49:48Z


-method new($content?, *%fields) {
-    my $header = HTTP::Header.new(|%fields);
+method new($content?, Bool :$strict, *%fields) {


This seems like a bit of an unfortunate api design. What if I want to send a header named strict?

Instead of named, strict could be the final positional and default to False. If that's acceptable, I'd like to keep it consistent by changing all the signatures to have strict as the final positional with False as default. This particular case might require a multi method, though:

multi method new($content, Bool $strict = False, *%fields) multi method new(Bool $strict = False, *%fields) is default

I haven't thought much about it, but another option would just be to make setting strict an additional step, i.e. my $m = HTTP::Message.new(...); $m.strict(True);. I think either way is fine, but maybe you have an opinion on it.

I think, aside from simplicity, good goals would be:

don't break existing code

clear

consistent

convenient

I hadn't thought of your option, but think it does better with the priorities than what I had:
Message.new: 'some content', True, field => 'value';
which fails 2, in my opinion, and complicates 1.

ugexe · 2026-02-08T20:51:34Z

+        # pop zero-length Str that occurs after last chunk
+        #   what to do if this doesn't happen?
+        @lines.pop if @lines %2;
+        @lines = grep *,


What is the purpose of grepping on *? It is also hard to read this chain since it is mixing routine and method calls.

The full purpose of the chain is to take lines two at a time - first line is the chunk length, second the chunk content. If the chunk length looks valid, pass along the content, otherwise use grep to filter it out. It should probably fully validate.

The larger problem is that if the content contains embedded CRLFs, the method is broken. I might try my hand at a correct implementation.

Let me rephrase my question: what do you think grep *, @foo does? If it doesn't do anything is there any reason to include it?

It was meant to filter out falsy values, but that requires grep so *. That was pushed for one of the instances. The other is fixed locally.

ugexe · 2026-02-08T21:02:10Z

 }

-method parse($raw_message) {
+method !parse ( $raw_message ) {


!parse isn't a great name for this function, especially given there is a public function of the same name. !parse-strict or some such would make it more clear what the purpose of this function is.

Also it seems like there is a lot of logic that could be shared between !parse and parse. Refactoring these would make it easier to maintain.

refactored, including adding various private methods that parse status / request line, header, and content

ugexe · 2026-02-08T21:02:38Z

+        # technically incorrect - content allowed to contain embedded CRLFs
+        my @lines = $content.split: $CRLF;
+        # pop zero-length Str that occurs after last chunk
+        #   what to do if this doesn't happen?


I don't know, what should this do?

One option would be: if the user specified throw-exceptions, to throw with 'truncated last chunk'. The changes are meant to focus on producing strict output rather than demand strict input, so I'm not sure about this - your thoughts?

I'm not sure. It would make sense to create an issue though so it isn't forgotten/

ugexe · 2026-02-08T21:18:19Z

It seems like a strange design to only have special http header objects for one type of header. Is there any reason this needs its own object? The motivation for it isn't clear to me as the commit message on this file is just "initial strict implementation".

ETags can be tagged as weak validators and might be handled differently in those cases.

ugexe · 2026-02-08T21:24:22Z

 # headers container
 has @.fields;

+grammar Grammar::Strict {


The other grammar in this file is called HTTP::Header::Grammar which makes it clear what it is used for. To me a name like Grammar::Strict in the same file does not suggest it has any relation to HTTP::Header::Grammar even though it appears that is the case based on the parse method that returns one of these classes.

I don't think I understand the comment. The only relation between the grammars is that they serve the same purpose. Otherwise the strict grammar is based on the RFCs and supports weak ETags. Could you please let me know if my explanation missed the mark?

zjmarlow · 2026-02-08T23:47:25Z

Hello @ugexe , thank you for taking the time to do this. Would you prefer a separate PR with only the final changes committed for any further reviews? I have some other work to take care of so it will likely be over the next few days that I get through the change requests.

zjmarlow · 2026-02-13T09:13:58Z

I think the test failure is just due to incorrect skip count. I've fixed it locally. I'm still working through the change requests and will mention in a comment when changes have been pushed. In the meantime, I might have questions / need feedback.

…eader

zjmarlow · 2026-02-16T14:29:19Z

Okay, all changes I think I understood have been applied. I can push what I have now, or, if you prefer, wait until the rest of the change request conversations have been resolved.

jonathanstowe · 2026-02-16T17:52:39Z

I can push what I have now

Yes please.

zjmarlow · 2026-02-16T20:30:58Z

The signature change where strict is the last positional instead of an associative has been implemented enough to get tests passing. The rest of the methods should probably be updated at some point to make requesting strict processing consistent.

zjmarlow · 2026-02-19T15:54:26Z

Final sig changes submitted. In the end, I kept the strict parameter as the last positional, and optional. Message, Request, and Response constructors all need to know at the time of call what the strict setting is.

zjmarlow · 2026-03-13T06:09:37Z

Please let me know if continuing to pursue these changes is worthwhile. If not, I don't mind continuing to contribute in other ways (there are two other small pull requests ready for review).

patrickbkr · 2026-03-13T09:22:10Z

@zjmarlow I haven't read through the entire discussion in detail, but have loosely followed from the sidelines. My impression has always been that a lot of work and thought (from you and the reviewers) went into this. Unless I somehow missed a difficulty or controversy that questions all of this work, I'd find it sad to just lose all of this work all of you have put into this.

ugexe

Sorry for taking so long to re-review. The changes look good, but I think there are still a few bugs that should be fixed. I would encourage you to rebase this branch into a few logical commits so the git history is useful, but I wouldn't block this PR on that.

ugexe · 2026-03-16T03:19:37Z

    my constant $max_size = 300;
-    my $s = $.header.Str($eol);
-    $s ~= $eol if $.content;
+    self.field: Content-Length => ( $!content.?encode or $!content ).bytes.Str


It feels wrong for a serialization function (the Str() function this logic is inside) to mutate the object (i.e. self.field). It also makes it confusing what the state of the object is if someone e.g. calls Str() multiple times.

Okay, UA's request method handles some of the other Request setup - currently Cookies, User-Agent, and Authorization. It could be moved there.

I think there are two conflicting principles here. One, as you mentioned, is that serialization should not mutate the object. The other is that, to conform to the RFCs, and thus respect a strict call, Messages must include Content Length in the absence of Transfer Encoding: chunked, and also use CRLF as the EOL.

One option would be to remove the strict option from Message.Str and have UA's request method handle including the Content Length and EOL selection when the request method is called with its strict option. I am open to other alternatives.

I'll look into adding Content-Length if it's needed without mutating the object itself.

ugexe · 2026-03-16T03:28:06Z

+method Str (Bool $strict is copy = False, :$debug, Bool :$bin) {
+    $strict ||= $!strict;


I'm not sure if many of these ||= instances are doing what you want or the user would expect. For instance I would expect the value of $strict to be used over $!strict, i.e. I expect function parameters to override class attributes representing a default. I would expect these to instead be like

Suggested change

method Str (Bool $strict is copy = False, :$debug, Bool :$bin) {

$strict ||= $!strict;

method Str (Bool $strict is copy = $!strict, :$debug, Bool :$bin) {

This way if I call .Str(False) (which as an aside doesn't read very well, and to me suggests it and maybe others should be a named argument where it makes sense unlike that previous .new instance from a prior review) the False is respected.

ugexe · 2026-03-16T03:31:01Z

+        # technically incorrect - content allowed to contain embedded CRLFs
+        my @lines = $content.split: $CRLF;
+        # pop zero-length Str that occurs after last chunk
+        #   what to do if this doesn't happen?


I'm not sure. It would make sense to create an issue though so it isn't forgotten/

ugexe · 2026-03-16T03:34:43Z

 }

+multi method field ( HTTP::Header::ETag:D $etag ) {
+    @.fields.push: $etag;


Is this going to accumulate etags every time it is called? The other field setter above replaces existing values, it doesn't append.

Yes, good point. To be lenient in what it accepts, it should either ignore or replace. To be consistent with the other behavior, it should probably replace.

This change has been made locally. It needs some testcases to cover it. It will be committed with other changes once some more discussion has been had regarding the other change requests.

ugexe · 2026-03-16T03:42:33Z

-method Str($eol = "\n") {
+multi method Str(Str $eol is copy = "\n", Bool $strict is copy = False) {
+    $strict ||= $!strict;
+    $eol = $CRLF if $strict;


It is kind of a weird api to have $strict override $eol that the user explicitly passes. Something like this is a bit better since it respect the $eol the user provides, although it isn't great either:

multi method Str(Str $eol is copy = $strict ?? $CRLF !! "\n", Bool :$strict is copy = False) {

Also by making :$strict a named parameter we can call self.Str: :$strict; in the Str multi below instead of self.Str: "\n", $strict; (which duplicates the default \n in the function body as well as this multis signature).

This issue seems similar to what is mentioned in the Message.parse comment. Waiting for more discussion in that comment.

zjmarlow · 2026-03-16T20:02:44Z

No problem, I will work on rebasing and then go through the change requests this week.

chunk size should be hex Co-authored-by: Nick Logan <nlogan@gmail.com>

zjmarlow · 2026-03-19T12:10:47Z

For the rebase into logical commits, are you mainly referring to squashing commits that were minor / ended up being reverted / reworked?

ugexe · 2026-03-19T22:14:49Z

For the rebase into logical commits, are you mainly referring to squashing commits that were minor / ended up being reverted / reworked?

Essentially yes. That might end up being just a single commit.

keep original names fix expected error message reorg modules; fix imports; restore original modules make sure strict classes pass original tests fix variable redeclaration fix body-less Message.Str was test issue auth reinstate auth; add Str as URL convenience methods to UA-Strict auth update provides update provides fix return fix return add UA-Strict itself to provides -Strict to ::Strict rename nested grammar add back auth flag implementation of strict remove redundant attr; fix new; fix strict print logic; add interop tests TestServer location pre strict sig change refactor Message parse methods; make $strict default to $!strict in Header pass $strict when sending binary request, too preliminary sig changes final sig changes Update lib/HTTP/Message.rakumod chunk size should be hex Co-authored-by: Nick Logan <nlogan@gmail.com>

…useragent into issue-226-eol-minimal

ugexe · 2026-03-20T20:10:02Z

For your commits you'll want a reasonable message as well. Don't make a list of the changes you made - the changes should already be split up into logical pieces as commits. The commit message should inform us of the WHY and potentially some of the high level HOW. The commit message title should briefly explain what the changes do. For example something like:

Implement strict html parsing and validation mode

Previously HTTP::UserAgent would accept and produce output that is technically invalid according to RFC. This commit adds a strict parameter to enable additional validation of html so that xyz.

The reason is someone looking at the git history of any of the lines of code you added should be able to understand why that line of code was added from the commit subject/message.

zjmarlow · 2026-03-21T16:20:01Z

Okay, I can amend the commit message.

I understand that quite a bit of work has gone into the current approach, but it might be worth reconsidering the subclass approach. The two main goals were to 1. make changes opt-in / leave existing behavior in place, and 2. make requests more conformant to the RFCs. The drawbacks of the current approach are that several changes are being made to existing code opening the possibility of breaking existing behavior and that there are several places that a user can / has to specify which behavior they want.

The subclass approach would leave existing code relatively untouched, provide the relevant classes with a single use statement, and still make it possible to have lenient response handling with strict requests. The first drawback that comes to mind is that some private routines / methods would need to be duplicated. There may be others.

I am okay with continuing with the current approach, but if you think it's worth revisiting subclasses, please let me know - a separate pull request could be submitted and the one not used could be closed.

ugexe · 2026-03-22T14:55:47Z

It is really up to you. However, do consider the maintenance burden of whatever approach you take. As you've probably noticed there isn't a lot of bandwidth available for reviews here, and duplicated logic (on the surface) makes me think it is going to be a more difficult to maintain approach. But I could also see someone else arguing the opposite.

Make requests more conformant to the RFCs Previously, HTTP::UserAgent would produce output that is technically invalid according to the RFCs. These (squashed) commits add a parameter to make requests more conformant. When the parameter is set to True, CRLFs will be used for EOLs and no extra, invalid EOL will be added to POST requests. The changes should not alter the functionality of code already using this module. Tests have been added to cover the changes. Also, ETags now have a separate class with an attribute indicating whether they are weak validators. Co-authored-by: Nick Logan <nlogan@gmail.com>

zjmarlow · 2026-03-23T11:00:52Z

proposed commit message:

Make requests more conformant to the RFCs

Previously, HTTP::UserAgent would produce output that is technically invalid according to the RFCs. These (squashed) commits add a parameter to make requests more conformant. When the parameter is set to True, CRLFs will be used for EOLs and no extra, invalid EOL will be added to POST requests. The changes should not alter the functionality of code already using this module.

Tests have been added to cover the changes.

Also, ETags now have a separate class with an attribute indicating whether they are weak validators.

preliminary minimal changes

1e9da7b

zjmarlow added 6 commits January 24, 2026 14:53

initial strict implementation

5662117

keep original names

169f4ee

fix expected error message

b7af141

reorg modules; fix imports; restore original modules

8ea1649

make sure strict classes pass original tests

842cf81

fix variable redeclaration

014bfd0

zjmarlow added 2 commits January 26, 2026 06:57

fix body-less Message.Str

4a289b6

was test issue

37f292f

jonathanstowe reviewed Jan 26, 2026

View reviewed changes

zjmarlow added 8 commits January 26, 2026 08:22

auth

5e802ad

reinstate auth; add Str as URL convenience methods to UA-Strict

bfc08bc

auth

01ce9fa

update provides

85bb7c8

update provides

f94507c

fix return

5e565d2

fix return

e7fbc7b

add UA-Strict itself to provides

4120d59

zjmarlow added 2 commits February 7, 2026 05:27

remove redundant attr; fix new; fix strict print logic; add interop t…

a3a177f

…ests

TestServer location

9f09027

ugexe requested changes Feb 8, 2026

View reviewed changes

zjmarlow added 3 commits February 14, 2026 13:47

pre strict sig change

d18d284

refactor Message parse methods; make $strict default to $!strict in H…

c725a7f

…eader

pass $strict when sending binary request, too

ed51d78

preliminary sig changes

9a38dca

final sig changes

d3a56b3

ugexe requested changes Mar 16, 2026

View reviewed changes

Update lib/HTTP/Message.rakumod

1d2d89c

chunk size should be hex Co-authored-by: Nick Logan <nlogan@gmail.com>

zjmarlow and others added 2 commits March 20, 2026 05:13

Merge branch 'issue-226-eol-minimal' of /home/zjhm/devel/rakudo/http-…

5f9fa09

…useragent into issue-226-eol-minimal

		method Str (Bool $strict is copy = False, :$debug, Bool :$bin) {
		$strict \|\|= $!strict;

	method Str (Bool $strict is copy = False, :$debug, Bool :$bin) {
	$strict \|\|= $!strict;
	method Str (Bool $strict is copy = $!strict, :$debug, Bool :$bin) {

Conversation

zjmarlow commented Jan 9, 2026

Uh oh!

lizmat commented Jan 12, 2026

Uh oh!

zjmarlow commented Jan 16, 2026

Uh oh!

lizmat commented Jan 16, 2026

Uh oh!

zjmarlow commented Jan 17, 2026

Uh oh!

zjmarlow commented Jan 19, 2026

Uh oh!

lizmat commented Jan 19, 2026

Uh oh!

zjmarlow commented Jan 24, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lizmat commented Jan 24, 2026

Uh oh!

zjmarlow commented Jan 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zjmarlow commented Jan 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zjmarlow commented Jan 26, 2026

Uh oh!

zjmarlow commented Jan 26, 2026

Uh oh!

zjmarlow commented Feb 6, 2026

Uh oh!

zjmarlow commented Feb 7, 2026

Uh oh!

ugexe left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ugexe Feb 18, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zjmarlow Feb 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zjmarlow commented Jan 24, 2026 •

edited

Loading

zjmarlow commented Jan 25, 2026 •

edited

Loading

zjmarlow commented Jan 26, 2026 •

edited

Loading

ugexe Feb 18, 2026 •

edited

Loading

zjmarlow Feb 9, 2026 •

edited

Loading

zjmarlow commented Feb 16, 2026 •

edited

Loading

ugexe Mar 16, 2026 •

edited

Loading

ugexe Mar 16, 2026 •

edited

Loading