-
Notifications
You must be signed in to change notification settings - Fork 117
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix incorrect line folding of RFC2047-encoded strings #77
Conversation
Codecov Report
@@ Coverage Diff @@
## master #77 +/- ##
==========================================
- Coverage 80.10% 79.97% -0.13%
==========================================
Files 15 15
Lines 995 989 -6
==========================================
- Hits 797 791 -6
Misses 122 122
Partials 76 76
Continue to review full report at Codecov.
|
{ | ||
k: "Subject", | ||
v: "=?utf-8?q?=E2=80=9CShort subject=E2=80=9D=0A?= =?utf-8?q?=0AAuthor=0A=0AOil_on...?=", | ||
formatted: "Subject: =?utf-8?q?=E2=80=9CShort subject=E2=80=9D=0A?= =?utf-8?q?\r\n =0AAuthor=0A=0AOil_on...?=\r\n", | ||
formatted: "Subject: =?utf-8?q?=E2=80=9CShort subject=E2=80=9D=0A?=\r\n =?utf-8?q?=0AAuthor=0A=0AOil_on...?=\r\n", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. What if a space appears in a RFC2047-encoded string and we fold it? e.g. =?utf-8?q?=E2=80=9CShort subject=E2=80=9D=0A?=
→ =?utf-8?q?=E2=80=9CShort\r\n subject=E2=80=9D=0A?=
Are we allowed to do this? IIRC we had the quoted-printable regex to avoid this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It should not happen because it's clearly an ill-formed word (2 and 6.3 of RFC 2047).
- When unfolded it should return back to it's original form, so it should be decoded the same way (if possible). To my knowledge Golang mime.WordDecoder and Thunderbird decoder both can handle spaces in Q-encoded utf-8 strings.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm. Seems like it indeed, should've investigated a little bit more before merging #22 (which this PR effectively reverts).
Anyway, I agree with you, this PR does the right thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Going through RFC and possible cases I've found something possibly dangerous in \n
handling.
If input contains \n
without corresponding \r
we will have an incorrect output (neither printable ASCII, HTAB or SPACE, not \r\n
line terminator) . I'm not sure about practical possibilities of this though since any encoder should take care of this \n
beforehand
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Users aren't supposed to provide header keys/values with \n
or \r
. Maybe we should error out in this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(This is a separate change though - should probably be in a separate commit or PR)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure error is the right thing here.
I may open a new issue to discuss possible implementations, or if you have an idea right now, or if you're able to fix it straight away, you'll likely do it better since it's your code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, please open an issue! I always prefer to help contributors send patches instead of doing it myself, that's healthier for the project on the long run.
Use only whitespace characters as a separator to fold lines according to the section 2.2.3 of RFC5322. It may increase number of cases where hard limit fallback is used, but it should prevent damaging encoded subjects.
845e4ed
to
29ab0c8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks for your patience!
Use only whitespace characters as a separator to fold lines according to the section 2.2.3 of RFC5322. It may increase number of cases where hard limit fallback is used, but it should prevent damaging encoded subjects.
Should fix #54