Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix incorrect line folding of RFC2047-encoded strings #77

Merged
merged 1 commit into from
Apr 15, 2020

Conversation

dvalter
Copy link
Contributor

@dvalter dvalter commented Mar 18, 2020

Use only whitespace characters as a separator to fold lines according to the section 2.2.3 of RFC5322. It may increase number of cases where hard limit fallback is used, but it should prevent damaging encoded subjects.

Should fix #54

@codecov-io
Copy link

codecov-io commented Mar 18, 2020

Codecov Report

Merging #77 into master will decrease coverage by 0.12%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #77      +/-   ##
==========================================
- Coverage   80.10%   79.97%   -0.13%     
==========================================
  Files          15       15              
  Lines         995      989       -6     
==========================================
- Hits          797      791       -6     
  Misses        122      122              
  Partials       76       76              
Impacted Files Coverage Δ
textproto/header.go 83.69% <100.00%> (-0.35%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9c4415e...29ab0c8. Read the comment docs.

{
k: "Subject",
v: "=?utf-8?q?=E2=80=9CShort subject=E2=80=9D=0A?= =?utf-8?q?=0AAuthor=0A=0AOil_on...?=",
formatted: "Subject: =?utf-8?q?=E2=80=9CShort subject=E2=80=9D=0A?= =?utf-8?q?\r\n =0AAuthor=0A=0AOil_on...?=\r\n",
formatted: "Subject: =?utf-8?q?=E2=80=9CShort subject=E2=80=9D=0A?=\r\n =?utf-8?q?=0AAuthor=0A=0AOil_on...?=\r\n",
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. What if a space appears in a RFC2047-encoded string and we fold it? e.g. =?utf-8?q?=E2=80=9CShort subject=E2=80=9D=0A?==?utf-8?q?=E2=80=9CShort\r\n subject=E2=80=9D=0A?=

Are we allowed to do this? IIRC we had the quoted-printable regex to avoid this.

Copy link
Contributor Author

@dvalter dvalter Mar 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. It should not happen because it's clearly an ill-formed word (2 and 6.3 of RFC 2047).
  2. When unfolded it should return back to it's original form, so it should be decoded the same way (if possible). To my knowledge Golang mime.WordDecoder and Thunderbird decoder both can handle spaces in Q-encoded utf-8 strings.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm. Seems like it indeed, should've investigated a little bit more before merging #22 (which this PR effectively reverts).

Anyway, I agree with you, this PR does the right thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Going through RFC and possible cases I've found something possibly dangerous in \n handling.

If input contains \n without corresponding \r we will have an incorrect output (neither printable ASCII, HTAB or SPACE, not \r\n line terminator) . I'm not sure about practical possibilities of this though since any encoder should take care of this \n beforehand

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Users aren't supposed to provide header keys/values with \n or \r. Maybe we should error out in this case?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(This is a separate change though - should probably be in a separate commit or PR)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure error is the right thing here.
I may open a new issue to discuss possible implementations, or if you have an idea right now, or if you're able to fix it straight away, you'll likely do it better since it's your code.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, please open an issue! I always prefer to help contributors send patches instead of doing it myself, that's healthier for the project on the long run.

Use only whitespace characters as a separator to fold lines according to
the section 2.2.3 of RFC5322. It may increase number of cases where hard limit
fallback is used, but it should prevent damaging encoded subjects.
Copy link
Owner

@emersion emersion left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for your patience!

@emersion emersion merged commit fee642d into emersion:master Apr 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Linefolding introduces illegal spaces into Q-Encoded subjects
3 participants