Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WARN: skipping invalid fetch #525

Closed
ajanvrin opened this issue Dec 12, 2024 · 15 comments
Closed

WARN: skipping invalid fetch #525

ajanvrin opened this issue Dec 12, 2024 · 15 comments
Assignees
Labels
bug Something isn't working

Comments

@ajanvrin
Copy link
Contributor

ajanvrin commented Dec 12, 2024

As I was sifting through thousands of mails with himalaya (great tool by the way!), I came across a weird instance where himalaya would report that my inbox only contained a single mail, whereas the ProtonMail web client reported a lot more.
Turning --debug on indicated that there were 2828 WARN fetch_envelopes_by_sequence{client=1}: imap_client::tasks: skipping invalid fetch warnings.

Context (Environment)

I'm using himalaya on a Windows laptop, with ProtonBridge running.
Other than the reported issue, and the occasional error/panic, it mostly works, I was able to create folders, move many mails to those folders, delete folders, etc.

Steps to Reproduce

I want to help pinpoint and troubleshoot this issue, but the logs contain tons of PII that I cannot share.
I was sadly unable to send myself a mail that would reproduce the issue.

I've taken some troubleshooting steps and have been able to eliminate some possible causes:

  • Envelope format produced by ProtonMail/ProtonBridge does not seem to be the cause, as all mails load fine in ThunderBird through ProtonBridge
  • Although non-ascii chars were present in the fields of a whooping 2824 out of 2828 envelopes, I was able to rule out this possible cause, as a mail containing "àéèç" in the subject were not affected
  • The emails come from multiple mail providers, there is no common pattern there, they come from Proton, Gmail, O365, etc.
  • Attachments are not the cause, as there are many envelopes with and without attachments within the invalid fetches.
  • Various combinations of --page-size and --page did not produce much difference. To get the 2828 warnings, I used the rather aggressive --page-size 10000

I am struggling to find more useful debugging steps, short of reading the code (which I may do this weekend, but being unfamiliar with IMAP and this codebase, I will probably struggle there too)

Here is one example of an invalid fetch:

2024-12-12T07:22:26.354595Z  WARN fetch_envelopes_by_sequence{client=1}: imap_client::tasks: skipping invalid fetch fetch="* 2829 FETCH (UID 2829 FLAGS (\\Flagged \\Seen) ENVELOPE (\"Sun, 29 May 2022 11:19:52 +0200\" \"[redacted]\" ((\"[redacted]\" NIL \"[redacted]\" \"gmail.com\")) ((\"[redacted]\" NIL \"[redacted]\" \"gmail.com\")) ((\"[redacted]\" NIL \"[redacted]\" \"gmail.com\")) ((NIL NIL \"[redacted]\" \"[redacted]\")) NIL NIL NIL \"<[redacted]@mail.gmail.com>\") BODYSTRUCTURE ((\"text\" \"html\" (\"charset\" \"utf-8\") NIL NIL \"quoted-printable\" 177 3 NIL NIL NIL NIL)(\"image\" \"jpeg\" (\"filename\" \"20220529_111616.jpg\" \"name\" \"20220529_111616.jpg\") \"<1810f1c27ce83dd33723>\" NIL \"base64\" 4877872 NIL (\"attachment\" (\"filename\" \"20220529_111616.jpg\")) NIL NIL)(\"image\" \"jpeg\" (\"filename\" \"20220529_111537.jpg\" \"name\" \"20220529_111537.jpg\") \"<1810f1c27fb526ec0f04>\" NIL \"base64\" 5090040 NIL (\"attachment\" (\"filename\" \"20220529_111537.jpg\")) NIL NIL)(\"image\" \"jpeg\" (\"filename\" \"20220529_111710.jpg\" \"name\" \"20220529_111710.jpg\") \"<1810f1c27bab86ee1ad2>\" NIL \"base64\" 5711388 NIL (\"attachment\" (\"filename\" \"20220529_111710.jpg\")) NIL NIL)(\"image\" \"jpeg\" (\"filename\" \"20220529_111721.jpg\" \"name\" \"20220529_111721.jpg\") \"<1810f1c27a5ba31d1cd1>\" NIL \"base64\" 5172832 NIL (\"attachment\" (\"filename\" \"20220529_111721.jpg\")) NIL NIL) \"mixed\" (\"boundary\" \"6d7f72d5d329be9d2b25a01e202c148edc6338fbe6d7be0ff33f5b3eff336e7a\") NIL NIL NIL))\r\n"

And here is a shorter one:

2024-12-12T07:45:43.592444Z  WARN fetch_envelopes_by_sequence{client=1}: imap_client::tasks: skipping invalid fetch fetch="* 1941 FETCH (UID 1941 FLAGS (\\Seen) ENVELOPE (\"Mon, 01 Jun 2015 07:46:41 +0000\" \"2.zip\" ((\"[redacted]\" NIL \"[redacted]\" \"gmail.com\")) ((\"[redacted]\" NIL \"[redacted]\" \"gmail.com\")) ((\"[redacted]\" NIL \"[redacted]\" \"gmail.com\")) ((NIL NIL \"[redacted]\" \"gmail.com\")) NIL NIL NIL \"<[redacted]@google.com>\") BODYSTRUCTURE (\"text\" \"html\" (\"charset\" \"utf-8\") NIL NIL \"quoted-printable\" 4781 62 NIL NIL NIL NIL))\r\n"

Is there any step I can make to help pinpoint this?

Expected Behavior

I'd expect himalaya to be able to handle these 2828 mails just fine, as ThunderBird is able to.

Current Behavior

Many WARN fetch_envelopes_by_sequence{client=1}: imap_client::tasks: skipping invalid fetch warnings

@soywod
Copy link
Member

soywod commented Dec 12, 2024

I cannot investigate right now, but it may be related to modern-email/defects#38.

@petrm
Copy link

petrm commented Jan 24, 2025

I would also like to help debugging this issue and I can reproduce it 100%. The message triggering this does not contain any non-ASCII characters.
What would you need? I tried --debug, that doesn't seem to provide anything useful in addition to the WARN.

@ajanvrin
Copy link
Contributor Author

ajanvrin commented Jan 24, 2025

Hello! If what you say is true, we may actually be facing two bugs in a trench-coat.
Which is what I suspected too, given that 4 of my own messages did not contain any non-ASCII chars.
I had been unable to create an email that triggers the bug though, so if you're able to give instructions that would reproduce the bug without any non-ASCII chars involved, I guess that would be a great help!
Side note, this issue is tracked over there too in more details: modern-email/defects#38

@petrm
Copy link

petrm commented Jan 25, 2025

Reproduction is actually very simple using https://github.com/foxcpp/maddy. Start up the server, create an imap account, send yourself a message. Done.

@petrm
Copy link

petrm commented Jan 25, 2025

Here is the maddy log with himalaya connecting. This is the highest log level available.

maddy  | 2025-01-25T10:30:52.572441139Z [debug] tls.loader.acme/handshake: choosing certificate {"identifier":"imap.services.domain.tld","num_choices":1}                   
maddy  | 2025-01-25T10:30:52.572507024Z [debug] tls.loader.acme/handshake: default certificate selection results        {"hash":"ad2fea10f731303445f1be751303e1e2dcfb39760b252
43baa899121e9b6aa02","identifier":"imap.services.domain.tld","issuer_key":"acme-v02.api.letsencrypt.org-directory","managed":true,"subjects":["imap.services.domain.tld"]}
maddy  | 2025-01-25T10:30:52.572521371Z [debug] tls.loader.acme/handshake: matched certificate in cache {"expiration":"2025-04-19T14:50:02.000","hash":"ad2fea10f731303445f1be
751303e1e2dcfb39760b25243baa899121e9b6aa02","managed":true,"remote_ip":"192.168.1.2","remote_port":"38888","subjects":["imap.services.domain.tld"]}                         
maddy  | 2025-01-25T10:30:52.575758785Z imap: * OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR CHILDREN UNSELECT MOVE IDLE APPENDLIMIT AUTH=PLAIN COMPRESS] IMAP4rev1 Service Ready
maddy  | 2025-01-25T10:30:52.575972430Z imap: 0.0.wnJF090B AUTHENTICATE PLAIN AHByaW50ZXIAZ2MzJTNpRkZGSkR3UmRMNGg0OGI=                                                        
maddy  | 2025-01-25T10:30:52.803061876Z imap: 0.0.wnJF090B OK [CAPABILITY IMAP4rev1 LITERAL+ SASL-IR CHILDREN UNSELECT MOVE IDLE APPENDLIMIT I18NLEVEL=1 SORT THREAD=ORDEREDSU
BJECT COMPRESS NAMESPACE] AUTHENTICATE completed                                                                                                                              
maddy  | 2025-01-25T10:30:52.803515145Z imap: 0.1.lgySQXmV SELECT INBOX                                                                                                       
maddy  | 2025-01-25T10:30:52.803906957Z [debug] imapsql: initialized uidMap for selected mailbox: 2 [8 9]                                                                     
maddy  | 2025-01-25T10:30:52.804090726Z imap: * FLAGS (\Seen \Answered \Flagged \Deleted \Draft old)                                                                          
maddy  | 2025-01-25T10:30:52.804132715Z imap: * OK [PERMANENTFLAGS (\Seen \Answered \Flagged \Deleted \Draft \* old)] Flags permitted.                                        
maddy  | 2025-01-25T10:30:52.804191286Z imap: * OK [UNSEEN 2] Message 2 is first unseen                                                                                       
maddy  | 2025-01-25T10:30:52.804287068Z imap: * OK [UIDVALIDITY 1437467386] UIDs valid                                                                                        
maddy  | 2025-01-25T10:30:52.804301274Z imap: * 2 EXISTS                                                                                                                      
maddy  | 2025-01-25T10:30:52.804324398Z imap: * 0 RECENT                                                                                                                      
maddy  | 2025-01-25T10:30:52.804340288Z imap: * OK [UIDNEXT 10] Predicted next UID                                                                                            
maddy  | 2025-01-25T10:30:52.804395293Z imap: 0.1.lgySQXmV OK [READ-WRITE] SELECT completed                                                                                   
maddy  | 2025-01-25T10:30:52.804660335Z imap: 0.2.kxqbTv0z FETCH 2:1 (UID FLAGS ENVELOPE BODYSTRUCTURE)                                                                       
maddy  | 2025-01-25T10:30:52.804985040Z [debug] imapsql: resolved false 8:9 to 8:9                                                                                            
maddy  | 2025-01-25T10:30:52.805530003Z [debug] imapsql: scanMessages: scanned msgId=8 (seq 1) [UID FLAGS ENVELOPE BODYSTRUCTURE]                                             
maddy  | 2025-01-25T10:30:52.805637647Z [debug] imapsql: scanMessages: scanned msgId=9 (seq 2) [UID FLAGS ENVELOPE BODYSTRUCTURE]                                             
maddy  | 2025-01-25T10:30:52.805692511Z imap: * 1 FETCH (UID 8 FLAGS (old \Seen) ENVELOPE ("Fri, 24 Jan 2025 18:10:41 +0100" "this is a test" ((NIL NIL "printer" "services.ca
sona.rocks")) ((NIL NIL "printer" "services.domain.tld")) ((NIL NIL "printer" "services.domain.tld")) ((NIL NIL "printer" "services.domain.tld")) NIL NIL NIL "<ggzxrvlr
b43i4xfot7bws2qntl6j5lxwe4a6eqzdseplqlylgo@ct6wl36ufvlu>") BODYSTRUCTURE ("text" "plain" ("charset" "us-ascii") NIL NIL NIL 7 1 NIL ("inline" ()) NIL NIL))                   
maddy  | 2025-01-25T10:30:52.805801507Z imap: * 2 FETCH (UID 9 FLAGS (old) ENVELOPE ("Fri, 24 Jan 2025 18:15:31 +0100" "bla" ((NIL NIL "printer" "services.domain.tld")) ((N
IL NIL "printer" "services.domain.tld")) ((NIL NIL "printer" "services.domain.tld")) ((NIL NIL "printer" "services.domain.tld")) NIL NIL NIL "<du46gib7rpdxuv4d5mb3sa25c
anezopr6cxdqe2wdqpjeuqkhr@rzdxpuppmuws>") BODYSTRUCTURE ("text" "plain" ("charset" "us-ascii") NIL NIL NIL 10 1 NIL ("inline" ()) NIL NIL))                                   
maddy  | 2025-01-25T10:30:52.805922988Z imap: 0.2.kxqbTv0z OK FETCH completed

@soywod
Copy link
Member

soywod commented Jan 27, 2025

@petrm thanks for your help. Could you share the associated Himalaya logs?

@petrm
Copy link

petrm commented Jan 27, 2025

Is there something more that I can provide?

himalaya --debug envelope list
2025-01-27T16:14:40.296920Z  INFO himalaya::email::envelope::command::list: executing list envelopes command
2025-01-27T16:14:40.296990Z DEBUG email::imap: building 1 IMAP clients
2025-01-27T16:14:40.299354Z DEBUG client::build: rustls::client::hs: No cached session for DnsName("imap.domain.tld")    
2025-01-27T16:14:40.299474Z DEBUG client::build: rustls::client::hs: Not resuming any session    
2025-01-27T16:14:40.311897Z DEBUG client::build: rustls::webpki::anchors: add_parsable_certificates processed 150 valid and 0 invalid certs    
2025-01-27T16:14:40.311919Z DEBUG client::build: rustls_platform_verifier::verification::others: Loaded 150 CA certificates from the system    
2025-01-27T16:14:40.312439Z DEBUG client::build: rustls::client::hs: Using ciphersuite TLS13_AES_128_GCM_SHA256    
2025-01-27T16:14:40.312451Z DEBUG client::build: rustls::client::tls13: Not resuming    
2025-01-27T16:14:40.312512Z DEBUG client::build: rustls::client::tls13: TLS1.3 encrypted extensions: []    
2025-01-27T16:14:40.312518Z DEBUG client::build: rustls::client::hs: ALPN protocol is None    
2025-01-27T16:14:40.313415Z DEBUG client::build: email::imap: using password authentication
2025-01-27T16:14:40.313422Z DEBUG client::build: email::imap: supported auth mechanisms mechanisms=[Plain, Login]
2025-01-27T16:14:40.313435Z DEBUG client::build: email::imap: trying auth mechanism… mechanism=Plain
2025-01-27T16:14:40.413835Z DEBUG client::build: email::imap: authentication succeeded! mechanism=Plain
2025-01-27T16:14:40.413977Z  INFO email::email::envelope::list::imap: listing IMAP envelopes from mailbox INBOX
2025-01-27T16:14:40.413985Z DEBUG email::imap: client 1/1 is free, locking it
2025-01-27T16:14:40.413997Z DEBUG email::email::envelope::list::imap: UTF7-encoded mailbox name="INBOX"
2025-01-27T16:14:40.414516Z DEBUG email::email::envelope::list::imap: mailbox selected name="INBOX" data=SelectDataUnvalidated { flags: Some([Seen, Answered, Flagged, Deleted, Draft, Keyword(Atom("old"))]), exists: Some(1), recent: Some(0), unseen: Some(1), permanent_flags: Some([Flag(Seen), Flag(Answered), Flag(Flagged), Flag(Deleted), Flag(Draft), Asterisk, Flag(Keyword(Atom("old")))]), uid_next: Some(12), uid_validity: Some(1437467386) }
2025-01-27T16:14:40.414930Z  WARN fetch_envelopes_by_sequence{client=1}: imap_client::tasks: skipping invalid fetch fetch="* 1 FETCH (UID 11 FLAGS (old) ENVELOPE (\"Mon, 27 Jan 2025 17:14:26 +0100\" \"test subject\" ((NIL NIL \"printer\" \"domain.tld\")) ((NIL NIL \"printer\" \"domain.tld\")) ((NIL NIL \"printer\" \"domain.tld\")) ((NIL NIL \"printer\" \"domain.tld\")) NIL NIL NIL \"<igjq53rboqmg7vsll4htzrxjgzxtwxuz3bwf7m2tlxdc5f7kmo@lj4y6ho4p7d6>\") BODYSTRUCTURE (\"text\" \"plain\" (\"charset\" \"us-ascii\") NIL NIL NIL 14 1 NIL (\"inline\" ()) NIL NIL))\r\n"
2025-01-27T16:14:40.414984Z DEBUG email::email::envelope::list::imap: found 0 imap envelopes

| ID                             | FLAGS                            | SUBJECT                            | FROM                            | DATE                            |
|--------------------------------|----------------------------------|------------------------------------|---------------------------------|---------------------------------|

@soywod
Copy link
Member

soywod commented Jan 27, 2025

I was able to find the culprit: the BODYSTRUCTURE of the FETCH response is missing the encoding string:

* 1 FETCH ([…] BODYSTRUCTURE ("text" "plain" ("charset" "us-ascii") NIL NIL NIL 14 1 NIL ("inline" ()) NIL NIL))\r\n`

According to the RFC, it is supposed to be a non-nil string:

body-fld-enc = (DQUOTE ("7BIT" / "8BIT" / "BINARY" / "BASE64"/
"QUOTED-PRINTABLE") DQUOTE) / string

imap-codec just reflects the RFC, so the error looks legitimate to me. Do you think that a new quirk should be added @duesee?

@duesee
Copy link

duesee commented Jan 28, 2025

Good catch! Maybe a quirk that rectifies NIL into an empty string and emits a warning? Should we also file a defect and inform the maddy project?

@soywod
Copy link
Member

soywod commented Jan 28, 2025

Should we also file a defect and inform the maddy project?

According to @ajanvrin, ProtonMail also generates NIL encoding?

@ajanvrin
Copy link
Contributor Author

Sadly no, out of the 2828 original invalid fetches, 2824 can be explained by utf8 characters, but the 4 last invalid fetches, of which there are two examples below, still escape my understanding.

2024-12-12T07:22:26.354595Z  WARN fetch_envelopes_by_sequence{client=1}: imap_client::tasks: skipping invalid fetch fetch="* 2829 FETCH (UID 2829 FLAGS (\\Flagged \\Seen) ENVELOPE (\"Sun, 29 May 2022 11:19:52 +0200\" \"[redacted]\" ((\"[redacted]\" NIL \"[redacted]\" \"gmail.com\")) ((\"[redacted]\" NIL \"[redacted]\" \"gmail.com\")) ((\"[redacted]\" NIL \"[redacted]\" \"gmail.com\")) ((NIL NIL \"[redacted]\" \"[redacted]\")) NIL NIL NIL \"<[redacted]@mail.gmail.com>\") BODYSTRUCTURE ((\"text\" \"html\" (\"charset\" \"utf-8\") NIL NIL \"quoted-printable\" 177 3 NIL NIL NIL NIL)(\"image\" \"jpeg\" (\"filename\" \"20220529_111616.jpg\" \"name\" \"20220529_111616.jpg\") \"<1810f1c27ce83dd33723>\" NIL \"base64\" 4877872 NIL (\"attachment\" (\"filename\" \"20220529_111616.jpg\")) NIL NIL)(\"image\" \"jpeg\" (\"filename\" \"20220529_111537.jpg\" \"name\" \"20220529_111537.jpg\") \"<1810f1c27fb526ec0f04>\" NIL \"base64\" 5090040 NIL (\"attachment\" (\"filename\" \"20220529_111537.jpg\")) NIL NIL)(\"image\" \"jpeg\" (\"filename\" \"20220529_111710.jpg\" \"name\" \"20220529_111710.jpg\") \"<1810f1c27bab86ee1ad2>\" NIL \"base64\" 5711388 NIL (\"attachment\" (\"filename\" \"20220529_111710.jpg\")) NIL NIL)(\"image\" \"jpeg\" (\"filename\" \"20220529_111721.jpg\" \"name\" \"20220529_111721.jpg\") \"<1810f1c27a5ba31d1cd1>\" NIL \"base64\" 5172832 NIL (\"attachment\" (\"filename\" \"20220529_111721.jpg\")) NIL NIL) \"mixed\" (\"boundary\" \"6d7f72d5d329be9d2b25a01e202c148edc6338fbe6d7be0ff33f5b3eff336e7a\") NIL NIL NIL))\r\n"
2024-12-12T07:45:43.592444Z  WARN fetch_envelopes_by_sequence{client=1}: imap_client::tasks: skipping invalid fetch fetch="* 1941 FETCH (UID 1941 FLAGS (\\Seen) ENVELOPE (\"Mon, 01 Jun 2015 07:46:41 +0000\" \"2.zip\" ((\"[redacted]\" NIL \"[redacted]\" \"gmail.com\")) ((\"[redacted]\" NIL \"[redacted]\" \"gmail.com\")) ((\"[redacted]\" NIL \"[redacted]\" \"gmail.com\")) ((NIL NIL \"[redacted]\" \"gmail.com\")) NIL NIL NIL \"<[redacted]@google.com>\") BODYSTRUCTURE (\"text\" \"html\" (\"charset\" \"utf-8\") NIL NIL \"quoted-printable\" 4781 62 NIL NIL NIL NIL))\r\n"

All of these 4 fetches have an encoding of quoted-printable.

@soywod
Copy link
Member

soywod commented Feb 5, 2025

I replaced [redacted] by localhost and I cannot reproduce any parsing error using the latest imap-codec. So either your [redacted] contained invalid chars, or this particular issue has already been solved on master and will therefore be fixed within the next Himalaya release.

@soywod
Copy link
Member

soywod commented Feb 7, 2025

Good catch! Maybe a quirk that rectifies NIL into an empty string and emits a warning? Should we also file a defect and inform the maddy project?

@ajanvrin
Copy link
Contributor Author

ajanvrin commented Feb 9, 2025

I replaced [redacted] by localhost and I cannot reproduce any parsing error using the latest imap-codec. So either your [redacted] contained invalid chars, or this particular issue has already been solved on master and will therefore be fixed within the next Himalaya release.

Well, I am pretty certain that the redacted bits didn't contain invalid chars, but sadly the affected emails no longer exist, so I won't be able to check (sorry). I think we can assume the issue for these 4 mails is fixed.

soywod added a commit that referenced this issue Feb 17, 2025
@soywod
Copy link
Member

soywod commented Feb 17, 2025

The quirk has been added to master, so the issue should be fixed for the next release.

@soywod soywod closed this as completed Feb 17, 2025
@github-project-automation github-project-automation bot moved this to Done in Pimalaya Feb 17, 2025
@soywod soywod changed the title [Bug] WARN fetch_envelopes_by_sequence{client=1}: imap_client::tasks: skipping invalid fetch (2028 occurrences) WARN: skipping invalid fetch Feb 17, 2025
@soywod soywod self-assigned this Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
Status: Done
Development

No branches or pull requests

4 participants