Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue when parsing header if the from field contains a semicolon #300

Open
nilshellerhoff opened this issue Oct 14, 2022 · 3 comments
Open
Labels
bug Something isn't working validating

Comments

@nilshellerhoff
Copy link

nilshellerhoff commented Oct 14, 2022

Describe the bug
When parsing the header of an email where the from field contains a semicolon ";", the from field will not be parsed correctly. Minimal example of such an email:

To: [email protected]
Subject: Test of a semicolon in from-header
Date: Wed, 12 Oct 2022 16:31:06 +0000
From: "Foo; Bar" <[email protected]>
Message-ID: <[email protected]>
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

<< email body >>

The raw email is appended to avoid issues with linebreaks (I changed the extension to .txt as Github doesn't support .eml).
semicolon_test.txt

Used config
Default.

Code to Reproduce

$raw_mail = file_get_contents('semicolon_test.txt');
$header = new \Webklex\PHPIMAP\Header($raw_mail);
var_dump($header->get('from'));

Output when running this via php test.php:

PHP Warning:  Trying to access array offset on value of type null in ***/vendor/webklex/php-imap/src/Header.php on line 457
PHP Warning:  Trying to access array offset on value of type null in ***/vendor/webklex/php-imap/src/Header.php on line 457
object(Webklex\PHPIMAP\Attribute)#18 (2) {
  ["name":protected]=>
  string(4) "from"
  ["values":protected]=>
  array(1) {
    [0]=>
    string(4) ""Foo"
  }
}

Expected behavior
When we remove the semicolon from the from-header, we get the expected result:

PHP Warning:  Trying to access array offset on value of type null in ***/vendor/webklex/php-imap/src/Header.php on line 457
PHP Warning:  Trying to access array offset on value of type null in ***/vendor/webklex/php-imap/src/Header.php on line 457
object(Webklex\PHPIMAP\Attribute)#6 (2) {
  ["name":protected]=>
  string(4) "from"
  ["values":protected]=>
  array(1) {
    [0]=>
    object(Webklex\PHPIMAP\Address)#7 (5) {
      ["personal"]=>
      string(9) ""Foo Bar""
      ["mailbox"]=>
      string(6) "foobar"
      ["host"]=>
      string(10) "domain.tld"
      ["mail"]=>
      string(17) "[email protected]"
      ["full"]=>
      string(29) ""Foo Bar" <[email protected]>"
    }
  }
}

Screenshots
If applicable, add screenshots to help explain your problem.

Desktop / Server (please complete the following information):

  • OS: Linux Mint 21 (= Ubuntu 22.04)

  • PHP: 7.4 and 8.1

  • Version 4.0.2

  • Provider: The mail which triggered the issue was sent through the "Contact Form 7" plugin on a Wordpress instance.

EDIT:
I am not actually sure that a semicolon in the header fields confroms to the spec, but Gmail, Thunderbird and also phpmailer do handle these mails correctly.

@ojgarciab
Copy link

I have the same problem with subjects containing a semicolon. The subject content is removed after the semicolon.

Sample:

Subject: This is an example; for example

Results in:

echo $message->subject;
This is an example

@nilshellerhoff
Copy link
Author

The problem originates here:

php-imap/src/Header.php

Lines 654 to 686 in 45843e1

private function extractHeaderExtensions() {
foreach ($this->attributes as $key => $value) {
if (is_array($value)) {
$value = implode(", ", $value);
} else {
$value = (string)$value;
}
// Only parse strings and don't parse any attributes like the user-agent
if (($key == "user_agent") === false) {
if (($pos = strpos($value, ";")) !== false) {
$original = substr($value, 0, $pos);
$this->set($key, trim(rtrim($original)), true);
// Get all potential extensions
$extensions = explode(";", substr($value, $pos + 1));
foreach ($extensions as $extension) {
if (($pos = strpos($extension, "=")) !== false) {
$key = substr($extension, 0, $pos);
$key = trim(rtrim(strtolower($key)));
if (isset($this->attributes[$key]) === false) {
$value = substr($extension, $pos + 1);
$value = str_replace('"', "", $value);
$value = trim(rtrim($value));
$this->set($key, $value);
}
}
}
}
}
}
}

Im not very versed in email handling, but when reading this (altough not an authoritative source ofcourse) it seems to me, that certain fields including subject ... should maybe be excluded from extension parsing, and additionally to checking for semicolons, the parser should actually only parse a field, if it finds a key=value pair after the semicolon.

Can you maybe comment on this @Webklex? I can also do a PR otherwise in the weekend.

@Webklex Webklex added bug Something isn't working validating labels Mar 16, 2023
@abhilashpa39
Copy link

I have the same problem with subjects containing a semicolon. The subject content is removed after the semicolon.
Subject : Test; ticket <<support-id=744482>>

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working validating
Projects
None yet
Development

No branches or pull requests

4 participants