mb_encode_mimeheader

(PHP 4 >= 4.0.6, PHP 5, PHP 7, PHP 8)

mb_encode_mimeheader — Encode string for MIME header

Description

mb_encode_mimeheader(
    string $string,
    ?string $charset = null,
    ?string $transfer_encoding = null,
    string $newline = "\r\n",
    int $indent = 0
): string

Encodes a given string string by the MIME header encoding scheme.

Parameters

string: The string being encoded. Its encoding should be same as mb_internal_encoding().
charset: charset specifies the name of the character set in which string is represented in. The default value is determined by the current NLS setting (mbstring.language).
transfer_encoding: transfer_encoding specifies the scheme of MIME encoding. It should be either "B" (Base64) or "Q" (Quoted-Printable). Falls back to "B" if not given.
newline: newline specifies the EOL (end-of-line) marker with which mb_encode_mimeheader() performs line-folding (a » RFC term, the act of breaking a line longer than a certain length into multiple lines. The length is currently hard-coded to 74 characters). Falls back to "\r\n" (CRLF) if not given.
indent: Indentation of the first line (number of characters in the header before string).

Return Values

A converted version of the string represented in ASCII.

Changelog

Version	Description
8.0.0	`charset` and `transfer_encoding` are nullable now.

Examples

Example #1 mb_encode_mimeheader() example

<?php
$name = "太郎"; // kanji
$mbox = "kru";
$doma = "gtinn.mon";
$addr = '"' . addcslashes(mb_encode_mimeheader($name, "UTF-7", "Q"), '"') . '" <' . $mbox . "@" . $doma . ">";
echo $addr;
?>

The above example will output:

"=?UTF-7?Q?+WSqQzg-?=" <kru@gtinn.mon>

Notes

Note:
This function isn't designed to break lines at higher-level contextual break points (word boundaries, etc.). This behaviour may clutter up the original string with unexpected spaces.

User Contributed Notes 10 notes

down

stormflyCUT at hyh dot pl ¶

18 years ago


Some solution for using national chars and have problem with UTF-8 for example in mail subject. Before you use mb_encode_mimeheader with UTF-8 set mb_internal_encoding('UTF-8').

down

nigrez at nius dot waw dot pl ¶

19 years ago


True, function is broken (PHP5.1, encoding from UTF-8 with pl_PL charset). Below is about 15% faster version of proposed _mb_mime_encode. Also it has header more like othe mb_* functions and doesn't trigger any errors/warnings/notices.

<?php

function mb_mime_header($string, $encoding=null, $linefeed="\r\n") {
  if(!$encoding) $encoding = mb_internal_encoding();
  $encoded = '';

  while($length = mb_strlen($string)) {
    $encoded .= "=?$encoding?B?"
             . base64_encode(mb_substr($string,0,24,$encoding))
             . "?=$linefeed";

    $string = mb_substr($string,24,$length,$encoding);
  }

  return $encoded;
}

?>

down

gullevek at gullevek dot org ¶

21 years ago


Read this FIRST: http://bugs.php.net/bug.php?id=23192 because mb_encode_mimeheaders is BUGGY!

a work around for the multibyte broken error for too long subjects for ISO-2022-JP:

$pos=0;
$split=36; // after 36 single bytes characters, if then comes MB, it is broken
while ($pos<mb_strlen($string,$encoding))
{
  $output=mb_strimwidth($string,$pos,$split,"",$encoding);
  $pos+=mb_strlen($output,$encoding);
  $_string.=(($_string)?' ':'').mb_encode_mimeheader($output,$encoding);
}
$string=$_string;

is not the best, but it works

down

Anonymous ¶

15 years ago


I could not find a PHP function to MIME encode the name for a n email address.



Input   = "Karl Müller<kmueller@gmx.de>"

Output = "Karl%20M%FCller<kmueller@gmx.de>"



I wrote it on my own:



<?php

// required to encode names in email addresses    

// replace " " with "%20"

// replace "ü" with "%FC" 

// replace "%" with "%25"      etc....

// Use "%" as Delimiter for MIME

// Use "=" as Delimiter for Quoted Printable

// Input string must be UTF8 encoded

public static function EncodeMime($Text, $Delimiter)

{

    $Text = utf8_decode($Text);

    $Len  = strlen($Text);

    $Out  = "";

    for ($i=0; $i<$Len; $i++)

    {

        $Chr = substr($Text, $i, 1);

        $Asc = ord($Chr);



        if ($Asc > 0x255) // Unicode not allowed

        {

            $Out .= "?";

        }

        else if ($Chr == " " || $Chr == $Delimiter || $Asc > 127) 

        {

            $Out .= $Delimiter . strtoupper(bin2hex($Chr));

        }

        else $Out .= $Chr;

    }

    return $Out;

}

?>

down

tokul at users dot sourceforge dot net ¶

16 years ago


mb_encode_mimeheader() depends on correct mbstring.internal_encoding setting. It tries to convert $str from internal encoding to $charset. If you ignore mbstring internal encoding, function might encode strings incorrectly even when $str character set matches $charset

down

gullevek at gullevek dot org ¶

19 years ago


My first post was around 2003, and still the mb_mime_header is broken. It is *NOT* usable with longer subjects, and mostly unusable with anything else than japanese.

iwakura at junx dot org is also not working for me, it produces also some gargabe.

I updated my old function (the one I posted 2003) and I tested it with overlong subjects in UTF-8, ISO-2022-JP (japanese), GB2312 (simplified chinese) and EUC-KR (korean) and I got readable results in thunderbird, mail.app, outlook, etc.

<?php

function _mb_mime_encode($string, $encoding)
{
    $pos = 0;
    // after 36 single bytes characters if then comes MB, it is broken
    // but I trimmed it down to 24, to stay 100% < 76 chars per line
    $split = 24;
    while ($pos < mb_strlen($string, $encoding))
    {
        $output = mb_strimwidth($string, $pos, $split, "", $encoding);
        $pos += mb_strlen($output, $encoding);
        $_string_encoded = "=?".$encoding."?B?".base64_encode($output)."?=";
        if ($_string)
            $_string .= "\r\n";
        $_string .= $_string_encoded;
    }
    $string = $_string;
    return $string;
}

?>

down

paravoid ¶

19 years ago


If mb_ version doesn't work for you in MIME-B mode:

function encode_mimeheader($string, $charset=null, $linefeed="\r\n") {
    if (!$charset)
        $charset = mb_internal_encoding();

    $start = "=?$charset?B?";
    $end = "?=";
    $encoded = '';

    /* Each line must have length <= 75, including $start and $end */
    $length = 75 - strlen($start) - strlen($end);
    /* Average multi-byte ratio */
    $ratio = mb_strlen($string, $charset) / strlen($string);
    /* Base64 has a 4:3 ratio */
    $magic = $avglength = floor(3 * $length * $ratio / 4);

    for ($i=0; $i <= mb_strlen($string, $charset); $i+=$magic) {
        $magic = $avglength;
        $offset = 0;
        /* Recalculate magic for each line to be 100% sure */
        do {
            $magic -= $offset;
            $chunk = mb_substr($string, $i, $magic, $charset);
            $chunk = base64_encode($chunk);
            $offset++;
        } while (strlen($chunk) > $length);
        if ($chunk)
            $encoded .= ' '.$start.$chunk.$end.$linefeed;
    }
    /* Chomp the first space and the last linefeed */
    $encoded = substr($encoded, 1, -strlen($linefeed));

    return $encoded;
}

down

chappy at citromail dot hu ¶

19 years ago


In countries where there's non-us ASCII, it's a very good example, for sending mail:

mb_internal_encoding('iso-8859-2');
setlocale(LC_CTYPE, 'hu_HU');

function encode($str,$charset){
    $str=mb_encode_mimeheader(trim($str),$charset, 'Q', "\n\t");
    return $str;
}

print encode('the text with spec. chars: &#337; &#368; &#336; &#369;, ?','iso-8859-2');

It creates a 7bit string

down

-1

iwakura at junx dot org ¶

19 years ago


i think mb_encode_mimeheader still have bug. here is sample code:

function mb_encode_mimeheader2($string, $encoding = "ISO-2022-JP") {
    $string_array = array();
    $pos = 0;
    $row = 0;
    $mode = 0;
    
    while ($pos < mb_strlen($string)) {
        $word = mb_strimwidth($string, $pos, 1);
        if (!$word) {
            $word = mb_strimwidth($string, $pos, 2);
        }
        if (mb_ereg_match("[ -~]", $word)) {    // ascii
            if ($mode != 1) {
                $row++;
                $mode = 1;
                $string_array[$row] = NULL;
            }
        } else {                                // multibyte
            if ($mode != 2) {
                $row++;
                $mode = 2;
                $string_array[$row] = NULL;
            }
        }
        $string_array[$row] .= $word;
        $pos++;
    }
    
    //echo "<pre>";
    //print_r($string_array);
    //echo "</pre>";
    
    foreach ($string_array as $key => $value) {
        $value = mb_convert_encoding($value, $encoding);
        $string_array[$key] = mb_encode_mimeheader($value, $encoding);
    }
    
    //echo "<pre>";
    //print_r($string_array);
    //echo "</pre>";
    
    return implode("", $string_array);
}

is not the best, but it works

down

-3

mortoray at ecircle-ag dot com ¶

19 years ago


At least for Q encoding, this function is unsafe and does not encode correctly. Raw characters which appear as RFC2047 sequences are simply left as is.

Ex:

mb_encode_mimeheader( '=?iso-8859-1?q?this=20is=20some=20text?=' );

returns '=?iso-8859-1?q?this=20is=20some=20text?='

The exact same string, which is obviously not the encoding for the source string.  That is, mb_encode_mimeheader does not do any type of escaping.

That is, the following condition is not always true:
    mb_decode_mimeheader( mb_encode_mimeheader( $text ) ) == $text

add a note

mb_encode_mimeheader

Description

Parameters

Return Values

Changelog

Examples

Notes

See Also

User Contributed Notes 10 notes