如何将Unicode特殊字符转换为html实体


How do I convert Unicode special characters to html entities?

我有以下字符串:

$string = "★ This is some text ★";

我想把它转换成html实体:

$string = "★ This is some text ★";

每个人都在写的解决方案:

htmlentities("★ This is some text ★", "UTF-8");

但是htmlentities不能将所有unicode转换为html实体。所以它只是给了我与输入相同的输出:

★ This is some text ★

我还尝试将这个解决方案与两者结合起来:

header('Content-Type: text/plain; charset=utf-8');

和:

mb_convert_encoding();

但这要么打印并清空结果,根本无法转换,要么错误地将恒星转换为:

Â

如何转换★以及所有其他unicode字符到正确的html实体?

htmlentities在这种情况下不起作用,但您可以尝试UCS-4对字符串进行编码,类似于:

$string = "★ This is some text ★";
$entity = preg_replace_callback('/['x{80}-'x{10FFFF}]/u', function ($m) {
    $char = current($m);
    $utf = iconv('UTF-8', 'UCS-4', $char);
    return sprintf("&#x%s;", ltrim(strtoupper(bin2hex($utf)), "0"));
}, $string);
echo $entity;

★ This is some text ★

Ideone演示

这是更好的

html_entity_decode('zł');

输出-zł