转换嵌套<;ul><;李>;到PHP嵌套数组


Convert nested <ul><li> to PHP nested array

我想将嵌套的ul li转换为PHP数组。

我的HTML代码看起来像:

<ul id="main-menu">
    <li id="firstNavItem"><a href="index.html">Home</li>
    <li><a href="Warp.html">Warp</a>
        <ul>
            <li><a href="Warp-how-it-works.html">How it works</a>
            </li>
            <li><a href="Warp-Engine.html">Warp Engine</a>
            </li>
            <li><a href="WarpFactors.html">Warp Factors</a>
            </li>
            <li><a href="">Fuel</a>
                <ul>
                    <li><a href="Anti-Matter.html">Anti-Matter</a>
                    </li>
                    <li><a href="Deuterium.html">Deuterium</a>
                    </li>
                </ul>
            </li>
        </ul>
    </li>
    <li><a href="Fact-or-Fiction.html">Fact or Fiction</li>
    <li><a href="StarTrek.html">Star Trek</a>
        <ul>
            <li><a href="Enterprise.html">Enterprise</a>
            </li>
            <li><a href="Voyager.html">Voyager</a>
            </li>
        </ul>
    </li>
    <li><a href="about.html">About</a>
    </li> </ul>

它必须转换为数组。

我尝试了几种解析方法,但都失败了。

我使用的方法之一是:

$doc = new 'DOMDocument();
$doc->preserveWhiteSpace = false;
$doc->loadHTML($data);
$i = 0;
while( is_object($finance = $doc->getElementsByTagName("li")->item($i)) )
{
    foreach($finance->childNodes as $nodename)
    {
        if($nodename->nodeName == 'li')
        {
            foreach($nodename->childNodes as $subNodes)
            {
                $arr[$i] = $subNodes->nodeValue.PHP_EOL;
            }
        }
        else
        {
            $s = explode('             ', $nodename->nodeValue);
            if (count($s) == 1)
            {
                $arr[$i] =$nodename->nodeValue;
            }
            else
            {
                $arr[$i] =  $s;
            }
        }
    }
    $i++;
}

下面的代码给出了一个嵌套数组。我不认为输出的数组应该是什么样子,但这段代码给出了以下内容:

Array
(
    [0] => Array
        (
            [key] => Home
            [items] => Array
                (
                )
        )
    [1] => Array
        (
            [key] => Warp
            [items] => Array
                (
                    [0] => Array
                        (
                            [key] => How it works
                            [items] => Array
                                (
                                )
                        )
                    [1] => Array
                        (
                            [key] => Warp Engine
                            [items] => Array
                                (
                                )
                        )
                    [2] => Array
                        (
                            [key] => Warp Factors
                            [items] => Array
                                (
                                )
                        )
                    [3] => Array
                        (
                            [key] => Fuel
                            [items] => Array
                                (
                                    [0] => Array
                                        (
                                            [key] => Anti-Matter
                                            [items] => Array
                                                (
                                                )
                                        )
                                    [1] => Array
                                        (
                                            [key] => Deuterium
                                            [items] => Array
                                                (
                                                )
                                        )
                                )
                        )
                )
        )
    [2] => Array
        (
            [key] => Fact or Fiction
            [items] => Array
                (
                )
        )
    [3] => Array
        (
            [key] => Star Trek
            [items] => Array
                (
                    [0] => Array
                        (
                            [key] => Enterprise
                            [items] => Array
                                (
                                )
                        )
                    [1] => Array
                        (
                            [key] => Voyager
                            [items] => Array
                                (
                                )
                        )
                )
        )
    [4] => Array
        (
            [key] => About
            [items] => Array
                (
                )
        )
)

代码:

<?php
class Parser {
    private $elements = [];
    public function parse() {
        $doc = new 'DOMDocument();
        $doc->preserveWhiteSpace = false;
        $doc->loadHTMLFile("./html.html");
        $this->parseChildNodes($doc, $this->elements);
    }
    private function parseChildNodes($node, & $arrayToPush) {
        $indexPushed = count($arrayToPush);
        if ($node->nodeName == "li") {
            $representation = [
                "key" => $this->getDisplayValueFromNode($node),
                "items" => []
            ];
            array_push($arrayToPush, $representation);
            $arrayToPush = & $arrayToPush[$indexPushed]["items"];
        }
        if ($node->childNodes == null) {
            return;
        }
        foreach ($node->childNodes as $child) {
            $this->parseChildNodes($child, $arrayToPush);
        }
    }
    /**
     * Get the value of the node's first element
     * In our case this is the text value of the anchor tag
     *
     * @param $node
     * @return String
     */
    private function getDisplayValueFromNode($node) {
        return $node->firstChild->nodeValue;
    }
    public function getElements() {
        return $this->elements;
    }
}
$parser = new Parser();
$parser->parse();
print_r($parser->getElements());

这并不容易,但我不知道你可以用PHP访问DOM,所以这是一个有趣的挑战。

这将适用于最多两个深度的嵌套列表,您可以重构它,使其更容易处理更深的列表。

下面的代码应该可以帮助您将列表放入数组中。为了便于演示,我留下了呼应语句。

<?php
    $data = <<<EOT
<ul id="main-menu">
    <li id="firstNavItem"><a href="index.html">Home</li>
    <li><a href="Warp.html">Warp</a>
        <ul>
            <li><a href="Warp-how-it-works.html">How it works</a>
            </li>
            <li><a href="Warp-Engine.html">Warp Engine</a>
            </li>
            <li><a href="WarpFactors.html">Warp Factors</a>
            </li>
            <li><a href="">Fuel</a>
                <ul>
                    <li><a href="Anti-Matter.html">Anti-Matter</a>
                    </li>
                    <li><a href="Deuterium.html">Deuterium</a>
                    </li>
                </ul>
            </li>
        </ul>
    </li>
    <li><a href="Fact-or-Fiction.html">Fact or Fiction</li>
    <li><a href="StarTrek.html">Star Trek</a>
        <ul>
            <li><a href="Enterprise.html">Enterprise</a>
            </li>
            <li><a href="Voyager.html">Voyager</a>
            </li>
        </ul>
    </li>
    <li><a href="about.html">About</a>
    </li>
</ul>
EOT;
    $doc = new 'DOMDocument();
    $doc->preserveWhiteSpace = false;
    $doc->loadHTML($data);
    $list = $doc->getElementsByTagName('ul')->item(0);
    foreach ($list->childNodes as $node) {
        if ($node->nodeName == 'li'
            &&
            $node->lastChild->nodeName != 'ul'
        ) {
            echo $node->textContent . "<br>";
        } else {
            if ($node->lastChild->childNodes) {
                foreach ($node->lastChild->childNodes as $node2) {
                    if ($node2->nodeName == 'li'
                        &&
                        $node2->lastChild->nodeName != 'ul'
                    ) {
                        echo "&bull; " . $node2->textContent . "<br>";
                    } else {
                        if ($node2->lastChild->childNodes) {
                            foreach ($node2->lastChild->childNodes as $node3) {
                                if ($node3->nodeName == 'li'
                                    &&
                                    $node3->lastChild->nodeName != 'ul'
                                ) {
                                    echo "&bull; &bull; " . $node3->textContent . "<br>";
                                }
                            }
                        }
                    }
                }
            }
        }
    }

getElementsByTagName()返回具有该名称的所有节点(包括嵌套节点),因此无需额外搜索子节点。下面代码段中的代码返回这个数组:

Array
(
    [0] => Home
    [1] => Warp
    [2] => How it works
    [3] => Warp Engine
    [4] => Warp Factors
    [5] => Fuel
    [6] => Anti-Matter
    [7] => Deuterium
    [8] => Fact or Fiction
    [9] => Star Trek
    [10] => Enterprise
    [11] => Voyager
    [12] => About
)

代码:

<?php
class Parser {
    private $elements = [];
    public function parse() {
        $doc = new 'DOMDocument();
        $doc->preserveWhiteSpace = false;
        $doc->loadHTMLFile("./html.html");
        foreach($doc->getElementsByTagName("li") as $node) {
            array_push($this->elements, $node->firstChild->nodeValue);
        }
    }
    /**
     * Get the value of the node's first element
     * In our case this is the text value of the anchor tag
     *
     * @param $node
     * @return String
     */
    private function getDisplayValueFromNode($node) {
        return $node->firstChild->nodeValue;
    }
    public function getElements() {
        return $this->elements;
    }
}
$parser = new Parser();
$parser->parse();
print_r($parser->getElements());