使用Html敏捷包剥离所有html标签_html-js-css

概述我有一个这样的HTML字符串： <html><body><p>foo <a href='http://www.example.com'>bar</a> baz</p></body></html> 我希望剥离所有html标签,以便生成的字符串变成： foo bar baz 从另一个帖子在这里,我已经提出了这个功能(使用Html敏捷包)： Public Shared Function stripTag 我有一个这样的HTML字符串：

<HTML><body><p>foo <a href='http://www.example.com'>bar</a> baz</p></body></HTML>

我希望剥离所有HTML标签,以便生成的字符串变成：

foo bar baz

从另一个帖子在这里,我已经提出了这个功能(使用HTML敏捷包)：

Public Shared Function stripTags(ByVal HTML As String) As String    Dim plain As String = String.Empty    Dim HTMLdoc As New HTMLAgilityPack.HTMLdocument    HTMLdoc.LoadHTML(HTML)    Dim invalIDNodes As HTMLAgilityPack.HTMLNodeCollection = HTMLdoc.documentNode.SelectNodes("//HTML|//body|//p|//a")    If Not HTMLdoc Is nothing Then      For Each node In invalIDNodes        node.parentNode.RemoveChild(node,True)      Next    End If    Return HTMLdoc.documentNode.WriteContentTo  End Function

不幸的是,这不会返回我的期望,而是给出：

bazbarfoo

请问我哪里错了 – 这是最好的方法吗？

问候和快乐的编码！

更新：通过以下答案,我想出了这个功能,可能对别人有用：

Public Shared Function stripTags(ByVal HTML As String) As String    Dim HTMLdoc As New HTMLAgilityPack.HTMLdocument    HTMLdoc.LoadHTML(HTML.Replace("</p>","</p>" & New String(Environment.Newline,2)).Replace("<br/>",Environment.Newline))    Return HTMLdoc.documentNode.InnerText  End Function

解决方法为什么不返回HTMLdoc.documentNode.InnerText而不是删除所有的非文本节点？它应该给你你想要的总结

以上是内存溢出为你收集整理的使用Html敏捷包剥离所有html标签全部内容，希望文章能够帮你解决使用Html敏捷包剥离所有html标签所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错，欢迎将内存溢出网站推荐给程序员好友。

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/web/1103659.html