install.packages("openNLP") ## Installs the required natural language processing (NLP) packageinstall.packages("openNLPmodels.en") ## Installs the model files for the English languagelibrary(openNLP) ## Loads the package for use in the tasklibrary(openNLPmodels.en) ## Loads the model files for the English languagetext = "Dr. brown and Mrs. Theresa will be away from a very long time!!! I can't wait to see them again." ## This sentence has unusual punctuation as suggested by @gui11aumex = sentDetect(text,language = "en") ## sentDetect() is the function to use. It detects and seperates sentences in a text. The first argument is the string vector (or text) and the second argument is the language.x ## displays the different sentences in the string vector (or text).[1] "Dr. brown and Mrs. Theresa will be away from a very long time!!! "[2] "I can't wait to see them again."length(x) ## displays the number of sentences in the string vector (or text).[1] 2
{openNLP}包非常适合R中的自然语言处理,你可以找到它的简短介绍here,或者你可以查看软件包的文档here.
包中还支持三种语言.您只需安装并加载相应的模型文件即可.
> {openNLPmodels.es}为西班牙语> {openNLPmodels.ge}为德语> {openNLPmodels.th}泰语
总结以上是内存溢出为你收集整理的如何计算R中文本中的句子数?全部内容,希望文章能够帮你解决如何计算R中文本中的句子数?所遇到的程序开发问题。
如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)