python– 如何从马尔可夫链输出创建段落?

python– 如何从马尔可夫链输出创建段落?,第1张

概述我想修改下面的脚本,以便它从脚本生成的随机数量的句子中创建段落.换句话说,在添加换行符之前连接一个随机数(如1-5)的句子.脚本工作正常,但输出是由换行符分隔的短句.我想把一些句子收集成段落.关于最佳实践的任何想法?谢谢.''' from: http://code.activestate.com/recipes/194364-the-markov-

我想修改下面的脚本,以便它从脚本生成的随机数量的句子中创建段落.换句话说,在添加换行符之前连接一个随机数(如1-5)的句子.

脚本工作正常,但输出是由换行符分隔的短句.我想把一些句子收集成段落.

关于最佳实践的任何想法?谢谢.

"""    from:  http://code.activestate.com/recipes/194364-the-markov-chain-algorithm/?in=lang-python"""import random;import sys;stopword = "\n" # Since we split on whitespace,this can never be a wordstopsentence = (".","!","?",) # Cause a "new sentence" if found at the end of a wordsentencesep  = "\n" #String used to seperate sentences# GENERATE tablew1 = stopwordw2 = stopwordtable = {}for line in sys.stdin:    for word in line.split():        if word[-1] in stopsentence:            table.setdefault( (w1,w2),[] ).append(word[0:-1])            w1,w2 = w2,word[0:-1]            word = word[-1]        table.setdefault( (w1,[] ).append(word)        w1,word# Mark the end of the filetable.setdefault( (w1,[] ).append(stopword)# GENERATE SENTENCE OUTPUTmaxsentences  = 20w1 = stopwordw2 = stopwordsentencecount = 0sentence = []while sentencecount < maxsentences:    newword = random.choice(table[(w1,w2)])    if newword == stopword: sys.exit()    if newword in stopsentence:        print ("%s%s%s" % (" ".join(sentence),newword,sentencesep))        sentence = []        sentencecount += 1    else:        sentence.append(newword)    w1,newword

编辑01:

好吧,我拼凑了一个简单的“段落包装器”,它可以很好地将句子收集到段落中,但它与句子生成器的输出相混淆 – 我对第一个单词的重复性过高,例如,其他的问题.

但前提是声音;我只需要弄清楚为什么句子循环的功能受到段落循环的添加的影响.如果您能看到问题,请告知:

####    usage: $python markov_sentences.py < input.txt > output.txt#    from:  http://code.activestate.com/recipes/194364-the-markov-chain-algorithm/?in=lang-python###import random;import sys;stopword = "\n" # Since we split on whitespace,) # Cause a "new sentence" if found at the end of a wordparagraphsep  = "\n\n" #String used to seperate sentences# GENERATE tablew1 = stopwordw2 = stopwordtable = {}for line in sys.stdin:    for word in line.split():        if word[-1] in stopsentence:            table.setdefault( (w1,[] ).append(stopword)# GENERATE ParaGRAPH OUTPUTmaxparagraphs = 10paragraphs = 0 # reset the outer 'while' loop counter to zerowhile paragraphs < maxparagraphs: # start outer loop,until maxparagraphs is reached    w1 = stopword    w2 = stopword    stopsentence = (".",)    sentence = []    sentencecount = 0 # reset the inner 'while' loop counter to zero    maxsentences = random.randrange(1,5) # random sentences per paragraph    while sentencecount < maxsentences: # start inner loop,until maxsentences is reached        newword = random.choice(table[(w1,w2)]) # random word from word table        if newword == stopword: sys.exit()        elif newword in stopsentence:            print ("%s%s" % (" ".join(sentence),newword),end=" ")            sentencecount += 1 # increment the sentence counter        else:            sentence.append(newword)        w1,newword    print (paragraphsep) # newline space    paragraphs = paragraphs + 1 # increment the paragraph counter# EOF

编辑02:

将以下句子中的句子= []添加到elif语句中.以机智;

        elif newword in stopsentence:            print ("%s%s" % (" ".join(sentence),end=" ")            sentence = [] # I have to be here to make the new sentence start as an empty List!!!            sentencecount += 1 # increment the sentence counter

编辑03:

这是此脚本的最后一次迭代.感谢悲伤帮助整理出来.我希望其他人可以玩得开心,我知道我会的. 总结

以上是内存溢出为你收集整理的python – 如何从马尔可夫链输出创建段落?全部内容,希望文章能够帮你解决python – 如何从马尔可夫链输出创建段落?所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/1205465.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-06-04
下一篇 2022-06-04

发表评论

登录后才能评论

评论列表(0条)

保存