使用python中指定的分隔符逐块读取文件

死链检查 • 2022-12-16 • 随笔 • 阅读 19

此处的一般解决方案是为此编写一个生成器函数，该函数一次生成一组。这是您一次只能在内存中存储一组。

def get_groups(seq, group_by):    data = []    for line in seq:        # Here the `startswith()` logic can be replaced with other        # condition(s) depending on the requirement.        if line.startswith(group_by): if data:     yield data     data = []        data.append(line)    if data:        yield datawith open('input.txt') as f:    for i, group in enumerate(get_groups(f, ">"), start=1):        print ("Group #{}".format(i))        print ("".join(group))

输出：

Group #1> header1 descriptiondata datadataGroup #2>header2 descriptionmore datadatadata

对于一般的FASTA格式，我建议使用Biopython软件包。

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/5663993.html

函数软件包生成器编写生成

打赏

微信扫一扫

支付宝扫一扫

死链检查一级用户组

0 0

在Django REST框架中优化数据库查询

上一篇 2022-12-16

批量大小可变的TensorFlow DataSet from_generator

下一篇 2022-12-16

发表评论

登录后才能评论

使用python中指定的分隔符逐块读取文件

发表评论

评论列表（0条）