如果您使用的是* nix系统,最好的通用选项只是使用:
sort filea fileb | uniq -u
但是,如果您需要使用Python:
您的代码在外部文件的每次迭代中都会重新打开内部文件。在循环外打开它。
使用嵌套循环比循环遍历第一个存储找到的值,然后将第二个与这些值进行比较的效率低。
def build_set(filename): # A set stores a collection of unique items. Both adding items and searching for them # are quick, so it's perfect for this application. found = set() with open(filename) as f: for line in f: # [:2] gives us the first two elements of the list. # Tuples, unlike lists, cannot be changed, which is a requirement for anything # being stored in a set. found.add(tuple(sorted(line.split()[:2]))) return foundset_more = build_set('100rwsnMore.txt')set_del = build_set('100rwsnDeleted.txt')with open('results.txt', 'w') as out_file: # Using with to open files ensures that they are properly closed, even if the pre # raises an exception. for res in (set_more - set_del): # The - computes the elements in set_more not in set_del. out_file.write(" ".join(res) + "n")
欢迎分享,转载请注明来源:内存溢出
评论列表(0条)