如何在组长度和组内元素的所有可能组合中将列表分为n个组？_随笔

如何在组长度和组内元素的所有可能组合中将列表分为n个组？

我们可以从该答案中使用基本的递归算法，然后对其进行修改以生成特定长度的分区，而不必生成和过滤掉不需要的分区。

def sorted_k_partitions(seq, k):    """Returns a list of all unique k-partitions of `seq`.    Each partition is a list of parts, and each part is a tuple.    The parts in each individual partition will be sorted in shortlex    order (i.e., by length first, then lexicographically).    The overall list of partitions will then be sorted by the length    of their first part, the length of their second part, ...,    the length of their last part, and then lexicographically.    """    n = len(seq)    groups = []  # a list of lists, currently empty    def generate_partitions(i):        if i >= n: yield list(map(tuple, groups))        else: if n - i > k - len(groups):     for group in groups:         group.append(seq[i])         yield from generate_partitions(i + 1)         group.pop() if len(groups) < k:     groups.append([seq[i]])     yield from generate_partitions(i + 1)     groups.pop()    result = generate_partitions(0)    # Sort the parts in each partition in shortlex order    result = [sorted(ps, key = lambda p: (len(p), p)) for ps in result]    # Sort partitions by the length of each part, then lexicographically.    result = sorted(result, key = lambda ps: (*map(len, ps), ps))    return result

这里有很多事情，所以让我解释一下。

首先，我们从上述递归算法的程序性，自底向上（术语？）实现开始：

def partitions(seq):    """-> a list of all unique partitions of `seq` in no particular order.    Each partition is a list of parts, and each part is a tuple.    """    n = len(seq)    groups = []  # a list of lists, currently empty    def generate_partitions(i):        if i >= n: yield list(map(tuple, groups))        else: for group in groups     group.append(seq[i])     yield from generate_partitions(i + 1)     group.pop() groups.append([seq[i]]) yield from generate_partitions(i + 1) groups.pop()    if n > 0:        return list(generate_partitions(0))    else:        return [[()]]

主要算法在嵌套

generate_partitions

函数中。基本上，它遍历序列，对于每个项目，它：1）将项目放入工作集中的每个当前组（又称零件）并递归；2）将项目放入自己的新组中。

当我们到达序列（

i == n

）的末尾时，我们将生成一个已经建立的工作集的（深层）副本。

现在，要获得特定长度的分区，我们可以
简单地将所需结果过滤或分组，然后用它来完成，但是如果我们只是想这样做，这种方法会执行很多不必要的工作（即递归调用）一定长度的隔板

。

请注意，在上面的函数中，只要以下情况，分区的长度（即组数）就会增加：

 # this adds a new group (or part) to the partition groups.append([seq[i]]) yield from generate_partitions(i + 1) groups.pop()

…执行。因此，我们可以通过简单地在该块上放置防护来限制分区的大小，如下所示：

def partitions(seq, k):    ...    def generate_partitions(i):        ... # only add a new group if the total number would not exceed k if len(groups) < k:     groups.append([seq[i]])     yield from generate_partitions(i + 1)     groups.pop()

现在，将新参数和该行添加到

partitions

函数中将导致它仅生成长度 最大为的
分区

。这几乎是我们想要的。问题在于，

for

循环有时仍会生成长度 小于的 分区

。

为了修剪那些递归分支，只有

for

在可以确定序列中有足够的剩余元素将工作集扩展到所有组时，才需要执行循环

。剩余元素（或尚未放入组中的元素）的数量为

n- i

（或

len(seq) - i

）。这

k - len(groups)

是我们需要添加以产生有效k分区的新组的数量。如果为

n - i <= k -len(groups)

，则我们 不能通过 将项目添加到当前组之一来浪费它-我们必须创建一个新组。

因此，我们这次只是向另一个递归分支添加另一个保护：

    def generate_partitions(i):        ... # only add to current groups if the number of remaining items # exceeds the number of required new groups. if n - i > k - len(groups):     for group in groups:         group.append(seq[i])         yield from generate_partitions(i + 1)         group.pop() # only add a new group if the total number would not exceed k if len(groups) < k:     groups.append([seq[i]])     yield from generate_partitions(i + 1)     groups.pop()

这样一来，您将拥有一个可用的k分区生成器。您可能甚至可以进一步折叠一些递归调用（例如，如果剩余3个项目，而我们需要3个以上的组，那么您已经知道必须将每个项目分成自己的组），但是我想展示一下功能是对生成所有分区的基本算法的略微修改。

剩下要做的就是对结果进行排序。不幸的是，我没有弄清楚如何按照期望的顺序直接生成分区（这是一只聪明的狗的练习），我作弊并只是对生成后进行了排序。

def sorted_k_partitions(seq, k):    ...    result = generate_partitions(0)    # Sort the parts in each partition in shortlex order    result = [sorted(ps, key = lambda p: (len(p), p)) for ps in result]    # Sort partitions by the length of each part, then lexicographically.    result = sorted(result, key = lambda ps: (*map(len, ps), ps))    return result

除了主要功能外，有点不言自明。第一个：

key = lambda p: (len(p), p)

表示先按长度排序，然后再按序列本身排序（在Python中，默认情况下按字典顺序排序）。该

代表“的一部分”。这用于对分区中的零件/组进行排序。例如，此键意味着

(4,)

在之前

(1,2, 3)

，因此

[(1, 2, 3), (4,)]

被排序为

[(4,), (1, 2, 3)]

。

key = lambda ps: (*map(len, ps), ps) # or for Python versions <3.5: lambda ps: tuple(map(len, ps)) + (ps,)

在

ps

这里代表“零件”，复数。这个人说要按照序列中每个元素的长度（必须是序列本身）对序列进行排序，然后（按字典顺序）按序列本身进行排序。这用于对各个分区进行相互排序，例如，

[(4,),(1, 2, 3)]

在before之前

[(1, 2), (3, 4)]

。

以下：

seq = [1, 2, 3, 4]for k in 1, 2, 3, 4:    for groups in sorted_k_partitions(seq, k):        print(k, groups)

产生：

1 [(1, 2, 3, 4)]2 [(1,), (2, 3, 4)]2 [(2,), (1, 3, 4)]2 [(3,), (1, 2, 4)]2 [(4,), (1, 2, 3)]2 [(1, 2), (3, 4)]2 [(1, 3), (2, 4)]2 [(1, 4), (2, 3)]3 [(1,), (2,), (3, 4)]3 [(1,), (3,), (2, 4)]3 [(1,), (4,), (2, 3)]3 [(2,), (3,), (1, 4)]3 [(2,), (4,), (1, 3)]3 [(3,), (4,), (1, 2)]4 [(1,), (2,), (3,), (4,)]

欢迎分享，转载请注明来源：内存溢出

原文地址: http://outofmemory.cn/zaji/5643637.html

如何在组长度和组内元素的所有可能组合中将列表分为n个组？

发表评论

评论列表（0条）