Python 3 generator comprehension to generate chunks including last

  • A+
Category:Languages

If you have a list in Python 3.7:

>>> li [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10] 

You can turn that into a list of chunks each of length n with one of two common Python idioms:

>>> n=3 >>> list(zip(*[iter(li)]*n)) [(0, 1, 2), (3, 4, 5), (6, 7, 8)] 

Which drops the last incomplete tuple since (9,10) is not length n

You can also do:

>>> [li[i:i+n] for i in range(0,len(li),n)] [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10]] 

if you want the last sub list even if it has less than n elements.

Suppose now I have a generator, gen, unknown length or termination (so calling list(gen)) or sum(1 for _ in gen) would not be wise) where I want every chunk.

The best generator expression that I have been able to come up with is something along these lines:

from itertools import zip_longest sentinel=object()             # for use in filtering out ending chunks gen=(e for e in range(22))    # fill in for the actual gen  g3=(t if sentinel not in t else tuple(filter(lambda x: x != sentinel, t)) for t in zip_longest(*[iter(gen)]*n,fillvalue=sentinel)) 

That works for the intended purpose:

>>> next(g3) (0, 1, 2) >>> next(g3) (3, 4, 5) >>> list(g3) [(6, 7, 8), (9, 10)] 

It just seems -- clumsy. I tried:

  1. using islice but the lack of length seems hard to surmount;
  2. using a sentinel in iter but the sentinel version of iter requires a callable, not an iterable.

Is there a more idiomatic Python 3 technique for a generator of chunks of length n including the last chuck that might be less than n?

I am open to a generator function as well. I am looking for something idiomatic and mostly more readable.


Update:

DSM's method in his deleted answer is very good I think:

>>> g3=(iter(lambda it=iter(gen): tuple(islice(it, n)), ())) >>> next(g3) (0, 1, 2) >>> list(g3) [(3, 4, 5), (6, 7, 8), (9, 10)] 

I am open to this question being a dup but the linked question is almost 10 years old and focused on a list. There is no new method in Python 3 with generators where you don't know the length and don't want any more than a chunk at a time?

 


I think this is always going to be messy as long as you're trying to fit this into a one liner. I would just bite the bullet and go with a generator function here. Especially useful if you don't know the actual size (say, if gen is an infinite generator, etc).

from itertools import islice  def chunk(gen, k):     """Efficiently split `gen` into chunks of size `k`.         Args:            gen: Iterator to chunk.            k: Number of elements per chunk.         Yields:            Chunks as a list.     """      while True:         chunk = [*islice(gen, 0, k)]         if chunk:             yield chunk         else:             break 

>>> gen = iter(list(range(11))) >>> list(chunk(gen)) [[0, 1, 2], [3, 4, 5], [6, 7, 8], [9, 10]] 

Someone may have a better suggestion, but this is how I'd do it.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: