Suppose you have a list of objects that you need to iterate over two consecutive items at a time.
This makes possible an idiom for clustering a data series into n-length groups using
So the solution would be:
k = [1, 2, 3, 4, 5, 6] list(zip(*[iter(k)]*2)) # [(1, 2), (3, 4), (5, 6)]
That is cryptic! Let’s break it down to understand why this works.
- Let’s start with the inner most bit.
itersimply returns an iterator object. For lists we would normally just write
for x in alistto iterate over the list, but under the hood an iterator is defined with each loop fetching the next item using a
>>> iter(k) <list_iterator object at 0x7fcf654c9f28>
- Next we consider
[iter(k)]*2- the multiplication here creates a shallow copy of the list.
>>> [iter(k)] * 2 [<list_iterator object at 0x7fcf654c9f28>, <list_iterator object at 0x7fcf654c9f28>]
- The star operator
*then unpacks the collection as positional arguments to a function which is
zipin this case.
zipis a handy tool to merge several iterable together.
>>> zip(*[iter(k)] * 2) <zip object at 0x7fcf654de808>
- Finally, the
listoperator just runs through to generate the entire list, giving us the desired output.
>>> list(zip(*[iter(k)] * 2)) [(1, 2), (3, 4), (5, 6)]
What’s strange about all this is that it depends on subtle behaviours of the underlying methods. For example, instead of
zip(*[iter(k)] * 2) you wrote
list(zip(*[iter(k), iter(k)])). You will end up with a different result. The solution depends on the iterators being a shallow copy! Each time any of the iterator is hit, it calls the
next call to the function.
Show, don’t tell
I’d hate to encounter snippets like this in the wild as it places significant cognitive load on people trying to read this. Strange it was included in the official 2.x documentation, thankfully removed from the current versions.