Itertools is a great library, but some methods definitely receive more attention than others – for instance, I’d wager that
chain is a lot more well-known than
tee. Python’s documentation describes
tee as follows:
Return n independent iterators from a single iterable.
It also adds this caveat:
Once tee() has made a split, the original iterable should not be used anywhere else; otherwise, the iterable could get advanced without the tee objects being informed.
However, the documentation doesn’t really explain why you might want to use
tee, instead of copying the iterable
n times. The short answer is that
tee() uses some heuristics and strategies to make the generation of these
n iterators more memory efficient. This article will peek inside Python’s internals and explore these strategies in more detail.
To understand the basis for the optimizations, it is essential to know how Python’s iterator protocol works. (More experienced Pythonistas might want to skim this section – it’s here for completeness.)
Iterables are Python objects that define an
__iter__() method, which, when called, returns an iterator. Iterators define the
__iter__() method as well, but implement it by simply returning the iterator itself. Iterators also implement a
next() method, which returns the next element in a sequence. To signal the end of a sequence,
next() raises a
StopIteration exception. Any object that implements both
next in the aforementioned manner is said to implement the iterator protocol.
To actually use these iterators and iterables, we turn to Python’s classic
for item in sequence loop. Here,
sequence can be either an iterator or an iterable – it doesn’t matter, because the loop will begin by implicitly invoking
__iter__ method, which will return an iterator. Now, on each round through the loop, the Python interpreter will invoke the iterator’s
next method and assign it to
item, and will continue to do so until
Perhaps you are wondering why Python makes the distinction between iterables and iterators, and why iterators don’t have
rewind methods. These two questions are in fact related. It can be expensive or complicated to implement these other methods – for instance, reading sequentially forwards from a file corresponds simply to
fread() in C, but rewinding to the start is a more expensive
fseek call, and there is no efficient way to read sequentially in reverse.
rewind are not essential for iteration – they can be built upon the primitive
next method. For instance, if we really wish to access the earlier contents of the file in Python, we could buffer the earlier results ourselves. However, buffering all the earlier values is generally inefficient and wasteful: many programs have no need to access earlier values of an iterator, and those that do tend to limit themselves to one or two elements before the current one. Rewinding an iterable is a more common use case, but the same thing can be achieved by obtaining an entirely new iterator that points to the start of the sequence. To do that, we need a factory that produces iterators – which is exactly what an iterable is.
In sum, Python’s iterator protocol is simple to implement, and it works well for the common use case of
for .. in loops.
The starkness of Python’s iterator design means that more complicated use cases will need to build their own abstractions on top of it. Fortunately, its simplicity also means that it is easy to build upon.
tee() is one such abstraction, built to efficiently create
n independent iterators. Consider file I/O again: we might not want to create
n file iterations by calling
fopen() over and over, especially if
n is large. Nor would we want to copy iterators that do heavy computation each time we called
next(). In both cases buffering is a better solution, and this is what
tee does, collating in one central list the values returned by the original iterator. Its return value consists of
tee iterator objects,1 each of which stores a single integer index that indicates the next element it should return from the list. The list itself is populated lazily – whenever any of the
n iterators asks for a value at an index that is not yet in the buffer,
next on the underlying iterator, then caches and returns the new value.
With this implementation detail in mind, the reasoning behind the documentation’s caveat becomes clearer – using the iterator outside of
tee() would cause some (or all) values to be lost from the cache.
Buffering is not always the most efficient way to create
n independent iterators. For example,
next could be a computationally cheap function, and it would be more efficient for us to just copy it
n times. We can indicate this fact by defining a
__copy__ method on our iterator.2 If
tee() detects the presence of
__copy__, it will copy the iterator instead of doing buffering. Moreover, it will only do this copying
n - 1 times – the last iterator will simply be the original one that was passed in! Since the original iterator should never be used after being passed to
tee(), this makes perfect sense; the end result will appear the same to the library consumer.
tee objects themselves implement
__copy__! Since any
tee object is backed by a central buffer, the most efficient way to duplicate one is to have the copy point to the same buffer. So not only does
tee duplicate any iterator efficiently, it also ensures that future duplications are efficient.
Notably, as a consequence of this optimization, passing in a non-
tee object that implements
__copy__ will return a tuple of copies of that object, not
tee objects. In practice, the type of the object should not matter as long as one sticks to using it solely as an iterator.
If the total number of items generated by the iterator is very large, storing all of the generated values might take up a lot of memory. If we can keep track of all the indices of the
tee objects generated by a
tee() function call, we can delete old values once all of the indices have passed it. (Since iterators can only progress forwards, we know that these values will never be used again.) With CPython’s reference counting, this is actually quite simple: the buffer can be implemented as a singly linked list, with each
tee object holding a pointer to the next element that it will return. As a natural consequence, once the last
tee iterator is done with a buffer element, the element’s reference count goes to zero and it gets garbage collected.
However, linked lists perform worse than arrays in terms of memory fragmentation. Moreover, they incur lots of overhead in memory allocations and deallocations. As a compromise, CPython uses linked arrays – a linked list with arrays as individual elements. Moreover, these arrays are sized such that the entire array element is 64 bits in size, fitting nicely into cache lines.
Python’s documentation gives pure-Python implementation of
tee(), but it simplifies things and leaves out the buffering and copy semantics. I’ve written up a slightly more involved one that reflects the optimizations described above. Unlike the version in the docs, this implementation passes the standard library’s test suite. It still makes some concessions to simplicity: for instance, buffering uses a single list, instead of the linked arrays described above.
class TeeIterator(object): def __new__(cls, tee_data): if isinstance(tee_data, TeeData): self = super(TeeIterator, cls).__new__(cls) self.tee_data = tee_data else: self = TeeIterator.from_iterable(tee_data) self.index = 0 return self def __copy__(self): return TeeIterator(self.tee_data) def __iter__(self): return self def next(self): rv = self.tee_data[self.index] self.index += 1 return rv @classmethod def from_iterable(cls, iterable): if isinstance(iterable, cls): return iterable.__copy__() tee_data = TeeData(iter(iterable)) return TeeIterator(tee_data) class TeeData(object): def __init__(self, iterator): self.iterator = iterator self.buffer =  def __getitem__(self, index): if index == len(self.buffer): self.buffer.append(next(self.iterator)) return self.buffer[index] def tee(iterable, n=2): if n < 0: raise ValueError("n must be >= 0") elif n == 0: return () if hasattr(iterable, '__copy__'): copyable = iterable else: copyable = TeeIterator.from_iterable(iterable) result = [ copyable ] for i in xrange(1, n): result.append(copyable.__copy__()) return tuple(result)