python generators garbage collection

  • A+
Category:Languages

I think my question is related to this, but not exactly similar. Consider this code:

def countdown(n):     try:         while n > 0:             yield n             n -= 1     finally:         print('In the finally block')  def main():     for n in countdown(10):         if n == 5:             break         print('Counting... ', n)     print('Finished counting')  main() 

The output of this code is:

Counting...  10       Counting...  9        Counting...  8        Counting...  7        Counting...  6        In the finally block  Finished counting   

Is it guaranteed that the line "In the finally block" is going to be printed before "Finished counting"? Or is this because of cPython implementation detail that an object will be garbage collected when the reference count reaches 0.

Also I am curious on how finally block of the countdown generator is executed? e.g. if I change the code of main to

def main():     c = countdown(10)     for n in c:         if n == 5:             break         print('Counting... ', n)     print('Finished counting') 

then I do see Finished counting printed before In the finally block. How does the garbage collector directly go to the finally block? I think I have always taken try/except/finally on its face value, but thinking in the context of generators is making me think twice about it.


You are, as you expected, relying on implementation-specific behavior of the CPython garbage collection.1

In fact, if you run this code in, say, PyPy, the output will usually be:

Counting...  10 Counting...  9 Counting...  8 Counting...  7 Counting...  6 Finished counting In the finally block 

And if you run it in an interactive PyPy session, that last line may come many lines later, or even only when you finally exit.


If you look at how generators are implemented, they have methods roughly like this:

def __del__(self):     self.close() def close(self):     try:         self.raise(GeneratorExit)     except GeneratorExit:         pass 

In CPython, which uses reference counting (plus a cycle collector, but that isn't relevant here), as soon as the generator goes out of scope, it gets deleted, so it gets closed, so it raises a GeneratorExit into the generator frame and resumes it. And of course there's no handler for the GeneratorExit, so the finally clause gets executed and control passes up the stack, where the exception is swallowed.

In PyPy, which uses a hybrid garbage collector, the generator doesn't get deleted until the next time the GC decides to scan. And in an interactive session, with low memory pressure, that could be as late as exit time. But once it does, the same thing happens.

You can see this by handling the GeneratorExit explicitly:

def countdown(n):     try:         while n > 0:             yield n             n -= 1     except GeneratorExit:         print('Exit!')         raise     finally:         print('In the finally block') 

(If you leave the raise off, you'll get the same results for only slightly different reasons.)


You can explicitly close a generator—and, unlike the stuff above, this is part of the public interface of the generator type:

def main():     c = countdown(10)     for n in c:         if n == 5:             break         print('Counting... ', n)     c.close()     print('Finished counting') 

Or, of course, you can use a with statement:

def main():     with contextlib.closing(countdown(10)) as c:         for n in c:             if n == 5:                 break             print('Counting... ', n)     print('Finished counting') 

1. As Tim Peters' answer points out, you're also relying of implementation-specific behavior of the CPython compiler in the second test.

Comment

:?: :razz: :sad: :evil: :!: :smile: :oops: :grin: :eek: :shock: :???: :cool: :lol: :mad: :twisted: :roll: :wink: :idea: :arrow: :neutral: :cry: :mrgreen: