Python etc
6.12K subscribers
18 photos
194 links
Regular tips about Python and programming in general

Owner — @pushtaev

© CC BY-SA 4.0 — mention if repost
Download Telegram
Hi there! As you’ve probably already noticed there is no activity here at the moment. I hereby declare the second season to be officially over. I currently have no specific plan nor ideas for the third season. DM @pushtaev if you do.

For the time being, you can enjoy the archive. Posts of the channel are mostly relevant for this day.
You are also welcome to consider joining my team. We still develop the voice assistant and beautiful devices for her to live in:
Finally, you can express your gratitude towards the authors by donating via the new telegram donation service. Thanks and hope to see you in the third season one day.
channel = '@pythonetc'
print(f'Happy new Year, {channel}!')

# there are our top posts from 2021
by_likes = {
'join-lists': 236,
'dev-mode': 181,
'is-warning': 170,
'str-concat': 166,
'class-scope': 149,
}
by_forwards = {
'class-scope': 111,
'dev-mode': 53,
'join-lists': 50,
'str-concat': 44,
'eval-strategy': 36,
}
by_views = {
'__path__': 7_736,
'dev-mode': 7_113,
'immutable': 6_757,
'class-scope': 6_739,
'sre-parse': 6_661,
}

from datetime import date
from textwrap import dedent
if date.today().year == 2022:
print(dedent("""
The season 2.6 is coming!
This is what awaits:
"""))
print(
'native telegram reactions instead of buttons',
'deep dive into garbage collection, generators, and coroutines',
'the season is still ran by @orsinium',
'as always, guest posts and donations are welcome',
sep='\n',
)

print('See you next year \N{Sparkling Heart}!')
The module atexit allows registering hooks that will be executed when the program terminates.

There are only a few cases when it is NOT executed:

+ When os._exit (don't confuse with sys.exit) is called.
+ When the interpreter failed with a fatal error.
+ When the process is hard-killed. For example, someone executed kill -9 or the system is ran out of memory.

In all other cases, like an unhandled exception or sys.exit, the registered hooks will be executed.

A few use cases:

+ Finish pending jobs
+ Send pending log messages into the log system
+ Save interactive interpreter history

However, keep in mind that there is no way to handle unhandled exceptions using atexit because it is executed after the exception is printed and discarded.

import atexit

atexit.register(print, 'FINISHED')
1/0

Output:

Traceback (most recent call last):
File "example.py", line 4, in <module>
1/0
ZeroDivisionError: division by zero
FINISHED
The module faulthandler allows registering a handler that will dump the current stack trace in a specific file (stderr by default) upon receiving a specific signal or every N seconds.

For example, dump stack trace every 2 seconds:

import faulthandler
from time import sleep

faulthandler.dump_traceback_later(
timeout=2,
repeat=True,
)
for i in range(5):
print(f"iteration {i}")
sleep(1)

Output:

iteration 0
iteration 1
Timeout (0:00:02)!
Thread 0x00007f8289147740 (most recent call first):
File "tmp.py", line 10 in <module>
iteration 2
iteration 3
Timeout (0:00:02)!
Thread 0x00007f8289147740 (most recent call first):
File "tmp.py", line 10 in <module>
iteration 4
Now, let's see how to dump stack trace when a specific signal is received. We will use SIGUSR1 but you can do the same for any signal.

import faulthandler
from signal import SIGUSR1
from time import sleep

faulthandler.register(SIGUSR1)
sleep(60)

Now, in a new terminal, find out the PID of the interpreter. If the file is named tmp.py, this is how you can do it (we add [] in grep to exclude the grep itself from the output):

ps -ax | grep '[t]mp.py'

The first number in the output is the PID. Now, use it to send the signal for PID 12345:

kill -SIGUSR1 12345

And back in the terminal with the running script. You will see the stack trace:

Current thread 0x00007f22edb29740 (most recent call first):
File "tmp.py", line 6 in <module>

This trick can help you to see where your program has frozen without adding logs to every line. However, a better alternative can be something like py-spy which allows you to dump the current stack trace without any changes in the code.
PEP-563 (landed in Python 3.7) introduced postponed evaluation of type annotations. That means, all your type annotations aren't executed at runtime but rather considered strings.

The initial idea was to make it the default behavior in Python 3.10 but it was postponed after a negative reaction from the community. In short, it would be in some cases impossible to get type information at runtime which is crucial for some tools like pydantic or typeguard. For example, see pydantic#2678.

Either way, starting from Python 3.7, you can activate this behavior by adding from __future__ import annotations at the beginning of a file. It will improve the import time and allow you to use in annotations objects that aren't defined yet.

For example:

class A:
@classmethod
def create(cls) -> A:
return cls()


This code will fail at import time:

Traceback (most recent call last):
File "tmp.py", line 1, in <module>
class A:
File "tmp.py", line 3, in A
def create(cls) -> A:
NameError: name 'A' is not defined


Now add the magic import, and it will work:

from __future__ import annotations

class A:
@classmethod
def create(cls) -> A:
return cls()


Another solution is to manually make annotations strings. So, instead of -> A: you could write -> 'A':.
Often, your type annotations will have circular dependencies. For example, Article has an attribute category: Category, and Category has attribute articles: list[Article]. If both classes are in the same file, adding from __future__ import annotations would solve the issue. But what if they are in different modules? Then you can hide imports that you need only for type annotations inside of the if TYPE_CHECKING block:

from __future__ import annotations
from dataclasses import dataclass
from typing import TYPE_CHECKING

if TYPE_CHECKING:
from .category import Category

@dataclass
class Article:
category: Category


Fun fact: this constant is defined as TYPE_CHECKING = False. It won't be executed at runtime, but the type checker is a static analyzer, it doesn't care.
PEP-604 (landed in Python 3.10) introduced a new short syntax for typing.Union (as I predicted, but I messed up union with intersection, shame on me):

def greet(name: str) -> str | None:
if not name:
return None
return f"Hello, {name}"


You already can use it in older Python versions by adding from __future__ import annotations, type checkers will understand you.
LookupError is a base class for IndexError and KeyError:

LookupError.__subclasses__()
# [IndexError, KeyError, encodings.CodecRegistryError]

KeyError.mro()
# [KeyError, LookupError, Exception, BaseException, object]

IndexError.mro()
# [IndexError, LookupError, Exception, BaseException, object]


The main purpose of this intermediate exception is to simplify a bit lookup for deeply nested structures when any of these two exceptions may occur:

try:
username = resp['posts'][-1]['authors'][0]['name']
except LookupError:
username = None
The operator is checks if the two given objects are the same object in the memory:

{} is {}  # False
d = {}
d is d # True

Since types are also objects, you can use it to compare types:

type(1) is int        # True
type(1) is float # False
type(1) is not float # True

And you can also use == for comparing types:

type(1) == int  # True

So, when to use is and when to use ==? There are some best practices:

+ Use is to compare with None: var is None.

+ Use is to compare with True and False. However, don't explicitly check for True and False in conditions, prefer just if user.admin instead of if user.admin is True. Still, the latter can be useful in tests: assert actual is True.

+ Use isinstance to compare types: if isinstance(user, LoggedInUser). The big difference is that it allows subclasses. So if you have a class Admin which is subclass of LoggedInUser, it will pass isinstance check.

+ Use is in some rare cases when you explicitly want to allow only the given type without subclasses: type(user) is Admin. Keep in mind, that mypy will refine the type only for isinstance but not for type is.

+ Use is to compare enum members: color is Color.RED.

+ Use == in ORMs and query builders like sqlalchemy: session.query(User).filter(User.admin == True). The reason is that is behavior cannot be redefined using magic methods but == can (using __eq__).

+ Use == in all other cases. In particular, always use == to compare values: answer == 42.
The del statement is used to delete things. It has a few distinct behaviors, depending on what is the specified target.

If a variable specified, it will be removed from the scope in which it is defined:

a = []
del a
a
# NameError: name 'a' is not defined

If the target has a form target[index], target.__delitem__(index) will be called. It is defined for built-in collections to remove items from them:

a = [1, 2, 3]
del a[0]
a # [2, 3]

d = {1: 2, 3: 4}
del d[3]
d # {1: 2}

Slices are also supported:

a = [1, 2, 3, 4]
del a[2:]
a # [1, 2]

And the last behavior, if target.attr is specified, target.__delattr__(attr) will be called. It is defined for object:

class A:
b = 'default'
a = A()
a.b = 'overwritten'
a.b # 'overwritten'
del a.b
a.b # 'default'
del a.b # AttributeError
The method __del__ is called on the object by the garbage collector when the last reference to the object is removed:

class A:
def __del__(self):
print('destroying')

a = b = A()
del a
del b
# destroying

def f():
a = A()

f()
# destroying


The method is used by Python's file object to close the descriptor when you don't need it anymore:

def f():
a_file = open('a_file.txt')
...


However, you cannot safely rely on that the destructor (this is how it's called in other languages, like C) will be ever called. For instance, it can be not true in PyPy, MicroPython, or just if the garbage collector is disabled using gc.disable().

The thumb-up rule is to use the destructor only for unimportant things. For example, aiohttp.ClientSession uses __del__ to warn about an unclosed session:

def __del__(self) -> None:
if not self.closed:
warnings.warn(
f"Unclosed client session {self!r}", ResourceWarning
)
By using __del__ and global variables, it is possible to leave a reference to the object after it was "destroyed":

runner = None
class Lazarus:
def __del__(self):
print('destroying')
global runner
runner = self

lazarus = Lazarus()
print(lazarus)
# <__main__.Lazarus object at 0x7f853df0a790>
del lazarus
# destroying
print(runner)
# <__main__.Lazarus object at 0x7f853df0a790>


In the example above, runner points to the same object as lazarus did and it's not destroyed. If you remove this reference, the object will stay in the memory forever because it's not tracked by the garbage collector anymore:

del runner  # it will NOT produce "destroying" message


This can lead to a strange situation when you have an object that escapes the tracking and will be never collected.

In Python 3.9, the function gc.is_finalized was introduced that tells you if the given object is a such runner:

import gc
lazarus = Lazarus()
gc.is_finalized(lazarus) # False
del lazarus
gc.is_finalized(runner) # True


It's hard to imagine a situation when you'll need it, though. The main conclusion you can make out of it is that you can break things with a destructor, so don't overuse it.
The module warnings allows to print, you've guessed it, warnings. Most often, it is used to warn users of a library that the module, function, or argument they use is deprecated.

import warnings

def g():
return 2

def f():
warnings.warn(
"f is deprecated, use g instead",
DeprecationWarning,
)
return g()

f()

The output:

example.py:7: DeprecationWarning: 
function f is deprecated, use g instead
warnings.warn(

Note that DeprecationWarning, as well as other warning categories, is built-in and doesn't need to be imported from anywhere.

When running tests, pytest will collect all warnings and report them at the end. If you want to get the full traceback to the warning or enter there with a debugger, the easiest way to do so is to turn all warnings into exceptions:

warnings.filterwarnings("error")

On the production, you can suppress warnings. Or, better, turn them into proper log records, so they will be collected wherever you collect logs:

import logging
logging.captureWarnings(True)
The string.Template class allows to do $-style substitutions:

from string import Template
t = Template('Hello, $channel!')

t.substitute(dict(channel='@pythonetc'))
# 'Hello, @pythonetc!'

t.safe_substitute(dict())
# 'Hello, $channel!'

Initially, it was introduced to simplify translations of strings. However, now PO-files natively support python-format flag. It indicates for translators that the string has str.format-style substitutions. And on top of that, str.format is much more powerful and flexible.

Nowadays, the main purpose of Template is to confuse newbies with one more way to format a string. Jokes aside, there are a few more cases when it can come in handy:

+ Template.safe_substitute can be used when the template might have variables that aren't defined and should be ignored.
+ The substitution format is similar to the string substitution in bash (and other shells), which is useful in some cases. For instance, if you want to write your own dotenv.
A long time ago, we already covered the chaining of comparison operations:
https://tttttt.me/pythonetc/411

A quick summary is that the result of right value from each comparison gets passed into the next one:

13 > 2 > 1  # same as `13 > 2 and 2 > 1`
# True

13 > 2 > 3 # same as `13 > 2 and 2 > 3`
# False


What's interesting, is that is and in are also considered to be operators, and so can be also chained, which can lead to unexpected results:

a = None
a is None # True, as expected
a is None is True # False 🤔
a is None == True # False 🤔
a is None is None # True 🤯


The best practice is to use the operator chaining only to check if the value in a range using < and <=:

teenager = 13 < age < 19