Python annoyances
Day (date.today() - date(2021, 7, 14)).days
of not understanding why Python is
used in production
pathlib
>>> from pathlib import PurePath
>>> PurePath(b"")
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
File "/nix/store/y3inmdhijqkb4qj36yphj4cbllljhqzz-python3-3.9.6/lib/python3.9/pathlib.py", line 665, in __new__
return cls._from_parts(args)
File "/nix/store/y3inmdhijqkb4qj36yphj4cbllljhqzz-python3-3.9.6/lib/python3.9/pathlib.py", line 697, in _from_parts
drv, root, parts = self._parse_args(args)
File "/nix/store/y3inmdhijqkb4qj36yphj4cbllljhqzz-python3-3.9.6/lib/python3.9/pathlib.py", line 686, in _parse_args
raise TypeError(
TypeError: argument should be a str object or an os.PathLike object returning str, not <class 'bytes'>
This is fine because every file system on the planet is UTF-8, clearly.
I heard a counterargument that "it says in the docs that for
'low-level path manipulation on strings, you can also use the os.path
module.'" I take a few issues with this: nowhere in the docs are there explicit
mentions that exceptions will be raised when passing bytes
instead of str
;
nowhere in the docs are there explicit type annotations suggesting that you can
only use str
; and the phrasing of that little warning uses such passive
language that it doesn't seem like there's any real reason to care about this
case in the first place.
I would expect a library designed specifically for dealing with paths to be
able to deal with paths, so I find this behavior to be... surprising.
A counterargument I heard to this is that "pathlib
just provides a high level
OOP interface to paths" but I don't understand how that's mutually exclusive
with handling bytes/non-UTF-8.
datetime
I'd like to convert an ISO 8601 timestamp string to the appropriate Python
object. Looks like datetime.fromisoformat
is the way to do
that. But wait:
Caution: This does not support parsing arbitrary ISO 8601 strings - it is only intended as the inverse operation of
datetime.isoformat()
. A more full-featured ISO 8601 parser,dateutil.parser.isoparse
is available in the third-party packagedateutil
.
If you're going to add a function for this to the standard library, you'd think
you'd want to avoid half-assing it, no? Since it's built-in, it's way more
likely to be used than any third party package. Anyway, now that I've got my ISO
8601 string (which includes timezone information) converted into a datetime
object, let's compare it against the current time:
(Pdb) p cache_expires_at
datetime.datetime(2022, 4, 20, 21, 41, 52, 955721, tzinfo=datetime.timezone.utc)
(Pdb) p cache_expires_at < datetime.utcnow()
*** TypeError: can't compare offset-naive and offset-aware datetimes
(Pdb) datetime.utcnow()
datetime.datetime(2022, 4, 20, 20, 43, 23, 982491)
... what!? Why does datetime.utcnow()
not have timezone information?
Shouldn't it know what timezone the datetime
it's creating is in since it
literally has utc
in the name? Okay, it looks like the docs actually
address this:
Warning: Because naive
datetime
objects are treated by manydatetime
methods as local times, it is preferred to use aware datetimes to represent times in UTC. As such, the recommended way to create an object representing the current time in UTC is by callingdatetime.now(timezone.utc)
.
Well, sort of, anyway. Why even provide this method if it omits timezone
information, then? Why are naive datetime
s treated as local time? I bet there
are some horrifying edge cases there. Another big point of pain is that since
naive and aware timestamps are both the same type, tools like pyright
can't
even warn about this stuff statically. You need good code coverage (hard, rare)
or manual testing (ew) to be able to detect this sort of error. Similarly,
pydantic
can't easily enforce timestamps to be timezone-aware since again,
there's a single type for both cases. It's incredibly silly to allow this sort
of error to even happen when it could so easily be prevented by having two
separate types.
PyPI
For some reason, PyPI allows packages to be uploaded with version requirements
that almost definitely will not work. If I make a package that depends on *
or
>1
or such of some other dependency, PyPI will happily accept my upload. The
problem is that, as soon as that dependency releases 2.0
, my package is sure
to break. For a real world example of this, see here.
PEP 440 defines Python's own special versioning scheme (instead of just using SemVer like everyone else) with liberal usage of the word "MUST" but then official Python tooling (like PyPI) opt not to enforce it at all. What's even the point, then? Also, what even is a "post release"? Asking Google "post release meaning" gives me a bunch of stuff about prisoners, and appending "software" to the query doesn't help either. After eventually finding the explanation in the PEP, the answer is "it's functionally identical to SemVer Patch releases except we decided to make it a separate thing for no reason".
Since Python decided not to use SemVer, it now also needs to invent its own
syntax for specifying allowable dependency versions. It's a mess, and
quite easy to misuse since nobody knows what ~=
means, nor realizes you can use
,
to add additional constraints. This could all have been neatly avoided by
adopting SemVer instead. Speaking of ~=
, here's a cheap shot:
The spelling of the compatible release clause (
~=
) is inspired by the Ruby (~>
) and PHP (~
) equivalents.
— PEP 440
Ah yes, PHP, the paragon of good design. Smartly,
Poetry lets you just use the standard SemVer syntax for this (^
).
Poetry
poetry remove
has no --lock
option.
Adding dependencies
poetry add
can take forever. Trying to add new dependencies is a nightmare,
and that's due to both the aforementioned performance issues and the fact that,
due to the way that Python imports work, it is impossible to have multiple
versions of a single package installed at a time. As a direct result of these
things, I just spent over five minutes trying to install dependencies.
Observe:
$ poetry add --lock --source REDACTED [REDACTED_0..REDACTED_6]
...
Updating dependencies
Resolving dependencies... (40.4s)
...
SolverProblemError
Then after some vim pyproject.toml
to comment out things that caused the
SolverProblemError
:
$ poetry add --lock --source REDACTED [REDACTED_0..REDACTED_6]
...
Updating dependencies
Resolving dependencies... (115.3s)
...
Writing lock file
Cool, this time it worked, but I'm still not done getting the dependencies I need. So let's add them back:
$ poetry add --lock attrs marshmallow
...
Updating dependencies
Resolving dependencies... (39.0s)
...
SolverProblemError
Okay fine so I need to manually specify an older version of marshmallow
because for some reason poetry
just picks the newest one instead of trying to
find the newest compatible one. Let's try again with the version it says is
causing the conflict:
$ poetry add --lock attrs 'marshmallow^2'
...
Updating dependencies
Resolving dependencies... (35.8s)
...
SolverProblemError
Okay so now attrs
is having the same problem. Following the same pattern:
$ poetry add --lock 'attrs^19' 'marshmallow^2'
...
Updating dependencies
Resolving dependencies... (106.8s)
...
Writing lock file
Thank fuck, it's finally over. Well, for this project. We have a lot of projects
that need to be converted to poetry
. It'll be worth it though because
pip
/pip-compile
is worse, and poetry2nix
is nice.
Just for fun, let's try something similar in a different language:
$ time cargo add rand syn rand_core libc cfg-if quote proc-macro2 unicode-xid serde bitflags
...
... 1.968 total
$ time cargo update # to rebuild the lockfile
...
... 0.704 total
Under 2 seconds. No literally unfixable issues with incompatible transitive dependencies. It Just Works™. Incredible.
Black <22.3.0
incompatible with Click >=8.1
[T]he most recent release of Click, 8.1.0, is breaking Black. This is because Black imports an internal module so Python 3.6 users with misconfigured LANG continues to work mostly properly. The code that patches click was supposed to be resilient to the module disappearing but the code was catching the wrong exception.
I find the quantity of backlinks to this issue to be greatly amusing. (There's probably way more than shown too due to the existence of private repositories.) This is what happpens when hobbyists and the industry take a language seriously even though it lacks:
- A language-enforced concept of item privacy
- The ability to have multiple versions of a package in the dependency tree
- Statically checkable error types
Combinatorial exhaustiveness
Let's see what various typecheckers think about the following code:
from typing import Literal, Tuple, Union
SumType = Union[
Tuple[Literal["foo"], str],
Tuple[Literal["bar"], int],
]
def assert_int(_: int): pass
def assert_str(_: str): pass
def assert_combinatorial_exhaustion(
first: SumType,
second: SumType,
):
match (first, second):
case (("foo", x), ("foo", y)):
assert_str(x)
assert_str(y)
case (("foo", x), ("bar", y)):
assert_str(x)
assert_int(y)
case (("bar", x), ("foo", y)):
assert_int(x)
assert_str(y)
case (("bar", x), ("bar", y)):
assert_int(x)
assert_int(y)
Pytype
I couldn't get this to run on NixOS, so I don't know.
Rating: ?/10
Pyre
I couldn't get this to run on NixOS either, but they do have a web based version for some reason. Here's what it says:
21:23: Incompatible parameter type [6]: In call `assert_str`, for 1st positional only parameter expected `str` but got `Union[int, str]`.
22:23: Incompatible parameter type [6]: In call `assert_str`, for 1st positional only parameter expected `str` but got `Union[int, str]`.
24:23: Incompatible parameter type [6]: In call `assert_str`, for 1st positional only parameter expected `str` but got `Union[int, str]`.
25:23: Incompatible parameter type [6]: In call `assert_int`, for 1st positional only parameter expected `int` but got `Union[int, str]`.
27:23: Incompatible parameter type [6]: In call `assert_int`, for 1st positional only parameter expected `int` but got `Union[int, str]`.
28:23: Incompatible parameter type [6]: In call `assert_str`, for 1st positional only parameter expected `str` but got `Union[int, str]`.
30:23: Incompatible parameter type [6]: In call `assert_int`, for 1st positional only parameter expected `int` but got `Union[int, str]`.
31:23: Incompatible parameter type [6]: In call `assert_int`, for 1st positional only parameter expected `int` but got `Union[int, str]`.
pyre
is clearly unable to do type narrowing in the match arms. There are no
warnings about exhaustiveness, however; is that working properly? Let's pass it
the simplest possible code to test for that:
def assert_exhaustion(
x: bool,
) -> None:
match x:
case True:
pass
Output:
No Errors!
Rating: 0/10
Mypy
playground/main.py:23: error: INTERNAL ERROR -- Please try using mypy master on Github:
https://mypy.readthedocs.io/en/stable/common_issues.html#using-a-development-mypy-build
If this issue continues with mypy master, please report a bug at https://github.com/python/mypy/issues
version: 0.941
playground/main.py:23: : note: please use --show-traceback to print a traceback when reporting a bug
Yep, you're reading that right; mypy
just crashes.
Rating: Comically bad/10
Pyright
error: Cases within match statement do not exhaustively handle all values
Unhandled type: "tuple[SumType, SumType]"
If exhaustive handling is not intended, add "case _: pass" (reportMatchNotExhaustive)
pyright
's lack of errors about the assert_{str,int}
functions indicates that
it is correctly doing type narrowing, so that's good. However, it states pretty
clearly that it thinks this match is not exhaustive. Tragically, someone
reported this already and it got closed as wontfix.
Rating: 5 pity points since it can at least narrow types and not crash/10
rustc
#![allow(unused)] fn main() { enum SumType { Foo(String), Bar(i32), } fn assert_int(_: i32) {} fn assert_str(_: String) {} fn assert_combinatorial_exhaustiveness( first: SumType, second: SumType, ) { match (first, second) { (SumType::Foo(x), SumType::Foo(y)) => { assert_str(x); assert_str(y); }, (SumType::Foo(x), SumType::Bar(y)) => { assert_str(x); assert_int(y); }, (SumType::Bar(x), SumType::Foo(y)) => { assert_int(x); assert_str(y); }, (SumType::Bar(x), SumType::Bar(y)) => { assert_int(x); assert_int(y); }, } } println!("The match is exhaustive and the types check out."); println!( "If this weren't the case, you'd be seeing a compiler error message here." ); }
(Hit the play button in the top right corner.)
Rating: 10/10