CF Bolz-Tereick

@cfbolz@mastodon.social

On Twitter I had a thread going this year in which I tried to reflect on bugs that I found throughout the year, how to avoid this kind of bug, what can be learned, etc. I will port this idea over to here and see how it goes in the future (I'm still both here and on Twitter, we'll see how that goes).

December 4, 2022 at 12:46:13 PM

Recently I fixed a bug in PyPy's time.strftime. It was using some unicode helper function that takes as argument a byte buffer with some utf-8 encoded string, as well as the number of code points. strftime was using this API wrong and passing the number of bytes instead.

foss.heptapod.net/pypy/pypy/-/

After finding the bug we tried to make this API more robust by having a check in the function that counts the codepoints in the byte buffer and complains if that is different from the second argument. This shouldn't be one by default for performance reasons, but it's on during testing.

The reason why the bug got away for so long is that if you test only with ASCII chars it works, because number of bytes == number of codepoints in that case. Lesson: write tests with wider ranges of characters.

Another bug, this time in itertools.tee: tee has an optimization that uses a __copy__ method on the iterator if it has one, instead of carefully using its generic implementation. However, PyPy got it wrong and copied the iterable instead of the iterator

foss.heptapod.net/pypy/pypy/-/

This works in simple tests, but in more complicated situations it gives nonsense.

馃, also present in CPython. On Linux, if you pass MSG_TRUNC as a flag to socket.recv (which calls recv in its implementation) it will return the size of the packet, not the number of bytes written into the output buf.

foss.heptapod.net/pypy/pypy/-/
This confused the logic in socket.recv, it leads to an assertion error in PyPy (trying to read too many chars from the output buffer) and getting garbled characters in CPy. Fixed by not reading more than the buffer size from the buffer in PyPy in that case

CPython bug: github.com/python/cpython/issu
someone could fix this! probably not super hard.

I learned again that I know nothing about network programming :-(

Fixed a bug in PyPy's 3.9 parser (based on the new PEG parsing approach introduced in cpy 3.9). The parser would report a valid generator expression in a function call as lacking parentheses, but only if there is another syntax error further down in the file. Eg

f(x for x in y)
if a:
pass

Would report line 1 (which is fine) not line 3.

Bug was an oversight, leaving out an 'if' in the logic when porting from CPy. Shows that error cases are often not tested enough?

foss.heptapod.net/pypy/pypy/-/

Elk Logo

Welcome to Elk!

Elk is a nimble Mastodon web client. You can login to your Mastodon account and use it to interact with the fediverse.

Expect some bugs and missing features here and there. Elk is Open Source and we're actively improving it as a community project. Join us and let's build it together!

If you'd like to report a bug, help us testing, give feedback, or contribute, reach out to us on GitHub and get involved.

To boost development, you can sponsor the Team through GitHub Sponsors. We hope you enjoy Elk!

TAKAHASHI ShuujiDaniel RoeAnthony Fu涓夊挷鏅哄瓙 Kevin DengJoaqu铆n S谩nchezPatak

The Elk Team