skip to navigation
skip to content

Planet Python

Last update: June 25, 2026 01:48 PM UTC

June 25, 2026


Artem Golubin

Hexora v0.3: New features and improvements

Recently, I've improved my Python library, hexora. I wrote it to detect malicious Python code using static analysis.

In the new v.0.3.0 release, I've added new detections, and we now also use a simple machine learning model to analyze the whole file. The machine learning model uses code structure features, semantic features, and static code analysis to assess the entire Python file.

Although the model can detect malicious code without any detections coming from static analysis, its main use case is to filter false positives.

I've been testing it against newly published PyPI packages and it detects 2-10 new malicious packages each day.

Due to the number of published packages, before the machine learning model, I was getting around 5-10 false positives for 1[......]

June 25, 2026 01:37 PM UTC


Django Weblog

How the Django Software Foundation Became a CNA

Why the DSF pursued CNA status

Django has a long history of responsible security practices: a dedicated, private security mailing list, clear advisory policies, and predictable security releases. Even so, we relied on external organizations to assign CVE IDs (Common Vulnerabilities and Exposures). This sometimes introduced administrative delays and extra coordination overhead.

Becoming a CNA (CVE Numbering Authority) allows the DSF to:

The initial exploration

The process began with internal discussions within the DSF Board and Django Security Team. We evaluated:

After confirming that our policies were mature and that the administrative workload would be manageable, we initiated the CNA application with MITRE.

Preparing the application

MITRE requires that new CNAs document their security processes and demonstrate that they can meet CNA obligations. Our preparation included:

  1. Reviewing and updating the Django Security Policy.

  2. Mapping our existing workflows to MITRE's CNA rules, including:

    1. How reports are received.
    2. How vulnerabilities are validated.
    3. How advisories are produced.
    4. How CVEs will be assigned and published.
  3. Defining the scope of the CNA:

    1. Django itself as the core product.
    2. A small, clearly bounded set of related ecosystem projects.
  4. Ensuring we had private communication channels and documented procedures for confidential handling.

  5. Drafting the required procedural documentation for MITRE.

Most of the work here was not about creating new processes but about articulating long standing Django practices in the format MITRE expects.

Training and review

Once our initial documentation was accepted, MITRE scheduled us for CNA onboarding training. This covered:

We also completed MITRE's required CNA onboarding exercises. As part of this process, we worked through sample security reports and demonstrated how we would determine CVE assignments, including cases where multiple CVEs may or may not be warranted for a single report.

Approval and onboarding

After MITRE approved our documentation, training, and exercise submissions, the DSF was formally granted CNA status. The announcements steps were:

Lessons learned

A few procedural insights for other projects considering CNA status:

What changes for Django users

For most contributors and users, nothing changes. Django will continue to follow its established process for receiving reports, coordinating fixes, and publishing security releases.

The difference is that the DSF can now assign CVE IDs directly, which simplifies coordination and allows us to publish advisories with fewer external dependencies.

Acknowledgments

This work was led by Django Fellows Natalia Bidart and Jacob Walls, with support from the Django Security Team and the DSF Board. We are grateful to MITRE for their guidance during the onboarding process.

If you have questions about Django's CNA scope or security process, contact the Django Security Team.

June 25, 2026 11:00 AM UTC


PyCharm

Explicit Lazy Imports Are Coming to Python 3.15

A while ago at PyCon US 2026, I had the pleasure of listening to the Python Steering Council give updates about new features that are being added in Python 3.15. One that stood out was explicit lazy imports (via PEP 810), which defer module loading until first use. I am curious to see how this new feature works, and I want to benchmark its performance with PyCharm. Let’s take a look together.

Overview of explicit lazy imports

PEP 810 introduces an explicit syntax for lazy imports, allowing you to defer the loading and execution of modules until their attributes are actually accessed, unlike standard eager imports that execute immediately. This feature aims to significantly reduce startup latency and memory consumption. Explicitly marking modules as `lazy` can deliver substantial improvements in initial responsiveness and baseline resource usage in large-scale applications and command-line tools.

Because the implementation approach uses proxy objects within the module’s namespace instead of modifying Python’s fundamental dictionary structures, it preserves critical interpreter optimizations. 

This mechanism defers both the finding and the loading of the module to maximize efficiency, especially in environments with high-latency filesystems. To manage potential side effects and ensure backward compatibility, the proposal includes global control flags and a transitional variable for progressive adoption across different Python versions.

In short, Python 3.15 will let you optimize application performance by significantly reducing startup latency and memory consumption, as the loading and execution of modules are deferred until their attributes are actually accessed.

Trying them out in Python 3.15.0b1

At the time this is being written, Python 3.15.0b1 is already out, so we can give this new feature a try. You can build it from source at the CPython GitHub repo, but since getting Python 3.15.0b1 is easy when using `uv` or `pyenv`, we will do that instead.

Make sure you have the latest version of `uv` or `pyenv`, and then download Python 3.15.0b1 via either of the following commands:

After that, select the new interpreter in your project in PyCharm.

Now you will need to reinstall the dependencies for your project. You may have to build some of the libraries from source, as most of the libraries will not have a Python 3.15 wheel for download.

Profiling against normal imports

It is a common joke that the first thing data scientists will do is type `import pandas as pd` and `import numpy as np`, even if they are not actually going to use them. Let’s assume this is the case, and you received a script like this from your colleague:

import pandas as pd
import matplotlib.pyplot as plt
import numpy as np

def main():
    print("Initializing example data science project...")
    
    # Generate some dummy data
    data = {
        'x': np.linspace(0, 10, 100),
        'y': np.sin(np.linspace(0, 10, 100)) + np.random.normal(0, 0.1, 100)
    }
    
    # Plotting
    plt.figure(figsize=(10, 6))
    plt.plot(data['x'], data['y'], label='Sine Wave with Noise')
    plt.title('Sample Visualization')
    plt.xlabel('X-axis')
    plt.ylabel('Y-axis')
    plt.legend()
    
    # Save the plot instead of showing it (since this is non-interactive)
    plt.savefig('sine_wave.png')
    print("Project executed successfully. Plot saved as sine_wave.png.")

if __name__ == "__main__":
    main()

As you see, PyCharm highlights the unused pandas import for you, so removing it would be straightforward. However, for our experiment here, we’ll keep it.

To get a better visualization of import profiles, install a tool from PyPI called tuna.

You can profile your script by setting a custom run script with this script text:

python -X importtime main.py 2> import_log.txt; tuna import_log.txt

When you use it, a new browser window will pop up with the import graph. 

As you see, importing pandas accounts for half of the time it takes to load all the modules, and we never use it!

Now let’s add `lazy` to all the imports.

Don’t worry about the syntax highlighting. PyCharm just doesn’t recognize it yet since `lazy` is a new keyword that has not been officially released.

Let’s profile the script again.

Now we see the pandas import is gone, and loading everything takes way less time.

So, if you have a script that imports a lot of large libraries, and some of them are only used in certain conditions (e.g. in if-else clauses), lazy import can save time by loading modules only when they are first used.

Checking the inner workings with lazy imports

Let’s see how lazy imports are handled internally.

When a module is imported “lazily”, meaning `__lazy_import__` is called instead of `__import__`, a `types.LazyImportType` proxy object will be created. The module name will then be listed in `sys.lazy_modules` instead of `sys.modules`. (See the Lazy import mechanism section in PEP 810.)

When a lazy object is used, it needs to be reified. CPython will try resolving the import at that point and replacing the proxy object with the actual module itself. In this process, `__import__` is called to resolve the import. At the same time, the module is removed from `sys.lazy_modules`.

If there’s an error during reification, AKA importing the module, the lazy object is not reified or replaced. The next time the lazy module is used, the import will try again. The exception raised during reification will also show both where the lazy import was defined and where it was accessed. (See the Reification section in PEP 810.)

To experiment with it ourselves, let’s add some breakpoints with `pdb` and check what’s happening in the code:

import pdb
pdb.set_trace()

lazy import pandas as pd
lazy import matplotlib.pyplot as plt
lazy import numpy as np

pdb.set_trace()
…

And

    # Generate some dummy data
    data = {
        'x': np.linspace(0, 10, 100),
        'y': np.sin(np.linspace(0, 10, 100)) + np.random.normal(0, 0.1, 100)
    }

    pdb.set_trace()
    
    # Plotting
    plt.figure(figsize=(10, 6))
    pdb.set_trace()
…

Now run the script in the console:

python main.py

Note that PyCharm 2026.1 does not yet support Python 3.15, so using the Run or Debug button to run a script using lazy import may result in unexpected behavior.

When it hits the first line of ` pdb.set_trace()` at the top, there should not be any module loaded in. Let’s check:

(Pdb) import sys
(Pdb) sys.lazy_modules

As expected, none of our libraries – pandas, numpy, and matplotlib – are listed.

Now, let’s continue running the program and let it stop at the next breakpoint. In the console, type `continue` and once it stops, we can check by typing `sys.lazy_modules` again:

Here, we see that all of our modules are in `lazy_modules`. Let’s check whether pandas is in `sys.modules`:

(Pdb) 'pandas' in sys.modules

Nope, it’s not. You can try with numpy and matplotlib, and you will see that neither of those is in `sys.module`.

Now let’s type `continue` again and reach the next breakpoint, which occurs after numpy is used. Check `sys.lazy_module` again, and you’ll see that numpy is no longer on the list. When we check whether it is in `sys.module`, we get `True` this time.

However, pandas and matplotlib are still not in `sys.modules`.

When you check the next breakpoint, you’ll see that matplotlib is similarly removed from `sys.lazy_modules` and added to `sys.modules` after it is used.

Trying it yourself with PyCharm

Download the latest version of PyCharm to experiment with Python 3.15.0b1 and experience firsthand how explicit lazy imports can optimize your application’s performance by significantly reducing startup latency and memory consumption.

June 25, 2026 08:21 AM UTC


Bob Belderbos

Python Is Not Enough: Why Pythonistas Love Rust (Podcast)

I joined Bas Steins and Michal Martinka on their complexity.fm show to talk about why Pythonistas are picking up Rust, what AI really does to how we learn, and why vibe coding is a myth. The conversation ran for over an hour because there was a lot to unpack.

Watch the full episode on YouTube:

Watch on YouTube

Highlights from the episode

  1. Why Pythonistas reach for Rust

  2. You can get far with Python's type checkers

    • We discuss how for business logic and when performance is not essential, Python + type checkers gets you far.
    • What Rust still gives you though is exhaustive enum matching enforced by the language.
    • The tooling got a lot better: Modern Python tooling: uv, ruff and ty
  3. Vibe coding is a slot machine

    • Real dopamine effect, working with multiple agents has a lot of context switching, which is bad for productivity (and your brain).
    • The real speed up number: not 10x, more like ~1.2-2x. You get a prototype in an hour, but then the iteration/judgment/cleanup often takes days.
    • The rubber-stamping risk: code that looks plausible, but you miss the 50 lines duplicated three times, because you didn't feel the pain of producing the code.
  4. Timeless coaching that outlives the hype

    • All project-based, it's the best way to learn holistically, but noticing a shift from greenfield (2020) to more brownfield and AI hardening these days.
    • Ryan's Payroll SaaS case study: tools changed, TDD/guardrails mindset didn't and stemmed from the coaching years ago ("don't give the fish, teach how to fish"). More on this: AI coding tools fundamentals case study.

Where to go from here

Michal ends on a positive note:

I think the world is actually starting to heal. I think there will be a lot of adjustment with all this AI hype.

And it's something I am coming back to as well: yes, we have a powerful set of new tools, but they are only as good as the knowledge and judgment you bring to them. The hype is deflating into something more honest, and the tools aren't going anywhere, but neither is the need for engineers with deep skills.

If you want to further develop this skill set, check out how I help engineers level up.

June 25, 2026 12:00 AM UTC

June 24, 2026


Brett Cannon

Why I wrote PEP 832 -- virtual environment discovery

While I decide what to do with PEP 832 after polling folks on their opinion, I thought I would write out why I&aposm even bothering with any of this.

I&aposm going to talk from the perspective of VS Code and its Python extensions, but you could just as easily substitute "VS Code" for your editor of choice or even "AI agent" and it wouldn&apost change the problem: it isn&apost necessarily easy for tools like VS Code to know what workflow (tool) you&aposre using and thus where you&aposre putting your (virtual) environment(s) (I&aposm going to say "environment" as a stand-in for virtual environments, conda environments, etc.). Knowing where the environment lives is important in order to know how to run your code (as the environment will have a Python interpreter that you can use), analyze your dependencies (so linting, auto-complete, etc. do the right thing), etc. So having a way to communicate to VS Code where to find the environment is important.

The problems

First time seeing a project

When you first open a project in VS Code, you may have just done a git checkout, so nothing is set up yet. How is VS Code to know what workflow tool you prefer for creating an environment? Do you prefer Hatch? Poetry? uv? No preference? Some custom solution you have just for you (my-tool)? VS Code could guess maybe based on some tool table in pyproject.toml, but that requires having a known list of tools to look for. And while I know some people will say, "just assume uv", I&aposve been doing this long enough to know not everyone uses that tool and it isn&apost guaranteed to be the tool of choice forever (I remember when pipenv was what everyone recommended). Right now there is no way to know what tool a project wants to see used. Same goes for if you have a personal preference when a project has no reason to care. VS Code could have you specify such a preference in your settings or in .vscode/settings.json for a project, but that would then be VS Code-specific and would only work for supported workflow tools that were hard-coded (sorry, my-tool).

What if you ran git checkout , ran your workflow tool to create the environment, and then opened the project in VS Code? There might be an extra hint about which tool was used if VS Code is able to find the environment on its own, but that assumes VS Code even knew where to look for an environment for that hard-coded list of tools. This also has the same issue of needing VS Code to know where various tools put their environments as well as tie the environment back to the project. And that sort of thing is typically an implementation detail, and thus prone to change unexpectedly.

Finding all the environments

Tying into finding the environments, how do you know which environments are meant for the project? Some tools keep environments locally with the project files, but some keep them in a global location. And when that happens you need specific knowledge to map the environments back or hope the tool has a CLI call that returns a list of the environments in a way that never changes.

And then there&aposs the added wrinkle of workflow tools that never tie an environment to one project. Tools like virtualenvwrapper and conda let you name an environment so they can be shared across projects. But then that means VS Code can&apost know what environment to use without asking. And when someone has 700 or more environments (and that number is from actual experience; it isn&apost an exaggeration), it doesn&apost make is easy to know what environments to suggest.

Possible solutions

Let&aposs work backwards from already having an environment to figuring out what workflow tool to run.

The simplest thing is when there&aposs a virtual environment in a .venv directory. It&aposs easy to discover and can thus be quickly used. It doesn&apost help you modify the environment as you don&apost know what workflow tool was used and whether it cares that it is used to make any changes to the environment, but at least you can run code and analyze the environment.

Solving for when an environment is kept outside of a project or if there is more than one can be done with a file that records such details. Right now I&aposm proposing a .python-envs file which is just a newline-delimited list of paths to environments, with the last one being the one to use by default if the user doesn&apost want to have to choose (FYI what&aposs in the PEP as of this writing is out-of-date).

Knowing what workflow tool a project wants you to use is currently suggested to be done by a [workflow] table in pyproject.toml. That would specify how to launch a workflow tool in a server mode implementing the workflow server protocol (WSP, which has a nice connotation for whitespace in grammar definitions). That would let the tool communicate via JSON-RPC over stdin/stdout with VS Code about what environments there are, creating environments, etc. VS Code would provide a way to specify a fallback tool for when a project doesn&apost have a preference, but there currently isn&apost any thinking about how to make specifying a fallback a standard.

An alternative I have proposed in the past is having some naming convention for CLI tools and then they return the necessary details. This could still use WSP or come up with some CLI standard that tools implement, but it would help solve the issue of not knowing what fallback to use.

Why I care about any of this

VS Code is used across the experience spectrum: from first-time programmers to senior programmers with decades of experience. One common theme across experience levels is no one likes having to set something up. Another theme is no one likes dealing with environments. As such, I&aposm trying to find as much of a solution as people will agree to that makes getting started as easy as possible and hides environments as much as possible. And I want to do all of this in a way that isn&apost specific to VS Code if it doesn&apost have to be.

On top of that, people often don&apost want to leave VS Code to do any workflow stuff. That makes VS Code a user of the workflow tool rather than a peer. Thus VS Code uses the workflow tools as middleware which they aren&apost usually designed to do. With no tool-to-tool communication standard you end up with bespoke support like the Hatch extension which uses public APIs from the Python environments extension, which means it doesn&apost easily scale to help other tools. It&aposs great for VS Code users that extension exists, but other editors don&apost get that benefit. As well, the extension takes effort and that takes away from what could have been a general solution for any editor.

June 24, 2026 11:08 PM UTC


Django Weblog

Django 6.1 beta 1 released

Django 6.1 beta 1 is now available. It represents the second stage in the 6.1 release cycle and is an opportunity to try out the changes coming in Django 6.1.

Django 6.1 offers a harmonious mélange of new features and usability improvements, which you can read about in the in-development 6.1 release notes.

Only bugs in new features and regressions from earlier Django versions will be fixed between now and the 6.1 final release. Translations will be updated following the "string freeze", which occurs when the release candidate is issued. The current release schedule calls for a release candidate in about a month, with the final release scheduled roughly two weeks later on August 5.

Early and frequent testing from the community will help minimize the number of bugs in the release. Updates on the release schedule are available on the Django forum.

As with all alpha and beta packages, this release is not for production use. However, if you'd like to try some of the new features or help find and fix bugs (which should be reported to the issue tracker), you can grab a copy of the beta package from our downloads page or on PyPI.

The PGP key ID used for this release is Jacob Walls: 131403F4D16D8DC7

June 24, 2026 05:00 PM UTC


death and gravity

reader 3.26 released – discovery, exports, demo

Hi there!

I'm happy to announce version 3.26 of reader, a Python feed reader library.

What's new? #

Here are the highlights since reader 3.24.

Feed autodiscovery #

reader now discovers feeds automatically – instead of searching for feed links, just add the website URL, and the web app will suggest any feeds it finds.

Behind the scenes, this is enabled by the autodiscover plugin, which stores discovered feeds in a feed tag, so you can use it without the web app, or even without reader. For a short related rant about standards, check out this Bluesky thread (there are cats!).


feed autodiscovery feed autodiscovery
database exports database exports

Database exports #

This is an optional feature that I really wanted in the hosted reader MVP – it should be possible to get all your data out, not just lists of feeds and read / starred articles.

So yeah, now you can download a copy of your entire database from the web app, which means you can always migrate to another reader installation. (If you're using reader locally or self-hosting, the command might be handy for backups.)

Hosted reader status update #

Speaking of, did I tell you I'm working on a hosted version of reader? :D

Background: Why another feed reader web app?, Why not just self-host it?

Public demo #

Another thing I wanted for the MVP was a demo (no login needed):

https://beta.reader.andgravity.com/demo/

Go forth and click all the things! (it's read-only, nothing should break™)

OK, so what now? #

This is what is finished so far:

So just launch the damn thing already:

Meanwhile, if this sounds like something you'd like to use, get in touch.


That's it for now. For more details, see the full changelog.

Want to contribute? Check out the docs and the roadmap.

Learned something new today? Share it with others, it really helps!

What is reader? #

reader takes care of the core functionality required by a feed reader, so you can focus on what makes yours different.

reader in action reader allows you to:

...all these with:

To find out more, check out the GitHub repo and the docs, or give the tutorial a try.

Why use a feed reader library? #

Have you been unhappy with existing feed readers and wanted to make your own, but:

Are you already working with feedparser, but:

... while still supporting all the feed types feedparser does?

If you answered yes to any of the above, reader can help.

The reader philosophy #

June 24, 2026 04:48 PM UTC


Spyder IDE

How your donations transformed Spyder in 2025

Thanks to the community's financial support, Spyder has not only survived but thrived in 2025. Read on as we share the new features, releases and interface improvements from the last year that your donations directly made possible.

June 24, 2026 12:00 AM UTC

June 23, 2026


PyCoder’s Weekly

Issue #740: Pluggy, ABCs, Scrapy Extensions, and More (2026-06-23)

#740 – JUNE 23, 2026
View in Browser »

The PyCoder’s Weekly Logo


Plugins Case Study: Pluggy

Pluggy is an open source plugin system used by frameworks such as pytest and tox. This article introduces you to how it works and what you can do with it.
ELI BENDERSKY

Implementing Interfaces in Python: ABCs and Protocols

Learn how to implement interfaces in Python using abstract base classes, Protocols, and duck typing, and enforce method contracts cleanly.
REAL PYTHON

Quiz: Implementing Interfaces in Python: ABCs and Protocols

REAL PYTHON

Production Monitoring for Python Apps — Built by Developers, Not Suits

alt

Error tracking, intelligent logging, and Just Enough APM™ in one tool. Our founders Ben and Josh built Honeybadger to fix their own production headaches. They think it can fix yours too — and they’ll personally write back if you hit a snag. Try Honeybadger Free!
HONEYBADGER sponsor

How to Build Your First Scrapy Extension

Scrapy is a great extensible web scraping python framework, here’s how to make it better with plugins.
AYAN PAHWA • Shared by Ayan Pahwa

PSF Board Election Dates for 2026

PYTHON SOFTWARE FOUNDATION

PEP 835: Shorthand Syntax for Annotated Type Metadata (Added)

PYTHON.ORG

Large Number of PEPs Marked Final

As part of the 3.15 beta, a significant number of PEPs have been moved to “Status: Final”: PEP 753, 668, 687, 691, 699, 701, 703, 728, 753, 770, 773, and 829. For more details see the list of PEPs.
GITHUB.COM/PYTHON

Announcing the Search for a DSF Executive Director

DJANGO SOFTWARE FOUNDATION

PyData London 26 Videos Released

YOUTUBE.COM

Articles & Tutorials

Python 3.14 Garbage Collection Rigamarole

Python 3.14.0 introduced a new incremental garbage collector. But reports of higher memory usage caused the Python team to revert the garbage collector changes in 3.14.5. This post covers how memory management works in Python and workloads that perform best and worst for the incremental garbage collector.
PIERRE ZEMB

Choosing a Python Task Queue Library in 2026

This post compares the Python task queue libraries worth considering in 2026: Celery, Dramatiq, FastStream, Taskiq, and Repid. The comparison covers broker support, async behavior, benchmark results, and the places where they differ.
ALEKSANDR SULIMOV • Shared by Aleksandr Sulimov

Are Insecure Code Completions a Vulnerability?

Seth tries out the PyCharm “Full Line Completion” plugin that uses a deep learning model to suggest lines of code, and is concerned about the results. Many of the suggestions were for code that turns off security features.
SETH LARSON

Everything Security at PyCon US 2026

This post to the PSF blog summarizes all things security related at PyCon US 2026. It includes the first talk at the security track, updates to how the PSF deals with security, the OSS security space, and more.
STHE LARSON

Why Dependency Management Trips Up New Developers

A mix of opinion piece and practical advice, this post talks about Python dependency management, virtual environments, Docker, and why setup issues frustrate so many new developers.
ETHAN CARVER

Context Engineering for Python Codebases

Learn how context engineering shapes what your AI coding agent sees on every turn, and use four practical strategies to keep your Python projects on track.
REAL PYTHON

Quiz: Context Engineering for Python Codebases

REAL PYTHON

Building Python Skills for the Job Market

Learn which Python skills employers value most and how to build them, using a skill roadmap worksheet, weekly practice plan, and interview prep tips.
REAL PYTHON course

Quiz: Building Python Skills for the Job Market

REAL PYTHON

Run Modified Python Code Using the AST Module

How to work with Python’s Abstract Syntax Tree (AST), a foundation of many metaprogramming techniques, and how this can be valuable in the age of AI
ALEX HALL • Shared by Alex Hall

Make Your SciPy Presentation in Quarto

Quarto is built for scientific presentations. Here’s how to build your next SciPy (or any conference) talk as a Quarto slide deck.
ISABELLA VELÁSQUEZ • Shared by Isabella Velásquez

Projects & Code

asncounter: Count Hits Per Related Network Block

GITLAB.COM/ANARCAT

hydra: Framework for Configuring Complex Applications

GITHUB.COM/FACEBOOKRESEARCH

warp: GPU-accelerated Simulation, Robotics, and ML

GITHUB.COM/NVIDIA

python-socketio: Python Socket.IO Server and Client

GITHUB.COM/MIGUELGRINBERG

marimo-tutorials: Collection of Marimo Tutorials

GITHUB.COM/HALESHOT

Events

Weekly Real Python Office Hours Q&A (Virtual)

June 24, 2026
REALPYTHON.COM

PyDelhi User Group Meetup

June 27, 2026
MEETUP.COM

Python Sheffield

June 30, 2026
GOOGLE.COM

Python Southwest Florida (PySWFL)

July 1, 2026
MEETUP.COM

STL Python

July 2, 2026
MEETUP.COM

Canberra Python Meetup

July 2, 2026
MEETUP.COM


Happy Pythoning!
This was PyCoder’s Weekly Issue #740.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

June 23, 2026 07:30 PM UTC


Glyph Lefkowitz

Adversarial Communication

As I have discussed in previous posts, “AIs” can make mistakes. In fact, they do make mistakes, and their mistake-making patterns are such that where and how they will make mistakes is both uncertain and constantly changing.

Thus, in any scenario where you want to attempt to make “productive” use of “AI”, you must have a system in place for checking every result. Not checking some results; checking every result. If each result might have a consequence for you (and if it didn’t have a consequence, why bother automating it?) and you cannot predict in advance which kinds of results will need verification, then verification is always required.

The verification often ends up being just as expensive as doing the work in the first place, which means that if you want your usage of “AI” to be personally profitable, you have to find someone else to externalize the cost of verification onto. This person becomes your adversary, and, if you are successful, your “AI’s” victim.

The Ladder-Climber And Their Reverse-Centaur Rungs

One way that this constellation of facts can straightforwardly assemble themselves into a dystopian nightmare is the phenomenon, described by Cory Doctorow, of the reverse centaur. This is when your employer non-consensually turns you into the verification system. The “AI” does the fun part of initially performing the work, and then you do the boring part where you check if the robot is right and clean up its messes, even if everyone already knows that it would, in aggregate, be cheaper for you to do the work in the first place.

Reverse centaurs can be made from any automation, not only “AI” automation. I think that there is a reason that this term happens to have emerged in the “age of AI”, though, and not with earlier automation technologies (even those which were considerably more viscerally horrific). That reason is: the wrongness of “AI” output is not merely a technical feature that must be compensated for, it is a generalized externality.

As I mentioned above, if you are responsible for the entirety of the work, both extruding the “AI” output and checking it, it’s usually cheaper to have humans do the entirety of the work to begin with. When humans do the writing directly, we can check as we go, and thus verification doesn’t need to be as comprehensive.

When “AI” coding advocates say “code review is the bottleneck”, what they are observing is that the LLM is still rolling the dice for each PR, and a human is still necessary to verify that each of those rolls is a winner. But calling this process “code review” is a bit of a misnomer; it’s not really “code review” in the traditional sense, it’s human understanding.

Before the advent of “AI”, the human understanding was implicit in the process of writing the code in the first place1, and the code review was a way of diffusing and extending that understanding. Now that the code can be authored with no initial understanding taking place, that cost has not gone away, it has moved.

Human understanding was always the bottleneck.

However, this is taking a collaborative view of a software project, where satisfying the needs and solving the problems of your customers are the goals. We can see that “AI” is a bad tool to satisfy those goals, because all it’s doing is converting the first half of the work, that of understanding the code as you write it, to understanding the agent’s output as you read it.

What if, instead, we were to take the view that every software company is a Hobbesian nightmare, red in tooth and claw? In this view, the only goal of a software project is for the individual developers to make their promo cycles and get their bonuses. Given that there is only a certain amount of money to go around, this is a zero-sum game where each programmer wants to look more productive than their colleagues.

Pretty much every organization finds it easy to reward “productivity” as expressed by lines of code emitted, but the benefits of doing thorough and thoughtful design, analysis, and code review very difficult to reward. In this world, an LLM is an invaluable tool for the sociopathic ladder-climber, particularly if your legacy organization is still structuring their workflows as if the person prompting the bot is “writing” the code, and then they get to foist off the act of “reviewing” the code onto someone else.

Here, the prompter effectively externalizes the cost of the LLM’s failures but internalizes any benefits. The prompter will vibe-code a big feature, so large that the assigned reviewer can’t possibly comprehend it all effectively. When this happens, the reviewer will, eventually, be pressured to approve it, even if they can try to spot a few problems along the way. The reviewer has their own work to get back to, after all, the obligation to review the prompter’s (read: the bot’s) code is a drain on their time that they are not going to get rewarded for.

If this feature is a big success, the prompter gets a promotion. If it causes a big issue, well, the reviewer must not have been careful enough.

This is why LLMs are “good for coding”, and also why their biggest promoters keep having outages.

The Generative Gish Galloper

Coding is the biggest “success story” of this type of adversarial communication, but it is by far not the only instance of such a thing. LLMs create a new form of leverage that can turn Brandolini’s law from a linear advantage into an exponential one. If you are engaged in a political debate where you want to overwhelm the other side in nonsense, an LLM can generate bullshit faster than it is physically possible for a human being to type, let alone respond thoughtfully. There is an asymmetry to the utility of this weapon as well: only one side of the political spectrum wants to flood the zone and destroy trust in institutions and the concept of truth. There’s a good reason that the fascists love it.

Straightforward Spam and Fraud

This is kind of obvious, but LLMs can generate lightly-customized, plausible-looking text much more quickly than any human being. This facilitates their use in fraud, spam, and scams. In a spamming or fraudulent interaction, once again, the costs are externalized onto the victim: the recipient of a spam message has to do all the work of “checking” the LLM’s output. Spammers already expect very low hit rates from boilerplate, and if the LLM can increase those percentages from 1% to 5% the technology will pay for itself; they don’t need anything like reliable accuracy.

Customer “Support”

If you have any kind of commercial relationship with a company, I probably don’t even need to mention this: customer “support” bots are a misery. Everybody knows it at this point. But customer support is usually conceptualized by businesses as an adversarial interaction, because it is a cost center. They maintain internal metrics on time-to-resolution and try to optimize them. Implicitly, this creates a dynamic where the goal of the customer service agent’s job is not to solve your problem, but to emit noise that will cause you to think your problem is resolved, or to give up, as fast as possible. Unsurprisingly, LLMs can emit this noise faster than humans can, getting those customers off the phone. But those customers will remember those interactions, and the story outside the TTR metrics is horrible.

Similarly to the situation in software development, LLMs can look very good on paper for customer support, but mostly what they are doing is illuminating the problems with the industry’s existing metrics, by turning “winning the metrics battle against the customer” into a more obvious and immediate defeat for the company’s long term reputation.

“Education”

In 2026 it is sadly a fact of life that students cheat all the time using “AI”, and that this cheating is very successful, in that the teachers find it very hard to detect.

LLMs are great for cheating on schoolwork because the student is externalizing the work of the checking onto the teachers, who are often starting at a disadvantage to begin with, at least in the US.

My view is that this is happening because of a divergence in the way that students vs. teachers (or, more accurately, “the broader educational system”) view grading.

When a student is asked to write an essay, the teachers see the effort as both intrinsically worthwhile for the student, as well as useful as a pedagogical tool to evaluate and react to the student’s progress. The student, by contrast, sees a stumbling block designed to knock them off the path to success and into a permanent underclass. It is no wonder that the student sees “AI” as useful to their own goals and has no compunction about deploying it.

There is a bitter irony that the ability to understand the inherent value of actually writing the essay on their own is the sort of thing that students can really only learn by writing a bunch of essays. There’s no way that I can think of which makes the benefit legible as long as a shortcut is available.

The net effect here is a downward spiral, where the already-wobbling educational system is sustaining an attack that it doesn’t have the resources to recover from. The individual students’ attacks against their teachers and their schools’ grading systems might appear to momentarily succeed, but they will win the battle and lose the war.

Spamming “For Good”?

Usually when we talk about someone unilaterally choosing to enter into an adversarial relationship, that’s an “attack” and for good reasons we have a negative impression of the attacker. However, I would be remiss if I did not point out that there are some cases where the relationship was already adversarial; just because you’re the attacker doesn’t mean that you are evil.

For example we might imagine use-cases like automatically filing appeals for prior authorizations against health insurance. It’s relatively well-known at this point that the main way for-profit insurers maintain their margins is by denying claims right up to the line of the policies themselves being fraud, so using a spamming tool to fight them might be entirely justifiable2 in that case.

Similarly, using an LLM could be justified in a fight against a company refusing to honor a warranty. One could imagine using an LLM to immediately generate replies and escalations.

However, even in imagined cases like these, the underlying problem is that the insurers and the vendors already have a tremendous amount of structural power, so it is more likely that they will have the advantage in deploying a communications weapon like an LLM, as well as enacting policies to simply ignore any LLM-based communication that you might submit. Worse, if these strategies were to become widespread, they might provide an excuse to reject any communications by feeding them into an unreliable “LLM detector” and issuing an automated “computer says no” even to hand-written correspondence.

It is also worth stressing that these cases are imagined, as compared to the very real coworker-abuse, spam, scam, fraud, and disinformation campaigns being waged in real life today.

Therefore, while legitimate uses might exist, it’s hard to imagine that there’s anywhere they would be genuinely valuable and sustainable. In the best case “AI” will provide a temporary advantage for underdogs that will provoke an arms race which the resource-advantaged adversaries will win in the long run, in the worst case the arms race itself will cement permanent structural change that will make things worse.

“Search” By Stealing

Most of the adversarial utility of “AI” is on the “write” side, since write-amplification is more obviously aggressive than reading. But the “read” side of LLMs — summarization and question-answering — can be a form of attack as well.

To begin with, the act of reading itself is currently enormously destructive, but that’s arguably not a fundamental aspect of this technology. They could set reasonable rate-limits and respect things like robots.txt, as search engines have for decades now. They could also refrain from committing criminal levels of copyright infringement. But, today, using “AI” tools does suborn this sort of out-of-control crawling.

More insidiously, consider the scenario described in this YouTube video. The LTT Bros decided to try Linux again, and in the course of so doing, they had problems. When trying to solve these problems, they were faced with a choice: they could consult Reddit, or they could ask an LLM. Asking an LLM would “gaslight the heck out of” them, but they still found it preferable, because they would at least get an answer without getting yelled at.

Initially this sounds great. But it also means that you want to extract knowledge from a community, while mechanically eliding any values or norms that the community may want to impart as part of offering that knowledge. As someone who spent many years in a community tech support role, this is worrying. Many requests for support are people asking how to do things that will momentarily solve a superficial problem but create a long-term reliability problem or even an immediate security risk, that the question-asker doesn’t want to hear about. Consider the question “I’m tired of entering my password so much, how do I make it so my laptop unlocks automatically”. An obsequious chatbot will helpfully tell you how to do this without pushback.

But, this is also a sort of ethically murky area. The Linux community is somewhat famously, for many years now, a toxic cesspool of general hostility, misogyny, etc. It is certainly a good thing that people can get access to this knowledge without subjecting themselves to abuse. But it also means that the people with the power and the privilege to change the community for the better can just quietly withdraw, rather than fixing the problems. It also means that the positive elements of culture cannot be transmitted, and people will have no opportunity to learn about unknown unknowns.

In this case, the “adversarial” communication is with society. The thing that using an LLM for search lets you do is withdraw from society and avoid forming any personal connections. There are some personal connections which are painful and annoying, and so that can feel like a momentary balm. But the need to make connections in general is, like, the concept of society itself.

Who Am I Hurting?

LLMs are good at adversarial communication. They are so good at it, relative to their other benefits, that they will tend to make communications adversarial if you are not remaining vigilant about the possibility that it might do so. My request to you, dear reader, if you are going to use such tools, is to always ask yourself, “who might I be hurting, if I use an LLM for this?”

If you’re using an “AI”, who is its adversary? If you haven’t given it one yet, who might the “AI” turn into an adversary? Who might you overwhelm with an asymmetric amount of output, or, if you’re receiving information and not sending it, who are you taking that information from without consulting?

Figure out the answers to these questions and conduct yourself accordingly; the answer might be “yourself”.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor!


  1. One of the reasons that software developers tend to prefer greenfield development is that when you are given a blank page, you can project your own specific understanding onto it. You can structure the codebase in a way that works for your brain, down to the variable naming conventions and the module layouts. LLM-assisted development makes everything into instant brownfield work, which makes developers instantly miserable; even those who are excited about the technology will frequently complain about how it feels like their agency has been stolen and their joy in the work has been diminished. But I digress. 

  2. Modulo the massive amount of other externalities involved in using LLMs, of course, but I don’t have the time or energy to get into those here. 

June 23, 2026 01:38 PM UTC


Python Software Foundation

Mitigated API authentication bypass for python.org download metadata

This post is a cross-post from the Python Insider Blog.

Summary

On February 23rd 2026, Splitline Ng from the DEVCORE Research Team reported to the Python Security Response Team (PSRT) an authentication bypass vulnerability in the “python.org” release management API. By supplying an admin username with an arbitrary API key the request was processed with admin privileges.

If exploited, this would have allowed an attacker to modify Python release and file metadata that affects what URLs users are offered when visiting python.org/downloads. While it would not enable existing release files to be modified in-place, it would enable an attacker to modify the URLs that are provided on python.org for each release file, including verification material URLs. There is no evidence this vulnerability was exploited after auditing logs and database backups. This scenario is even more unlikely to have happened unnoticed due to the many redistributors requiring Python Sigstore and PGP materials be verified prior to builds.

Details

PSRT confirmed the vulnerability on a local instance of python.org. Seth Larson and Hugo van Kemenade developed and deployed the patch to production with help from Jacob Coffee. Less than 48 hours after the initial report the PSRT and the reporter confirmed that the proof-of-concept provided by the reporter no longer worked locally or on the production deployment.

This vulnerability was likely never exploited. However due to the age of the vulnerability (existing in the codebase since 2014) we don’t have absolute certainty beyond our logs and database backups. We believe attempts to exploit this vulnerability would have been “loud” and discovered quickly given the number of downstream tools and distributions automatically verifying the Sigstore and PGP materials.

We confirmed that all artifacts on python.org had not been modified by verifying Sigstore and PGP materials. Our own workflow verifying all Sigstore signatures did not signal any changes to artifacts from years prior. While verifying PGP materials we were able to verify all signatures where keys are still readily accessible from Python 2.5 to 3.13. Note that Python 3.14 and onwards no longer provide PGP materials, so these were verified with Sigstore.

The codebase was manually audited and additional hardening was applied. In addition to manual auditing, LLM auditing tools were unable to find additional issues with authentication. The delay between the initial finding and publishing of this final report was to give ample time for auditing for other issues related to authentication, to receive access to LLM auditing tools, and to arrange and complete a third-party audit from Trail of Bits prior to publication of this report. Full results from the Trail of Bit audit will be published soon.

Remediations

Timeline

Acknowledgements

Thanks to Splitline Ng from the DEVCORE Research Team for responsibly disclosing this vulnerability and confirming the remediation.

Funding for the follow-up third-party audit was provided by OpenAI. The audit and mitigations were completed by Trail of Bits, with special thanks to Facundo Tuesca and Eric Quintero. Audit results and mitigations were reviewed and applied by Seth Larson. Seth Larson's role as Security Developer-in-Residence at the Python Software Foundation is supported by Alpha-Omega.

If your organization wants to support security at the Python Software Foundation through the Developers-in-Residence program please reach out to sponsors@python.org.

 

June 23, 2026 10:56 AM UTC


Python Bytes

#485 Creating memories

<strong>Topics covered in this episode:</strong><br> <ul> <li><strong><a href="https://offen.github.io/docker-volume-backup/?featured_on=pythonbytes">Backup Docker volumes locally or to any S3</a></strong></li> <li><strong><a href="https://blog.pyodide.org/posts/314-release/?featured_on=pythonbytes">Pyodide 314.0 Release</a></strong></li> <li><strong><a href="https://github.com/jupyter-ai-contrib/nb-cli?featured_on=pythonbytes">nb-cli</a>: A Command-Line Interface for AI Agents and Notebook Automation</strong></li> <li><strong><a href="https://hindsight.vectorize.io/?featured_on=pythonbytes">Hindsight</a> Agent Memory That Learns</strong></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=BZIGLfrjCZQ' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="485">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by us! Support our work through:</p> <ul> <li>Our <a href="https://training.talkpython.fm/?featured_on=pythonbytes"><strong>courses at Talk Python</strong></a></li> <li><a href="https://www.midwestcommunityday.com/?featured_on=pythonbytes">AWS Community Day Midwest</a> tomorrow Wednesday the 24th in downtown Indianapolis, <a href="https://sixfeetup.com/?featured_on=pythonbytes">Six Feet Up</a> is sponsoring and there are 2 Sixies presenting</li> </ul> <p><strong>Connect with the hosts</strong></p> <ul> <li>Michael: <a href="https://fosstodon.org/@mkennedy">Mastodon</a> / <a href="https://bsky.app/profile/mkennedy.codes?featured_on=pythonbytes">BlueSky</a> / <a href="https://x.com/mkennedy?featured_on=pythonbytes">X</a> / <a href="https://www.linkedin.com/in/mkennedy/?featured_on=pythonbytes">LinkedIn</a></li> <li>Calvin: <a href="https://sixfeetup.social/@calvin?featured_on=pythonbytes">Mastodon</a> / <a href="https://bsky.app/profile/calvinhp.com?featured_on=pythonbytes">BlueSky</a> / <a href="https://x.com/calvinhp?featured_on=pythonbytes">X</a> / <a href="https://www.linkedin.com/in/calvinhp/?featured_on=pythonbytes">LinkedIn</a></li> <li>Show: <a href="https://fosstodon.org/@pythonbytes">Mastodon</a> / <a href="https://bsky.app/profile/pythonbytes.fm">BlueSky</a> / <a href="https://x.com/PythonBytes?featured_on=pythonbytes">X</a></li> </ul> <p>Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually <strong>Tuesday at 7am PT</strong>. Older video versions available there too.</p> <p>Finally, if you want an bonus digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it.</p> <p><strong>Michael #1: <a href="https://offen.github.io/docker-volume-backup/?featured_on=pythonbytes">Backup Docker volumes locally or to any S3</a></strong></p> <ul> <li>Via Bryan Weber (thanks Bryan!), who spotted it over on Virtualization HowTo. Find Bryan at <a href="https://bryanwweber.com/?featured_on=pythonbytes">bryanwweber.com</a>.</li> <li><a href="https://offen.github.io/docker-volume-backup/?featured_on=pythonbytes">offen/docker-volume-backup</a> is a lightweight companion container that backs up the volumes your apps actually depend on, then ships them somewhere safe.</li> <li>It's tiny: written in Go and about 25MB compressed, roughly 1/20th the size of the shell-based image (<code>jareware/docker-volume-backup</code>) that inspired it.</li> <li>Drop it into your <code>docker compose</code> file as a <code>backup</code> service, mount the volumes you care about as read-only, and you're off.</li> <li>Push backups to a pile of destinations: a local directory, plus any S3, WebDAV, Azure Blob Storage, Dropbox, Google Drive, or SSH-compatible target. Mix and match as many as you want in one run.</li> <li>Recurring cron-style backups in a Compose setup, or one-off backups straight from the Docker CLI.</li> <li>Production-friendly touches worth calling out: <ul> <li>Rotates away old backups so you don't quietly fill the disk.</li> <li>GPG encryption for your archives.</li> <li>Notifications on finished and failed runs (so you find out about failures before you need the backup).</li> <li>Stop a container during backup for a consistent snapshot using a simple <code>docker-volume-backup.stop-during-backup=true</code> label, then auto-restart it.</li> <li>Run custom commands during the backup lifecycle (great for a database dump before the file copy).</li> <li>Docker Swarm support, plus <code>arm64</code> and <code>arm/v7</code> builds. Hello, Raspberry Pi homelab.</li> </ul></li> <li>Fun aside from Bryan: he searched our back catalog for this tool and the search came back so fast he thought it hadn't run. Love to hear it.</li> </ul> <p><strong>Calvin #2: <a href="https://blog.pyodide.org/posts/314-release/?featured_on=pythonbytes">Pyodide 314.0 Release</a></strong></p> <ul> <li><strong>PEP 783 is the real news</strong> — Pyodide maintainers used to hand-build 300+ packages. Now anyone can publish Pyodide wheels to PyPI with <code>cibuildwheel</code>.</li> <li><strong>The version jump from 0.29 to 314.0 is intentional</strong> — it now tracks the Python version, so 314.x = Python 3.14. Binary compatibility is locked per Python cycle, meaning packages you build today won't break on the next Pyodide release.</li> <li><strong><code>sqlite3</code>, <code>ssl</code>, and <code>lzma</code> are back in the default stdlib</strong> — no more <code>await pyodide.loadPackage("sqlite3")</code>. Bigger download, but a much smoother experience for newcomers.</li> <li><strong><code>bigint</code> precision bug is fixed</strong> — values above 2^53 were silently losing precision when crossing the Python/JS boundary. The new <code>JsBigInt</code> type makes the roundtrip correct. Worth flagging if anyone is doing numeric work in a browser app.</li> <li><strong>Experimental TCP sockets in Node.js</strong> — you can now connect Pyodide to a real database (MySQL, PostgreSQL, Redis tested) when running server-side. Blurs the line between "Python in the browser" and "Python runtime anywhere Wasm runs."</li> </ul> <p><strong>Michael #3: <a href="https://github.com/jupyter-ai-contrib/nb-cli?featured_on=pythonbytes">nb-cli</a>: A Command-Line Interface for AI Agents and Notebook Automation</strong></p> <ul> <li>From Piyush Jain (Jupyter and LangChain maintainer) on the Jupyter blog: <a href="https://blog.jupyter.org/nb-cli-a-command-line-interface-for-ai-agents-and-notebook-automation-996ad7edacd9?featured_on=pythonbytes">nb-cli: A Command-Line Interface for AI Agents and Notebook Automation</a>.</li> <li><a href="https://github.com/jupyter-ai-contrib/nb-cli?featured_on=pythonbytes">nb-cli</a> is an experimental, Rust-based CLI to read, write, execute, and search Jupyter notebooks. The premise: agents are great at CLIs but terrible at hand-editing the nested JSON in an <code>.ipynb</code>, so let them operate on the notebook from the outside instead of running inside it.</li> <li>Works with or without a Jupyter server. No server? It reads/writes <code>.ipynb</code> files directly and talks to kernels over ZeroMQ. Connected to a live JupyterLab, your edits show up instantly via Y.js (the same CRDT Jupyter uses).</li> <li>Smart output format: instead of token-heavy JSON or ambiguous plain markdown, it uses <code>@@cell</code> / <code>@@output</code> sentinels with inline metadata. Less wasted context, unambiguous structure, and it degrades gracefully on truncation.</li> <li>The payoff is composability. "Add a summary section and run it" becomes one shell pipeline instead of six agent tool calls. And <code>nb search notebook.ipynb --with-errors</code> returns only the failing cells, so the agent skips the cells that worked.</li> <li>Claude Code tie-in: it ships as an agent skill. <code>npx skills install jupyter-ai-contrib/nb-cli</code> and your agent can drive notebooks via <code>nb</code>.</li> <li>Out of jupyter-ai-contrib, which aims to become an official Jupyter AI subproject. Still early (<a href="http://crates.io?featured_on=pythonbytes">crates.io</a> is at v0.0.5), so kick the tires before anything load-bearing.</li> <li>See also <a href="https://marimo.io/blog/marimo-pair?featured_on=pythonbytes">marimo-pair</a>.</li> </ul> <p><strong>Calvin #4: <a href="https://hindsight.vectorize.io/?featured_on=pythonbytes">Hindsight</a> Agent Memory That Learns</strong></p> <ul> <li>AI agents forget everything between sessions — Hindsight gives them persistent memory that learns over time</li> <li>Simple three-method API: <code>retain()</code>, <code>recall()</code>, <code>reflect()</code> — store, retrieve, and reason over memories</li> <li>TEMPR retrieval runs semantic, keyword, graph, and temporal search in parallel for accurate results</li> <li>Automatically consolidates related facts into durable observations instead of piling up duplicates</li> <li><code>pip install hindsight-all</code> runs the entire server in-process; integrates with LangChain, LlamaIndex, Pydantic AI, CrewAI, and more</li> </ul> <p><strong>Extras</strong></p> <p>Calvin:</p> <ul> <li><a href="https://lucumr.pocoo.org/2026/5/26/clankers/?featured_on=pythonbytes"><strong>Clanker: A Word For The Machine</strong></a></li> <li><a href="https://github.com/DietrichGebert/ponytail?featured_on=pythonbytes">**Ponytail</a> — You know him. Long ponytail. Oval glasses. Has been at the company longer than the version control**</li> <li><a href="https://mcdonc.github.io/klangk/?featured_on=pythonbytes">**Klangk</a>: Multi-User AI Sandboxing, Collaboration and Coding Platform**</li> <li>Cursor announces <a href="https://cursor.com/origin?featured_on=pythonbytes">Origin</a></li> <li><a href="https://vorpus.github.io/performativeUI/#/">performative-ui</a> to quick start your new idea Michael:</li> <li><a href="https://talkpython.fm/episodes/show/552/astral-joins-openai?featured_on=pythonbytes">Astral Joins OpenAI: The Interview</a></li> <li><a href="https://arstechnica.com/ai/2026/06/spacex-will-acquire-coding-tool-cursor-to-compete-with-anthropic-openai/?featured_on=pythonbytes">SpaceX to acquire Cursor</a></li> <li>And <a href="https://www.linkedin.com/posts/marshcharles_im-proud-to-announce-that-were-renewing-share-7472388576807133184-XzKZ/?featured_on=pythonbytes">OpenAI renews Open Source support</a></li> <li>Portuguese subtitles are now available for <a href="https://training.talkpython.fm/courses/all?featured_on=pythonbytes">Talk Python courses</a></li> <li><a href="https://www.djangoproject.com/weblog/2026/jun/17/announcing-the-search-for-a-dsf-executive-director/?featured_on=pythonbytes">DSF is hiring</a> including Six Feet Up support</li> </ul> <p><strong>Joke: <a href="https://x.com/pr0grammerhum0r/status/2062202683742711910?s=12&featured_on=pythonbytes">Oh Babe…</a></strong></p>

June 23, 2026 08:00 AM UTC


Python Insider

Mitigated API authentication bypass for python.org download metadata

Vulnerability mitigated in python.org with follow-up third-party audit from Trail of Bits

June 23, 2026 12:00 AM UTC

Python 3.15.0 beta 3 is here!

The penultimate 3.15 beta is out!

June 23, 2026 12:00 AM UTC


Armin Ronacher

The Coming Loop

I don’t prompt Claude anymore. I have loops running that prompt Claude and figuring out what to do. My job is to write loops.

— Boris Cherny

Over the last months I have watched more and more people build something on top of coding agents that feels meaningfully different from just using a coding agent. Some of this happens on top of Pi which is cool to see for sure! The pattern is the same everywhere though: work is put into a queue of sorts, a machine picks it up, attempts it, stops, and then some harness decides whether that was actually the end.

If not, the harness continues the same session, injects another message, starts a fresh session with modified context, or sends the task to another machine. The task stays alive beyond the point where the model by itself would normally have said: “I am done.”

I think about that type of loop more than I want to admit.

There is already an agent loop inside every coding agent. The model calls a tool, incorporates the result, calls another tool, reads a file, edits a file, runs tests, and eventually produces some answer. That loop is one we have been quite familiar with for a long time. The other loop is the harness level loop: the loop outside the agent loop. That loop is also not new. We have been doing versions of this since early Claude Code days, but that loop is becoming ever more present in agentic engineering and in recent weeks it has started to dominate the Twitter discourse.

I Am Not Good At This Yet

My current status is that I have not had much success with this way of working for code I deeply care about which turns out to be quite a lot of code.

Part of that is taste and part of it is control. I attempt to set a high bar for what I want code to look like, and I want to understand the code I ship. Under pressure, or in a discussion with another human, I want to be able to explain what the system does without first having to ask a clanker to explain it to me. Now there is obviously a question if this desire to understand the code is one that I will still have a few years from now. For now I have not moved past the point of comprehension being important to me.

Given this desire, there is something I lack with my experience of code written without me paying attention, particularly from loops. Present-day models tend to produce code that is too defensive, too complex, too local in its reasoning. They avoid strong invariants. They add fallbacks instead of making bad states impossible. They duplicate code, invent bad abstractions, and paper over unclear design with more machinery. Worse though: I so far see very little progress of this improving. If anything, on that front it feels to me that we might even be making steps in the wrong direction. At least for my taste, present-day hands-off harnesses like Claude Code with ultracode produce worse code than what we were producing last autumn. That’s because Claude Code, with Fable for instance will be working uninterrupted on a problem for thirty minutes or more, when previously the process would have been much more human in the loop.

Furthermore it’s well understood that models tend to observe some local failure and add a local defense. Karpathy mentioned how they are “mortally terrified of exceptions”. In systems with important invariants, especially persisted data formats or core infrastructure, the right fix is not “handle every malformed case.” The right fix is to make the malformed case unrepresentable or impossible to write in the first place. Yet even with a lot of manual steering, that type of code does not come out of LLMs naturally, and even if the code comes out naturally like that, they will still attempt to handle now impossible errors.

When you take that behavior and you put it behind loops, you tend to amplify it. If each iteration adds another small defense, the system slowly becomes less understandable while appearing more robust. The more hands-off you are, the more that happens. It also teaches really bad practices when tools like this are given to juniors without clear guidance. Because if you ask them, why they are doing all that, they will convincingly argue their case.

Where Loops Work

At the same time, it would be dishonest to pretend the loop pattern does not work because it already works astonishingly well in some domains.

Porting code one of them. There are already impressive examples of large automatic porting efforts, including the reported work around moving parts of Bun from Zig to Rust. I have used it with success myself to port MiniJinja to Go. Performance explorations are another case where this works beautifully. A machine can try experiments, benchmark them, discard failures, and keep searching. Security scanning fits naturally too and so does almost any type of research: asking a system to explore a complex problem space and report back without necessarily committing lasting code. One thing that many of these have in common is that they either do not generate new code, but transform code that already exists, or they produce code that intentionally does not have a long shelf life. They either produce proof of concepts or ideas, surface findings or are more akin to mechnical transformation.

I believe that loops that produce artifacts without necessity of longevity or that create some form of clearly verifiable mechnical translation matters more than the general ability of a harness to mechanically measure a goal. Many successful applications of loops use another LLM as a judge or as an orchestrator. The mechnical translation case can be verified with a binary test case, but it can also be judged by an LLM instead!

Claude Code, for instance, is increasingly good at creating entire experimental workflows that it will then execute. Sure, the code it produces is slop, but that’s more the fault of the model than the harness not being a good judge on if a step in the workflow resulted in a net improvement or completion.

The harness just needs some signal that lets it continue. It does not have to be objective or binary, it just has to be useful enough to drive another iteration.

I absolutely love loops already that take the boring parts out of my day to experiment and measure and to give me ideas.

Software As Organism

On the other hand using that same looping methodology to write lasting code does not yet sit well with me. The metaphor I like to reach for is one of moving from software as a deterministic machine to software as an organism.

I became a software engineer in an enviornment that encouraged me to understand the machine. There was always a layer you could peel off to deepen your understanding. Machines that did not exhibit deterministic observable behavior were maybe accepted, but generally seen as not exactly optimal. Software architecture-wise, I saw it as desireable to push further towards more determinism rather than less. Likewise the ability to understand the code has been an undeniable goal. In practice not always possible we still took pride in writing code so that it became possible even for new engineers to navigate complex code bases through clever architecture. On well designed systems there were always engineers that knew where the invariantes lived, which parts were load-bearing and which changes were safe. Ideally all of that was also well documented. Where that understanding was lacking, it was generally regarded as something to improve upon.

Obviously that ideal has always been strained. Many software systems, especially very successful ones had periods where engineers on the team were able to keep them clean. Large software systems are not infrequently too big, too dynamic and too dependent on external services to fit into anyone’s head. Even without LLMs we already diagnose distributed systems somewhat like doctors in that we observe symptoms, form hypotheses, “order more tests”, try some remedies, and observe again.

Yet with LLMs we’re pushing much further in that direction and much quicker. We use them to write the code and we also use them for diagnosis and remedy. There are plenty of engineers that already live in a world in which the first step after the occurrence of a production issue is followed by having a clanker read logs, propose root causes and proactively put up a patch. The resulting patch is then often picked up by another machine that reviews, sometimes even landing it on main without any human supervision.

Obviously that is powerful and I cannot deny that it sounds appealing. But giving in to that idea, particularly with less and less human oversight means accepting that we may no longer understand the whole system in the same way. We treat it, we monitor it, we stabilize it, but we do not necessarily comprehend it.

I have no doubts that for some software, that is okay. Not every line of code deserves human authorship and worse code might have been written in the past.

But do I want all software to be authored this way?

You Cannot Quite Opt Out

What’s very uncomfortable is that opting out of this fully machine-driven future may not be an option.

Security is the clearest example today. Even if you do not use loops to build your software, other people will use loops against your software. Attackers will run machines continuously and even if it’s not attackers, then security researchers will and some of that automated work will throw up dust but also find real issues. And both the signal and the noise will come your way at a volume that makes it almost impossible to deal with unless you yourself throw a machine at the problem.

Daniel Stenberg’s post about curl’s summer of bliss is a good example of the pressure maintainers are already under. As far as I know, AI does not play a tremendous role in the core development of curl today. Yet despite all of this, maintainers are overwhelmed by reports, most of which are now AI-generated ones.

If attackers and reporters loop, defenders will eventually need to loop too to keep up. Maybe not to write patches directly, maybe just to triage and reproduce and pressure will increase.

The same is true competitively as some teams will out-build others through raw speed. Some projects will suddenly move faster because a tiny group figures out how to orchestrate machines effectively. Some startups will do with five people what used to require fifty. Some people might literally put a machine against your product in a loop and ask it to “make it like the other one.” And if their users are happy, does it really matter?

Not all software will be equally affected. Some domains will punish sloppiness and demand trust and responsibility, but a lot of software lives in a world where raw speed, quick experimentation, and vast coverage matter enormously.

Building New Dependencies

The scariest part to me is that we become dependent on these new machines in new ways. Software has always depended on tools. I remember the time when I had to pay for compilers. These new tools are a flashback to times where creating software came with real costs. But now it’s no longer a one-time payment, it’s a constant dependency. Not just a dependency on a filled wallet, but also a cognitive dependency.

If a codebase is produced by loops, reviewed by loops, patched by loops, and kept alive by loops, what happens when you no longer have access to the same class of systems? What happens when some trade restrictions take away access to the most powerful models? What if just the cost becomes unbearable? What if you and your team just lose the last remaining ability to understand the code without using the machine?

We may create codebases that are not merely hard to maintain by humans, but that assume machine participation as part of their maintenance model. This is already happening! It’s not happening everywhere, and it might not even be happening in ways that are seen as problematic, but we see more and more of it. People more and more merge code they cannot fully explain. People lose their ability to create issue reports or discuss things in chat, without augmenting or rephrasing their messages with the context provided by a clanker. Too many people increasingly rely on a machine to summarize or contextualize it. More and more do I encounter people who converse with me through the indirection of an LLM.

Again, maybe that is not even going to be wrong, but it’s a massive change to how we did things.

Future Harnesses

I have little doubt that this is where things are going but going there will require us to do something about our tooling everywhere, and not just in the coding agents.

Just orchestrating more loops won’t be enough. Better visualizations of changes or orchestration or agents will not restore our understanding. Either we need to find clever ways to jolt the human back into the loop and make the changes of the loops legible long term, or we need to find better ways to compose these ever more complex systems.

This is also where my thinking about the role of Pi is changing. Pi has been cautious, and I think that caution is good. I do not want a future where every interaction turns into an uncontrolled swarm of machines making changes I cannot follow. I would not want Pi to become an unmaintainable mess in an effort to win the race towards software that writes itself and I would not want Pi to promote this type of engineering either. At the same time Pi is a harness and harnesses are at the center of people running these new types of experiments.

Task queues for coding tasks, orchestration of agents, subagents, durable sessions will matter more and more. Even those of us who have their reservations and are not blindly embracing loops will have to start doing those experiments. We need to, because we need to understand how to make this future bounded and survivable.

Controlling Loops

As you can read from this post, I’m very uneasy about this future. Not cause of fear, but because of caution given experiences with this technology so far.

Adopting the idea of harness loops means that the harness decides when work is finished. In the agent loop, the model eventually says “done” and I review. Even before that, I usually steer along the way. I am involved and I enjoy learning along the way. In the harness operated loop I’m not sure what my role even is. Even the “done” signal loses all meanings and just becomes communicated to yet another machine that judges. My role is reduced to that of a messenger.

Today I do not like much of the code that I see from systems built that way and neither do I enjoy interacting with too much of software built with AI assistence. Looping is powerful but it removes responsibility more and more, and it at least today very much encourages us to give in to the machine.

And yet I have no doubts that this looping future is going to be our future despite the fact that I presently resent it. I already see astonishingly small teams building at impossible speed and I see codebases turning more and more into obscure and confusing organisms that can only be diagnosed by more machines. Those codebases are simultaniously useful and messy.

So I guess I’m coming to terms with that the question is not whether we will loop because clearly we will. Maybe the question is that in a future of loops, how do we don’t abdicate judgment, how we can retain rules of good engineering, how we can ensure that responsible human can continue to supervise, how we need to re-think how we architect code to retain sanity along the way.

June 23, 2026 12:00 AM UTC

June 22, 2026


Rodrigo Girão Serrão

Write a coding agent from first principles

Learn how to write a coding agent in this Python tutorial that teaches how to interact with an LLM through an API, how to manage the conversation context, and how to do tool calling.

Introduction

This tutorial will show you how to create your own coding agent from first principles. By doing so, you'll understand how coding agents work under the hood.

Prerequisites

To be able to follow this tutorial, you'll need

The concepts explained in this tutorial are independent from your LLM provider but the code snippets will make use of the Claude API and its Python SDK. This means that you can follow along with a different model provider as long as you adapt the code snippets to match the format expected by the API of your provider.

What's a coding agent?

A coding agent is an agent that's specialised for coding. In turn, an agent is just an LLM that has been extended with extra functionality that allows it to interact with its environment. This extra functionality is provided through tools, one of the core ideas covered in this tutorial.

This short definition still hides a lot of details, but instead of giving you a theoretical definition you can learn what a coding agent is by creating one. That starts now.

Project set up

To set your project up, start by using uv to create a packageable app project[^2]:

% uv init --app --package agent
Initialized project `agent` at `/Users/rodrigogs/Documents/mathspp/agent`

Then, cd into the project and add the two dependencies you'll need:

% cd agent
% uv add python-dotenv anthropic

You'll use python-dotenv to help you with authentication to access the Claude API and you'll use the dependency anthropic to make it easier to interact with the Claude API.

To set up authentication, create a .env file and paste your Claude API key there in front of the variable ANTHROPIC_API_KEY. When you're done, your .env file should look like this:

ANTHROPIC_API_KEY="sk-ant-api03-qI_3mJ..."

To make sure you never upload your API key to GitHub by accident, add the file .env to your .gitignore:

# .gitignore
# ... other entries generated by uv
.env

Now that you've set up your project, you can make your first request to the Claude API.

Interacting with an LLM

A coding agent needs an LLM at its core. Your LLM can come from any provider you want but you're going to use Claude because its SDK (the dependency anthropic you added in the previous section) is easy to use and because Claude is a popular model provider.

Using the anthropic SDK, here's how you can send a message to the LLM:

# src/agent/__init__.py
from anthropic import Anthropic
import dotenv

dotenv.load_dotenv()  # Load .env

MODEL = "claude-haiku-4-5"...

June 22, 2026 12:32 PM UTC

June 21, 2026


The Python Coding Stack

2. Anatomy of an Agent

Read Stephen's Preface to Agents Unpacked if you're new here.


You have used a large language model. You know the deal: a careful prompt gets a careful answer. A vague prompt gets a vague one. And the model itself does not keep anything from one conversation to the next, unless something external is holding that context for it.

Agents work differently. They have parts that do things a plain LLM does not. These parts are what make an agent an agent. It is not just the model underneath. It is the structure built around it that gives the system its abilities to persist, act, and keep going.

Understanding this structure is the second major shift in this series. The first shift is seeing that a chatbot can give you a good answer without finishing the job, because it stops after responding. The second shift is seeing that an agent is not a smarter model. It is a model placed inside a structure that gives it something to act with and somewhere to keep what it has done.

The Agent Formula

Most agents share the same basic parts:

Different platforms package these differently. Some call memory “context,” some call tools “plugins” or “capabilities,” and some merge instructions and tools into a single configuration layer. But the parts are the same. An agent is not a single thing. It is a system, and each part matters.

Stephen: Don’t LLMs also have memory since they remember what happened earlier in the conversation? How’s this different?

Here is one distinction worth getting clear early: the context window and memory are not the same thing. The context window is the working space an LLM uses during a single session. It holds the conversation so far and gets loaded fresh every time the model gets a chance to speak. Memory, by contrast, is information stored outside the model, maintained by the system, and available across sessions and steps. We will come back to this.

An agent needs all its components:

Agent = Model + Instructions + Memory + Tools + Execution Loop

Leave any one of these out and the system changes behaviour in ways that matter. We will look at each piece in turn.


Subscribe now


What the Model Does and What It Doesn’t Do

The model is the reasoning core. It reads your request, figures out what to do, and decides what to say back. It gets the most attention because it is the part that generates language.

But a model on its own is like a brilliant mind with no hands and no memory of its own. It can think. It cannot act. It cannot remember what happened five minutes ago unless something explicit carries that information forward.

Stephen: Wait a second. You say the model doesn’t remember what happened five minutes earlier. But when I use an LLM, it does seem to remember what happened earlier in the conversation.

Here is what is actually happening. When an LLM appears to remember earlier in a conversation, it is not the model itself that is remembering. The context window is carrying all the earlier messages along with your new message, every time you send something. The model sees the full conversation again and generates a response that fits what came before. That is not memory in the model. That is the system feeding the model a transcript.

This trips up almost everyone when they start using agents. The model generates text. The rest of the system decides what to do with that text and whether to act on it.

A better model helps. It reasons more clearly, follows instructions more faithfully, and handles edge cases better. But dropping a smarter model into an agent that is missing a working execution loop will not make it an agent. You need the other parts too.

Instructions: The Agent’s Direction

Instructions tell the agent what it is supposed to do and how to behave. Some systems call these system prompts. Others call them agent definitions or behavioural instructions. The name does not matter. What matters is that they are the layer that tells the model why it exists, who it is helping, and what ‘good’ looks like for the task at hand.

Good instructions do not make an agent smarter. They make it more focused. They give it a frame for every decision: what to prioritise, what to avoid, when to ask for help, how to present its output.

Stephen: Are these what are often called ‘skills’, or are skills something else altogether?

Skills and instructions are related, but they are not the same thing. Instructions are the core behavioural direction: who the agent is, what it is for, how it should approach its work. A skill, in platforms like OpenClaw and Hermes, is a specific file that tells the agent how to carry out a particular task, often by combining one or more tools. So instructions tell the agent how to behave generally. A skill tells it how to do something specific. We will see this distinction more clearly when we look at how different platforms implement these parts.

The instructions shape what the agent notices, what it proposes, what it tries, and what it says no to. Two agents built on the same model with different instructions will behave differently in the same situation. They will notice different things, prioritise differently, and produce different outcomes.

Poorly written instructions can quietly break an agent. If the instructions are vague, the agent has to improvise every step. If they contradict each other, the agent has to choose, and it might not choose the way you intended.

Stephen: Can you provide a few examples of what these instructions may look like in different scenarios?

Here is what instructions might look like in practice. A poorly-written instruction can quietly break an agent. Consider an instruction that says “be helpful and concise” without defining either term. When a user asks for a full technical breakdown, the agent has to arbitrate between two vague goals. It might give a two-sentence answer that technically satisfies “concise” but ignores “helpful,” or it might give an exhaustive response that satisfies “helpful” but ignores “concise.” Either way, the agent is improvising because the instructions gave it no real frame for the conflict.

A research assistant agent might have instructions that say something like: “You are a research associate working for [user name]. Your role is to find, summarise, and organise information on topics the user assigns. Always cite your sources. Flag uncertainty rather than guessing. Present findings in a clear brief, not a wall of text.”

A code review agent might have very different instructions: “You are a principled code reviewer. Focus on correctness, clarity, and performance. Do not praise code unnecessarily. When you find an issue, explain why it matters and suggest a concrete fix. Keep responses short.”

The difference between those two sets explains a lot about why two agents can feel like entirely different systems, even if they use the same model underneath.

Memory: The Workspace and Context

Memory in an agent is not like human memory. It is a structured store of information kept and updated as the agent works. It is what lets the agent hold a thread across multiple steps without starting from scratch each time.

Most agents use some combination of three types:

This is not a personality feature. It is not the agent “remembering” in the way a person remembers their childhood. It is operational continuity. The system maintaining a thread of relevant information across time and steps.

Different platforms handle these differently. LangChain agents build up a rolling context window: the current request gets appended to everything that happened before, and the whole thing is passed to the model. If the conversation gets long, older turns get dropped or summarised to make room. AutoGen agents can maintain shared memory across a team, so that when one agent finishes a task, what it learned is available to the next agent that picks up the thread.

OpenClaw takes yet another approach. Its memory layer is a structured store that agents write to and read from across sessions. When an agent starts a new session, it can query that memory store for relevant context rather than relying solely on what was in the most recent conversation. An agent can know that the user prefers short emails, even if that was established three weeks ago.

Stephen: If memory can be stored in files, does it mean that agents can have nearly unlimited memory (within the limits of the computer or server’s overall memory capacity)?

There are practical limits even when storage is effectively unbounded. The more relevant limit is not how much the agent can store, but how well it can find and use what it has stored. A full inbox is not the same as a well-organised one. Retrieval becomes harder as memory grows, and irrelevant information can dilute the signal if the system does not manage it carefully.

Think of it this way. A context window that holds 128,000 tokens can technically hold a lot of information. But it can only hold what was placed there. An agent with a large memory store full of useful context still needs a way to surface the right information at the right time. If it cannot find what it needs, or if what it finds is buried under noise, the effective memory is constrained.

The quality of retrieval matters as much as the quality of storage. An agent that retrieves relevant context poorly is effectively working with a much smaller memory than one that retrieves well, even if both store the same amount.

Stephen: So, tell me if I understood this. The agent has an index telling it where to find information specific to certain topics or tasks. When the LLM part of the agent decides it needs to deal with a certain topic, it uses the index to read and load the information from the memory file into its context. Is that right?

That is broadly right. The memory store, the index, and the retrieval into context are the key parts. One small correction worth noting: the decision to retrieve from memory is typically made by the agent or coordinator layer, not by the LLM directly. The LLM receives the retrieved content as part of its context, but it is the agent system that decides what to look up and when. This distinction matters because it is the agent layer, not the model, that is doing the memory management.

Stephen: But isn’t the agent’s brain the LLM? Clarify the distinction in your answer above. Which part of the agent’s infrastructure deals with this?

It is a fair challenge. The LLM is genuinely where the reasoning happens. It reads context, generates text, and makes decisions about what to say or do next. But it is also just a text processor. It receives input, produces output, and has no awareness of anything beyond the tokens it has been given.

The coordinator layer is the infrastructure that sits around the LLM and manages the process. It reads the LLM’s output, decides whether to act on it, calls tools, retrieves memory, and feeds results back into the next LLM call. It is the difference between the LLM thinking and the agent doing. A bare LLM generates text. The coordinator turns that text into action.

To use a rough analogy: the LLM is like a pilot who can read instruments and make decisions. The coordinator is like air traffic control — it decides which runway to use, when to land, and when to divert. The pilot’s brain does the reasoning. But without the infrastructure around it, the pilot just sits in the cockpit thinking.

So when we say the agent retrieves memory, we mean the coordinator retrieves it and places it where the LLM can see it. The LLM does not reach into a file and pull something out. The coordinator does that work and presents the result to the LLM as part of the next context.

Stephen: And are the bits of these files then loaded into the LLM’s context? Therefore, the more stuff is loaded from the memory files, the more the context fills up, affecting the rest of the conversation and cost, right?

Yes, exactly right. Memory retrieval feeds into the context window, which is the LLM’s working space for the current session. Every token that goes into the context window is a token the LLM processes and a token that costs something. Loading a lot of context from memory means less room for the conversation itself, and it means higher token usage on every call.

This is one of the practical engineering tensions in agent design. Loading more memory gives the agent more to work with, but it also makes each LLM call more expensive and slower. A well-designed agent retrieves only what is relevant to the current task, not everything it knows.

Tools: What the Agent Can Actually Do

Tools are the capabilities that let an agent act beyond generating text. The model decides to use a tool. The tool performs an action and returns the result to the model.

This was covered in Chapter 1 under “Tools Are the Hands.” Here it is worth noting that tools are also where agents differ most between platforms. Some agents come with a large built-in toolkit. Others can call external tools through open protocols. Some let you build custom tools. Others are more locked down.

What tools might an agent actually have? A research agent might be able to search the web and read files on your machine. A coding agent might run shell commands and read or write files. A calendar agent might check your schedule and send messages. The tool is the bridge between the model’s decisions and the world the agent is working in.

What matters is not how many tools an agent has, but whether the tools it has are the right ones for the tasks you want it to perform.

Different platforms implement tools differently. LangChain provides a standardised tool interface that lets you connect to search APIs, databases, file systems, and custom functions. OpenCode agents run inside a development environment, where the tools available are the commands and interfaces of that environment. OpenClaw uses an open tool protocol that lets agents call external capabilities regardless of who built them. Hermes takes a more composed approach: a skill file specifies not just what the agent should do, but which tools to use and in what combination to carry out a specific task.

Here is the thing worth unpacking. A tool on its own is just a capability. What makes it useful is the bridge between what the agent is trying to accomplish and the tool that can help. A calendar tool is useless if the agent does not know it should check the schedule. An agent running a meeting-preparation skill that says “check availability, send invites, prepare a briefing document” has that bridge built in.

The Execution Loop: The Part That Makes It an Agent

The execution loop is the cycle that takes an agent from a single-shot response to a sustained process. Observe, think, act, check, repeat.

This was the core of Chapter 1. But it is worth restating here, in the context of anatomy, because the loop is what ties all the other parts together. Without it, you have a model that receives instructions and context and produces text. With it, you have a system that can pursue a goal across time, recover from partial failures, and stop when the work is genuinely done.

The loop is the difference between an agent and a very well-instructed chatbot.

Here is why the repeat step matters so much. A model has no native sense of when it is done. When you call a function in code, the function returns and you are finished. When a model generates text, it produces tokens until it hits a stop condition built into the model itself, most commonly a token limit or a designated stop sequence. These conditions tell the model when to stop generating, but they do not tell the agent whether the result is actually what the user wanted. There is no built-in check that says “is this the right answer?”

The execution loop provides that check. The check phase asks: is the result good? Does it meet the original goal? If not, the loop continues. Sometimes that means a dozen or more cycles before a task is genuinely complete.

The loop also determines how goals decompose. In LangChain’s ReAct-style agents, the loop runs inside a single agent: observe, decide on the next action, execute it, check the result, repeat. In AutoGen, the loop is distributed across multiple agents that hand off to each other. A planner agent might coordinate specialist agents, each running their own loop on their own piece of the problem. OpenClaw uses a coordinator agent to manage the loop, assigning work to sub-agents and handling the check phase across the full task rather than within a single agent cycle.

The architecture of the loop is one of the most significant differences between agent platforms. But the function is the same everywhere: turning a sequence of isolated model calls into a coherent, goal-directed process.

Multiple Platforms: Comparing the Formula in Practice

It helps to see the same five-part formula playing out in different platforms. Here is how a few of them map onto it.

LangChain is one of the most widely-used agent frameworks. A LangChain agent has an LLM at its core, a set of tools, a prompt defining the agent’s role, memory that accumulates conversation history, and an agent executor that runs the loop. The loop in LangChain is explicit: the agent executor repeatedly calls the model, parses the model’s tool-call output, runs the tool, and feeds the result back until the model says it is done.

AutoGen takes a different approach. Rather than a single agent, AutoGen sets up a team of agents that communicate with each other. Each agent has a model, instructions defining its role, and its own set of tools. The loop is distributed: there is no single execution cycle. Agents exchange messages, delegate tasks to each other, and the overall process continues until the team has finished the assigned goal. Memory in AutoGen can be shared across agents so that one agent’s work is available to the next.

OpenClaw uses a coordinator agent that manages the overall execution loop. Sub-agents each have their own identity, tools, and memory. The coordinator decides which sub-agent handles which part of a task, passes context between them, and handles the check phase across the full goal. Skills in OpenClaw are files that tell a specific agent how to carry out a particular task, combining instructions about what to do with definitions of which tools to use.

Hermes also uses a skill-based architecture where skill files define both the instructions and the tool configuration for specific tasks. Rather than a single general-purpose agent, Hermes composes agents from skills that know how to use particular tools in particular contexts.

OpenCode works differently again. It runs agents inside a development environment, typically a cloud workspace. The tools available to the agent are the commands and interfaces of that environment. The loop is typically managed at the task level: the agent receives a task, works through it using the tools at its disposal, and reports back. There is less of a formalised multi-step loop and more of a task-completion focus.

None of these platforms invents new parts of the agent formula. They all use a model, instructions, memory, tools, and an execution loop. What differs is how those parts are implemented, how they are divided up, and how they communicate. Understanding the formula means you can look at any of these platforms and see what you are actually looking at.

What This Chapter Covered

This chapter pulled apart the five components of the agent formula.

We saw how the model is the reasoning core but cannot act or remember on its own. How instructions shape the agent’s focus and behaviour, and why the same model with different instructions can feel like a different system entirely. How memory provides operational continuity across steps and sessions, and why retrieval quality matters more than storage capacity. How tools extend what the agent can do beyond generating text, and why a tool is only as useful as the bridge between the model’s decisions and the action the tool can take. And how the execution loop is the architecture that turns isolated model calls into a coherent, goal-directed process.

We also saw how different platforms implement the same five components differently: LangChain’s explicit agent executor, AutoGen’s team-based coordination, OpenClaw’s coordinator and skill-based sub-agents, Hermes’s composed skill architecture, and OpenCode’s environment-integrated approach.

The goal was not to become an expert on any one platform. It was to show that agents are not mysterious black boxes. They are systems built from a small number of recognisable parts, and once you know what to look for, you can see the anatomy underneath any agent platform you encounter.

Next up in Agents Unpacked: we dig into tools and skills: what it actually means for an agent to do something rather than just say it, and why a well-tooled agent operating autonomously in a loop is a fundamentally different thing from a model answering questions.


<< Previous Post: From Answer to Outcome

>> Next Post: Coming Soon

Table of Contents


stephengruppetta.com

June 21, 2026 09:48 PM UTC


Christian Ledermann

Stop Copy-Pasting Your .pre-commit-config.yaml

Stop Copy-Pasting: Introducing pc-init

We’ve all been there: you’re starting a new project, you’ve got your repo initialized, and now comes the tedious part—setting up the quality gates.

You know you need pre-commit (or the newer prek) to keep your code clean, but you end up hunting through your older repositories to find the "best" .pre-commit-config.yaml to copy and paste. Then, you spend ten minutes editing paths, versions, and configurations to match the specific needs of your new stack.

It’s a chore that breaks your flow before you’ve even written a line of code. That is exactly why I built pc-init.

The Problem: The "Config Archeology" Workflow

Most languages and frameworks have a gold standard for quality tools. If you're working in Python, you want ty and ruff; in React, you want eslint and prettier.

Manually setting these up requires:

  1. Identifying the recommended tools for your specific stack.
  2. Looking up the correct hook repository URLs, revisions, and arguments.
  3. Writing the YAML by hand (and hoping you didn't miss a syntax error).

Doing this every other month is just frequent enough to be a persistent pain point, but not frequent enough to have the workflow committed to muscle memory.

The Solution: pc-init

pc-init is a CLI tool designed to replace "config archeology" with a single, declarative command. It scaffolds a production-ready .pre-commit-config.yaml based on your project's technology stack.

Instead of copying old files, you simply tell pc-init what you're building, and it builds the configuration for you.

How it works

pc-init uses a system of language and framework presets. You provide the parameters, and it handles the rest:

# For a standard Python project
pc-init --lang py

# For a JavaScript project using React
pc-init --lang js --framework react

# For a Python project using Django
pc-init --lang py --framework django --force

Why use pc-init?

Get Started

If you’re ready to speed up your setup process, you can install pc-init via uv:

uv tool install pc-init

After generating your config, don't forget to run pre-commit autoupdate or prek autoupdate to ensure you are pulling the very latest versions of your selected tools.

If you have suggestions for new presets or run into issues, please head over to the GitHub repository and open an issue. Let’s make boilerplate setup a thing of the past.

June 21, 2026 10:53 AM UTC


Bob Belderbos

From Python to Rust: Master Iterators by Rebuilding 10 Unix Tools

The fastest way I know to learn a language is to rebuild something you already understand. You stop fighting the problem and spend your attention on the syntax and the idioms. That is the whole idea behind the new Unix tools track I just released on the Rust platform: ten small command-line classics, each one a pure function you implement and cargo test to validate you got it right.

Why Unix tools, and why from Python

Every exercise opens with the Python you would write, then teaches the Rust idiom that replaces it. People who love the platform keep pointing at the same thing: the Python-to-Rust bridge is what makes the concepts stick.

You are not memorizing Iterator methods in the abstract, you are watching len(text.split()) turn into .split_whitespace().count(), and reaching for Option and Result instead of Python's exceptions.

Take cut. In Python an invalid field raises an exception at runtime:

def cut(text, delim, field):  # field is 1-based
    if field == 0:
        raise ValueError("field values may not include zero")
    ...

In Rust both outcomes live in the return type:

#[derive(Debug, PartialEq)]
enum CutError {
    ZeroField,
}

fn cut(text: &str, delim: char, field: usize) -> Result<Vec<&str>, CutError> {
    // ...
}

#[test]
fn field_zero_is_an_error() {
    assert_eq!(cut("a:b", ':', 0), Err(CutError::ZeroField));
}

Encoding the failure in the type, not just the docs, is a core Rust idea: the compiler makes every caller account for it, and that is a big part of what makes Rust code safer. The failing case becomes a test you code towards.

Iterators are the spine of the track, seven of the ten exercises turn for loops and comprehensions into iterator chains. Every exercise relates back to one or more Python idioms you already know.

For me Rust is hard but thanks to comparison with Python I feel that I understand more. - Piotr R

Having a direct comparison with Python snippets keeps me more in the context of what's going on. - Michal S

The track also follows one rule that matters for real CLIs: the logic lives in a pure, testable function, while the I/O (reading a file or stdin) stays in a thin wrapper around it.

That split is exactly how you would structure a professional Rust tool, and it is why every exercise can be validated by a test instead of by running a binary. I wrote more about how this rewiring changes your Python instincts in Rust made me a better Python developer.

The 10 exercises and what each one teaches

  1. wc: count lines, words, characters. Iterators plus .count() replace len(...), and you return a (usize, usize, usize) tuple. The character count is also a sneak intro to why chars().count() is not len() in a Unicode world.

  2. head & tail: first and last N lines. head uses lazy .take(n) and stops early; tail forces you to collect first and slice from the end, because you cannot run an iterator backwards.

  3. cat -n: number the lines. Rust's .enumerate() starts at 0, not 1 (there is no start= argument), and you rebuild the numbered text from there.

  4. tr: translate and delete characters. The exercise that makes the char vs &str distinction click: 'l' is a char, "l" is a &str, and you .map() over .chars() to rewrite each one.

  5. grep: filter matching lines, with -i and -v. Substring tests with .contains(), case folding, and a single boolean condition that handles both -i and -v without branching. First taste of borrowing and lifetimes in the signature.

The Unix tools track on the Rust platform: all 10 exercises from wc to the top_words capstone, with difficulty levels and completion status

  1. cut: extract a field. Two outcomes that pull in different directions: a missing field is normal (skip the line), but field == 0 is a bad request. You model that with Option and Result instead of Python's exceptions.

  2. uniq -c: count adjacent duplicates. Rust has no itertools.groupby, so you walk runs by hand with pattern matching and a Vec, keeping the consecutive-only behavior honest.

  3. sort: sort lines, with -n. Iterators do not sort, so you collect into a mut Vec and call .sort(); numeric order means sort_by_key with a closure, Rust's answer to Python's key=.

  4. sed s///: find and replace per line. str::replace swaps every match, replacen stops after a count, so the g flag is just a choice between two methods, no manual counting.

  5. top_words: the capstone. Compose everything: count word frequencies with the HashMap entry API (entry().or_insert(), since there is no Counter), then sort and take the top n. This is the tr | sort | uniq -c | sort -rn | head pipeline rebuilt as one Rust function.


If you have been meaning to get past reading about Rust to actually writing it and making the concepts stick, begin with the free wc and head / tail exercises. Each one is short, has tests to code towards, and there is no AI on the platform; you have to do the work. I hope you learn a lot: start the Unix tools track.

Next up on the platform: a track on Rust lifetimes.

June 21, 2026 12:00 AM UTC

June 20, 2026


Bob Belderbos

Profile First: A 10x Faster Django Test Suite

The Rust Platform Django test suite took 30 seconds to run. I had a hunch it was database-related. Of course I was wrong. I profiled it with cProfile and cut it from 30 to 3 seconds.

Stop guessing, run the profiler

The instinct on a slow test suite is to start making assumptions: too many fixtures, the database is slow, I should parallelize, it's the GIL. Every one of those is a real fix for some issues. The problem is you don't know until you measure.

From High Performance Python:

Sometimes it’s good to be lazy. By profiling first, you can quickly identify the bottlenecks that need to be solved, and then you can solve just enough of these to achieve the performance you need. If you avoid profiling and jump to optimization, you’ll quite likely do more work in the long run. Always be driven by the results of profiling.

cProfile ships with Python. In this case I pointed it at my test suite:

uv run python -m cProfile -o tests.pstats -m pytest -k unit

That gives you a pstats file. You can read it from the command line, but it's not intuitive:

uv run python -c "import pstats; pstats.Stats('tests.pstats').sort_stats('tottime').print_stats(10)"

Adam Johnson recently released profiling-explorer, a browser viewer over the same data. You can invoke it like this:

uvx profiling-explorer tests.pstats

It opens on http://127.0.0.1:8099. I sorted by internal time and the bottleneck was obvious:

profiling-explorer showing pbkdf2_hmac dominating internal time in the Django test run

Reading the numbers without fooling yourself

Two columns matter:

Sort by tottime to find what to fix:

FrameCallstottimeShare
_hashlib.pbkdf2_hmac17726,470 ms82.8%
psycopg2 cursor.execute4,6821,522 ms4.8%

pbkdf2_hmac is Django's default password hasher. It's meant to be slow, because it resists brute-forcing real passwords. But it fires on every test that creates a user: fixtures, create_user, client.login. In production it runs once per login. In a test suite it runs hundreds of times for no security benefit at all.

One more thing the table teaches you: on a high-call-count frame, the call count is the signal, not the time. cursor.execute at 4,682 calls is mostly real database wait, and the count is high because every test sets up its own data: inserts, auth lookups, session reads, across 355 tests. A high count is worth a second look (it can hide an N+1), but you have to confirm that before believing it. (cProfile adds a fixed cost per call too, but at a few thousand calls that overhead is marginal; it only distorts the picture when a frame fires millions of times.) PBKDF2 only ran 177 times, so its time really was concentrated there, and it was the real culprit.

The five-line fix

Swap the hasher to fast MD5 in tests only. An autouse fixture in conftest.py (default "function" scope does it for every test):

import pytest


@pytest.fixture(autouse=True)
def fast_password_hashing(settings):
    settings.PASSWORD_HASHERS = ["django.contrib.auth.hashers.MD5PasswordHasher"]

Hash strength is irrelevant under test, so there's no loss of coverage. Test suite speed went from ~30s to ~3s.

It turns out this is also a documented Django trick, which is the point: the profiler led me to a known fix I didn't know I needed, instead of me making assumptions.

The new top frame is cursor.execute at ~4.7k calls, which smelled like an N+1. So I measured that too: I wrapped the suite to capture every query and grouped them by shape. No N+1. The list views already batch their lookups into one query, and the high count was just 355 tests each doing honest setup. The only real waste was a save() path firing redundant COUNT queries, cheap, but easy to halve.

Which is the whole point: I guessed N+1 and was wrong again. The profiler keeps you honest. What's the slowest thing you run every day that you've never actually measured? Let me know on LinkedIn or X/Twitter.

June 20, 2026 12:00 AM UTC

June 19, 2026


Core Dispatch

Core Dispatch #6

Welcome back to Core Dispatch! This edition covers June 4 through 19, 2026. Python 3.14.6 and 3.13.14 landed on June 10, and the next milestone is 3.15.0 beta 3 on June 23.

The big news this fortnight comes from the Steering Council, who put out an announcement on the path forward for the experimental JIT. The JIT entered CPython's main branch as an experiment, alongside the Informational PEP 744. The Council would like to see its path forward worked out through a Standards Track PEP, giving the project the explicit, structured conversation it hasn't really had yet about what people expect from a JIT, including performance targets, interop guarantees, and tooling compatibility.

On a related note, JIT contributors have opened a thread to gather community perspectives on the JIT as they begin drafting that PEP. Give it a read, and if you've got experiences, expectations, or concerns to share, it's a good place to weigh in.

It's been a bit quieter on the PEP front over the past two weeks, though PEP 835, a shorthand syntax for Annotated type metadata, was newly drafted.

Over on the PSF side, the Board has published the draft of its 2026 strategic plan: six organizational goals and four program goals spanning financial sustainability, supply chain security, and community empowerment. The feedback window is open through June 25, so if you've got thoughts, now's the time. The 2026 PSF Board election dates are out too.

As always, if you maintain a package or just like living on the edge, give the latest 3.15 beta a spin and file any issues you find.

Upcoming Releases

Official News

PEP Updates

Steering Council Updates

Merged PRs

Discussion

Core Dev Musings

Upcoming CFPs & Conferences

Community

Credits

June 19, 2026 12:00 AM UTC


Bob Belderbos

End-to-End Testing Every Rust Exercise with Playwright

The Rust platform has 71 exercises and counting (I just added a new track of Unix exercises). They all share the same interface: load an editor, type code, validate it against a Rust backend. When I make any changes to the platform, how do I confirm nothing breaks? Enter end-to-end testing with Playwright.

The Problem

Manual testing doesn't scale. Every time I add an exercise, tweak the editor, or update the validation flow, I need confidence that all exercises still work. Not just that the page loads, but the full loop: login, navigate, type code, submit, see results.

Unit tests cover the Django app, and the Rust validator has its own test suite. But neither exercises the full path a student takes: loading an exercise and getting a real pass or fail back.

One Test Function, 71 Test Cases

Playwright with pytest covers this in under 50 lines. Here's the core test file:

import psycopg2
import pytest
from decouple import config

from .constants import DOMAIN

exercises = []
with psycopg2.connect(dsn=config("DATABASE_URL")) as conn:
    with conn.cursor() as cursor:
        cursor.execute(
            "SELECT slug, solution FROM bites_exercise WHERE public = true"
        )
        exercises = cursor.fetchall()


@pytest.mark.parametrize("exercise", exercises, ids=[ex[0] for ex in exercises])
def test_exercise(logged_in_page, exercise):
    slug, solution = exercise
    page = logged_in_page

    exercise_url = f"{DOMAIN}/{slug}"
    page.goto(exercise_url)
    page.wait_for_url(exercise_url)

    page.wait_for_selector(".CodeMirror", state="visible")
    page.wait_for_function(
        "document.querySelector('.CodeMirror')?.CodeMirror !== undefined"
    )

    page.evaluate(
        f"""document.querySelector('.CodeMirror').CodeMirror.setValue({repr(solution)})"""
    )
    page.click("#validate-button")

    page.wait_for_function(
        "document.querySelector('#feedback').innerText.includes('Congrats') || "
        "document.querySelector('#feedback').innerText.includes('Oops')",
        timeout=30000,
    )

    validate_result = page.text_content("#feedback")
    assert "Congrats, you passed this exercise" in validate_result

The database query at module load fetches every public exercise with its solution. @pytest.mark.parametrize turns that into 71 test cases. Each test navigates, injects the solution, validates, and asserts success.

Here is Playwright running the tests locally against the real Rust validator, one exercise after another:

Your browser does not support embedded video.

Patterns That Made It Work

Session-scoped fixtures for speed

Launching a browser is expensive. Logging in is expensive. Do it once:

@pytest.fixture(scope="session")
def browser(e2e_user):
    with sync_playwright() as p:
        with p.chromium.launch(headless=HEADLESS) as browser:
            yield browser


@pytest.fixture(scope="session")
def logged_in_page(browser):
    page = browser.new_page()
    page.set_default_timeout(30_000)
    page.goto(f"{DOMAIN}/pbadmin/")
    page.fill('input[name="username"]', LOGIN)
    page.fill('input[name="password"]', PASSWORD)
    page.click('input[type="submit"]')
    yield page

All 71 exercises run against the same authenticated browser session, so the login cost is paid once.

Waiting for CodeMirror

One tricky thing with Playwright is timing. Sometimes elements are not yet ready when you hit the page. In this case you have to wait for the CodeMirror editor to be fully initialized before injecting code:

page.wait_for_selector(".CodeMirror", state="visible")
page.wait_for_function(
    "document.querySelector('.CodeMirror')?.CodeMirror !== undefined"
)

page.evaluate(
    f"""document.querySelector('.CodeMirror').CodeMirror.setValue({repr(solution)})"""
)

First we wait for the selector to be visible, then we wait for the JavaScript instance to be ready.

Avoiding Django's async context trap

Another issue I faced was creating a Django user inside a Playwright fixture triggered:

SynchronousOnlyOperation: You cannot call this from an async context

I worked around it by creating the test user in a separate fixture that runs before Playwright starts:

@pytest.fixture(scope="session")
def e2e_user(django_db_blocker):
    with django_db_blocker.unblock():
        return ensure_e2e_user()


@pytest.fixture(scope="session")
def browser(e2e_user):  # e2e_user runs first
    with sync_playwright() as p:
        ...

The django_db_blocker.unblock() context manager allows database access in session-scoped fixtures. Order matters: the user must exist before the browser fixture runs.

Running Locally vs CI

The E2E suite runs against a live database with the Rust validator running. That's deliberate: for this layer I want real integration, not mocked responses. (The mocked cases have their own home, more on that below.)

# Run all 71 exercises
uv run pytest tests/test_e2e.py -v

# Debug a specific exercise
HEADLESS=False uv run pytest tests/test_e2e.py -v -k "exercise-slug"

By default, the tests run headless, which means no browser window opens. This is faster and works well in CI. If you want to see what Playwright is doing, set HEADLESS=False to open a visible browser window.

This is essential for debugging why a particular exercise fails. Use the -k option to filter for a specific exercise by slug. And you can use --pdb to leave the browser window open when a test fails, so you can inspect the state.

For CI, I run unit tests only. E2E tests require the Rust backend and take longer. I run them locally before pushing major changes. This is a good example of separating unit and integration tests.

What about the unhappy paths?

A reader pointed out that these E2E tests only cover the happy path: type the correct solution, see "Congrats". Fair observation. What happens when a student submits wrong code, or the validator itself blows up?

Those cases live in the unit tests, where I mock out the validator API call:

ScenarioTestAsserted message
Wrong code (tests fail)test_validate_failuremock returns success: Falseb"Oops, try again"
Correct codetest_validate_successsuccess: Trueb"Congrats"
Runner/API throwstest_validate_api_errormock_post.side_effect = Exceptionb"Error while executing code"
Exercise doesn't existtest_validate_exercise_not_foundb"Exercise not found"
Tests deleted/alteredtest_validate_missing_testsb"tests are missing"
Prohibited code (std::fs, unsafe, include!, std::io...)test_validate_prohibited_pattern_*b"prohibited pattern" (parametrized over 6 snippets)

This is the split that makes the whole thing manageable. The E2E suite proves the full loop works end to end against the real validator. The unit tests, with the validator mocked, assert the exact message a student sees in each error case. Mocking is the natural home for these: some failure modes, like the runner throwing an exception, are hard to trigger on demand against a live backend, and even the ones you could reproduce in a browser are faster and clearer to pin down with a mock. See How to Tell if Your Python Mock Is Actually Working for the gotchas there.

What this buys me

Add an exercise, it's tested automatically, no new test code. That's the whole point of parameterizing over the database instead of hand-writing cases. Every frontend change now runs through a regression suite before it reaches users.

I find Playwright more modern and ergonomic than Selenium; the one rough edge is element timing, which the wait_for_function calls above handle. If you're testing anything with dynamic content, parameterizing over your real data beats writing one test per case.

June 19, 2026 12:00 AM UTC

June 18, 2026


Ned Batchelder

Dodecahedron with stars

I saw this dodecahedron with an Islamic-inspired pattern designed by Taj Ragoo. As soon as I saw it, I knew I had to make one. I studied the pattern, wrote some Python, and made myself a PDF. I cut it out, folded it, glued it together, and now I have one of my own:

A paper dodecahedron on a cluttered desk

I love that this elegantly combines two pure geometric forms: the Platonic dodecahedron (12 uniform pentagons), and an Islamic pattern using five-pointed stars.

Looking closely, details emerge:

The dodecahedron with some regions of the pattern highlighted in colors

Each face has ten small stars in a ring. I’ve lightened them a bit in the front face here. At the center of each face is a ten-pointed star (highlighted in red), made of two overlaid five-pointed stars.

The real genius of the pattern is at the corners. I’ve highlighted one in blue. It’s a star made of the same parts as the central ten-pointed star, but there are only nine points. It works because three pentagons lying flat touching at a point occupy 324 degrees, leaving a 36-degree gap.

When the dodecahedron is folded together, the gap is closed. 36 degrees is exactly one-tenth of a complete 360-degree circle, so exactly one point of the ten-pointed star is missing, leaving a perfect nine-pointed star using the same shapes, spread over the corners of three pentagons. Beautiful!

If this appeals to you, follow Taj on Instagram: he’s got more Platonic/Islamic mashups to enjoy. The paper versions are just prototypes of the final versions he makes in wood.

Of course, you can get my PDF and make one for yourself:

A thumbnail of the PDF

The Python code to draw the net isn’t great: it has no real parallels to the structure of each face. It’s a lot of math and line drawing to get things in the right places. My ideal would be to have a toolset that used a tile-placing abstraction, to be able to do more interesting designs. Some day.

It was a joy to work on this though. It was a slow process of studying the original, working out the math, then mulling over coding approaches. The code was developed in small steps over weeks. Then printing initial versions, marking them up, working out the tab structure. Some copies were colored to understand how the lines flowed across the whole dodecahedron. It was good to be working in both the mental and physical worlds:

Various stages of progress in a messy pile

Update: it looks like the design was originally by Dana Awartani: Dodecahedron Within an Icosahedron.

June 18, 2026 10:15 AM UTC


Python Software Foundation

PSF Board Election Dates for 2026

Python Software Foundation (PSF) Board elections are a chance for the community to choose representatives to help the PSF create a vision for and build the future of the Python community. This year, there are 4 seats open on the PSF Board. Check out who is currently on the PSF Board on our website. (Cheuk Ting Ho, Christopher Neugebauer, Denny Perez, and Georgi Ker are at the end of their current terms.) 

The recent approval of the Packaging Council (PC) through PEP 772 means that the PC election will be held in parallel to the PSF Board election. For the first PC election, communications will be published on the PSF blog. Once the first PC has been established, they will define the standard lines of communication and more PC election process specifics for the future. More information on the PC election coming soon.

Board Election Timeline

Voting 

You must be a Contributing, Supporting, or Fellow member by August 25th and affirm your intention to vote to participate in this election. Reminder: If you were formerly a Managing member, your membership type was changed to Contributing per 2024’s Bylaw change that merged Managing and Contributing memberships

Check out the PSF membership page to learn more about membership classes and benefits. You can affirm your voting intention by following the steps in our video tutorial:

Per another recent Bylaw change that allows for simplifying the voter affirmation process by treating past voting activity as intent to continue voting, if you voted last year, you will automatically be added to the 2026 voter roll. Please note that if you removed or changed your email on psfmember.org, you may not automatically be added to this year's voter roll. 

If you have questions about membership, please email psf-elections@pyfound.org.

Election communications from psfmember.org

PSF Members should review their communication preferences on psfmember.org if you would like to opt in or out of receiving emails about either the PSF Board, PC elections, or both. Here’s how:

If you had previously opted out of communications from the PSF through psfmember.org and would like to start receiving them, we encourage you to update them using the instructions above. If you're not sure what how your psfmember.org communication preferences are currently set, you can check via the "Name and Address" tab mentioned above, and make any adjustments as desired. 

The PSF only sends a handful of election and fundraising related communications every year via psfmember.org. The PSF newsletter runs through a separate mailing list (and we heartily welcome you to sign up for our newsletter!). 

Run for the Board

Who runs for the board? People who care about the Python community, who want to see it flourish and grow, and also have a few hours a month to attend regular meetings, serve on committees, participate in conversations, and promote the Python community. We're looking for candidates with a diverse range of skills and backgrounds, including leadership experience, fundraising knowledge, non-profit familiarity, and event organizing. Technical expertise, a record of collaboration, and experience speaking or teaching in the Python community are also all qualities we hope to see in Board members.

Want to learn more about being on the PSF Board? Check out the following resources to learn more about the PSF, as well as what being a part of the PSF Board entails:

You can nominate yourself or someone else. If you're nominating someone else, we'd encourage you to reach out to them first to make sure they're excited about the opportunity and give them a heads up that they'll need to submit their own nomination statement too. Nominations open on Tuesday, July 28th, 2:00 pm UTC, so you have time to talk with potential nominees, research the role, and craft a nomination statement for yourself or others. Take a look at last year’s nomination statements for reference. 

Learn more and join the discussion

You are welcome to join the discussion about the PSF Board election on our forum. This year, we’ll also be hosting PSF Board Office Hours on the PSF Discord in July and August to answer questions about running for and serving on the board. Subscribe to the PSF blog or, if you’re a member, join the psf-member-announce mailing list to receive updates leading up to the election.

June 18, 2026 08:18 AM UTC


Bob Belderbos

When to use classmethod, staticmethod, or instance method in Python

In a coaching call this week we discussed a create classmethod, and someone asked the obvious question: why is that here? It just forwarded its arguments to __init__. We ended up discussing the difference between instance methods, classmethods, and staticmethods, and how to tell which is which. Here's a simple decision rule.

The decision rule

Look at what the method actually touches:

  1. Needs the instance (self) → instance method
  2. Needs the class (cls) but not a specific instance → @classmethod
  3. Needs neither@staticmethod

Nice, but what are some actual use cases? Let's look at the create method that prompted the question.

The create method from the call fails the rule above. It took the same arguments as __init__ and passed them straight through. It still adds a nice interface (Class.create(...)), but it doesn't do any work that the constructor doesn't already do:

# shortened for clarity
@classmethod
def create(cls, amount: Decimal, currency: Currency = Currency.EUR) -> "Expense":
    return cls(amount=amount, currency=currency)

When a classmethod earns its place

A classmethod pulls its weight when it does work the constructor shouldn't, or builds the object from a different starting point. Add a normalization step and the same method suddenly has a job:

@classmethod
def create(cls, amount: Decimal, currency: Currency = Currency.EUR) -> "Expense":
    return cls(amount=amount.quantize(Decimal("0.01")), currency=currency)

The canonical use of a @classmethod is the alternative constructor. Python won't let you overload __init__, so when you want to build an object several ways, each way becomes a classmethod.

The standard library has rich examples, for example take a look at datetime.date:

date.today()                      # from the system clock
date.fromtimestamp(1718539200)    # from a POSIX timestamp
date.fromisoformat("2026-06-16")  # from an ISO 8601 string
date.fromordinal(739418)          # from a proleptic Gregorian ordinal
date.fromisocalendar(2026, 25, 1) # from ISO year/week/day

Source:

    # Additional constructors

    @classmethod
    def fromtimestamp(cls, t):
        "Construct a date from a POSIX timestamp (like time.time())."
        if t is None:
            raise TypeError("'NoneType' object cannot be interpreted as an integer")
        y, m, d, hh, mm, ss, weekday, jday, dst = _time.localtime(t)
        return cls(y, m, d)

    @classmethod
    def today(cls):
        "Construct a date from time.time()."
        t = _time.time()
        return cls.fromtimestamp(t)

    ...
    ...

Every one of those returns a date, but starts from different raw material. They have to be classmethods because they need cls to construct the instance, and they return cls(...) which makes it also work with subclasses. For instance, if MyDate subclasses date, then MyDate.today() will return a MyDate instance, not a date.


Bonus: I was annoyed that my pysource package didn't work, so I've since patched it, and now you can get to this source code easily with:

uvx --from pybites-pysource pysource -m datetime.date

(I tend to pip this into Vim with | vi - to read the source code in a scratch buffer.)


You'll see the same pattern across the ecosystem: dict.fromkeys(...), int.from_bytes(...), and in Pydantic Model.model_validate(...) / model_validate_json(...) are all classmethods that build an instance from different raw material.

Another classmethod use case is class-level state: registries, caches, counters. A plugin registry is the clean example, because the method reads and mutates state that belongs to the class, not to any instance:

class Handler:
    _registry: dict[str, type["Handler"]] = {}

    @classmethod
    def register(cls, name: str, handler: type["Handler"]) -> None:
        cls._registry[name] = handler

    @classmethod
    def get(cls, name: str) -> type["Handler"]:
        return cls._registry[name]


# called on the class, no instance needed; it mutates state that lives on the class
Handler.register("json", JSONHandler)

When it's really a staticmethod

If the method touches neither self nor cls, it's a staticmethod, which is a plain function that happens to live inside the class for namespacing. That's a legitimate choice when the helper is tightly bound to the class and you want Expense.normalize(...) to read well. It's now part of the class API (it shows up in dir(Expense)) and can be called without an instance.

Genuine staticmethods are rarer than the other two, which itself tells you something. A clean example is a Color class with conversion helpers (from a Pybites exercise):

class Color:
    def __init__(self, name: str):
        self.name = name
        self.rgb = COLOR_NAMES.get(name.upper())

    @staticmethod
    def hex2rgb(hex_value: str) -> tuple[int, int, int]:
        return tuple(int(hex_value[i:i + 2], 16) for i in (1, 3, 5))

    @staticmethod
    def rgb2hex(rgb: tuple[int, int, int]) -> str:
        return f"#{rgb[0]:02x}{rgb[1]:02x}{rgb[2]:02x}"

hex2rgb and rgb2hex touch neither the instance nor the class. They're pure conversions that live on Color so Color.hex2rgb("#ff0000") reads well next to the rest of the API.

But that's exactly the signal worth noticing: a staticmethod might be just a function in disguise, and sometimes the honest move is to pull it out to a module-level function where it's easier to test and use on its own.

Summary

Method typeFirst argumentAccess toCommon use case
Instance methodselfInstance & class stateModifying object state
Class method (@classmethod)clsClass state onlyAlternative constructors, registries
Static method (@staticmethod)noneNeitherIsolated utility/helper functions

Why this matters more now

When you write the code yourself, you rarely add a method without a reason. When an agent writes it, you get plausible-looking structure that nobody chose. A create classmethod that does nothing, a staticmethod that should be a free function, a helper hanging off the wrong class. That's your judgment call: is this method doing work that belongs to the class, or is it just a pattern the agent learned from other code?

It pays to slow down and look critically at any code and ask those questions. With AI producing more code faster, it's easy to assume that if it looks like Python, it's good Python. But the agent has no taste, and it will happily produce code that is technically correct but structurally wrong.

This is also why I keep writing articles like this one: to give you a simple decision rule you can run in your head during review. It reminds me of Rust, which makes data flow explicit right in the signature with self, &self, and &mut self. The signature tells you what the method touches, same idea as the rule here. (That data-and-behavior split is the whole theme of Why Rust does not need OOP.)

So use AI, but keep developing your knowledge and taste. The more you know, the better you judge the code that comes your way, whether a human or an agent wrote it.

June 18, 2026 12:00 AM UTC