skip to navigation
skip to content

Planet Python

Last update: June 23, 2026 01:48 PM UTC

June 23, 2026


Glyph Lefkowitz

Adversarial Communication

As I have discussed in previous posts, “AIs” can make mistakes. In fact, they do make mistakes, and their mistake-making patterns are such that where and how they will make mistakes is both uncertain and constantly changing.

Thus, in any scenario where you want to attempt to make “productive” use of “AI”, you must have a system in place for checking every result. Not checking some results; checking every result. If each result might have a consequence for you (and if it didn’t have a consequence, why bother automating it?) and you cannot predict in advance which kinds of results will need verification, then verification is always required.

The verification often ends up being just as expensive as doing the work in the first place, which means that if you want your usage of “AI” to be personally profitable, you have to find someone else to externalize the cost of verification onto. This person becomes your adversary, and, if you are successful, your “AI’s” victim.

The Ladder-Climber And Their Reverse-Centaur Rungs

One way that this constellation of facts can straightforwardly assemble themselves into a dystopian nightmare is the phenomenon, described by Cory Doctorow, of the reverse centaur. This is when your employer non-consensually turns you into the verification system. The “AI” does the fun part of initially performing the work, and then you do the boring part where you check if the robot is right and clean up its messes, even if everyone already knows that it would, in aggregate, be cheaper for you to do the work in the first place.

Reverse centaurs can be made from any automation, not only “AI” automation. I think that there is a reason that this term happens to have emerged in the “age of AI”, though, and not with earlier automation technologies (even those which were considerably more viscerally horrific). That reason is: the wrongness of “AI” output is not merely a technical feature that must be compensated for, it is a generalized externality.

As I mentioned above, if you are responsible for the entirety of the work, both extruding the “AI” output and checking it, it’s usually cheaper to have humans do the entirety of the work to begin with. When humans do the writing directly, we can check as we go, and thus verification doesn’t need to be as comprehensive.

When “AI” coding advocates say “code review is the bottleneck”, what they are observing is that the LLM is still rolling the dice for each PR, and a human is still necessary to verify that each of those rolls is a winner. But calling this process “code review” is a bit of a misnomer; it’s not really “code review” in the traditional sense, it’s human understanding.

Before the advent of “AI”, the human understanding was implicit in the process of writing the code in the first place1, and the code review was a way of diffusing and extending that understanding. Now that the code can be authored with no initial understanding taking place, that cost has not gone away, it has moved.

Human understanding was always the bottleneck.

However, this is taking a collaborative view of a software project, where satisfying the needs and solving the problems of your customers are the goals. We can see that “AI” is a bad tool to satisfy those goals, because all it’s doing is converting the first half of the work, that of understanding the code as you write it, to understanding the agent’s output as you read it.

What if, instead, we were to take the view that every software company is a Hobbesian nightmare, red in tooth and claw? In this view, the only goal of a software project is for the individual developers to make their promo cycles and get their bonuses. Given that there is only a certain amount of money to go around, this is a zero-sum game where each programmer wants to look more productive than their colleagues.

Pretty much every organization finds it easy to reward “productivity” as expressed by lines of code emitted, but the benefits of doing thorough and thoughtful design, analysis, and code review very difficult to reward. In this world, an LLM is an invaluable tool for the sociopathic ladder-climber, particularly if your legacy organization is still structuring their workflows as if the person prompting the bot is “writing” the code, and then they get to foist off the act of “reviewing” the code onto someone else.

Here, the prompter effectively externalizes the cost of the LLM’s failures but internalizes any benefits. The prompter will vibe-code a big feature, so large that the assigned reviewer can’t possibly comprehend it all effectively. When this happens, the reviewer will, eventually, be pressured to approve it, even if they can try to spot a few problems along the way. The reviewer has their own work to get back to, after all, the obligation to review the prompter’s (read: the bot’s) code is a drain on their time that they are not going to get rewarded for.

If this feature is a big success, the prompter gets a promotion. If it causes a big issue, well, the reviewer must not have been careful enough.

This is why LLMs are “good for coding”, and also why their biggest promoters keep having outages.

The Generative Gish Galloper

Coding is the biggest “success story” of this type of adversarial communication, but it is by far not the only instance of such a thing. LLMs create a new form of leverage that can turn Brandolini’s law from a linear advantage into an exponential one. If you are engaged in a political debate where you want to overwhelm the other side in nonsense, an LLM can generate bullshit faster than it is physically possible for a human being to type, let alone respond thoughtfully. There is an asymmetry to the utility of this weapon as well: only one side of the political spectrum wants to flood the zone and destroy trust in institutions and the concept of truth. There’s a good reason that the fascists love it.

Straightforward Spam and Fraud

This is kind of obvious, but LLMs can generate lightly-customized, plausible-looking text much more quickly than any human being. This facilitates their use in fraud, spam, and scams. In a spamming or fraudulent interaction, once again, the costs are externalized onto the victim: the recipient of a spam message has to do all the work of “checking” the LLM’s output. Spammers already expect very low hit rates from boilerplate, and if the LLM can increase those percentages from 1% to 5% the technology will pay for itself; they don’t need anything like reliable accuracy.

Customer “Support”

If you have any kind of commercial relationship with a company, I probably don’t even need to mention this: customer “support” bots are a misery. Everybody knows it at this point. But customer support is usually conceptualized by businesses as an adversarial interaction, because it is a cost center. They maintain internal metrics on time-to-resolution and try to optimize them. Implicitly, this creates a dynamic where the goal of the customer service agent’s job is not to solve your problem, but to emit noise that will cause you to think your problem is resolved, or to give up, as fast as possible. Unsurprisingly, LLMs can emit this noise faster than humans can, getting those customers off the phone. But those customers will remember those interactions, and the story outside the TTR metrics is horrible.

Similarly to the situation in software development, LLMs can look very good on paper for customer support, but mostly what they are doing is illuminating the problems with the industry’s existing metrics, by turning “winning the metrics battle against the customer” into a more obvious and immediate defeat for the company’s long term reputation.

“Education”

In 2026 it is sadly a fact of life that students cheat all the time using “AI”, and that this cheating is very successful, in that the teachers find it very hard to detect.

LLMs are great for cheating on schoolwork because the student is externalizing the work of the checking onto the teachers, who are often starting at a disadvantage to begin with, at least in the US.

My view is that this is happening because of a divergence in the way that students vs. teachers (or, more accurately, “the broader educational system”) view grading.

When a student is asked to write an essay, the teachers see the effort as both intrinsically worthwhile for the student, as well as useful as a pedagogical tool to evaluate and react to the student’s progress. The student, by contrast, sees a stumbling block designed to knock them off the path to success and into a permanent underclass. It is no wonder that the student sees “AI” as useful to their own goals and has no compunction about deploying it.

There is a bitter irony that the ability to understand the inherent value of actually writing the essay on their own is the sort of thing that students can really only learn by writing a bunch of essays. There’s no way that I can think of which makes the benefit legible as long as a shortcut is available.

The net effect here is a downward spiral, where the already-wobbling educational system is sustaining an attack that it doesn’t have the resources to recover from. The individual students’ attacks against their teachers and their schools’ grading systems might appear to momentarily succeed, but they will win the battle and lose the war.

Spamming “For Good”?

Usually when we talk about someone unilaterally choosing to enter into an adversarial relationship, that’s an “attack” and for good reasons we have a negative impression of the attacker. However, I would be remiss if I did not point out that there are some cases where the relationship was already adversarial; just because you’re the attacker doesn’t mean that you are evil.

For example we might imagine use-cases like automatically filing appeals for prior authorizations against health insurance. It’s relatively well-known at this point that the main way for-profit insurers maintain their margins is by denying claims right up to the line of the policies themselves being fraud, so using a spamming tool to fight them might be entirely justifiable2 in that case.

Similarly, using an LLM could be justified in a fight against a company refusing to honor a warranty. One could imagine using an LLM to immediately generate replies and escalations.

However, even in imagined cases like these, the underlying problem is that the insurers and the vendors already have a tremendous amount of structural power, so it is more likely that they will have the advantage in deploying a communications weapon like an LLM, as well as enacting policies to simply ignore any LLM-based communication that you might submit. Worse, if these strategies were to become widespread, they might provide an excuse to reject any communications by feeding them into an unreliable “LLM detector” and issuing an automated “computer says no” even to hand-written correspondence.

It is also worth stressing that these cases are imagined, as compared to the very real coworker-abuse, spam, scam, fraud, and disinformation campaigns being waged in real life today.

Therefore, while legitimate uses might exist, it’s hard to imagine that there’s anywhere they would be genuinely valuable and sustainable. In the best case “AI” will provide a temporary advantage for underdogs that will provoke an arms race which the resource-advantaged adversaries will win in the long run, in the worst case the arms race itself will cement permanent structural change that will make things worse.

“Search” By Stealing

Most of the adversarial utility of “AI” is on the “write” side, since write-amplification is more obviously aggressive than reading. But the “read” side of LLMs — summarization and question-answering — can be a form of attack as well.

To begin with, the act of reading itself is currently enormously destructive, but that’s arguably not a fundamental aspect of this technology. They could set reasonable rate-limits and respect things like robots.txt, as search engines have for decades now. They could also refrain from committing criminal levels of copyright infringement. But, today, using “AI” tools does suborn this sort of out-of-control crawling.

More insidiously, consider the scenario described in this YouTube video. The LTT Bros decided to try Linux again, and in the course of so doing, they had problems. When trying to solve these problems, they were faced with a choice: they could consult Reddit, or they could ask an LLM. Asking an LLM would “gaslight the heck out of” them, but they still found it preferable, because they would at least get an answer without getting yelled at.

Initially this sounds great. But it also means that you want to extract knowledge from a community, while mechanically eliding any values or norms that the community may want to impart as part of offering that knowledge. As someone who spent many years in a community tech support role, this is worrying. Many requests for support are people asking how to do things that will momentarily solve a superficial problem but create a long-term reliability problem or even an immediate security risk, that the question-asker doesn’t want to hear about. Consider the question “I’m tired of entering my password so much, how do I make it so my laptop unlocks automatically”. An obsequious chatbot will helpfully tell you how to do this without pushback.

But, this is also a sort of ethically murky area. The Linux community is somewhat famously, for many years now, a toxic cesspool of general hostility, misogyny, etc. It is certainly a good thing that people can get access to this knowledge without subjecting themselves to abuse. But it also means that the people with the power and the privilege to change the community for the better can just quietly withdraw, rather than fixing the problems. It also means that the positive elements of culture cannot be transmitted, and people will have no opportunity to learn about unknown unknowns.

In this case, the “adversarial” communication is with society. The thing that using an LLM for search lets you do is withdraw from society and avoid forming any personal connections. There are some personal connections which are painful and annoying, and so that can feel like a momentary balm. But the need to make connections in general is, like, the concept of society itself.

Who Am I Hurting?

LLMs are good at adversarial communication. They are so good at it, relative to their other benefits, that they will tend to make communications adversarial if you are not remaining vigilant about the possibility that it might do so. My request to you, dear reader, if you are going to use such tools, is to always ask yourself, “who might I be hurting, if I use an LLM for this?”

If you’re using an “AI”, who is its adversary? If you haven’t given it one yet, who might the “AI” turn into an adversary? Who might you overwhelm with an asymmetric amount of output, or, if you’re receiving information and not sending it, who are you taking that information from without consulting?

Figure out the answers to these questions and conduct yourself accordingly; the answer might be “yourself”.

Acknowledgments

Thank you to my patrons who are supporting my writing on this blog. If you like what you’ve read here and you’d like to read more of it, or you’d like to support my various open-source endeavors, you can support my work as a sponsor!


  1. One of the reasons that software developers tend to prefer greenfield development is that when you are given a blank page, you can project your own specific understanding onto it. You can structure the codebase in a way that works for your brain, down to the variable naming conventions and the module layouts. LLM-assisted development makes everything into instant brownfield work, which makes developers instantly miserable; even those who are excited about the technology will frequently complain about how it feels like their agency has been stolen and their joy in the work has been diminished. But I digress. 

  2. Modulo the massive amount of other externalities involved in using LLMs, of course, but I don’t have the time or energy to get into those here. 

June 23, 2026 01:38 PM UTC


Armin Ronacher

The Coming Loop

I don’t prompt Claude anymore. I have loops running that prompt Claude and figuring out what to do. My job is to write loops.

— Boris Cherny

Over the last months I have watched more and more people build something on top of coding agents that feels meaningfully different from just using a coding agent. Some of this happens on top of Pi which is cool to see for sure! The pattern is the same everywhere though: work is put into a queue of sorts, a machine picks it up, attempts it, stops, and then some harness decides whether that was actually the end.

If not, the harness continues the same session, injects another message, starts a fresh session with modified context, or sends the task to another machine. The task stays alive beyond the point where the model by itself would normally have said: “I am done.”

I think about that type of loop more than I want to admit.

There is already an agent loop inside every coding agent. The model calls a tool, incorporates the result, calls another tool, reads a file, edits a file, runs tests, and eventually produces some answer. That loop is one we have been quite familiar with for a long time. The other loop is the harness level loop: the loop outside the agent loop. That loop is also not new. We have been doing versions of this since early Claude Code days, but that loop is becoming ever more present in agentic engineering and in recent weeks it has started to dominate the Twitter discourse.

I Am Not Good At This Yet

My current status is that I have not had much success with this way of working for code I deeply care about which turns out to be quite a lot of code.

Part of that is taste and part of it is control. I attempt to set a high bar for what I want code to look like, and I want to understand the code I ship. Under pressure, or in a discussion with another human, I want to be able to explain what the system does without first having to ask a clanker to explain it to me. Now there is obviously a question if this desire to understand the code is one that I will still have a few years from now. For now I have not moved past the point of comprehension being important to me.

Given this desire, there is something I lack with my experience of code written without me paying attention, particularly from loops. Present-day models tend to produce code that is too defensive, too complex, too local in its reasoning. They avoid strong invariants. They add fallbacks instead of making bad states impossible. They duplicate code, invent bad abstractions, and paper over unclear design with more machinery. Worse though: I so far see very little progress of this improving. If anything, on that front it feels to me that we might even be making steps in the wrong direction. At least for my taste, present-day hands-off harnesses like Claude Code with ultracode produce worse code than what we were producing last autumn. That’s because Claude Code, with Fable for instance will be working uninterrupted on a problem for thirty minutes or more, when previously the process would have been much more human in the loop.

Furthermore it’s well understood that models tend to observe some local failure and add a local defense. Karpathy mentioned how they are “mortally terrified of exceptions”. In systems with important invariants, especially persisted data formats or core infrastructure, the right fix is not “handle every malformed case.” The right fix is to make the malformed case unrepresentable or impossible to write in the first place. Yet even with a lot of manual steering, that type of code does not come out of LLMs naturally, and even if the code comes out naturally like that, they will still attempt to handle now impossible errors.

When you take that behavior and you put it behind loops, you tend to amplify it. If each iteration adds another small defense, the system slowly becomes less understandable while appearing more robust. The more hands-off you are, the more that happens. It also teaches really bad practices when tools like this are given to juniors without clear guidance. Because if you ask them, why they are doing all that, they will convincingly argue their case.

Where Loops Work

At the same time, it would be dishonest to pretend the loop pattern does not work because it already works astonishingly well in some domains.

Porting code one of them. There are already impressive examples of large automatic porting efforts, including the reported work around moving parts of Bun from Zig to Rust. I have used it with success myself to port MiniJinja to Go. Performance explorations are another case where this works beautifully. A machine can try experiments, benchmark them, discard failures, and keep searching. Security scanning fits naturally too and so does almost any type of research: asking a system to explore a complex problem space and report back without necessarily committing lasting code. One thing that many of these have in common is that they either do not generate new code, but transform code that already exists, or they produce code that intentionally does not have a long shelf life. They either produce proof of concepts or ideas, surface findings or are more akin to mechnical transformation.

I believe that loops that produce artifacts without necessity of longevity or that create some form of clearly verifiable mechnical translation matters more than the general ability of a harness to mechanically measure a goal. Many successful applications of loops use another LLM as a judge or as an orchestrator. The mechnical translation case can be verified with a binary test case, but it can also be judged by an LLM instead!

Claude Code, for instance, is increasingly good at creating entire experimental workflows that it will then execute. Sure, the code it produces is slop, but that’s more the fault of the model than the harness not being a good judge on if a step in the workflow resulted in a net improvement or completion.

The harness just needs some signal that lets it continue. It does not have to be objective or binary, it just has to be useful enough to drive another iteration.

I absolutely love loops already that take the boring parts out of my day to experiment and measure and to give me ideas.

Software As Organism

On the other hand using that same looping methodology to write lasting code does not yet sit well with me. The metaphor I like to reach for is one of moving from software as a deterministic machine to software as an organism.

I became a software engineer in an enviornment that encouraged me to understand the machine. There was always a layer you could peel off to deepen your understanding. Machines that did not exhibit deterministic observable behavior were maybe accepted, but generally seen as not exactly optimal. Software architecture-wise, I saw it as desireable to push further towards more determinism rather than less. Likewise the ability to understand the code has been an undeniable goal. In practice not always possible we still took pride in writing code so that it became possible even for new engineers to navigate complex code bases through clever architecture. On well designed systems there were always engineers that knew where the invariantes lived, which parts were load-bearing and which changes were safe. Ideally all of that was also well documented. Where that understanding was lacking, it was generally regarded as something to improve upon.

Obviously that ideal has always been strained. Many software systems, especially very successful ones had periods where engineers on the team were able to keep them clean. Large software systems are not infrequently too big, too dynamic and too dependent on external services to fit into anyone’s head. Even without LLMs we already diagnose distributed systems somewhat like doctors in that we observe symptoms, form hypotheses, “order more tests”, try some remedies, and observe again.

Yet with LLMs we’re pushing much further in that direction and much quicker. We use them to write the code and we also use them for diagnosis and remedy. There are plenty of engineers that already live in a world in which the first step after the occurrence of a production issue is followed by having a clanker read logs, propose root causes and proactively put up a patch. The resulting patch is then often picked up by another machine that reviews, sometimes even landing it on main without any human supervision.

Obviously that is powerful and I cannot deny that it sounds appealing. But giving in to that idea, particularly with less and less human oversight means accepting that we may no longer understand the whole system in the same way. We treat it, we monitor it, we stabilize it, but we do not necessarily comprehend it.

I have no doubts that for some software, that is okay. Not every line of code deserves human authorship and worse code might have been written in the past.

But do I want all software to be authored this way?

You Cannot Quite Opt Out

What’s very uncomfortable is that opting out of this fully machine-driven future may not be an option.

Security is the clearest example today. Even if you do not use loops to build your software, other people will use loops against your software. Attackers will run machines continuously and even if it’s not attackers, then security researchers will and some of that automated work will throw up dust but also find real issues. And both the signal and the noise will come your way at a volume that makes it almost impossible to deal with unless you yourself throw a machine at the problem.

Daniel Stenberg’s post about curl’s summer of bliss is a good example of the pressure maintainers are already under. As far as I know, AI does not play a tremendous role in the core development of curl today. Yet despite all of this, maintainers are overwhelmed by reports, most of which are now AI-generated ones.

If attackers and reporters loop, defenders will eventually need to loop too to keep up. Maybe not to write patches directly, maybe just to triage and reproduce and pressure will increase.

The same is true competitively as some teams will out-build others through raw speed. Some projects will suddenly move faster because a tiny group figures out how to orchestrate machines effectively. Some startups will do with five people what used to require fifty. Some people might literally put a machine against your product in a loop and ask it to “make it like the other one.” And if their users are happy, does it really matter?

Not all software will be equally affected. Some domains will punish sloppiness and demand trust and responsibility, but a lot of software lives in a world where raw speed, quick experimentation, and vast coverage matter enormously.

Building New Dependencies

The scariest part to me is that we become dependent on these new machines in new ways. Software has always depended on tools. I remember the time when I had to pay for compilers. These new tools are a flashback to times where creating software came with real costs. But now it’s no longer a one-time payment, it’s a constant dependency. Not just a dependency on a filled wallet, but also a cognitive dependency.

If a codebase is produced by loops, reviewed by loops, patched by loops, and kept alive by loops, what happens when you no longer have access to the same class of systems? What happens when some trade restrictions take away access to the most powerful models? What if just the cost becomes unbearable? What if you and your team just lose the last remaining ability to understand the code without using the machine?

We may create codebases that are not merely hard to maintain by humans, but that assume machine participation as part of their maintenance model. This is already happening! It’s not happening everywhere, and it might not even be happening in ways that are seen as problematic, but we see more and more of it. People more and more merge code they cannot fully explain. People lose their ability to create issue reports or discuss things in chat, without augmenting or rephrasing their messages with the context provided by a clanker. Too many people increasingly rely on a machine to summarize or contextualize it. More and more do I encounter people who converse with me through the indirection of an LLM.

Again, maybe that is not even going to be wrong, but it’s a massive change to how we did things.

Future Harnesses

I have little doubt that this is where things are going but going there will require us to do something about our tooling everywhere, and not just in the coding agents.

Just orchestrating more loops won’t be enough. Better visualizations of changes or orchestration or agents will not restore our understanding. Either we need to find clever ways to jolt the human back into the loop and make the changes of the loops legible long term, or we need to find better ways to compose these ever more complex systems.

This is also where my thinking about the role of Pi is changing. Pi has been cautious, and I think that caution is good. I do not want a future where every interaction turns into an uncontrolled swarm of machines making changes I cannot follow. I would not want Pi to become an unmaintainable mess in an effort to win the race towards software that writes itself and I would not want Pi to promote this type of engineering either. At the same time Pi is a harness and harnesses are at the center of people running these new types of experiments.

Task queues for coding tasks, orchestration of agents, subagents, durable sessions will matter more and more. Even those of us who have their reservations and are not blindly embracing loops will have to start doing those experiments. We need to, because we need to understand how to make this future bounded and survivable.

Controlling Loops

As you can read from this post, I’m very uneasy about this future. Not cause of fear, but because of caution given experiences with this technology so far.

Adopting the idea of harness loops means that the harness decides when work is finished. In the agent loop, the model eventually says “done” and I review. Even before that, I usually steer along the way. I am involved and I enjoy learning along the way. In the harness operated loop I’m not sure what my role even is. Even the “done” signal loses all meanings and just becomes communicated to yet another machine that judges. My role is reduced to that of a messenger.

Today I do not like much of the code that I see from systems built that way and neither do I enjoy interacting with too much of software built with AI assistence. Looping is powerful but it removes responsibility more and more, and it at least today very much encourages us to give in to the machine.

And yet I have no doubts that this looping future is going to be our future despite the fact that I presently resent it. I already see astonishingly small teams building at impossible speed and I see codebases turning more and more into obscure and confusing organisms that can only be diagnosed by more machines. Those codebases are simultaniously useful and messy.

So I guess I’m coming to terms with that the question is not whether we will loop because clearly we will. Maybe the question is that in a future of loops, how do we don’t abdicate judgment, how we can retain rules of good engineering, how we can ensure that responsible human can continue to supervise, how we need to re-think how we architect code to retain sanity along the way.

June 23, 2026 12:00 AM UTC

June 22, 2026


Rodrigo Girão Serrão

Write a coding agent from first principles

Learn how to write a coding agent in Python in this tutorial that teaches how to interact with an LLM through an API, how to manage the context, and how to do tool calling.

Introduction

This tutorial will show you how to create your own coding agent from first principles. By doing so, you'll understand how coding agents work under the hood.

Prerequisites

To be able to follow this tutorial, you'll need

The concepts explained in this tutorial are independent from your LLM provider but the code snippets will make use of the Claude API and its Python SDK. This means that you can follow along with a different model provider as long as you adapt the code snippets to match the format expected by the API of your provider.

What's a coding agent?

A coding agent is an agent that's specialised for coding. In turn, an agent is just an LLM that has been extended with extra functionality that allows it to interact with its environment. This extra functionality is provided through tools, one of the core ideas covered in this tutorial.

This short definition still hides a lot of details, but instead of giving you a theoretical definition you can learn what a coding agent is by creating one. That starts now.

Project set up

To set your project up, start by using uv to create a packageable app project[^2]:

% uv init --app --package agent
Initialized project `agent` at `/Users/rodrigogs/Documents/mathspp/agent`

Then, cd into the project and add the two dependencies you'll need:

% cd agent
% uv add python-dotenv anthropic

You'll use python-dotenv to help you with authentication to access the Claude API and you'll use the dependency anthropic to make it easier to interact with the Claude API.

To set up authentication, create a .env file and paste your Claude API key there in front of the variable ANTHROPIC_API_KEY. When you're done, your .env file should look like this:

ANTHROPIC_API_KEY="sk-ant-api03-qI_3mJ..."

To make sure you never upload your API key to GitHub by accident, add the file .env to your .gitignore:

# .gitignore
# ... other entries generated by uv
.env

Now that you've set up your project, you can make your first request to the Claude API.

Interacting with an LLM

A coding agent needs an LLM at its core. Your LLM can come from any provider you want but you're going to use Claude because its SDK (the dependency anthropic you added in the previous section) is easy to use and because Claude is a popular model provider.

Using the anthropic SDK, here's how you can send a message to the LLM:

# src/agent/__init__.py
from anthropic import Anthropic
import dotenv

dotenv.load_dotenv()  # Load .env

MODEL = "claude-haiku-4-5"...

June 22, 2026 12:32 PM UTC

June 21, 2026


The Python Coding Stack

2. Anatomy of an Agent

Read Stephen's Preface to Agents Unpacked if you're new here.


You have used a large language model. You know the deal: a careful prompt gets a careful answer. A vague prompt gets a vague one. And the model itself does not keep anything from one conversation to the next, unless something external is holding that context for it.

Agents work differently. They have parts that do things a plain LLM does not. These parts are what make an agent an agent. It is not just the model underneath. It is the structure built around it that gives the system its abilities to persist, act, and keep going.

Understanding this structure is the second major shift in this series. The first shift is seeing that a chatbot can give you a good answer without finishing the job, because it stops after responding. The second shift is seeing that an agent is not a smarter model. It is a model placed inside a structure that gives it something to act with and somewhere to keep what it has done.

The Agent Formula

Most agents share the same basic parts:

Different platforms package these differently. Some call memory “context,” some call tools “plugins” or “capabilities,” and some merge instructions and tools into a single configuration layer. But the parts are the same. An agent is not a single thing. It is a system, and each part matters.

Stephen: Don’t LLMs also have memory since they remember what happened earlier in the conversation? How’s this different?

Here is one distinction worth getting clear early: the context window and memory are not the same thing. The context window is the working space an LLM uses during a single session. It holds the conversation so far and gets loaded fresh every time the model gets a chance to speak. Memory, by contrast, is information stored outside the model, maintained by the system, and available across sessions and steps. We will come back to this.

An agent needs all its components:

Agent = Model + Instructions + Memory + Tools + Execution Loop

Leave any one of these out and the system changes behaviour in ways that matter. We will look at each piece in turn.


Subscribe now


What the Model Does and What It Doesn’t Do

The model is the reasoning core. It reads your request, figures out what to do, and decides what to say back. It gets the most attention because it is the part that generates language.

But a model on its own is like a brilliant mind with no hands and no memory of its own. It can think. It cannot act. It cannot remember what happened five minutes ago unless something explicit carries that information forward.

Stephen: Wait a second. You say the model doesn’t remember what happened five minutes earlier. But when I use an LLM, it does seem to remember what happened earlier in the conversation.

Here is what is actually happening. When an LLM appears to remember earlier in a conversation, it is not the model itself that is remembering. The context window is carrying all the earlier messages along with your new message, every time you send something. The model sees the full conversation again and generates a response that fits what came before. That is not memory in the model. That is the system feeding the model a transcript.

This trips up almost everyone when they start using agents. The model generates text. The rest of the system decides what to do with that text and whether to act on it.

A better model helps. It reasons more clearly, follows instructions more faithfully, and handles edge cases better. But dropping a smarter model into an agent that is missing a working execution loop will not make it an agent. You need the other parts too.

Instructions: The Agent’s Direction

Instructions tell the agent what it is supposed to do and how to behave. Some systems call these system prompts. Others call them agent definitions or behavioural instructions. The name does not matter. What matters is that they are the layer that tells the model why it exists, who it is helping, and what ‘good’ looks like for the task at hand.

Good instructions do not make an agent smarter. They make it more focused. They give it a frame for every decision: what to prioritise, what to avoid, when to ask for help, how to present its output.

Stephen: Are these what are often called ‘skills’, or are skills something else altogether?

Skills and instructions are related, but they are not the same thing. Instructions are the core behavioural direction: who the agent is, what it is for, how it should approach its work. A skill, in platforms like OpenClaw and Hermes, is a specific file that tells the agent how to carry out a particular task, often by combining one or more tools. So instructions tell the agent how to behave generally. A skill tells it how to do something specific. We will see this distinction more clearly when we look at how different platforms implement these parts.

The instructions shape what the agent notices, what it proposes, what it tries, and what it says no to. Two agents built on the same model with different instructions will behave differently in the same situation. They will notice different things, prioritise differently, and produce different outcomes.

Poorly written instructions can quietly break an agent. If the instructions are vague, the agent has to improvise every step. If they contradict each other, the agent has to choose, and it might not choose the way you intended.

Stephen: Can you provide a few examples of what these instructions may look like in different scenarios?

Here is what instructions might look like in practice. A poorly-written instruction can quietly break an agent. Consider an instruction that says “be helpful and concise” without defining either term. When a user asks for a full technical breakdown, the agent has to arbitrate between two vague goals. It might give a two-sentence answer that technically satisfies “concise” but ignores “helpful,” or it might give an exhaustive response that satisfies “helpful” but ignores “concise.” Either way, the agent is improvising because the instructions gave it no real frame for the conflict.

A research assistant agent might have instructions that say something like: “You are a research associate working for [user name]. Your role is to find, summarise, and organise information on topics the user assigns. Always cite your sources. Flag uncertainty rather than guessing. Present findings in a clear brief, not a wall of text.”

A code review agent might have very different instructions: “You are a principled code reviewer. Focus on correctness, clarity, and performance. Do not praise code unnecessarily. When you find an issue, explain why it matters and suggest a concrete fix. Keep responses short.”

The difference between those two sets explains a lot about why two agents can feel like entirely different systems, even if they use the same model underneath.

Memory: The Workspace and Context

Memory in an agent is not like human memory. It is a structured store of information kept and updated as the agent works. It is what lets the agent hold a thread across multiple steps without starting from scratch each time.

Most agents use some combination of three types:

This is not a personality feature. It is not the agent “remembering” in the way a person remembers their childhood. It is operational continuity. The system maintaining a thread of relevant information across time and steps.

Different platforms handle these differently. LangChain agents build up a rolling context window: the current request gets appended to everything that happened before, and the whole thing is passed to the model. If the conversation gets long, older turns get dropped or summarised to make room. AutoGen agents can maintain shared memory across a team, so that when one agent finishes a task, what it learned is available to the next agent that picks up the thread.

OpenClaw takes yet another approach. Its memory layer is a structured store that agents write to and read from across sessions. When an agent starts a new session, it can query that memory store for relevant context rather than relying solely on what was in the most recent conversation. An agent can know that the user prefers short emails, even if that was established three weeks ago.

Stephen: If memory can be stored in files, does it mean that agents can have nearly unlimited memory (within the limits of the computer or server’s overall memory capacity)?

There are practical limits even when storage is effectively unbounded. The more relevant limit is not how much the agent can store, but how well it can find and use what it has stored. A full inbox is not the same as a well-organised one. Retrieval becomes harder as memory grows, and irrelevant information can dilute the signal if the system does not manage it carefully.

Think of it this way. A context window that holds 128,000 tokens can technically hold a lot of information. But it can only hold what was placed there. An agent with a large memory store full of useful context still needs a way to surface the right information at the right time. If it cannot find what it needs, or if what it finds is buried under noise, the effective memory is constrained.

The quality of retrieval matters as much as the quality of storage. An agent that retrieves relevant context poorly is effectively working with a much smaller memory than one that retrieves well, even if both store the same amount.

Stephen: So, tell me if I understood this. The agent has an index telling it where to find information specific to certain topics or tasks. When the LLM part of the agent decides it needs to deal with a certain topic, it uses the index to read and load the information from the memory file into its context. Is that right?

That is broadly right. The memory store, the index, and the retrieval into context are the key parts. One small correction worth noting: the decision to retrieve from memory is typically made by the agent or coordinator layer, not by the LLM directly. The LLM receives the retrieved content as part of its context, but it is the agent system that decides what to look up and when. This distinction matters because it is the agent layer, not the model, that is doing the memory management.

Stephen: But isn’t the agent’s brain the LLM? Clarify the distinction in your answer above. Which part of the agent’s infrastructure deals with this?

It is a fair challenge. The LLM is genuinely where the reasoning happens. It reads context, generates text, and makes decisions about what to say or do next. But it is also just a text processor. It receives input, produces output, and has no awareness of anything beyond the tokens it has been given.

The coordinator layer is the infrastructure that sits around the LLM and manages the process. It reads the LLM’s output, decides whether to act on it, calls tools, retrieves memory, and feeds results back into the next LLM call. It is the difference between the LLM thinking and the agent doing. A bare LLM generates text. The coordinator turns that text into action.

To use a rough analogy: the LLM is like a pilot who can read instruments and make decisions. The coordinator is like air traffic control — it decides which runway to use, when to land, and when to divert. The pilot’s brain does the reasoning. But without the infrastructure around it, the pilot just sits in the cockpit thinking.

So when we say the agent retrieves memory, we mean the coordinator retrieves it and places it where the LLM can see it. The LLM does not reach into a file and pull something out. The coordinator does that work and presents the result to the LLM as part of the next context.

Stephen: And are the bits of these files then loaded into the LLM’s context? Therefore, the more stuff is loaded from the memory files, the more the context fills up, affecting the rest of the conversation and cost, right?

Yes, exactly right. Memory retrieval feeds into the context window, which is the LLM’s working space for the current session. Every token that goes into the context window is a token the LLM processes and a token that costs something. Loading a lot of context from memory means less room for the conversation itself, and it means higher token usage on every call.

This is one of the practical engineering tensions in agent design. Loading more memory gives the agent more to work with, but it also makes each LLM call more expensive and slower. A well-designed agent retrieves only what is relevant to the current task, not everything it knows.

Tools: What the Agent Can Actually Do

Tools are the capabilities that let an agent act beyond generating text. The model decides to use a tool. The tool performs an action and returns the result to the model.

This was covered in Chapter 1 under “Tools Are the Hands.” Here it is worth noting that tools are also where agents differ most between platforms. Some agents come with a large built-in toolkit. Others can call external tools through open protocols. Some let you build custom tools. Others are more locked down.

What tools might an agent actually have? A research agent might be able to search the web and read files on your machine. A coding agent might run shell commands and read or write files. A calendar agent might check your schedule and send messages. The tool is the bridge between the model’s decisions and the world the agent is working in.

What matters is not how many tools an agent has, but whether the tools it has are the right ones for the tasks you want it to perform.

Different platforms implement tools differently. LangChain provides a standardised tool interface that lets you connect to search APIs, databases, file systems, and custom functions. OpenCode agents run inside a development environment, where the tools available are the commands and interfaces of that environment. OpenClaw uses an open tool protocol that lets agents call external capabilities regardless of who built them. Hermes takes a more composed approach: a skill file specifies not just what the agent should do, but which tools to use and in what combination to carry out a specific task.

Here is the thing worth unpacking. A tool on its own is just a capability. What makes it useful is the bridge between what the agent is trying to accomplish and the tool that can help. A calendar tool is useless if the agent does not know it should check the schedule. An agent running a meeting-preparation skill that says “check availability, send invites, prepare a briefing document” has that bridge built in.

The Execution Loop: The Part That Makes It an Agent

The execution loop is the cycle that takes an agent from a single-shot response to a sustained process. Observe, think, act, check, repeat.

This was the core of Chapter 1. But it is worth restating here, in the context of anatomy, because the loop is what ties all the other parts together. Without it, you have a model that receives instructions and context and produces text. With it, you have a system that can pursue a goal across time, recover from partial failures, and stop when the work is genuinely done.

The loop is the difference between an agent and a very well-instructed chatbot.

Here is why the repeat step matters so much. A model has no native sense of when it is done. When you call a function in code, the function returns and you are finished. When a model generates text, it produces tokens until it hits a stop condition built into the model itself, most commonly a token limit or a designated stop sequence. These conditions tell the model when to stop generating, but they do not tell the agent whether the result is actually what the user wanted. There is no built-in check that says “is this the right answer?”

The execution loop provides that check. The check phase asks: is the result good? Does it meet the original goal? If not, the loop continues. Sometimes that means a dozen or more cycles before a task is genuinely complete.

The loop also determines how goals decompose. In LangChain’s ReAct-style agents, the loop runs inside a single agent: observe, decide on the next action, execute it, check the result, repeat. In AutoGen, the loop is distributed across multiple agents that hand off to each other. A planner agent might coordinate specialist agents, each running their own loop on their own piece of the problem. OpenClaw uses a coordinator agent to manage the loop, assigning work to sub-agents and handling the check phase across the full task rather than within a single agent cycle.

The architecture of the loop is one of the most significant differences between agent platforms. But the function is the same everywhere: turning a sequence of isolated model calls into a coherent, goal-directed process.

Multiple Platforms: Comparing the Formula in Practice

It helps to see the same five-part formula playing out in different platforms. Here is how a few of them map onto it.

LangChain is one of the most widely-used agent frameworks. A LangChain agent has an LLM at its core, a set of tools, a prompt defining the agent’s role, memory that accumulates conversation history, and an agent executor that runs the loop. The loop in LangChain is explicit: the agent executor repeatedly calls the model, parses the model’s tool-call output, runs the tool, and feeds the result back until the model says it is done.

AutoGen takes a different approach. Rather than a single agent, AutoGen sets up a team of agents that communicate with each other. Each agent has a model, instructions defining its role, and its own set of tools. The loop is distributed: there is no single execution cycle. Agents exchange messages, delegate tasks to each other, and the overall process continues until the team has finished the assigned goal. Memory in AutoGen can be shared across agents so that one agent’s work is available to the next.

OpenClaw uses a coordinator agent that manages the overall execution loop. Sub-agents each have their own identity, tools, and memory. The coordinator decides which sub-agent handles which part of a task, passes context between them, and handles the check phase across the full goal. Skills in OpenClaw are files that tell a specific agent how to carry out a particular task, combining instructions about what to do with definitions of which tools to use.

Hermes also uses a skill-based architecture where skill files define both the instructions and the tool configuration for specific tasks. Rather than a single general-purpose agent, Hermes composes agents from skills that know how to use particular tools in particular contexts.

OpenCode works differently again. It runs agents inside a development environment, typically a cloud workspace. The tools available to the agent are the commands and interfaces of that environment. The loop is typically managed at the task level: the agent receives a task, works through it using the tools at its disposal, and reports back. There is less of a formalised multi-step loop and more of a task-completion focus.

None of these platforms invents new parts of the agent formula. They all use a model, instructions, memory, tools, and an execution loop. What differs is how those parts are implemented, how they are divided up, and how they communicate. Understanding the formula means you can look at any of these platforms and see what you are actually looking at.

What This Chapter Covered

This chapter pulled apart the five components of the agent formula.

We saw how the model is the reasoning core but cannot act or remember on its own. How instructions shape the agent’s focus and behaviour, and why the same model with different instructions can feel like a different system entirely. How memory provides operational continuity across steps and sessions, and why retrieval quality matters more than storage capacity. How tools extend what the agent can do beyond generating text, and why a tool is only as useful as the bridge between the model’s decisions and the action the tool can take. And how the execution loop is the architecture that turns isolated model calls into a coherent, goal-directed process.

We also saw how different platforms implement the same five components differently: LangChain’s explicit agent executor, AutoGen’s team-based coordination, OpenClaw’s coordinator and skill-based sub-agents, Hermes’s composed skill architecture, and OpenCode’s environment-integrated approach.

The goal was not to become an expert on any one platform. It was to show that agents are not mysterious black boxes. They are systems built from a small number of recognisable parts, and once you know what to look for, you can see the anatomy underneath any agent platform you encounter.

Next up in Agents Unpacked: we dig into tools and skills: what it actually means for an agent to do something rather than just say it, and why a well-tooled agent operating autonomously in a loop is a fundamentally different thing from a model answering questions.


<< Previous Post: From Answer to Outcome

>> Next Post: Coming Soon

Table of Contents


stephengruppetta.com

June 21, 2026 09:48 PM UTC


Christian Ledermann

Stop Copy-Pasting Your .pre-commit-config.yaml

Stop Copy-Pasting: Introducing pc-init

We’ve all been there: you’re starting a new project, you’ve got your repo initialized, and now comes the tedious part—setting up the quality gates.

You know you need pre-commit (or the newer prek) to keep your code clean, but you end up hunting through your older repositories to find the "best" .pre-commit-config.yaml to copy and paste. Then, you spend ten minutes editing paths, versions, and configurations to match the specific needs of your new stack.

It’s a chore that breaks your flow before you’ve even written a line of code. That is exactly why I built pc-init.

The Problem: The "Config Archeology" Workflow

Most languages and frameworks have a gold standard for quality tools. If you're working in Python, you want ty and ruff; in React, you want eslint and prettier.

Manually setting these up requires:

  1. Identifying the recommended tools for your specific stack.
  2. Looking up the correct hook repository URLs, revisions, and arguments.
  3. Writing the YAML by hand (and hoping you didn't miss a syntax error).

Doing this every other month is just frequent enough to be a persistent pain point, but not frequent enough to have the workflow committed to muscle memory.

The Solution: pc-init

pc-init is a CLI tool designed to replace "config archeology" with a single, declarative command. It scaffolds a production-ready .pre-commit-config.yaml based on your project's technology stack.

Instead of copying old files, you simply tell pc-init what you're building, and it builds the configuration for you.

How it works

pc-init uses a system of language and framework presets. You provide the parameters, and it handles the rest:

# For a standard Python project
pc-init --lang py

# For a JavaScript project using React
pc-init --lang js --framework react

# For a Python project using Django
pc-init --lang py --framework django --force

Why use pc-init?

Get Started

If you’re ready to speed up your setup process, you can install pc-init via uv:

uv tool install pc-init

After generating your config, don't forget to run pre-commit autoupdate or prek autoupdate to ensure you are pulling the very latest versions of your selected tools.

If you have suggestions for new presets or run into issues, please head over to the GitHub repository and open an issue. Let’s make boilerplate setup a thing of the past.

June 21, 2026 10:53 AM UTC


Bob Belderbos

From Python to Rust: Master Iterators by Rebuilding 10 Unix Tools

The fastest way I know to learn a language is to rebuild something you already understand. You stop fighting the problem and spend your attention on the syntax and the idioms. That is the whole idea behind the new Unix tools track I just released on the Rust platform: ten small command-line classics, each one a pure function you implement and cargo test to validate you got it right.

Why Unix tools, and why from Python

Every exercise opens with the Python you would write, then teaches the Rust idiom that replaces it. People who love the platform keep pointing at the same thing: the Python-to-Rust bridge is what makes the concepts stick.

You are not memorizing Iterator methods in the abstract, you are watching len(text.split()) turn into .split_whitespace().count(), and reaching for Option and Result instead of Python's exceptions.

Take cut. In Python an invalid field raises an exception at runtime:

def cut(text, delim, field):  # field is 1-based
    if field == 0:
        raise ValueError("field values may not include zero")
    ...

In Rust both outcomes live in the return type:

#[derive(Debug, PartialEq)]
enum CutError {
    ZeroField,
}

fn cut(text: &str, delim: char, field: usize) -> Result<Vec<&str>, CutError> {
    // ...
}

#[test]
fn field_zero_is_an_error() {
    assert_eq!(cut("a:b", ':', 0), Err(CutError::ZeroField));
}

Encoding the failure in the type, not just the docs, is a core Rust idea: the compiler makes every caller account for it, and that is a big part of what makes Rust code safer. The failing case becomes a test you code towards.

Iterators are the spine of the track, seven of the ten exercises turn for loops and comprehensions into iterator chains. Every exercise relates back to one or more Python idioms you already know.

For me Rust is hard but thanks to comparison with Python I feel that I understand more. - Piotr R

Having a direct comparison with Python snippets keeps me more in the context of what's going on. - Michal S

The track also follows one rule that matters for real CLIs: the logic lives in a pure, testable function, while the I/O (reading a file or stdin) stays in a thin wrapper around it.

That split is exactly how you would structure a professional Rust tool, and it is why every exercise can be validated by a test instead of by running a binary. I wrote more about how this rewiring changes your Python instincts in Rust made me a better Python developer.

The 10 exercises and what each one teaches

  1. wc: count lines, words, characters. Iterators plus .count() replace len(...), and you return a (usize, usize, usize) tuple. The character count is also a sneak intro to why chars().count() is not len() in a Unicode world.

  2. head & tail: first and last N lines. head uses lazy .take(n) and stops early; tail forces you to collect first and slice from the end, because you cannot run an iterator backwards.

  3. cat -n: number the lines. Rust's .enumerate() starts at 0, not 1 (there is no start= argument), and you rebuild the numbered text from there.

  4. tr: translate and delete characters. The exercise that makes the char vs &str distinction click: 'l' is a char, "l" is a &str, and you .map() over .chars() to rewrite each one.

  5. grep: filter matching lines, with -i and -v. Substring tests with .contains(), case folding, and a single boolean condition that handles both -i and -v without branching. First taste of borrowing and lifetimes in the signature.

The Unix tools track on the Rust platform: all 10 exercises from wc to the top_words capstone, with difficulty levels and completion status

  1. cut: extract a field. Two outcomes that pull in different directions: a missing field is normal (skip the line), but field == 0 is a bad request. You model that with Option and Result instead of Python's exceptions.

  2. uniq -c: count adjacent duplicates. Rust has no itertools.groupby, so you walk runs by hand with pattern matching and a Vec, keeping the consecutive-only behavior honest.

  3. sort: sort lines, with -n. Iterators do not sort, so you collect into a mut Vec and call .sort(); numeric order means sort_by_key with a closure, Rust's answer to Python's key=.

  4. sed s///: find and replace per line. str::replace swaps every match, replacen stops after a count, so the g flag is just a choice between two methods, no manual counting.

  5. top_words: the capstone. Compose everything: count word frequencies with the HashMap entry API (entry().or_insert(), since there is no Counter), then sort and take the top n. This is the tr | sort | uniq -c | sort -rn | head pipeline rebuilt as one Rust function.


If you have been meaning to get past reading about Rust to actually writing it and making the concepts stick, begin with the free wc and head / tail exercises. Each one is short, has tests to code towards, and there is no AI on the platform; you have to do the work. I hope you learn a lot: start the Unix tools track.

Next up on the platform: a track on Rust lifetimes.

June 21, 2026 12:00 AM UTC

June 20, 2026


Bob Belderbos

Profile First: A 10x Faster Django Test Suite

The Rust Platform Django test suite took 30 seconds to run. I had a hunch it was database-related. Of course I was wrong. I profiled it with cProfile and cut it from 30 to 3 seconds.

Stop guessing, run the profiler

The instinct on a slow test suite is to start making assumptions: too many fixtures, the database is slow, I should parallelize, it's the GIL. Every one of those is a real fix for some issues. The problem is you don't know until you measure.

From High Performance Python:

Sometimes it’s good to be lazy. By profiling first, you can quickly identify the bottlenecks that need to be solved, and then you can solve just enough of these to achieve the performance you need. If you avoid profiling and jump to optimization, you’ll quite likely do more work in the long run. Always be driven by the results of profiling.

cProfile ships with Python. In this case I pointed it at my test suite:

uv run python -m cProfile -o tests.pstats -m pytest -k unit

That gives you a pstats file. You can read it from the command line, but it's not intuitive:

uv run python -c "import pstats; pstats.Stats('tests.pstats').sort_stats('tottime').print_stats(10)"

Adam Johnson recently released profiling-explorer, a browser viewer over the same data. You can invoke it like this:

uvx profiling-explorer tests.pstats

It opens on http://127.0.0.1:8099. I sorted by internal time and the bottleneck was obvious:

profiling-explorer showing pbkdf2_hmac dominating internal time in the Django test run

Reading the numbers without fooling yourself

Two columns matter:

Sort by tottime to find what to fix:

FrameCallstottimeShare
_hashlib.pbkdf2_hmac17726,470 ms82.8%
psycopg2 cursor.execute4,6821,522 ms4.8%

pbkdf2_hmac is Django's default password hasher. It's meant to be slow, because it resists brute-forcing real passwords. But it fires on every test that creates a user: fixtures, create_user, client.login. In production it runs once per login. In a test suite it runs hundreds of times for no security benefit at all.

One more thing the table teaches you: on a high-call-count frame, the call count is the signal, not the time. cursor.execute at 4,682 calls is mostly real database wait, and the count is high because every test sets up its own data: inserts, auth lookups, session reads, across 355 tests. A high count is worth a second look (it can hide an N+1), but you have to confirm that before believing it. (cProfile adds a fixed cost per call too, but at a few thousand calls that overhead is marginal; it only distorts the picture when a frame fires millions of times.) PBKDF2 only ran 177 times, so its time really was concentrated there, and it was the real culprit.

The five-line fix

Swap the hasher to fast MD5 in tests only. An autouse fixture in conftest.py (default "function" scope does it for every test):

import pytest


@pytest.fixture(autouse=True)
def fast_password_hashing(settings):
    settings.PASSWORD_HASHERS = ["django.contrib.auth.hashers.MD5PasswordHasher"]

Hash strength is irrelevant under test, so there's no loss of coverage. Test suite speed went from ~30s to ~3s.

It turns out this is also a documented Django trick, which is the point: the profiler led me to a known fix I didn't know I needed, instead of me making assumptions.

The new top frame is cursor.execute at ~4.7k calls, which smelled like an N+1. So I measured that too: I wrapped the suite to capture every query and grouped them by shape. No N+1. The list views already batch their lookups into one query, and the high count was just 355 tests each doing honest setup. The only real waste was a save() path firing redundant COUNT queries, cheap, but easy to halve.

Which is the whole point: I guessed N+1 and was wrong again. The profiler keeps you honest. What's the slowest thing you run every day that you've never actually measured? Let me know on LinkedIn or X/Twitter.

June 20, 2026 12:00 AM UTC

June 19, 2026


Core Dispatch

Core Dispatch #6

Welcome back to Core Dispatch! This edition covers June 4 through 19, 2026. Python 3.14.6 and 3.13.14 landed on June 10, and the next milestone is 3.15.0 beta 3 on June 23.

The big news this fortnight comes from the Steering Council, who put out an announcement on the path forward for the experimental JIT. The JIT entered CPython's main branch as an experiment, alongside the Informational PEP 744. The Council would like to see its path forward worked out through a Standards Track PEP, giving the project the explicit, structured conversation it hasn't really had yet about what people expect from a JIT, including performance targets, interop guarantees, and tooling compatibility.

On a related note, JIT contributors have opened a thread to gather community perspectives on the JIT as they begin drafting that PEP. Give it a read, and if you've got experiences, expectations, or concerns to share, it's a good place to weigh in.

It's been a bit quieter on the PEP front over the past two weeks, though PEP 835, a shorthand syntax for Annotated type metadata, was newly drafted.

Over on the PSF side, the Board has published the draft of its 2026 strategic plan: six organizational goals and four program goals spanning financial sustainability, supply chain security, and community empowerment. The feedback window is open through June 25, so if you've got thoughts, now's the time. The 2026 PSF Board election dates are out too.

As always, if you maintain a package or just like living on the edge, give the latest 3.15 beta a spin and file any issues you find.

Upcoming Releases

Official News

PEP Updates

Steering Council Updates

Merged PRs

Discussion

Core Dev Musings

Upcoming CFPs & Conferences

Community

Credits

June 19, 2026 12:00 AM UTC


Bob Belderbos

End-to-End Testing Every Rust Exercise with Playwright

The Rust platform has 71 exercises and counting (I just added a new track of Unix exercises). They all share the same interface: load an editor, type code, validate it against a Rust backend. When I make any changes to the platform, how do I confirm nothing breaks? Enter end-to-end testing with Playwright.

The Problem

Manual testing doesn't scale. Every time I add an exercise, tweak the editor, or update the validation flow, I need confidence that all exercises still work. Not just that the page loads, but the full loop: login, navigate, type code, submit, see results.

Unit tests cover the Django app, and the Rust validator has its own test suite. But neither exercises the full path a student takes: loading an exercise and getting a real pass or fail back.

One Test Function, 71 Test Cases

Playwright with pytest covers this in under 50 lines. Here's the core test file:

import psycopg2
import pytest
from decouple import config

from .constants import DOMAIN

exercises = []
with psycopg2.connect(dsn=config("DATABASE_URL")) as conn:
    with conn.cursor() as cursor:
        cursor.execute(
            "SELECT slug, solution FROM bites_exercise WHERE public = true"
        )
        exercises = cursor.fetchall()


@pytest.mark.parametrize("exercise", exercises, ids=[ex[0] for ex in exercises])
def test_exercise(logged_in_page, exercise):
    slug, solution = exercise
    page = logged_in_page

    exercise_url = f"{DOMAIN}/{slug}"
    page.goto(exercise_url)
    page.wait_for_url(exercise_url)

    page.wait_for_selector(".CodeMirror", state="visible")
    page.wait_for_function(
        "document.querySelector('.CodeMirror')?.CodeMirror !== undefined"
    )

    page.evaluate(
        f"""document.querySelector('.CodeMirror').CodeMirror.setValue({repr(solution)})"""
    )
    page.click("#validate-button")

    page.wait_for_function(
        "document.querySelector('#feedback').innerText.includes('Congrats') || "
        "document.querySelector('#feedback').innerText.includes('Oops')",
        timeout=30000,
    )

    validate_result = page.text_content("#feedback")
    assert "Congrats, you passed this exercise" in validate_result

The database query at module load fetches every public exercise with its solution. @pytest.mark.parametrize turns that into 71 test cases. Each test navigates, injects the solution, validates, and asserts success.

Here is Playwright running the tests locally against the real Rust validator, one exercise after another:

Your browser does not support embedded video.

Patterns That Made It Work

Session-scoped fixtures for speed

Launching a browser is expensive. Logging in is expensive. Do it once:

@pytest.fixture(scope="session")
def browser(e2e_user):
    with sync_playwright() as p:
        with p.chromium.launch(headless=HEADLESS) as browser:
            yield browser


@pytest.fixture(scope="session")
def logged_in_page(browser):
    page = browser.new_page()
    page.set_default_timeout(30_000)
    page.goto(f"{DOMAIN}/pbadmin/")
    page.fill('input[name="username"]', LOGIN)
    page.fill('input[name="password"]', PASSWORD)
    page.click('input[type="submit"]')
    yield page

All 71 exercises run against the same authenticated browser session, so the login cost is paid once.

Waiting for CodeMirror

One tricky thing with Playwright is timing. Sometimes elements are not yet ready when you hit the page. In this case you have to wait for the CodeMirror editor to be fully initialized before injecting code:

page.wait_for_selector(".CodeMirror", state="visible")
page.wait_for_function(
    "document.querySelector('.CodeMirror')?.CodeMirror !== undefined"
)

page.evaluate(
    f"""document.querySelector('.CodeMirror').CodeMirror.setValue({repr(solution)})"""
)

First we wait for the selector to be visible, then we wait for the JavaScript instance to be ready.

Avoiding Django's async context trap

Another issue I faced was creating a Django user inside a Playwright fixture triggered:

SynchronousOnlyOperation: You cannot call this from an async context

I worked around it by creating the test user in a separate fixture that runs before Playwright starts:

@pytest.fixture(scope="session")
def e2e_user(django_db_blocker):
    with django_db_blocker.unblock():
        return ensure_e2e_user()


@pytest.fixture(scope="session")
def browser(e2e_user):  # e2e_user runs first
    with sync_playwright() as p:
        ...

The django_db_blocker.unblock() context manager allows database access in session-scoped fixtures. Order matters: the user must exist before the browser fixture runs.

Running Locally vs CI

The E2E suite runs against a live database with the Rust validator running. That's deliberate: for this layer I want real integration, not mocked responses. (The mocked cases have their own home, more on that below.)

# Run all 71 exercises
uv run pytest tests/test_e2e.py -v

# Debug a specific exercise
HEADLESS=False uv run pytest tests/test_e2e.py -v -k "exercise-slug"

By default, the tests run headless, which means no browser window opens. This is faster and works well in CI. If you want to see what Playwright is doing, set HEADLESS=False to open a visible browser window.

This is essential for debugging why a particular exercise fails. Use the -k option to filter for a specific exercise by slug. And you can use --pdb to leave the browser window open when a test fails, so you can inspect the state.

For CI, I run unit tests only. E2E tests require the Rust backend and take longer. I run them locally before pushing major changes. This is a good example of separating unit and integration tests.

What about the unhappy paths?

A reader pointed out that these E2E tests only cover the happy path: type the correct solution, see "Congrats". Fair observation. What happens when a student submits wrong code, or the validator itself blows up?

Those cases live in the unit tests, where I mock out the validator API call:

ScenarioTestAsserted message
Wrong code (tests fail)test_validate_failuremock returns success: Falseb"Oops, try again"
Correct codetest_validate_successsuccess: Trueb"Congrats"
Runner/API throwstest_validate_api_errormock_post.side_effect = Exceptionb"Error while executing code"
Exercise doesn't existtest_validate_exercise_not_foundb"Exercise not found"
Tests deleted/alteredtest_validate_missing_testsb"tests are missing"
Prohibited code (std::fs, unsafe, include!, std::io...)test_validate_prohibited_pattern_*b"prohibited pattern" (parametrized over 6 snippets)

This is the split that makes the whole thing manageable. The E2E suite proves the full loop works end to end against the real validator. The unit tests, with the validator mocked, assert the exact message a student sees in each error case. Mocking is the natural home for these: some failure modes, like the runner throwing an exception, are hard to trigger on demand against a live backend, and even the ones you could reproduce in a browser are faster and clearer to pin down with a mock. See How to Tell if Your Python Mock Is Actually Working for the gotchas there.

What this buys me

Add an exercise, it's tested automatically, no new test code. That's the whole point of parameterizing over the database instead of hand-writing cases. Every frontend change now runs through a regression suite before it reaches users.

I find Playwright more modern and ergonomic than Selenium; the one rough edge is element timing, which the wait_for_function calls above handle. If you're testing anything with dynamic content, parameterizing over your real data beats writing one test per case.

June 19, 2026 12:00 AM UTC

June 18, 2026


Ned Batchelder

Dodecahedron with stars

I saw this dodecahedron with an Islamic-inspired pattern designed by Taj Ragoo. As soon as I saw it, I knew I had to make one. I studied the pattern, wrote some Python, and made myself a PDF. I cut it out, folded it, glued it together, and now I have one of my own:

A paper dodecahedron on a cluttered desk

I love that this elegantly combines two pure geometric forms: the Platonic dodecahedron (12 uniform pentagons), and an Islamic pattern using five-pointed stars.

Looking closely, details emerge:

The dodecahedron with some regions of the pattern highlighted in colors

Each face has ten small stars in a ring. I’ve lightened them a bit in the front face here. At the center of each face is a ten-pointed star (highlighted in red), made of two overlaid five-pointed stars.

The real genius of the pattern is at the corners. I’ve highlighted one in blue. It’s a star made of the same parts as the central ten-pointed star, but there are only nine points. It works because three pentagons lying flat touching at a point occupy 324 degrees, leaving a 36-degree gap.

When the dodecahedron is folded together, the gap is closed. 36 degrees is exactly one-tenth of a complete 360-degree circle, so exactly one point of the ten-pointed star is missing, leaving a perfect nine-pointed star using the same shapes, spread over the corners of three pentagons. Beautiful!

If this appeals to you, follow Taj on Instagram: he’s got more Platonic/Islamic mashups to enjoy. The paper versions are just prototypes of the final versions he makes in wood.

Of course, you can get my PDF and make one for yourself:

A thumbnail of the PDF

The Python code to draw the net isn’t great: it has no real parallels to the structure of each face. It’s a lot of math and line drawing to get things in the right places. My ideal would be to have a toolset that used a tile-placing abstraction, to be able to do more interesting designs. Some day.

It was a joy to work on this though. It was a slow process of studying the original, working out the math, then mulling over coding approaches. The code was developed in small steps over weeks. Then printing initial versions, marking them up, working out the tab structure. Some copies were colored to understand how the lines flowed across the whole dodecahedron. It was good to be working in both the mental and physical worlds:

Various stages of progress in a messy pile

Update: it looks like the design was originally by Dana Awartani: Dodecahedron Within an Icosahedron.

June 18, 2026 10:15 AM UTC


Python Software Foundation

PSF Board Election Dates for 2026

Python Software Foundation (PSF) Board elections are a chance for the community to choose representatives to help the PSF create a vision for and build the future of the Python community. This year, there are 4 seats open on the PSF Board. Check out who is currently on the PSF Board on our website. (Cheuk Ting Ho, Christopher Neugebauer, Denny Perez, and Georgi Ker are at the end of their current terms.) 

The recent approval of the Packaging Council (PC) through PEP 772 means that the PC election will be held in parallel to the PSF Board election. For the first PC election, communications will be published on the PSF blog. Once the first PC has been established, they will define the standard lines of communication and more PC election process specifics for the future. More information on the PC election coming soon.

Board Election Timeline

Voting 

You must be a Contributing, Supporting, or Fellow member by August 25th and affirm your intention to vote to participate in this election. Reminder: If you were formerly a Managing member, your membership type was changed to Contributing per 2024’s Bylaw change that merged Managing and Contributing memberships

Check out the PSF membership page to learn more about membership classes and benefits. You can affirm your voting intention by following the steps in our video tutorial:

Per another recent Bylaw change that allows for simplifying the voter affirmation process by treating past voting activity as intent to continue voting, if you voted last year, you will automatically be added to the 2026 voter roll. Please note that if you removed or changed your email on psfmember.org, you may not automatically be added to this year's voter roll. 

If you have questions about membership, please email psf-elections@pyfound.org.

Election communications from psfmember.org

PSF Members should review their communication preferences on psfmember.org if you would like to opt in or out of receiving emails about either the PSF Board, PC elections, or both. Here’s how:

If you had previously opted out of communications from the PSF through psfmember.org and would like to start receiving them, we encourage you to update them using the instructions above. If you're not sure what how your psfmember.org communication preferences are currently set, you can check via the "Name and Address" tab mentioned above, and make any adjustments as desired. 

The PSF only sends a handful of election and fundraising related communications every year via psfmember.org. The PSF newsletter runs through a separate mailing list (and we heartily welcome you to sign up for our newsletter!). 

Run for the Board

Who runs for the board? People who care about the Python community, who want to see it flourish and grow, and also have a few hours a month to attend regular meetings, serve on committees, participate in conversations, and promote the Python community. We're looking for candidates with a diverse range of skills and backgrounds, including leadership experience, fundraising knowledge, non-profit familiarity, and event organizing. Technical expertise, a record of collaboration, and experience speaking or teaching in the Python community are also all qualities we hope to see in Board members.

Want to learn more about being on the PSF Board? Check out the following resources to learn more about the PSF, as well as what being a part of the PSF Board entails:

You can nominate yourself or someone else. If you're nominating someone else, we'd encourage you to reach out to them first to make sure they're excited about the opportunity and give them a heads up that they'll need to submit their own nomination statement too. Nominations open on Tuesday, July 28th, 2:00 pm UTC, so you have time to talk with potential nominees, research the role, and craft a nomination statement for yourself or others. Take a look at last year’s nomination statements for reference. 

Learn more and join the discussion

You are welcome to join the discussion about the PSF Board election on our forum. This year, we’ll also be hosting PSF Board Office Hours on the PSF Discord in July and August to answer questions about running for and serving on the board. Subscribe to the PSF blog or, if you’re a member, join the psf-member-announce mailing list to receive updates leading up to the election.

June 18, 2026 08:18 AM UTC


Bob Belderbos

When to use classmethod, staticmethod, or instance method in Python

In a coaching call this week we discussed a create classmethod, and someone asked the obvious question: why is that here? It just forwarded its arguments to __init__. We ended up discussing the difference between instance methods, classmethods, and staticmethods, and how to tell which is which. Here's a simple decision rule.

The decision rule

Look at what the method actually touches:

  1. Needs the instance (self) → instance method
  2. Needs the class (cls) but not a specific instance → @classmethod
  3. Needs neither@staticmethod

Nice, but what are some actual use cases? Let's look at the create method that prompted the question.

The create method from the call fails the rule above. It took the same arguments as __init__ and passed them straight through. It still adds a nice interface (Class.create(...)), but it doesn't do any work that the constructor doesn't already do:

# shortened for clarity
@classmethod
def create(cls, amount: Decimal, currency: Currency = Currency.EUR) -> "Expense":
    return cls(amount=amount, currency=currency)

When a classmethod earns its place

A classmethod pulls its weight when it does work the constructor shouldn't, or builds the object from a different starting point. Add a normalization step and the same method suddenly has a job:

@classmethod
def create(cls, amount: Decimal, currency: Currency = Currency.EUR) -> "Expense":
    return cls(amount=amount.quantize(Decimal("0.01")), currency=currency)

The canonical use of a @classmethod is the alternative constructor. Python won't let you overload __init__, so when you want to build an object several ways, each way becomes a classmethod.

The standard library has rich examples, for example take a look at datetime.date:

date.today()                      # from the system clock
date.fromtimestamp(1718539200)    # from a POSIX timestamp
date.fromisoformat("2026-06-16")  # from an ISO 8601 string
date.fromordinal(739418)          # from a proleptic Gregorian ordinal
date.fromisocalendar(2026, 25, 1) # from ISO year/week/day

Source:

    # Additional constructors

    @classmethod
    def fromtimestamp(cls, t):
        "Construct a date from a POSIX timestamp (like time.time())."
        if t is None:
            raise TypeError("'NoneType' object cannot be interpreted as an integer")
        y, m, d, hh, mm, ss, weekday, jday, dst = _time.localtime(t)
        return cls(y, m, d)

    @classmethod
    def today(cls):
        "Construct a date from time.time()."
        t = _time.time()
        return cls.fromtimestamp(t)

    ...
    ...

Every one of those returns a date, but starts from different raw material. They have to be classmethods because they need cls to construct the instance, and they return cls(...) which makes it also work with subclasses. For instance, if MyDate subclasses date, then MyDate.today() will return a MyDate instance, not a date.


Bonus: I was annoyed that my pysource package didn't work, so I've since patched it, and now you can get to this source code easily with:

uvx --from pybites-pysource pysource -m datetime.date

(I tend to pip this into Vim with | vi - to read the source code in a scratch buffer.)


You'll see the same pattern across the ecosystem: dict.fromkeys(...), int.from_bytes(...), and in Pydantic Model.model_validate(...) / model_validate_json(...) are all classmethods that build an instance from different raw material.

Another classmethod use case is class-level state: registries, caches, counters. A plugin registry is the clean example, because the method reads and mutates state that belongs to the class, not to any instance:

class Handler:
    _registry: dict[str, type["Handler"]] = {}

    @classmethod
    def register(cls, name: str, handler: type["Handler"]) -> None:
        cls._registry[name] = handler

    @classmethod
    def get(cls, name: str) -> type["Handler"]:
        return cls._registry[name]


# called on the class, no instance needed; it mutates state that lives on the class
Handler.register("json", JSONHandler)

When it's really a staticmethod

If the method touches neither self nor cls, it's a staticmethod, which is a plain function that happens to live inside the class for namespacing. That's a legitimate choice when the helper is tightly bound to the class and you want Expense.normalize(...) to read well. It's now part of the class API (it shows up in dir(Expense)) and can be called without an instance.

Genuine staticmethods are rarer than the other two, which itself tells you something. A clean example is a Color class with conversion helpers (from a Pybites exercise):

class Color:
    def __init__(self, name: str):
        self.name = name
        self.rgb = COLOR_NAMES.get(name.upper())

    @staticmethod
    def hex2rgb(hex_value: str) -> tuple[int, int, int]:
        return tuple(int(hex_value[i:i + 2], 16) for i in (1, 3, 5))

    @staticmethod
    def rgb2hex(rgb: tuple[int, int, int]) -> str:
        return f"#{rgb[0]:02x}{rgb[1]:02x}{rgb[2]:02x}"

hex2rgb and rgb2hex touch neither the instance nor the class. They're pure conversions that live on Color so Color.hex2rgb("#ff0000") reads well next to the rest of the API.

But that's exactly the signal worth noticing: a staticmethod might be just a function in disguise, and sometimes the honest move is to pull it out to a module-level function where it's easier to test and use on its own.

Summary

Method typeFirst argumentAccess toCommon use case
Instance methodselfInstance & class stateModifying object state
Class method (@classmethod)clsClass state onlyAlternative constructors, registries
Static method (@staticmethod)noneNeitherIsolated utility/helper functions

Why this matters more now

When you write the code yourself, you rarely add a method without a reason. When an agent writes it, you get plausible-looking structure that nobody chose. A create classmethod that does nothing, a staticmethod that should be a free function, a helper hanging off the wrong class. That's your judgment call: is this method doing work that belongs to the class, or is it just a pattern the agent learned from other code?

It pays to slow down and look critically at any code and ask those questions. With AI producing more code faster, it's easy to assume that if it looks like Python, it's good Python. But the agent has no taste, and it will happily produce code that is technically correct but structurally wrong.

This is also why I keep writing articles like this one: to give you a simple decision rule you can run in your head during review. It reminds me of Rust, which makes data flow explicit right in the signature with self, &self, and &mut self. The signature tells you what the method touches, same idea as the rule here. (That data-and-behavior split is the whole theme of Why Rust does not need OOP.)

So use AI, but keep developing your knowledge and taste. The more you know, the better you judge the code that comes your way, whether a human or an agent wrote it.

June 18, 2026 12:00 AM UTC


Seth Michael Larson

TIL “@here” only notifies online users on Discord and Slack

I'm in a few Discord servers of friends and we get together in-person regularly. Whenever I was the one organizing an event I would attempt to ping everyone details in the Discord using @here.

After the initial ping I would usually follow-up with folks over text, which is fine and expected part of organizing. More often than not, invitees would be way more responsive over text than over Discord. I created an SMS BCC tool because of how effective texting is for organizing events.

Turns out that @here is functionally different from @everyone or @channel on Slack and Discord. @here only sends notifications to users that are currently online 🟢, not offline ⚫ or “away” 🟡. This makes @here useful for when you’re trying to play an online multiplayer game or chat synchronously... but not for planning a hang-out in advance. So none of my usually-offline friends on Discord would get my initial notification, only the follow-ups. I’ll be using @channel for this purpose from now on.

I learned this from a friend and three people including me were not aware of this distinction, so I figure I have to share this on the blog. Maybe this will help you increase the turn-out for the next event you host for Discord friends. What other Discord or Slack hacks am I probably unaware of? Send them to me via email or on social media.

Happy organizing!



Thanks for reading ♥ I would love to hear your thoughts! Contact me via Mastodon, Bluesky, or email. Browse the blog archive. Check out my blogroll.



June 18, 2026 12:00 AM UTC

June 17, 2026


PyCharm

Every developer has tools they rely on daily. The workflows they’ve built around them, the ways they’ve learned to move faster, debug smarter, and write better code – that kind of hands-on experience can be hard to put into words.

We’re collaborating with LinkedIn to make it easier for you to showcase your expertise with JetBrains IDEs on the world’s largest professional network. You can now connect your IDE to LinkedIn and let your real tool usage speak for itself.

Connect your IDE

IntelliJ IDEA, PyCharm, WebStorm, PhpStorm, Rider, GoLand, CLion, RustRover, and RubyMine are already supported via a free plugin, while support for DataGrip is coming soon.

In this blog post, we’ll explain what LinkedIn connected apps are, what they mean for your profile, and how to get started.

What this is about

Building on early collaboration with Descript, Duolingo, Lovable, Relay.app, and Replit, LinkedIn is expanding the range of apps you can feature on your LinkedIn profile, turning real-world product usage into a credible, visible signal of practical tool experience. We’re glad to join forces with them to bring this to JetBrains IDE users.

“We’re building new ways for members to show real, credible proof of what they’re capable of, right on their LinkedIn profile. And for the brands behind these tools, there’s no better endorsement than a customer who’s actively using and loving your product.”
– Dan Shapero, CEO of LinkedIn

Connected apps let you link the tools you use in your daily work directly to your LinkedIn profile, where they appear prominently, helping you stand out to your professional network. Once connected, each app generates a simple statement based on how you actually use it. Unlike manually added skills, this is based on real usage and updates automatically as your experience evolves.

How to get started

Open your JetBrains IDE, go to Settings | Plugins, search for the LinkedIn Connected Apps plugin under the Marketplace tab, and install it. 

Once installed, the plugin starts collecting data locally about how you use your IDE. Depending on your usage history, you may receive an initial statement right away, which will then be updated once the plugin has collected enough data to better reflect your real IDE expertise.

LinkedIn-integration-in-JetBrains-IDEs-example

Your IDE usage data stays on your machine. When you are ready, you can connect your LinkedIn account and share your IDE expertise badge there. If you keep the plugin installed, your badge will update automatically as your IDE usage evolves.

The plugin is free for all JetBrains IDE users.

What’s included in this release

This is the first version of the integration, delivered as a standalone plugin rather than being built directly into the IDE. It covers IntelliJ IDEA, PyCharm, WebStorm, GoLand, PhpStorm, Rider, CLion, RustRover, and RubyMine; DataGrip is not yet supported. 

Usage is detected within the IDE itself, so if you use AI features via an external tool or terminal, those won’t be reflected yet.

How your IDE expertise is determined

The model is intentionally simple for now. It is designed to represent your practical use of JetBrains tools, not to rank developers or certify skill levels. Our goal was to provide a solid starting point, but we know there’s more to capture about how developers work with their IDEs.

Statements map to different levels of experience and are generated based on how you interact with your IDE – from writing and editing code using basic features to working with debugging tools, version control, and AI-assisted workflows. For more information, see our documentation.

What’s coming next

We’re already working on the next version, planned for later this year. We’ll focus on improving how IDE usage, including AI feature usage, is represented, expanding support to DataGrip, and making the overall experience feel more integrated.

FAQ

Which JetBrains IDEs are supported?

IntelliJ IDEA, PyCharm, WebStorm, GoLand, PhpStorm, Rider, CLion, RustRover, and RubyMine. Support for DataGrip is coming soon.

Why isn’t DataGrip supported yet?

DataGrip is designed for working with databases and includes workflows that differ from our other IDEs. We plan to support it soon.

Can I connect multiple IDEs?

Yes, if you use multiple supported JetBrains IDEs, you can connect each of them. You’ll get a badge for all connected IDEs.

Note: If you use multiple JetBrains accounts across different IDE instances but link them all to the same LinkedIn profile, IDE usage statements from each account will be displayed on that LinkedIn profile.

Do I need to keep the plugin installed after connecting? 

You can share your IDE usage statement once and then remove the plugin, but note that it must remain installed if you want to track your ongoing progress and have any changes reflected on LinkedIn.

Is this feature free?

Yes, it’s available to all JetBrains IDE users at no cost.

Is this a certification?

Connected apps reflect real IDE usage and are designed to showcase applied experience, not to act as a formal certification or skill ranking. Certifications, degrees, and licenses remain important markers of professional achievement. Connected apps on LinkedIn add a different kind of signal: partner-validated tool usage that reflects practical work and can update over time.

What data is shared with LinkedIn and JetBrains?

Only the information required to represent your connected account and IDE usage statement. 

Will this help me get hired?

Having connected apps on your LinkedIn profile is extra proof of your practical experience with leading tools. While connected apps make your expertise visible, they are just one part of your profile. Think of it as a way to let your tooling speak for itself.

Give it a try and let us know what you think in the comments below. We’re continuing to develop this integration, and your feedback will help shape what comes next.

June 17, 2026 04:40 PM UTC


Talk Python to Me

#552: Astral joins OpenAI

OpenAI just acquired Astral, the company behind uv, Ruff, and ty. And if your first thought was "wait, is uv toast?", you are not alone. But here's the twist Charlie Marsh shared with me: he thinks they may ship more open source at OpenAI than they ever did at Astral. On this episode, we get into the acquisition, the mixed feelings, the future of your favorite Python tools, and what it's like to build right at the center of the AI universe.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/sentry'>Sentry Error Monitoring, Code talkpython26</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <h2 class="links-heading mb-4">Links from the show</h2> <div><strong>Guest</strong><br/> <strong>Charlie Marsh</strong>: <a href="https://github.com/charliermarsh?featured_on=talkpython" target="_blank" >github.com</a><br/> <br/> <strong>The announcement</strong>: <a href="https://astral.sh/blog/openai?featured_on=talkpython" target="_blank" >astral.sh</a><br/> <strong>OpenAI</strong>: <a href="https://openai.com/?featured_on=talkpython" target="_blank" >openai.com</a><br/> <strong>uv</strong>: <a href="https://github.com/astral-sh/uv?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>ty</strong>: <a href="https://github.com/astral-sh/ty?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>Ruff</strong>: <a href="https://github.com/astral-sh/ruff?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>pyx</strong>: <a href="https://astral.sh/pyx?featured_on=talkpython" target="_blank" >astral.sh</a><br/> <strong>Codex team</strong>: <a href="https://openai.com/codex/?featured_on=talkpython" target="_blank" >openai.com</a><br/> <strong>Anthropic did something similar by acquiring Bun</strong>: <a href="https://www.anthropic.com/news/anthropic-acquires-bun-as-claude-code-reaches-usd1b-milestone?featured_on=talkpython" target="_blank" >www.anthropic.com</a><br/> <strong>Daily Stars Explorer</strong>: <a href="https://emanuelef.github.io/daily-stars-explorer/#/astral-sh/uv" target="_blank" >emanuelef.github.io</a><br/> <br/> <strong>Agentic AI Programming for Python</strong>: <a href="https://training.talkpython.fm/courses/agentic-ai-programming-for-python" target="_blank" >training.talkpython.fm</a><br/> <strong>Python Web Security: OWASP Top 10 with Agentic AI</strong>: <a href="https://training.talkpython.fm/courses/agentic-ai-python-security" target="_blank" >training.talkpython.fm</a><br/> <br/> <strong>Episode #552 deep-dive</strong>: <a href="https://talkpython.fm/episodes/show/552/astral-joins-openai#takeaways-anchor" target="_blank" >talkpython.fm/552</a><br/> <strong>Episode transcripts</strong>: <a href="https://talkpython.fm/episodes/transcript/552/astral-joins-openai" target="_blank" >talkpython.fm</a><br/> <br/> <strong>Theme Song: Developer Rap</strong><br/> <strong>🥁 Served in a Flask 🎸</strong>: <a href="https://talkpython.fm/flasksong" target="_blank" >talkpython.fm/flasksong</a><br/> <br/> <strong>---== Don't be a stranger ==---</strong><br/> <strong>YouTube</strong>: <a href="https://talkpython.fm/youtube" target="_blank" ><i class="fa-brands fa-youtube"></i> youtube.com/@talkpython</a><br/> <br/> <strong>Bluesky</strong>: <a href="https://bsky.app/profile/talkpython.fm" target="_blank" >@talkpython.fm</a><br/> <strong>Mastodon</strong>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" ><i class="fa-brands fa-mastodon"></i> @talkpython@fosstodon.org</a><br/> <strong>X.com</strong>: <a href="https://x.com/talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @talkpython</a><br/> <br/> <strong>Michael on Bluesky</strong>: <a href="https://bsky.app/profile/mkennedy.codes?featured_on=talkpython" target="_blank" >@mkennedy.codes</a><br/> <strong>Michael on Mastodon</strong>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" ><i class="fa-brands fa-mastodon"></i> @mkennedy@fosstodon.org</a><br/> <strong>Michael on X.com</strong>: <a href="https://x.com/mkennedy?featured_on=talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @mkennedy</a><br/></div>

June 17, 2026 03:20 PM UTC


PyCharm

Your JetBrains IDE Expertise, Now on LinkedIn

June 17, 2026 01:11 PM UTC


Django Weblog

Announcing the Search for a DSF Executive Director

The Django Software Foundation is hiring its first Executive Director, and we have the Django community to thank for making it possible.

Six Django web development agencies have jointly pledged $47,500 to help fund the Executive Director's first year: Caktus Group, Lincoln Loop, Six Feet Up, Cuttlesoft, OddBird, and Two Rock. This is the financial foundation we needed to move from "we should hire an ED someday" to "we are hiring an ED now."

Why This Role Matters

The DSF has grown significantly over the past several years. We fund multiple Django Fellows, distribute grants to events around the world, manage corporate and individual memberships, oversee working groups, and handle the legal and operational responsibilities of a 501(c)(3) nonprofit. For years, volunteer board members have carried this operational load alongside their regular jobs. That dedication has carried us far, but there are real limits to what a volunteer board can do.

An Executive Director changes that. This person would handle day-to-day operations and administration, sponsorship development and partner relationships, community outreach and communications, coordination with the Django Fellows and working groups, and grant management and financial reporting.

It is a paid position, part-time or full-time depending on the candidate, for someone who understands open source communities and is genuinely excited about helping Django thrive as both a framework and a foundation.

"For years our board has run the DSF on volunteer time, and we've hit the limits of what that can do. An Executive Director lets us actually grow the work, more support for Django Fellows, better fundraising, and the operational help we've needed for a long time," said Jeff Triplett, DSF Board President. "We've talked about this hire for years, and funding was always what held us back. Six agencies who compete with each other decided to put money in together so we could finally do it. That tells you how much this community cares about Django's future."

The Agencies Who Made This Possible

These six agencies compete for the same clients, but they share a foundation: Django. That shared reliance drove them to collaborate on this pledge, and we want to recognize each of them.

Caktus Group ($12,500), the Durham, NC consultancy founded in 2007 and known for data-intensive Django work with clients like UNICEF and the University of Chicago, put it directly. CEO Tobias McNulty said: "Django is the bedrock of our business, and as a smaller team, contributing is a significant investment. We hope this coordinated action from six agencies sends a clear signal to the rest of the industry: it's time to contribute to the core technology that makes our businesses possible."

Lincoln Loop ($10,000), the remote-first Python consultancy that has built platforms for Planned Parenthood, Wharton, Mozilla, and PBS, framed it as a question of sustainability. Founder Peter Baumgartner said: "We've seen the Django community thrive under volunteer leadership, but we've reached a ceiling. The Executive Director role is about sustainability, providing the leadership and structure needed to scale the DSF's impact and protect Django's long-term future."

Six Feet Up ($10,000), the woman-owned consultancy founded in 1999 with clients including Capital One, Purdue University, and UNEP, focused on what this means for enterprise confidence in Django. CEO Gabrielle Hendryx-Parker said: "Tech leaders stake their roadmaps on the long-term viability of their technology stack. A full-time Executive Director de-risks the framework's future, protecting the robust and lasting systems we build for our clients and ensuring Django remains a bankable, innovative choice."

Cuttlesoft ($10,000), the product agency based in Tallahassee and Denver that has been building with Django since 2014, sees the hire as an investment in the whole ecosystem. Co-founder Frank Valcarcel said: "Investing in a dedicated Executive Director is a proactive step toward ensuring Django's continued evolution. We believe this role will unlock new opportunities for growth and collaboration within the community, benefiting all who rely on this incredible framework."

OddBird ($2,500), the remote boutique agency co-founded by Django core developer Jonny Gerig Meyer, has been contributing to both the framework and the community for more than 17 years. Jonny said: "Adding a dedicated Executive Director helps the DSF ensure Django's long-term sustainability, giving developers and enterprise clients peace of mind choosing the Django ecosystem. This investment is a no-brainer, and we're thrilled to partner with other peer agencies to help make it a reality."

Two Rock Software ($2,500), the Django-focused custom development shop with deep roots in the Django events community, rounded out the pledge. Co-founder Peter Grandstaff, who serves as President of Django Events Foundation North America and helps run DjangoCon US, said: "As President of Django Events Foundation North America, I know how hard it is for a volunteer board to run an effective organization. I feel strongly that Django is at a point where an Executive Director is the right step into the future."

We Need Your Help Too

This $47,500 pledge is a launchpad, not the finish line.

Hiring an Executive Director means taking on a recurring cost that our current fundraising levels cannot sustain on their own. That is why the DSF is raising its annual fundraising goal from $300,000 to $500,000 (2026 fundraising goals). The additional funding reflects what it takes to responsibly hire and maintain this role, continue supporting our Django Fellows, and keep the rest of our programs running without cutting corners.

Six agencies stepped up first. We are asking others to follow.

If your company builds on Django, sells products that run on Django, or employs developers who work with Django every day, this is your opportunity to invest in the infrastructure that makes that possible. No contribution is too small, and every organization that joins this effort makes it easier for the next one to say yes.

You can reach out directly through our Contact the DSF page to discuss a contribution toward the Executive Director fund, or make a general donation at djangoproject.com/foundation/donate/. Individual community members can also contribute via Open Collective.

With the community's help, with your company's help, we can get there.

What Comes Next

The board is formalizing the hiring process and will publish the job posting in the coming weeks. When it is ready, we will announce it across the DSF blog, Django Forum, Django Discord, and our other community channels.

If you know someone who would be a great fit for this role, start thinking about them now.

Thank you to Caktus Group, Lincoln Loop, Six Feet Up, Cuttlesoft, OddBird, and Two Rock for leading the way.

June 17, 2026 01:00 PM UTC


Python Software Foundation

Everything Security at PyCon US 2026

Phew, PyCon US 2026 is a wrap! Now it's time to share about everything security that happened in case you weren't able to attend (or you just want to reminisce). Subscribe to the PyCon US channel on YouTube so you're notified as soon as recordings for each talk are published. This blog post will also be updated with links once all talks are available.

PyCon US Security Track

Hala Ali on generating SBOMs directly from the Python runtime

Juanita Gomez and Seth Larson were the chairs of the first talk track dedicated to security at PyCon US: Trailblazing Python Security! We're excited to share the recordings for each talk featured in the track:

Thanks so much to the speakers and volunteers who helped make this inaugural track a success. For several of the talks above the room was standing-room only! The support and interest in security topics from the Python community was incredible to see and we're hoping to see you all again next year to continue learning and sharing ideas.

PSF Security Update

"Security isn't free!"
 

Following Amanda Casari's amazing keynote, Mike Fiedler and Seth Larson took the stage to give a brief update of the past year of security work at the Python Software Foundation (PSF).

Overall 2026 was the year of more, both good and not-so-good. More packages than ever and being published to the Python Package Index (PyPI), but also more malware and specifically watering-hole attacks targeting PyPI users. The double-edged sword of being a popular and widely-used programming language also makes Python and its users a more interesting target for attackers.

The slides for this presentation are available for download via speakerdeck.

OSS Maintainer Security Open Space


For the fourth year in a row Seth Larson hosted a security-themed Open Space at PyCon US. This year the open space was titled "Security for Open Source project maintainers" with the goal of "gather with fellow open source project maintainers to discuss current challenges with open source security".

A handful of Open Source maintainers were present to discuss security issues. The format was open-ended discussion with a few prompts to start the discussion off including vulnerability handling and CI/CD security.

CI/CD Security

Following the many watering-hole attacks on established Open Source projects involving CI/CD pipelines, hardening project CI/CD pipeline definitions was the first discussion topic. The overwhelming recommendation was to use Zizmor with its --fix mode and a GH_TOKEN. Other tools came up such as gha-update, pinact, Dependabot, Renovate, and using lock files like pip-compile to lock dependencies in your CI/CD workflows. Dependency Cooldowns were also a popular concept for dependencies involved in builds and publishing.

The most recent resource published for all-in-one repository security was a blog post by William Woodruff on open source security at Astral that details CI/CD security and how to configure repositories.

Vulnerability Reporting

The bulk of the discussion was about vulnerabilities and challenges around handling the volume of reports from reporters using LLMs. The prevailing theme is that the volume of reports has increased substantially, with anec-data being that vulnerability handling "previously was ~20% of time spent on a project" and is now "almost all" the time spent. Many reports are duplicates, verbose, extremely low quality due to the use of LLMs but the number of valid or almost-security issues has increased, too.

This "almost all" number is particularly frightening, many Open Source contributors didn't get into this line of volunteering because they wanted to work on security-related tasks.

There was some side discussion about how to judge whether handling a vulnerability in private was still a useful thing to do if the vulnerability is trivially discoverable using a publicly available LLM. The conversation referenced the Linux kernel's discussion of the same topic.

Security Policies & Threat Models

Talking about ways to mitigate the negative effects of LLMs and agents on security work lead to a discussion of security policies and threat models. Few projects, especially smaller ones, have tried this approach of documenting their threat model to see if this has a meaningful impact on the quality or quantity of reports received.

PythonDjango, Node, and curl were given as good examples of threat models to copy and learn from for your own projects.

There was an issue of discoverability, some documentation is in CONTRIBUTING.md, or on a website, but not checked into source control for the actual project, or used an organization-wide .github/SECURITY.md. Some projects didn't use an AGENTS.md (and didn't want to, for fear of inviting even more LLM-driven contributions), and it was difficult to tell whether any particular documentation was having an effect. There's also the difficulty of models changing or becoming more capable over time. More testing is necessary here!

Contributor Quality Signals

A separate meta-conversation through the previous topics was about having a way to signal that a particular contributor or security researcher had a high "contributor quality". The value of such a signal would tell maintainers where to focus their limited time, such as reports from someone more likely to engage with the process and follow instructions. "Talking with an LLM, indirectly" was mentioned multiple times as a negative but unfortunately common experience of maintainers interacting with first-time contributors.

gh-profiler from Eric Matthes was referenced during the discussion, and a few maintainers tested this on their own profiles and profiles of low-quality contributions they'd received recently. There was an interest in finding metrics or signals that are tougher to automate or fake. The group identified that as soon as such a signal was widely used that agents would simply "route around" the barrier.

Alpha-Omega × Python Software Foundation 

Thanks to Alpha-Omega for sponsoring security at the PSF. Their support funds two roles: the Security Developer-in-Residence, held by Seth Larson, and the PyPI Safety & Security Engineer, held by Mike Fiedler. Seth and Mike delivered a joint update on their work at PyCon US 2026.

The over-arching theme of the update was the impact of higher volumes of reports, vulnerabilities, malware, and supply-chain attacks are having on the Python ecosystem along with work done to mitigate some of the hockey-stick graphs we're seeing.

Seth detailed the Python Security Response Team (PSRT) governance and process changes detailed in PEP 811. These changes aim to improve the capacity of the PSRT ahead of an increasing workload triaging and remediating security vulnerabilities reported to Python and pip.

Mike detailed work for mitigating malware and supply-chain attacks to PyPI, especially novel attacks such as the Shai-Hulud worm that targets and exploits insecure CI/CD pipelines and developer API tokens to propagate malware. 

If you are interested the full set of slides is available for download via speakerdeck.

June 17, 2026 09:55 AM UTC


Bob Belderbos

Building an AI Agent in 6 Weeks (and Finally Understanding How They Work)

Jeff Haemer has written software since he was teaching it at the University of Colorado in the early 1980s. But he felt he needed to brush up his Python, and above all get a grounding in AI.

In his words AI was "a big undifferentiated cloud of things I didn't know." Time to change that.

Six weeks into our Python Agentic AI cohort, he ended up with an agentic application with a few thousand lines of code, almost 250 unit tests, 100% coverage, and three working interfaces: a web UI, a command line, and a Telegram bot that talks to his phone.

He even went as far as running mutation testing. More than that, through the program he developed a mental model of how agents actually work under the covers.

The gap

Jeff came in with a long career behind him and a clear-eyed view of what he was missing:

"It's a giant world and changing every day, and I didn't even know in many ways where to start. When you told me you were thinking about doing an AI cohort, I said, please do that and sign me up, and I will happily jump in feet first."

He set expectations low on purpose. "I expected myself to crash and burn because I have a very spotty background." He'd taught plenty of courses; he knew how the first run of anything goes. He signed on as the cohort's beta tester and decided to stay enthusiastic through whatever broke.

Watch the full interview with Jeff:

Watch on YouTube

Core logic first, interfaces on top

An agent is a program that takes a request, reasons about it with an AI model, and acts. The cohort builds the reasoning core first, then wraps interfaces around it. That structure got tested the hard way in the Telegram week, when connecting the agent to a chat app hit a wall.

"I got to the end of that week and I wasn't even close, and I said, okay, I failed you. And you said, go on to the next week, because we designed the software so those units are independent of one another. We'll rewrite it."

He moved on. The Telegram unit was independent, so an unfinished interface didn't block the rest of the build. We later rebuilt that week's material from almost 100 pages down to 41, starting from a simpler working version. When Jeff came back to it, it worked.

"It was beautiful. When we hit a speed bump, that worked. It proved itself."

There's a meta lesson here, and a reminder of what well-designed software looks like: build the thing that does the work first, keep the edges replaceable. This saved the beta cohort from going under.

Mocking, the thing he kept putting off

The concept that gave Jeff the most trouble was mocking, the practice of replacing a real outside service (an API, a database) with a stand-in so a test can run fast and offline.

"The biggest pain for me was finally understanding what mocking involves. I even asked a friend of mine, a test automation Python guy, and he said, I never really understood that either."

He avoided it until week four or five, when Juanjo made it non-negotiable:

"You got me to the point where I could actually separate my unit tests, which were independent of outside services, from my integration tests. I learned stuff about pytest, I learned stuff about mocking, and I learned stuff about my own code. I just ended up thinking about things in a different way."

Jeff's rigor with the test suite paid dividends. By the end they were catching real issues, including a function that slipped coverage when he deployed to a Debian box, which he set out to track down on his own.

AI as teaching assistant, not autopilot

Jeff wrote the code by hand. He used AI deliberately, as a tutor for the spots where his Python ran out, not as a generator to fill the file for him.

"I understand conceptually what to do here, but I don't know enough Python to struggle through the docs and work through all the bugs in that short amount of time. So, show me how to do this. AI turns into a great teaching assistant."

Unit tests gave each week a definition of done. The result is an artifact he can reason about, not a diff he has to trust on faith.

Understanding by building

The win Jeff cared about most wasn't the repo. It was the mental model:

"It's different having done it than even reading about it. If somebody had written an explanation, a week later I'd have a vague idea, a month later I'd say I read an article about it but don't remember anything, and two months later I'd say I've never heard of that. This, I think I understand."

He tested it against a toy agent he found in the wild, a hallucinating Wikipedia clone, and could trace exactly what it did: validate the input, structure it, hand it to the model, get structured output back, cache it in a database.

From "big undifferentiated cloud of things I didn't know" to a firm grounding in how AI agents work. Jeff ended up with a great artifact on GitHub, but the real win has been this deeper understanding that he will take to his next AI project.

We're excited to see what Jeff will build next with this new skill set.

June 17, 2026 12:00 AM UTC

June 16, 2026


PyCoder’s Weekly

Issue #739: JIT Delayed, Sandboxes, OpenRouter, and More (2026-06-16)

#739 – JUNE 16, 2026
View in Browser »

The PyCoder’s Weekly Logo


Steering Council Announcement Regarding the JIT

The Python Steering Council has announced that the work on the JIT needs to be paused until a new PEP gets written. There are many unresolved questions about the approach and integration with other tools and the work on the JIT has reached a stage where these questions need to be answered. Additional discussion
PYTHON.ORG

Python in a Sandbox With MicroPython and WASM

Simon’s been in search of the perfect code sandbox. This article is about his latest attempt and covers why he wants a sandbox and what tech he’s used to achieve it.
SIMON WILLISON

Wallaby for Python runs Tests as you Type and Streams Results Next to Code, Plus AI Context

alt

Wallaby brings pytest / unittest results, runtime values, coverage, errors, and time-travel debugging into VS Code, so you can fix Python faster and give Copilot, Cursor, or Claude the execution context they need to stop guessing. Try it free, now in beta →
WALLABY sponsor

Accessing Multiple AI Models With the OpenRouter API

Access models from popular AI providers in Python through OpenRouter’s unified API with smart routing, fallbacks, and cost controls.
REAL PYTHON course

Quiz: Accessing Multiple AI Models With the OpenRouter API

REAL PYTHON

scikit-learn 1.9 Released

SCIKIT-LEARN.ORG

Python 3.14.6 and 3.13.14 Released

PYTHON.ORG

Articles & Tutorials

Skip Jupyter’s Hidden State: Reactive Notebooks With Marimo

Marimo is a reactive Python notebook designed to make data science workflows more reproducible. This article shows how it avoids hidden execution state, saves notebooks as plain .py files for cleaner Git diffs, isolates dependencies with uv, supports pytest cells, and exports notebooks into reusable formats including scripts, HTML, and WASM dashboards.
CODECUT.AI • Shared by Khuyen Tran

EuroPython 2026: Celebrating 25 Years

What’s happening at EuroPython 2026? The conference celebrates its 25th anniversary this year in Kraków, Poland. This week on the show, organizers Mia Bajić and Daria Linhart Grudzien join me to discuss this year’s conference.
REAL PYTHON podcast

SQLPyHelper: Unified DataBase API

SQLPyHelper is a Python library that provides a unified API across SQLite, PostgreSQL, MySQL, SQL Server, and Oracle. It has async support for FastAPI, cross-database migration, connection pooling, and transactions.
DEV.TO • Shared by Adebayo Olaonipekun

Stroll Down Startup Lane

PyCon’s Startup Row is a stretch of booths where early-stage companies built on Python show off what they’re creating. In this episode, Talk Python interviews a host of folks from this year’s booths.
TALK PYTHON podcast

Pyodide 314.0 Release

This post announces the Pyodide 314.0 release and describes its features, including a focus on standardization and packaging. You can now build Pyodide wheels and post them to PyPI.
PYODIDE.ORG

The Smallest Brain You Can Build

A perceptron explained from scratch in Python, with interactive demos. Learn weights, bias, the decision boundary, epochs, learning rate, and why you normalize data.
DEVARSH RANPARA

Are You Expected to Run 5 Type-Checkers Now?

Library maintainers may feel overwhelmed by the plurality of type checkers that exist. We offer some guidance on how to focus their efforts where they matter most.
MARCO GORELLI

How to Tell if Your Python Mock Is Actually Working

A test that passes because the real API returned an error is not a passing test. Here’s how to verify your mock is intercepting, and fix it when it isn’t.
BOB BELDERBOS

Cursor vs Windsurf: Which AI Code Editor Is Best for Python?

Compare Cursor vs Windsurf for Python across code completion, multi-file editing, and debugging to choose the right editor for your workflow.
REAL PYTHON

Tricky Python Quiz

A tricky Python quiz game about surprising edge cases, weird outputs, and traps with questions from the popular WTFPython GitHub repo.
ADARSHD.DEV • Shared by Adarsh Divakaran

Free Threading Internals: Deferred Reference Counting

This is a follow up to Victor’s article on reference counting covering more complex counting mechanisms including immortal objects.
VICTOR STINNER

Projects & Code

Relier: Zero-Job-Loss Reliability Layer for Celery

GITHUB.COM/GETRELIER • Shared by Kolade Fajimi

uuid-utils: Rust-Based Replacement for Python’s UUID

GITHUB.COM/AMINALAEE • Shared by Amin

pytrendy: Trend Detection in Time Series Data

GITHUB.COM/RUSSELLSB

django-deploy-probes: Django Health & Startup Endpoints

GITHUB.COM/EMFPDLZJ • Shared by minjeong bak

NumCircBuf: High-Performance Numerical Circular Buffers

GITHUB.COM/BASIMALI-AI • Shared by Syed Basim Ali

Events

Weekly Real Python Office Hours Q&A (Virtual)

June 17, 2026
REALPYTHON.COM

How CPython Works on Android

June 18, 2026
LUMA.COM • Shared by Adarsh D

PyData Bristol Meetup

June 18, 2026
MEETUP.COM

PyLadies Dublin

June 18, 2026
PYLADIES.COM

Python for (Almost) Everything

June 18 to June 19, 2026
MEETUP.COM

PyCon Singapore 2026

June 19 to June 22, 2026
PYCON.SG


Happy Pythoning!
This was PyCoder’s Weekly Issue #739.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

June 16, 2026 07:30 PM UTC


Ari Lamstein

Upcoming O’Reilly Workshop: Building Data Apps with Streamlit and Copilot (July 2026)

On July 9 (9am–1pm Pacific), I’ll be teaching a 4‑hour live workshop for O’Reilly: Building Data Apps with Streamlit and Copilot.

This is the second time I’ve run this workshop, and I’ve made several improvements based on what I learned the first time.

If you work in Python and want to turn your analyses into interactive, shareable tools, this workshop is designed for you. We’ll start from a Jupyter notebook and build a complete Streamlit app that lets users explore a dataset through interactive controls, charts, and maps. Along the way, we’ll use Copilot to speed up development and discover Streamlit features more efficiently.

What we’ll cover

The workshop is hands‑on: you’ll build the app step‑by‑step, and by the end you’ll have a working project you can adapt to your own data.

What You’ll Build

Here’s a screenshot from the app we’ll build together:

The app lets users choose a state and demographic statistic, explore how it changes over time, and view the data as a chart, map, or table.

And while the example uses demographic data, the skills you’ll learn—structuring an app, building interactive controls, and creating dynamic visualizations—apply to any Streamlit project you want to build.

Who is this for?

How to Register

The workshop is hosted on O’Reilly, which is a membership platform. If you’re not already a member, I have a 30-day free trial you can use. To register for the workshop with the free trial:

  1. Start your free trial at this link
  2. Then register for the workshop itself

Also worth knowing: the workshop is recorded. So if July 9 doesn’t work for you, it’s still worth registering — you’ll have access to the recording.

I’d love to see you there.

June 16, 2026 06:38 PM UTC


Mariatta

Waitlisted for the Core Devs Sprint: When the Bad News was Also the Good News

Last week, I learned that I was one of 17 people waitlisted for the Python Core Devs Sprint at OpenAI this year.

A waitlist that long realistically means I probably won’t get in. I was sad, of course. The sprint alternates between Europe and the US. Traveling to Europe is … hard and complicated. I couldn’t go to last year’s in Europe, because of the location and work conflict. I was really hoping to go to this year’s US sprint. Next year it will be back in Europe, out of reach again. That’s potentially not sprinting for three years in a row.

June 16, 2026 04:00 PM UTC


Python Bytes

#484 All our tools

<strong>Topics covered in this episode:</strong><br> <ul> <li><strong><a href="https://pi.dev/?featured_on=pythonbytes">pi</a> + <a href="https://github.com/obra/superpowers/tree/main?featured_on=pythonbytes">superpowers</a></strong></li> <li><strong>Terminal: <a href="http://Warp.dev?featured_on=pythonbytes">Warp.dev</a> + <a href="https://ohmyz.sh/?featured_on=pythonbytes">OhMyZSH</a></strong></li> <li><strong>{<a href="https://blink.sh/?featured_on=pythonbytes">Blink</a>,<a href="https://sw.kovidgoyal.net/kitty/?featured_on=pythonbytes">kitty</a>} + <a href="https://mosh.org/?featured_on=pythonbytes">mosh</a> + <a href="https://github.com/tmux/tmux?featured_on=pythonbytes">tmux</a></strong></li> <li><strong><a href="https://www.anthropic.com/product/claude-code?featured_on=pythonbytes">Claude code</a></strong></li> <li><strong><a href="https://goodsnooze.gumroad.com/l/macwhisper?featured_on=pythonbytes">MacWhisper</a> or <a href="https://handy.computer?featured_on=pythonbytes">Handy</a></strong></li> <li><strong><a href="https://tailscale.com/?featured_on=pythonbytes">Tailscale</a></strong></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=wgKF3yvpxPU' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="484">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p>Sponsored by us! Support our work through:</p> <ul> <li>Our <a href="https://training.talkpython.fm/?featured_on=pythonbytes"><strong>courses at Talk Python Training</strong></a></li> <li>Six Feet Up is hosting a LinkedIn Live <strong>Connect with the hosts</strong></li> <li>Michael: <a href="https://fosstodon.org/@mkennedy">@mkennedy@fosstodon.org</a> / <a href="https://bsky.app/profile/mkennedy.codes?featured_on=pythonbytes">@mkennedy.codes</a> (bsky)</li> <li>Calvin: <a href="https://sixfeetup.social/@calvin?featured_on=pythonbytes">@calvinhp@sixfeetup.social</a> / <a href="https://bsky.app/profile/calvinhp.com?featured_on=pythonbytes">@calvinhp.com</a> (bsky)</li> <li>Show: <a href="https://fosstodon.org/@pythonbytes">@pythonbytes@fosstodon.org</a> / <a href="https://bsky.app/profile/pythonbytes.fm">@pythonbytes.fm</a> (bsky)</li> </ul> <p>Join us on YouTube at <a href="https://pythonbytes.fm/stream/live"><strong>pythonbytes.fm/live</strong></a> to be part of the audience. Usually <strong>Tuesday</strong> at 7am PT. Older video versions available there too. Finally, if you want an artisanal, hand-crafted digest of every week of the show notes in email form? Add your name and email to <a href="https://pythonbytes.fm/friends-of-the-show">our friends of the show list</a>, we'll never share it.</p> <p><strong>Calvin #1: <a href="https://pi.dev/?featured_on=pythonbytes">pi</a> + <a href="https://github.com/obra/superpowers/tree/main?featured_on=pythonbytes">superpowers</a></strong></p> <ul> <li>terminal-first, open-source coding agent</li> <li>Session management is a first-class citizen</li> <li>Extension model is what makes pi special — it's aggressively composable</li> <li>Superpowers brings a structured software development methodology as loadable skills</li> <li>Steps back and asks you what you're really trying to do</li> <li>“hand you the keys to the car” mode vs guardrails might not be for everyone</li> </ul> <p><strong>Michael #2: Terminal: <a href="http://Warp.dev?featured_on=pythonbytes">Warp.dev</a> + <a href="https://ohmyz.sh/?featured_on=pythonbytes">OhMyZSH</a></strong></p> <ul> <li>If you’re using the base terminal with default settings, you have so much head-room for improvement.</li> <li>I’ve been using <a href="http://Warp.dev?featured_on=pythonbytes">Warp.dev</a> since Elvis talked me into it. ;)</li> <li>Remarkable terminal but the AI side of things is a bit junky, can be turned off</li> <li>OhMyZSH gives better autocomplete <ul> <li>e.g. git branch [HTML_REMOVED] lists all branches in the local repo!</li> </ul></li> <li><a href="http://Commandbookapp.com?featured_on=pythonbytes">Commandbookapp.com</a> is excellent to keep the terminal focused on terminal things and more server commands and other automation in Command Book.</li> </ul> <p><strong>Calvin #3: {<a href="https://blink.sh/?featured_on=pythonbytes">Blink</a>,<a href="https://sw.kovidgoyal.net/kitty/?featured_on=pythonbytes">kitty</a>} + <a href="https://mosh.org/?featured_on=pythonbytes">mosh</a> + <a href="https://github.com/tmux/tmux?featured_on=pythonbytes">tmux</a></strong></p> <ul> <li><a href="https://sw.kovidgoyal.net/kitty/?featured_on=pythonbytes"><strong>Kitty Terminal</strong></a> — GPU-accelerated terminal emulator for macOS, Linux, and Windows with support for graphics, ligatures, and a powerful tiling layout system built right in.</li> <li><a href="https://blink.sh/?featured_on=pythonbytes"><strong>Blink Shell</strong></a> — The go-to terminal for iPad/iPhone power users; full SSH and Mosh client with a gorgeous interface built specifically for mobile professional workflows.</li> <li><a href="https://mosh.org/?featured_on=pythonbytes"><strong>Mosh</strong></a> — Mobile Shell replaces SSH for remote connections, surviving network switches, sleep cycles, and flaky Wi-Fi with zero dropped sessions — essential for staying connected to long-running agentic jobs.</li> <li><a href="https://github.com/tmux/tmux?featured_on=pythonbytes"><strong>tmux</strong></a> — Terminal multiplexer that keeps sessions alive on your Linux server indefinitely; detach from a Mosh session on your Mac, reconnect from your iPad, and your agent is right where you left it.</li> <li><strong>The combo</strong> — Kitty or Blink + Mosh + tmux creates a "persistent remote brain" pattern: your beefy Linux homelab runs the compute-heavy agent sessions 24/7, and any device becomes a thin client to drop in and out at will.</li> </ul> <p><strong>Michael #4: <a href="https://www.anthropic.com/product/claude-code?featured_on=pythonbytes">Claude code</a></strong></p> <ul> <li>I prefer the IDE experience, the new PyCharm + Claude integration is really good. VS Code too. Why IDE? Because we should still be present with our code and managing context is much easier.</li> <li>Use the best/latest models on high thinking. “Speed” is not your friend, it’s just shortcuts.</li> <li>Create skills and agents and use them.</li> <li>Curate your own rules (e.g. Talk Python’s <a href="http://Claude.md?featured_on=pythonbytes">Claude.md</a>)</li> <li>Works well on non-coding things. Just create a folder, put a ton of files in there and it’s like NotebookLM + Chat + more.</li> </ul> <p><strong>Calvin #5: <a href="https://goodsnooze.gumroad.com/l/macwhisper?featured_on=pythonbytes">MacWhisper</a> or <a href="https://handy.computer?featured_on=pythonbytes">Handy</a></strong></p> <ul> <li>Transcribes your speech using your choice of Whisper or Parakeet models.</li> <li>All transcription is done on your device, <strong>no data leaves your machine.</strong></li> <li>Automatic Speaker Recognition with local models.</li> <li>Handy is more basic, but open source and runs on all platforms.</li> </ul> <p><strong>Michael #6: <a href="https://tailscale.com/?featured_on=pythonbytes">Tailscale</a></strong></p> <ul> <li>No need to open ports at all, Tailscale makes machines inside the same network accessible to each other</li> <li>Works great for laptops, desktops, etc. But also available for servers. <ul> <li>Though I still use cloud firewalls for servers.</li> </ul></li> <li><strong>How I use it</strong>: <ul> <li><strong>My dev database server</strong>, preloaded with QA data, is always running on my home mac mini m4 pro. All my apps look for that server before looking locally and tailscale makes them always accessible to each other</li> <li>My <strong>local LLMs expose OpenAI API compatible APIs</strong>. Tailscale makes these accessible even while traveling or at a coffee shop.</li> <li>Use my <strong>mini as an exit node</strong>. All traffic is routed outbound from my local fiber network. Great to restricted IPs like accessing my servers without caring about the local IP.</li> <li>Screen share back to my home machines even while traveling.</li> </ul></li> <li>Listen to the <a href="https://talkpython.fm/episodes/show/546/self-hosting-apps-for-python-people?featured_on=pythonbytes">Talk Python episode with Alex</a> for a deeper conversation.</li> </ul> <p><strong>Extras</strong></p> <p>Calvin:</p> <ul> <li><a href="https://www.telescopo.app?featured_on=pythonbytes">Telescopo</a> great Mac Markdown viewer/editor. Michael:</li> <li>One more: <a href="https://typora.io/?featured_on=pythonbytes">Typora markdown</a> editor.</li> <li>Created <a href="https://mkennedy.codes/docs/?featured_on=pythonbytes">formal documentation for many of my open source packages</a> using <a href="https://posit-dev.github.io/great-docs/?featured_on=pythonbytes">Great Docs</a>.</li> <li>Via Mark Little: <a href="https://www.anthropic.com/news/fable-mythos-access?featured_on=pythonbytes">Statement on the US government directive to suspend access to Fable 5 and Mythos 5</a></li> </ul> <p><strong>Joke: <a href="https://x.com/pr0grammerhum0r/status/2063078450311598430?s=12&featured_on=pythonbytes">No second date</a></strong></p>

June 16, 2026 08:00 AM UTC


HoloViz

HoloViz for LLMs

June 16, 2026 12:00 AM UTC

June 15, 2026


Kay Hayen

Nuitka Release 4.1

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.

This release adds many new features and corrections with a focus on async code compatibility, missing generics features, and Python 3.14 compatibility and Python compilation scalability yet again.

Bug Fixes

Package Support

New Features

Optimization

Anti-Bloat

Organizational

Tests

Cleanups

Summary

This release builds on the scalability improvements established in 4.0, with enhanced Python 3.14 support, expanded package compatibility, and significant optimization work.

The --project option seems usable now.

Python 3.14 support remains experimental, but only barely made the cut, and probably will get there in hotfixes. Some of the corrections came in so late before the release, that it was just not possible to feel good about declaring it fully supported just yet.

June 15, 2026 10:00 PM UTC