skip to navigation
skip to content

Planet Python

Last update: June 02, 2026 09:43 PM UTC

June 02, 2026


PyCoder’s Weekly

Issue #737: Polars 1.41, Email, Great Docs, and More (2026-06-02)

#737 – JUNE 2, 2026
View in Browser »

The PyCoder’s Weekly Logo


Announcing Polars 1.41

Polars 1.41 is out and this post covers the new features it includes. Learn about faster parquet metadata decoding, nested subplan elimination, and more.
POLA.RS

Sending Emails With Python

Learn how to send emails with Python using SMTP, attach files, format HTML messages, and personalize bulk emails for your contact list.
REAL PYTHON

Quiz: Sending Emails With Python

Use Python’s standard library to send email through secure SMTP connections, attach files, include HTML content, and route replies.
REAL PYTHON

Your Coding Agent Gets Dumber the Longer It Runs. Here’s the Fix.

alt

Coding agents degrade as context grows. The fix: a multi-role loop where the planner, builder, and reviewer each get isolated context — no stale assumptions, no compounding noise. A practical breakdown from someone who built it. Read the full breakdown
DEPOT sponsor

Great Docs

Talk Python interviews Rich Iannone and Michael Chow from Posit and they talk about a new Python documentation tool called Great Docs.
TALK PYTHON podcast

PyPy v7.3.23 Released

PYPY.ORG

Articles & Tutorials

Improving Python Through PEPs and Protocols

Have you ever been confused by the naming of modules you’re importing from a package? Is there a standard way to organize and name your Python virtual environments? This week on the show, Brett Cannon returns to discuss the Python Enhancement Proposals (PEPs) he’s been working on recently.
REAL PYTHON podcast

Tame Your Pesky Little Scripts

Over time it is common to accumulate little helper scripts, whether they’re shell scripts, aliases, or custom functions. They are typically tiny things that can become unwieldy to manage. This post shares a few ideas that might help you take back control.
JUHA-MATTI SANTALA

5-Day Live OOP Workshop (Final Chance to Enroll)

The Object-Oriented Python live cohort begins June 8. Five 2-hour sessions Mon to Fri build one growing application end to end, with OOP features introduced as the code starts needing them: classes, the data model, inheritance vs composition, properties, dataclasses.
REAL PYTHON sponsor

Free-Threading vs the GIL in mod_wsgi 6.0.0

Free-threading in mod_wsgi 6.0.0 lets a single process spread Python work across multiple cores. This post is a metrics based comparison between the GIL being enabled and disabled.
GRAHAM DUMPLETON

Notes About Python Email Packages

Chris recently upgraded his personal mail program from Python 2 to Python 3 and this post talks about what needed to change and notes how the newer code works.
CHRIS SIEBENMANN

Learning Path: Perfect Your Python Development Setup

Set up a Python development environment with VS Code, PyCharm, virtual environments, Git, pyenv, Docker, and AI coding tools like Claude Code and Cursor.
REAL PYTHON

Top 7 Python Libraries for Large-Scale Data Processing

This article covers Python libraries that make large-scale data processing faster, more scalable, and easier to manage across modern data workflows.
BALA PRIYA C

Connecting LLMs to Your Data With Python MCP Servers

Build an MCP server in Python that exposes tools, resources, and prompts so AI agents like Cursor can interact with your data.
REAL PYTHON course

How to Make a Scatter Plot in Python With plt.scatter()

Learn how to make scatter plots in Python with plt.scatter() and customize markers by size, color, shape, and transparency.
REAL PYTHON

Quiz: How to Make a Scatter Plot in Python With plt.scatter()

REAL PYTHON

Two Python Scoping Bugs: A Lesson in Object Lifetimes

Two Python bugs with opposite symptoms but the same root cause: picking the wrong scope for a stateful object.
BOB BELDERBOS

Sentinel Built-In

A quick post about Python 3.15’s new sentinel built-in.
RODRIGO GIRÃO SERRÃO

Projects & Code

dj-lite-tenant: Multi-Tenant SQLite Databases for Django

GITHUB.COM/ADAMGHILL

Lifeguard: Detect Lazy Imports Incompatibilities

GITHUB.COM/FACEBOOK

nbpipe: Run Sequences of Jupyter Notebooks as a Workflow

GITHUB.COM/NGAFAR

httpx2: A Next Generation HTTP Client for Python

GITHUB.COM/PYDANTIC

mkdocs-marimo: Mkdocs Plugin for Marimo

GITHUB.COM/MARIMO-TEAM

Events

Weekly Real Python Office Hours Q&A (Virtual)

June 3, 2026
REALPYTHON.COM

Canberra Python Meetup

June 4, 2026
MEETUP.COM

Sydney Python User Group (SyPy)

June 4, 2026
SYPY.ORG

GeoPython 2026

June 8 to June 11, 2026
GEOPYTHON.NET

PiterPy Meetup

June 9, 2026
PITERPY.COM

SciPy 2026, Minneapolis, MN

July 13-19, 2026
SCIPY.ORG • Shared by SciPy Organizers


Happy Pythoning!
This was PyCoder’s Weekly Issue #737.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

June 02, 2026 07:30 PM UTC


Real Python

Structuring Your Python Script

You may have begun your Python journey interactively, exploring ideas within Jupyter Notebooks or through the Python REPL. While that’s great for quick experimentation and immediate feedback, you’ll likely find yourself saving code into .py files. However, as your codebase grows, knowing where things should go in your script becomes increasingly important.

Transitioning from interactive environments to structured scripts helps promote readability, enabling better collaboration and more robust development practices. This video course shows you the foundations of organizing a Python script: where the runnable bits go, how to arrange your imports, and how to refactor with constants and a fixed entry point.

By the end of this video course, you’ll know how to:

Without further ado, it’s time to start working through a concrete script and progressively shape it into well-organized, shareable code.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 02, 2026 02:00 PM UTC


PyCharm

Top Agentic Frameworks for Building Applications 2026

In 2026, the world of AI is changing at a serious pace. The days of AI systems dealing solely in single-prompt interactions are coming to an end. Instead, these models are evolving into agentic systems – long-running, goal-driven software enabled by agentic frameworks that are becoming a critical layer in modern application architecture.

This rapid shift means that Python developers building autonomous systems are increasingly relying on agentic frameworks to manage reasoning, memory, tools, and collaboration among multiple agents.

You’ve probably already heard of some of the most popular frameworks. LangChain and AutoGen have risen to prominence, but there are dozens more, many of them open-source and only one to two years old. With so many frameworks promising different agentic capabilities, the real challenge is knowing which ones are best suited for the kind of application you want to build.

Let’s take a closer look at some of the most important agentic frameworks on the market in 2026, comparing what each does best and rating them based on our key comparison criteria to help you discover which is best for your projects.

What are AI agents?

An AI agent is a piece of software capable of autonomously reasoning, setting goals, and performing tasks on behalf of a user or another system. As the name suggests, AI agents have a level of agency to learn, adapt, and make decisions independently. This means they can improve their behavior and, over time, choose their own actions to achieve specific goals or outcomes.

AI agents work by following a perceive, reason, act, reflect (PRAR) cycle, which allows them to:

AI agents rely on the natural language processing capabilities of large language models, but unlike traditional LLMs and AI chatbots, they don’t require continuous user input to perform tasks. Agents are proactive, working autonomously to achieve a goal based on a specified set of rules and parameters.

What is an agentic framework?

An agentic framework provides the infrastructure needed to build, run, and control AI agents at scale. Most modern frameworks offer three core capabilities:

While it’s possible to build an agent without a framework, they’re vital in ensuring agents are reliable, scalable, and safe.

Agentic frameworks help turn experimental agent builds into maintainable software by facilitating:

Core orchestration paradigms

Before comparing individual frameworks, it’s important to understand how they operate. Let’s look at the three most commonly used orchestration models in 2026.

Graph-based orchestration

Graph-based orchestration provides maximum control by organizing agents and tools as nodes in a directed graph. Instead of letting an agent freely decide what to do next, the flow that agents are allowed to follow is clearly defined.

Strengths

Limitations

Role-based orchestration

Role-based orchestration is most effective when simplicity is a priority. Agents are assigned specific roles, such as “Planner”, “Researcher”, or “Builder”, and collaborate by sending messages to one another.

Strengths

Limitations

Chain-based orchestration

Chain-based orchestration, also known as adaptive orchestration, arguably offers the greatest flexibility. Agents in this model operate in dynamic chains or loops, deciding the next step autonomously.

Strengths

Limitations

Best agentic frameworks for your projects

Now that we’re familiar with the key orchestration paradigms of agentic frameworks, it’s time to compare some of the most popular frameworks on the market in 2026. Below, we evaluate each framework’s performance against our key comparison criteria:

FrameworkOrchestration modelMulti-agent supportMemory capabilitiesHITL supportBest used for
LangChainChain-basedPartialModerateLimited to moderateRapid LLM app development
LangGraphGraph-basedYesStrongStrongProduction-grade agent workflows
LlamaIndexRetrieval-centricLimitedStrongModerateKnowledge-heavy agents
HaystackPipeline-based/modularModerateStrongModerateProduction RAG and context-heavy AI systems
AutoGenRole-basedStrongModerateLimitedConversational multi-agent systems
CrewAIRole-basedStrongLightLimitedTask-oriented agent teams
Semantic KernelPlanner-basedModerateModerateStrongEnterprise AI
smolagentsMinimalistLimitedLightMinimalLightweight experiments
OpenAI Agents SDKGraph-basedYesManagedStrongHosted agent applications
PhidataAgent-centricLimited to moderateStrongModerateData and tool-heavy agents

Let’s take a closer look at the strengths and weaknesses of each framework, along with the applications they’re most suited to.

LangChain

Launched in 2022, LangChain is one of the most widely adopted frameworks due to its broad ecosystem of integrations. It serves as an accessible interface for nearly any LLM and is an ideal starting point for enthusiasts or startups looking to explore agentic AI. While not strictly “agent-first”, it provides the building blocks for agentic behavior.

LangChain provides less control than other frameworks, but it’s still a fantastic entry point into agentic systems, especially for projects where speed and creativity take precedence over enforcing strict workflows.

Strengths

Limitations

Best applications

If you want to go beyond the basics, read our LangChain Python Tutorial: A Complete Guide for 2026. It takes a deeper look at what LangChain offers and walks through real-world use cases for building AI agents in Python.

LangGraph

LangGraph has emerged as the leading standard for production-grade agent systems. Built on top of LangChain, it replaces implicit chains with explicit graphs, providing strict control over workflows and excellent HITL support via interrupts.

While the graph structure itself can actually make debugging easier by clearly mapping how agents and tools interact, LangGraph does come with a learning curve. Much of this complexity comes from designing the graph and managing explicit state between nodes. Once you understand these concepts, the framework becomes a powerful option for building predictable and controllable agent systems.

Strengths

Limitations

Best applications

LlamaIndex

LlamaIndex is a Python framework designed to help AI systems understand, store, and retrieve information from large amounts of documents and data.

Rather than starting with agents and adding data later, LlamaIndex takes the opposite approach – it starts with data and then builds agent behavior around it. This is why it is often described as data-first or retrieval-centric.

Because it operates in this way, LlamaIndex excels at indexing, memory, and retrieval, making it ideal for building agents whose intelligence depends on accessing the right information rather than executing complex actions.

Strengths

Limitations

Best applications

Haystack

Haystack is an open-source AI orchestration framework created by deepset for building production-ready AI agents, retrieval-augmented generation (RAG) systems, and multimodal applications.

Instead of focusing purely on agent behavior, Haystack structures applications as explicit pipelines composed of retrievers, routers, memory layers, tools, evaluators, and generators. This modular architecture gives you control over how information flows through a system, allowing each component to be tested and improved independently.

Haystack is particularly strong in applications where the quality of retrieved information determines the quality of the model’s output. Its design also makes it well-suited for enterprise environments that require transparency and reliability in production systems.

Strengths 

Limitations 

Best applications

AutoGen

AutoGen, an open-source Microsoft framework, popularized the idea of agents collaborating through structured conversation, organizing systems as teams of agents, each with its own specific role. Unlike in other frameworks, there’s no central controller enforcing a strict execution path – the collaboration itself drives progress.

This approach makes AutoGen ideal for exploratory, creative, and research-driven multi-agent systems, at the cost of predictability, HITL, and strict execution control.

Strengths 

Limitations 

Best applications

CrewAI

CrewAI is centered around building simple, structured multi-agent systems. It is similar to AutoGen, modeling AI agents as members of a “crew” where each agent has a clearly defined role. The goal is to make multi-agent systems approachable, even if you are new to agentic AI.

CrewAI prioritizes simplicity and speed over deep memory and production controls, making it easy to learn and a strong option for prototypes and small teams. However, its limited toolset for observability, HITL, and error handling at scale makes it less suited for larger systems.

Strengths

Limitations

Best applications

Semantic Kernel

Semantic Kernel is another open-source Microsoft framework, designed for building AI-powered applications that integrate with existing enterprise systems.

It was created with production concerns in mind from the start, emphasizing governance, safety, observability, and human oversight. Rather than maximizing agent autonomy, it focuses on making AI predictable, controllable, and auditable.

By combining structured workflows with LLM reasoning, it trades flexibility and emergent behavior for trust, safety, and operational reliability.

Strengths

Limitations

Best applications

smolagents

smolagents is a bare-bones framework designed to make agentic AI as straightforward and transparent as possible. It prioritizes simple, readable code that makes it easy to understand how an agent works without needing to learn a large framework.

smolagents aims to make agent behavior accessible and easy to experiment with by keeping abstractions minimal and logic transparent. It offers first-class support for code-based and tool-calling agents, broad model and tool compatibility, and lightweight CLI utilities, while intentionally trading large-scale orchestration and production features for simplicity and clarity.

Strengths

Limitations

Best applications

OpenAI Agents SDK

Thanks to ChatGPT’s explosion in popularity, we’ve all heard of OpenAI. The Agents SDK is the company’s effort to provide a managed platform for building and running agents without having to maintain your own orchestration infrastructure.

Rather than assembling agents from scratch, you define agent behavior and workflows, while OpenAI provides orchestration, memory management, monitoring, and safety controls. This makes the Agents SDK particularly attractive for teams that want production-ready agents quickly.

Strengths

Limitations

Best applications

Phidata

Phidata is designed for building practical, tool-driven AI agents that operate on real-world data.

Rather than focusing on abstract orchestration patterns, Phidata centers the agent around direct interaction with systems such as APIs, databases, and internal services.

Its design reflects the fact that many agents spend most of their time fetching, transforming, and acting on data.

Strengths

Limitations

Best applications

Choosing the right framework

Now that you’re familiar with many of the most popular frameworks in 2026, it’s time to choose the right one for your project. Let’s take a look at some of the key use cases, along with the frameworks that fit them best.

Orchestration modelWhere to useRecommended frameworks
Graph-basedProjects involving complex branching logic and requiring high levels of reliability, auditability, and control.LangGraph, OpenAI Agents SDK
Role-basedProjects involving rapid development and intuitive design that benefit from emergent collaboration between agents.AutoGen, CrewAI
Chain-basedProjects requiring maximum flexibility, where agents need to adapt dynamically and determine next steps autonomously.LangChain
Retrieval-basedProjects where deep, reliable access to knowledge matters more than high levels of autonomy.LlamaIndex, Haystack
Enterprise-orientedProjects where strong governance and human-in-the-loop processes are non-negotiable requirements.Semantic Kernel
LightweightRapid prototyping, educational use, and simple local agents where transparency and control matter more than orchestration complexity.smolagents
Tool-centricBuilding production agents that primarily interact with APIs, databases, and external systems rather than complex multi-step orchestration.Phidata

In 2026, agentic frameworks have evolved from experimental tools into foundational infrastructure for many applications. The key decision is no longer whether to use agents, but how much control, autonomy, and governance your systems require.

June 02, 2026 12:12 PM UTC


Real Python

Quiz: Python's Format Mini-Language for Tidy Strings

In this quiz, you’ll test your understanding of Python’s Format Mini-Language for Tidy Strings.

By working through this quiz, you’ll revisit how format specifiers work inside f-strings and str.format(), including alignment and width fields, decimal precision, type representations, thousand separators, sign handling, dynamic specifiers, and percentage formatting.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 02, 2026 12:00 PM UTC

Quiz: Structuring Your Python Script

In this quiz, you’ll test your understanding of the video course Structuring Your Python Script.

By working through this quiz, you’ll revisit how to make a Python script executable with a shebang, organize your imports per PEP 8, automatically sort imports with ruff, and define a clear entry point using if __name__ == "__main__".

These habits help you transition from quick experiments in the REPL to writing Python scripts that are easy to read, share, and grow.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 02, 2026 12:00 PM UTC


Python Software Foundation

No Starch Press Humble Bundle: Grab a Deal and Support the PSF!

Curious about leveling up your Python skills, or just getting your feet wet? Pick up a whole set of solid Python books at a great price and support the Python Software Foundation (PSF) at the same time!

No Starch Press, an indie tech-book publisher and long time supporter of the PSF, just announced a new Python-themed Humble Bundle. Grab ‘Python: The Good Stuff by No Starch’ and pay what you want for all-Python DRM-free ebook titles for Python beginners to pros. And a share of the proceeds from the bundle goes to the PSF! This bundle runs now through June 18th, 2026, so make sure to grab it and share the link with your friends.

Python: The Good Stuff by No Starch’ includes 15 titles for $36 USD ($583 value 🫨), including Automate the Boring Stuff with Python, 3rd Edition (Al Sweigart), Python Crash Course, 3rd Edition (Eric Matthes), and Practical Deep Learning (Ronald T. Kneusel).

Humble Bundle Pro Tips: 


Make sure to grab this awesome bundle of Python books for yourself (or a friend!), and help support the PSF. Thank you, No Starch and Humble Bundle, for making Python education more accessible and supporting the PSF. Happy reading, everyone!

About the Python Software Foundation

The Python Software Foundation is a US non-profit whose mission is to promote, protect, and advance the Python programming language, and to support and facilitate the growth of a diverse and international community of Python programmers. The PSF supports the Python community using corporate sponsorships, grants, and donations. Are you interested in sponsoring or donating to the PSF so we can continue supporting Python and its community? Check out our sponsorship program, donate directly, or contact our team at sponsors@python.org!

June 02, 2026 07:21 AM UTC


Tryton News

Tryton News June 2026

In the last month we focused on fixing bugs, improving the behaviour of things, speeding-up performance issues - building on the changes from our last release. We also added some new features which we would like to introduce to you in this newsletter.

For an in depth overview of the Tryton issues please take a look at our issue tracker or see the issues and merge requests filtered by label.

Changes for the User

Accounting, Invoicing and Payments

We now add an optional journal column on the invoice list view.

Now we add a relate to the invoice model from the period and fiscal year to be able to export or print invoices per period.

We add a delay to the PEPPOL e-document rendering and processing for each service to allow after posting an invoice to record payments which are later rendered in the UBL invoice.

We now raise a generic user error message when failing to parse an imported AEB43 account statement.

Stock, Production and Shipments

Now we can manage products directly in the category form. So we think it is better to now have dedicated views at all but to ensure that we can manage such large Many2Many (also with #14782 (closed)).

Now we let Tryton calculate average lead time for product suppliers based on the effective date of incoming stock moves and the purchase date of the last year.

Parties

Now we make Tryton try to guess the type of contact mechanism when changing value for the standardised types like email, phone, mobile and URL.

User Interface

We now use the search dialogue popup window for deleting records in One2Many or removing records from Many2Many widgets. The remove (delete) button shows a search popup when no records are selected or when more than 20 records are selected. In the search popup are the identical records preselected. Users can refine the search using the filter and the sort order of the popup. And once the popup is validated, the selected records are removed (deleted) from the X2Many field.

We now display the number of records being deleted in the confirmation message. We think it helps the user to realise that they are deleting many records.

Now we allow users to mark notifications as read.

System Data and Configuration

Now we support the country organization (Like EU, ASEAN, …) as a criteria for tax rules.

New Releases

We released bug fixes for the currently maintained long term support series
8.0 and 7.0, and for the penultimate series 7.8.

There are no new release for 6.0 and 7.6 series as they entered their end of life period.

Changes for the System Administrator

We now remove the dependencies to pytz and backports.entry-points-selectable.

Now we update the version of Stripe to 2026-04-22.dahlia.

Changes for Implementers and Developers

We now add support for the age-functionality to SQLite. The age-function returns a time interval instead of an integer (of days) when calculating duration between dates.

Authors: @pokoli @udono

1 post - 1 participant

Read full topic

June 02, 2026 06:00 AM UTC


Python Insider

Python 3.15.0 beta 2 is here!

The antepenultimate 3.15 beta is out!

June 02, 2026 12:00 AM UTC

June 01, 2026


The No Title® Tech Blog

Just updated - both Optimize Images and Optimize Images X

This release represents a significant milestone for both Optimize Images and Optimize Images X, marking a coordinated step forward in modernization, dependency cleanup, and internal architecture improvements across the ecosystem.

June 01, 2026 09:40 PM UTC


death and gravity

DynamoDB crash course: part 3 – design patterns

Previously

This is the last part of a series covering core DynamoDB concepts. The goal is to help you understand idiomatic usage and trade-offs in under an hour.

In the first part, I summarized DynamoDB's main proposition to its users like so:

data modeling complexity is always preferable to complexity coming from infrastructure maintenance, availability, and scalability

Today, we're looking at the design patterns that help manage this complexity, making the most of its data model and features and working around its limits.

Contents

Composite keys #

Composite (aka synthetic) keys underpin most other patterns.

The idea is simple: keys don't have to be natural attributes of your data, they can be composed of other attributes that enable specific access patterns. This works both with table and index keys.

How do you compose keys? By string concatenation, of course! Careful with numbers though, they need padding to be useful in sort keys.

Example

To sort lexicographically by more than one attribute, you group them in a sort key, e.g. {Album}#{Song}.

Or, in single table design, you distinguish between item types by prefixing keys with the type, e.g. album#{Album}.

Or, in partition key sharding, you spread the load on a GSI partition by splitting one partition key into multiple ones, e.g. {Genre}#{shard}.

But denormalization has its trade-offs. For sort key {Album}#{Song}, should Album and Song also be separate attributes? If yes, you need to ensure they never change, but you can use them in indexes (e.g. a GSI with Album as primary key). If no, items can't become inconsistent, but you need to parse the key to get them.

This was inconvenient enough that DynamoDB finally added multi-attribute keys support to GSIs in 2025 (although not inconvenient enough to also add it to tables).

See also

Single table design #

The AWS guidance is to use as few tables as possible:

As a general rule, you should maintain as few tables as possible in a DynamoDB application. [...] A single table with inverted indexes can usually enable simple queries to create and retrieve the complex hierarchical data structures required by your application.

This culminates in single table design, where you put all entities in the same table, and tell them apart based on the key format, usually using a prefix. With this pattern, one DynamoDB table corresponds to a whole relational database.

The easiest way is to put items related to a top-level entity on the same partition. The main benefit is that joins with the top-level entity become trivial. A second one is that you can sometimes get different entity types in a single query, which can be both faster and cheaper (fewer queries; small items pack into fewer capacity units).

Example

You can group items related to an Artist on the same partition, with sort keys like artist, album#{Album}, and song#{Album}#{Song}.

# table Music (partition key: Artist, sort key: sk)
Solar Fields: !btree
  'album#Leaving Home': { Genre: Electronic }
  'artist': { Variations: [ Solarfields ] }
  'song#Leaving Home#Air Song': { Duration: 741 }
  'song#Leaving Home#Monogram': { Duration: 944 }

Besides getting items of a single type, you can also get artist details and albums in a single query (sk BETWEEN "album#" AND "artist").

But choose wisely – queries can have only one sort key condition, so you can't also get album details and songs in a single query with this schema; sort keys {Album} and {Album}#{Song} would do it, at the expense of the first query.

Sometimes, it can be useful to put some sub-entities on dedicated partitions, accepting that joins will have to be done in code.

Example

In the example above, a popular artist with lots of songs can lead to:

Perhaps it's better to put the songs in each album on separate partitions:

# table Music (partition key: pk, sort key: sk)
'artist#Solar Fields': !btree
  'album#Leaving Home': { Genre: Electronic }
  'artist': { Variations: [ Solarfields ] }
'song#Solar Fields#Leaving Home': !btree
  'Air Song': { Duration: 741 }
  'Monogram': { Duration: 944 }

This spreads the load onto multiple partitions, which should fix throttling.

The downside is that list songs for artist is now a two-step operation: first one query for the albums, then one query per album for the songs. The upside is that the per-album queries can be done in parallel, which wasn't possible before.

A consequence of this design is that you need a GSI to list items of a specific type (otherwise, you have to do a full table scan). Of note, exceeding the GSI partition throughput limit will cause write throttling on the base table; in the absence of a natural high-cardinality GSI partition key, sharding or some other composite key can help.

A final benefit of using a single table is better utilization with provisioned mode: usage gets averaged across entities and tends to be smoother, and spikes can share the same spare capacity.

See also

GSI overloading #

GSI overloading is just single table design for indexes – you put different values in the GSI key attributes, depending on item type. This way you can index more attributes than the 20 GSIs per table quota, and it can be cheaper too, since, like with tables, fewer indexes make better use of spare provisioned capacity.

Example

For a table that contains both artist and album items, a single GSI can be used for entirely different purposes:

# table Music (partition key: Artist, sort key: sk)
2 Bit Pie: !btree
  'album#2 Pie Island': { gsi1pk: 'album#Electronic' }
  'artist': { gsi1pk: 'artist#United Kingdom' }
Ishome: !btree
  'album#Confession': { gsi1pk: 'album#Electronic' }
  'artist': { gsi1pk: 'artist#Russia' }
# GSI GSI1 (partition key: gsi1pk, sort key: Artist)
'artist#United Kingdom': !btree
  2 Bit Pie: { sk: 'artist' }
'artist#Russia': !btree
  Ishome: { sk: 'artist' }
'album#Electronic': !btree
  2 Bit Pie: { sk: 'album#2 Pie Island' }
  Ishome: { sk: 'album#Confession' }

See also

Partition key sharding #

Sometimes, a partition key composed of multiple natural attributes is not enough to spread the load evenly across partitions; you can deal with this by putting items with the same natural attributes on multiple partitions.

So, what partition key should you use? One option is to use a random suffix from a known range; this allows you to list items for a natural attribute value by doing multiple queries, one for each suffix.

Example

For a table of songs, using Album as the partition key won't work, since not all songs are released on an album; Artist always has a value, but some artists have hundreds or even thousands of songs, which can lead to throttling.

Instead, we can use {Artist}#{randrange(10)} as partition key, which allows ten times as many items before we reach throughput limits. To list an artist's songs:

for shard in range(10):
    for item in dynamodb.query(f"{artist}#{shard}"):
        yield item

A downside of random suffixes is that you can't get a specific item, because you don't know what its suffix is. A better option is to calculate the suffix from an attribute that you do know, for example using its hash modulo N.

Example

With primary key {Artist}#{hash(Song) % 10)}, we can get a song like this:

def hash(s):
    return int.from_bytes(sha256(s.encode()).digest())

shard = hash(song_title) % 10
dynamodb.get_item(f"{artist}#{shard}", song_title)

A lot of times you need to list items by a low-cardinality attribute, so sharding may be even more important for GSIs.

Example

Assuming dedicated album items, you can list all the albums by putting them in a single GSI partition key called albums, but this will definitely cause throttling.

To avoid it, you can use GSI partition key album#{hash(Album} % 100} if you don't care about the order, or something like album#{Album[:2].lower()} if you do (but likely more sophistication is needed – th will be a very common album title prefix, and some album titles don't contain letters at all).

Even if throttling is not an issue (e.g. single infrequent reader), sharding allows you to query multiple partitions in parallel, which can speed up getting the entire result set.


So, how many shards should you have? That depends on the number, size, and how often you access the items, and is also a trade-off – too many shards means additional queries and latency, too few shards means you still overload the partitions sometimes.

Importantly, increasing the number of shards is non-trivial. For tables, you usually need to rebalance the items in place. For indexes, it's cleaner to move to a new index, or if you just need to list items by type, you can put all new items on new shards.

Regardless, you have to support it in code, do a backfill, and orchestrate the migration, which all become more complex if downtime and inconsistencies are not acceptable (e.g. if you expose a pagination token based on LastEvaluatedKey, you may want to support both versions during the switch).

See also

Sparse indexes #

An item with missing index partition/sort key attributes won't appear in the index, and you won't pay for it. This can be used deliberately to query a subset of the items in the table, like those of a specific type or in a specific state.

Example

Assuming dedicated album items, an alternative way to list all the albums is to have a GSI with {Album} as partition key, and just scan the entire index (the primary key has to be a dedicated attribute that only albums have, so that only album items appear in the index).

Or, you can use a dedicated GSI with CoverOf as primary key to list cover songs.

See also

Base table indexes #

In some cases, GSIs won't cut it – maybe you need a strongly consistent index, or need to model a many-to-one relationship (indexes map one item in the base table to one item in the index).

Instead, you can maintain an index in the base table by having additional index items associated with the main item; to guarantee atomic updates, use transactions. You then go from the main item to the index items via a main item attribute, and from the index items to the main item via their partition key.

Example

Songs have different identifiers in external systems, such as ISRC, ISWC, or MBID. To query songs by multiple external ids, you'd structure your database like this:

(Alternatively, you could have one sparse index per external id type, but then you lose strong consistency, and risk running out of GSIs).

Note that modeling one-to-many relationships isn't this involved, since it fits neatly into the related-items-same-partition variant of single table design.

See also

Optimistic locking #

Optimistic locking is a concurrency control method useful when conflicts are rare, so instead of acquiring a lock to do changes, you check if someone else changed the data right before commiting, as part of an atomic operation.

In DynamoDB, that operation is a conditional write; items get an integer version attribute, and every time you want to update an item, you:

  1. read the item, including the version
  2. increment the version and modify the item
  3. update the item, using a condition expression to ensure the version matches
    1. if successful, you're done
    2. else, start over from the beginning

You can also do this in transactions to update groups of related items, like in the base table index pattern above, with only the main item needing a version.

The upside of optimistic locking is that it is faster on average, since updates usually succeed on the first try; for fewer conflicts, use strongly consistent reads.

The downside is that it requires explicit support – it must be possible to start over from the beginning, which complicates logic, especially if you need to interact with other systems besides updating the item (e.g. to send a notification).

See also


Anyway, that's it for now.

See also

For mode details and examples, check out the official documentation:

Learned something new today? Share it with others, it really helps!

Want to know when new articles come out? Subscribe here to get new stuff straight to your inbox!

June 01, 2026 03:00 PM UTC


Real Python

Python sleep(): How to Add Time Delays to Your Code

Sometimes you need to make Python sleep, wait, or pause before running the next line of code. Whether you’re spacing out API requests, pacing a thread, or adding a delay to terminal output, Python’s time.sleep() function is the standard tool:

Language: Python
from time import sleep
sleep(3)  # Pause execution for 3 seconds

Beyond time.sleep(), Python provides different ways to add time delays depending on the context, including threads, async code, and GUI applications.

By the end of this tutorial, you’ll understand that:

  • time.sleep() suspends execution for a given number of seconds, including fractional values like milliseconds.
  • Retry decorators use time.sleep() to add a delay between failed attempts.
  • Event.wait() is the preferred way to add delays in threads because it can be interrupted cleanly.
  • asyncio.sleep() pauses a single coroutine without blocking the rest of your async code.
  • GUI frameworks like Tkinter provide scheduling methods such as .after() to avoid freezing the event loop.

The following sections cover each of these approaches with working code examples.

Get Your Code: Click here to download the free sample code you’ll use to add time delays to scripts, threads, async code, and GUI apps.

Take the Quiz: Test your knowledge with our interactive “Python time.sleep()” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Python time.sleep()

In this quiz, you'll revisit how to add time delays to your Python programs.

Pause Execution With Python sleep()

Python has built-in support for making your program wait. The time module has a sleep() function that you can use to add a delay by suspending execution of the calling thread for the number of seconds you specify:

Language: Python
>>> import time
>>> time.sleep(3)  # Sleep for 3 seconds

Here’s a quick example of time.sleep() in action:

Language: Python Filename: coffee.py
import time

print("Brewing coffee...")
print("This would take like 3 secs...")
time.sleep(3)
print("Done! Your coffee is ready!")

If you run this script, you’ll see a three-second pause between the messages while time.sleep() suspends execution.

You can also pass fractional seconds to time.sleep() for finer-grained durations. Here are some common values:

Language: Python
import time

time.sleep(0.5)  # Wait 500 milliseconds
time.sleep(0.001)  # Wait 1 millisecond
time.sleep(1.5)  # Wait 1.5 seconds
time.sleep(60)  # Wait 1 minute

The time.sleep() function isn’t perfectly precise. The specified value acts as a minimum delay. The actual pause will almost always be slightly longer in practice due to operating system scheduler overhead and current system load.

You can test how long the sleep lasts by using Python’s timeit module:

Language: Shell
$ python -m timeit -n 3 "import time; time.sleep(3)"
3 loops, best of 5: 3 sec per loop

Here, you run the timeit module with the -n parameter, which tells timeit how many times to run the statement per repeat. With the default of five repeats, the statement runs 15 times in total (3 × 5). timeit then reports the best time across all repeats, which is three seconds per loop, as expected.

For a more realistic example, say you need to monitor whether a website is up. You want to check its status code periodically, but querying the server too often could overload it or get you rate-limited. You can use time.sleep() to space out the checks:

Language: Python Filename: uptime_bot.py
import time
import urllib.request
import urllib.error

CHECK_INTERVAL = 60  # Seconds between checks

def uptime_bot(url):
    while True:
        try:
            urllib.request.urlopen(url)
        except urllib.error.HTTPError as e:
            # Email admin or log
            print(f"HTTPError: {e.code} for {url}")
        except urllib.error.URLError as e:
            # Email admin or log
            print(f"URLError: {e.reason} for {url}")
        else:
            # Website is up
            print(f"{url} is up")
        time.sleep(CHECK_INTERVAL)

if __name__ == "__main__":
    url = "https://www.google.com/py"
    uptime_bot(url)

Read the full article at https://realpython.com/python-sleep/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 01, 2026 02:00 PM UTC

Quiz: Regular Expressions: Regexes in Python (Part 1)

In this quiz, you’ll test your understanding of Regular Expressions: Regexes in Python (Part 1).

By working through this quiz, you’ll revisit how to use the re module to search for patterns, build character classes and anchors, group and capture substrings, and apply flags like re.IGNORECASE to control matching behavior.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

June 01, 2026 12:00 PM UTC


Python Bytes

#482 Mr. Beast's episode

<strong>Topics covered in this episode:</strong><br> <ul> <li><strong><a href="https://marcelotryle.com/blog/2026/05/28/cve-2026-48710-a-maintainers-perspective/?featured_on=pythonbytes">CVE-2026-48710: A Maintainer's Perspective</a></strong></li> <li><strong><a href="https://github.com/emanuelef/daily-stars-explorer?featured_on=pythonbytes">daily-stars-explorer</a></strong></li> <li><strong><a href="https://testandcode.org/posts/writing/markdown-to-pdf/?featured_on=pythonbytes">Markdown to pdf with pandoc and typst</a></strong></li> <li><strong><a href="https://github.com/golikovichev/postman2pytest?featured_on=pythonbytes">postman2pytest</a></strong></li> <li><strong>Extras</strong></li> <li><strong>Joke</strong></li> </ul><a href='https://www.youtube.com/watch?v=kNEoGGXppe4' style='font-weight: bold;'data-umami-event="Livestream-Past" data-umami-event-episode="482">Watch on YouTube</a><br> <p><strong>About the show</strong></p> <p><strong>Brian #1: <a href="https://marcelotryle.com/blog/2026/05/28/cve-2026-48710-a-maintainers-perspective/?featured_on=pythonbytes">CVE-2026-48710: A Maintainer's Perspective</a></strong></p> <ul> <li>Marcelo Trylesinski</li> <li>suggested by Lee Luocks</li> <li>Short version: <ul> <li>users of Starlette: upgrade to Starlette 1.0.1</li> <li>security professionals: we can’t treat open source projects like corporations</li> </ul></li> <li>This top link is a Starlette security advisory with the title <ul> <li>Missing Host header validation poisons request.url.path, bypassing path-based security checks</li> </ul></li> <li>The CVE apparently caused some negative press targeting starlette.</li> <li>However, “the vulnerability came from the application pattern and the deployment, never from something Starlette intended.”</li> <li>A quote from an OSTIF article: “This bug is a classic “responsibility gap” where if this maintainer didn’t patch, thousands of exposed projects would have to individually secure their projects. In doing this work, they’ve voluntarily taken on the responsibility to protect the ecosystem from long-term systemic harm. As with all open source projects, they owed us nothing and could have left this to be everyone else’s problem and took the extraordinary steps of helping the ecosystem.”</li> <li>Both X40 D-Sec and Ars Technica expected immediate fixes and responses from Starlette.</li> <li>That’s not good. We can do better.</li> </ul> <p><strong>Michael #2: <a href="https://github.com/emanuelef/daily-stars-explorer?featured_on=pythonbytes">daily-stars-explorer</a></strong></p> <ul> <li>Explore the full history of any GitHub repository.</li> <li>📈 Full Star History - Complete daily star counts for any repo</li> <li>⏰ Hourly Stars - Hour-by-hour activity with timezone support</li> <li>🔀 Compare Repos - Side-by-side comparison of any two repositories</li> <li>📊 Activity Timelines - Commits, PRs, Issues, Forks, Contributors over time</li> <li>📌 Pin Favorites - Bookmark repos for quick access without retyping</li> <li>📰 Feed Mentions - See when repos were mentioned on HN, Reddit, YouTube, GitHub</li> <li>💾 Export Data - Download as CSV or JSON</li> <li>🌙 Dark Mode - Easy on the eyes</li> <li>Try/use it online at <a href="https://emanuelef.github.io/daily-stars-explorer/#/helm/helm"><strong>emanuelef.github.io/daily-stars-explorer</strong></a> or install it for yourself.</li> </ul> <p><strong>Brian #3: <a href="https://testandcode.org/posts/writing/markdown-to-pdf/?featured_on=pythonbytes">Markdown to pdf with pandoc and typst</a></strong></p> <ul> <li>typst suggestion from Matt Harrison</li> <li>Markdown is awesome</li> <li>Pandoc is great for converting markdown to tons of stuff <ul> <li>but for pdf, it goes through LaTeX, which is … yuk (my opinion)</li> </ul></li> <li>Pandoc also can convert to typst</li> <li>And typst creates beautiful pdfs and is way easier (my opinion) to deal with than LaTeX.</li> <li>New tools <ul> <li><code>brew upgrade pandoc</code></li> <li><code>brew install typst</code></li> </ul></li> <li>Now convert <ul> <li><code>pandoc something.md --to typst -o something.typ</code></li> <li><code>typst compile something.typ something.pdf</code></li> </ul></li> </ul> <p><strong>Michael #4: <a href="https://github.com/golikovichev/postman2pytest?featured_on=pythonbytes">postman2pytest</a></strong></p> <ul> <li>via Mikhail</li> <li>Based on <a href="https://www.postman.com/downloads/?featured_on=pythonbytes">postman app</a></li> <li>Convert Postman Collection v2.1 JSON into executable pytest test suites</li> <li>Postman collections document your API. <code>postman2pytest</code> turns that documentation into executable regression tests that run in CI. No manual rewriting, no drift.</li> </ul> <p><strong>Extras</strong>:</p> <ul> <li><a href="https://testandcode.org/posts/meta/a-place-to-write/?featured_on=pythonbytes">New blog, who dis?</a> - <a href="https://testandcode.org?featured_on=pythonbytes">testandcode.org</a> is now on .org and a blog and soon to be a “publisher”.</li> </ul> <p><strong>Joke: <a href="https://x.com/PR0GRAMMERHUM0R/status/2059001594662846751?featured_on=pythonbytes">Centering a div</a></strong></p>

June 01, 2026 08:00 AM UTC


Speed Matters

Scandir Rs


layout: post title: scandir-rs tagline: Blazing-fast directory traversal for Python — up to 70× faster than os.walk. date: 2026-06-01 08:40:00 +0100 categories: posts —————–

scandir-rs: High-Performance Directory Traversal for Python

File system traversal is often a hidden bottleneck.

Whether you’re indexing files, collecting statistics, searching large directory trees, or building developer tools, performance matters. That’s why I created scandir-rs: a Rust-powered Python library designed to be a drop-in replacement for os.walk() and os.scandir(), while delivering dramatically better performance and additional functionality.

A new version (2.9.9) is available with following changes compared to the version I’ve introduced here the last time (2.7.1):

Why scandir-rs?

Because speed matters…

🚀 Significant Performance Improvements

Compared to Python’s built-in implementations:

When processing millions of files, these speedups can turn minutes into seconds.

Benchmarks results for running scandir in linux-5.9 folder

scandir benchmarks scandir-rs Walk benchmark on Linux (kernel 5.9) scandir benchmarks scandir-rs Walk benchmark on Windows (kernel 5.9)

🔍 Richer Metadata

Beyond the standard os.walk() and os.scandir() APIs, scandir-rs can return:

⚡ Background Processing

Long-running scans can run asynchronously in the background, allowing your application to process results while scanning is still in progress.

Installation

pip install scandir-rs

Usage Examples

Directory Statistics

Get fast statistics for an entire directory tree:

import scandir_rs as scandir

print(scandir.Count("/usr").collect())

Extended Statistics

Include additional metadata and hardlink detection:

import scandir_rs as scandir

print(
    scandir.Count(
        "/usr",
        return_type=scandir.ReturnType.Ext
    ).collect()
)

Background Scanning

Process results while scanning continues in the background:

import scandir_rs as scandir

counter = scandir.Count("/usr")

with counter:
    while counter.busy:
        results = counter.results()
        # Process intermediate results

# Final results as JSON
results = counter.to_json()

Faster os.walk()

A familiar interface with significantly better performance:

import scandir_rs as scandir

for root, dirs, files in scandir.Walk("/usr"):
    # Process files

Extended Walk Information

Retrieve additional file categories and error information:

import scandir_rs as scandir

for root, dirs, files, symlinks, other, errors in scandir.Walk(
    "/usr",
    return_type=scandir.ReturnType.Ext
):
    # Process files

On Unix systems, other includes special file types such as pipes and devices.

Faster os.scandir()

Collect all entries at once:

import scandir_rs as scandir

entries, errors = scandir.Scandir("/usr").collect()

Or iterate lazily:

import scandir_rs as scandir

for entry in scandir.Scandir("/usr"):
    # Process entry

Extended Metadata

Request detailed information for each directory entry:

import scandir_rs as scandir

for entry in scandir.Scandir(
    "/usr",
    return_type=scandir.ReturnType.Ext
):
    # Process entry

Entries are returned as DirEntryExt objects. Errors are reported as tuples containing:

(relative_path, error_message)

allowing scans to continue even when individual files cannot be accessed.

Benchmark Results

Walk Performance

Operation Linux Windows
Walk vs os.walk Up to 13× faster Up to 70× faster

Scandir Performance

Operation Linux Windows
Scandir vs os.scandir Up to 6.5× faster Up to 6.5× faster

For detailed benchmark data and methodology, see the benchmark documentation:

https://github.com/brmmm3/scandir-rs/blob/master/pyscandir/doc/benchmarks.md

Get Started

If your application spends time traversing large directory trees, scandir-rs can provide substantial performance improvements with minimal code changes.

The API is intentionally familiar, making migration from os.walk() and os.scandir() straightforward while unlocking additional capabilities and significantly faster execution.

Source code, documentation, and issue tracker:

https://github.com/brmmm3/scandir-rs

Licensed under the MIT License.

June 01, 2026 12:00 AM UTC


Stéphane Wirtel

PyCon Ireland 2026: The Call for Proposals is Open

![[pycon-ireland-2026-cfp-banner.png]]

TL;DR

PyCon Ireland 2026 takes place on 17 October at Trinity College Dublin. The Call for Proposals is open until 30 August. Two tracks get special focus this year: Python security and AI with Python. First-time speakers are welcome. Financial aid up to €350 is available. Submit at 2026.pycon.ie/cfp.


I’m part of the team organising PyCon Ireland 2026, and the Call for Proposals opened on 25 May. If you’ve been carrying a Python idea around (something you built, broke, learned, or want to share), now is the time to write it up.

June 01, 2026 12:00 AM UTC


Bob Belderbos

AI Human-in-the-loop: News Digest Triage Telegram Bot

In my trend digest article I shared a quick tool to keep on top of tech trends, but it's a one-way street: the model gives information, but I still have to decide what to do with it. Let's build the second half: a Telegram bot that shows me each story, guesses a tag, and lets me confirm or overrule it with one tap.

Human-in-the-loop (HITL): the model proposes, you decide

AI makes suggestions but it can hallucinate, so it's important to have a human in the loop to catch mistakes. The model does the work of categorizing, the human makes the final decision. This is a good example of the control layer above the model and it's where you can make AI more reliable.

This is what we teach in week 4 of our Agentic AI cohort where things come together: expense parsing, AI category suggestion, and the human in the loop to confirm it. This requires the bot to keep state, route responses, and a way to be wrong gracefully. Below is a smaller version so you can get a taste for how this works.

We'll build it in seven steps. Grab the full script up front, or follow along piece by piece.

Step 1: create the bot and get a token

Telegram bots are created by another bot. Open Telegram and search for @BotFather (it has a blue checkmark):

  1. Send /newbot.
  2. Give it a display name (anything).
  3. Give it a username ending in bot that is globally unique, e.g. alice_trend_bot.

BotFather replies with a token like 123456789:ABCdef.... Treat it like a password. Put it in a .env file next to your script (or export in your shell), together with your OpenAI key:

TELEGRAM_BOT_TOKEN=123456789:ABCdef...
OPENAI_API_KEY=sk-...

If the token ever leaks, send /revoke to BotFather for a fresh one.

Step 2: the dependencies

The whole thing is one file. I use a PEP 723 header so uv run resolves everything into its own environment, no virtualenv to manage. Put this at the top of trend_triage_bot.py:

# /// script
# requires-python = ">=3.12"
# dependencies = [
#   "python-telegram-bot>=21",
#   "openai>=1.40",
#   "httpx",
#   "python-decouple",
#   "pydantic",
# ]
# ///

If you would rather build this inside an existing project, the equivalent is:

uv init && uv add python-telegram-bot openai httpx python-decouple pydantic

Then the imports and a few constants:

import json
import logging
from pathlib import Path
from typing import Literal, Protocol

import httpx
from decouple import config
from openai import AsyncOpenAI
from pydantic import BaseModel
from telegram import InlineKeyboardButton, InlineKeyboardMarkup, Message, Update
from telegram.ext import (
    Application,
    CallbackQueryHandler,
    CommandHandler,
    ContextTypes,
)

logger = logging.getLogger(__name__)

TAGS = ["read", "lib", "tool", "skip"]
DEFAULT_TOPIC = "rust"
READING_LIST = Path("reading_list.jsonl")
LOBSTERS_FEED = "https://lobste.rs/t/{tag}.json"

Step 3: fetch the stories

Lobsters has a per-tag JSON feed, no auth required: https://lobste.rs/t/rust.json returns the latest Rust-tagged stories, .../t/python.json the Python ones, and so on. It's a tighter, more engineering-focused signal than a broad keyword search, and parameterizing the tag is what lets /digest rust and /digest python hit the same code. A Story is just a title and a URL.

Let's set up the model and fetch the latest five stories for a given tag:

from pydantic import BaseModel, HttpUrl

class Story(BaseModel):
    title: str
    url: HttpUrl


async def fetch_stories(tag: str, *, limit: int = 5) -> list[Story]:
    async with httpx.AsyncClient(
        timeout=10, headers={"User-Agent": "trend-triage-bot"}
    ) as http:
        response = await http.get(LOBSTERS_FEED.format(tag=tag))
        response.raise_for_status()
        return [
            Story(
                title=story["title"],
                url=story["url"] or f"https://lobste.rs/s/{story['short_id']}",
            )
            for story in response.json()[:limit]
            if story.get("title")
        ]

Two small details: Lobsters expects a User-Agent header, and a text/discussion post has an empty url, so we fall back to its comments page (/s/{short_id}), the same pattern you'd use for an HN self-post.

The * in the function signature makes the limit keyword-only, so you have to call fetch_stories("rust", limit=10), which is a nice safeguard against accidentally changing the default.

Step 4: let the LLM propose a tag

The model picks one of TAGS. As the digest topic is variable (/digest rust, /digest python), the tags have to be topic-agnostic, so they describe what a story is (read / lib / tool), not anything Rust- or Python-specific.

Content-type beats intent here: "is this a tool or a library" is answerable from a headline, whereas "will I read this or build with it" depends on me, not the title. And a tag the model can't infer is a tag you end up correcting every time.

And using structured outputs I get typed values back, not strings I have to parse and second-guess; consistent data types are the foundation of reliable AI.

SYSTEM = (
    "Tag this software/tech headline with one of: "
    "read (an article, post, or tutorial), "
    "lib (a library, framework, or package you import), "
    "tool (a CLI, app, or utility you run). "
    "Use 'skip' only if it is off-topic or clickbait."
)


class TagChoice(BaseModel):
    tag: Literal["read", "lib", "tool", "skip"]


class Classifier(Protocol):
    async def tag(self, story: Story) -> str: ...


class OpenAIClassifier:
    def __init__(self, api_key: str, model: str = "gpt-4o-mini") -> None:
        self._client = AsyncOpenAI(api_key=api_key)
        self._model = model

    async def tag(self, story: Story) -> str:
        completion = await self._client.beta.chat.completions.parse(
            model=self._model,
            messages=[
                {"role": "system", "content": SYSTEM},
                {"role": "user", "content": story.title},
            ],
            response_format=TagChoice,
        )
        choice = completion.choices[0].message.parsed
        return choice.tag if choice else "skip"


def _build_classifier() -> Classifier:
    return OpenAIClassifier(config("OPENAI_API_KEY"))

_build_classifier() constructs the client at runtime, not at import; we call it once in main() and stash the result (more on that in step 7).

This decoupling allows a test to inject a fake tagger without touching OpenAI. It's the same lazy-wiring trick I used demonstrating the repository pattern. The Protocol means any class with an async tag() method drops in. Protocols are more flexible here, because they don't require inheritance like ABCs do, so the test double doesn't have to know about the real classifier at all.

Filing a tagged story is a one-liner to a JSONL file. JSONL (or JSON Lines) is a way to store structured data; each line contains a single, valid JSON object.

def save_to_reading_list(story: Story, tag: str) -> None:
    with READING_LIST.open("a") as f:
        f.write(json.dumps({"tag": tag, **story.model_dump(mode="json")}) + "\n")

Note that model_dump() hands back a pydantic Url object (HttpUrl) that json.dumps can't serialize; mode="json" coerces it to a string first.

Step 5: the keyboard that highlights the guess

The AI's pick is prefixed with >>, but every other tag is one tap away. The callback_data stays plain (tag:read) so the handler never has to strip the decoration:

def triage_keyboard(suggested: str) -> InlineKeyboardMarkup:
    buttons = [
        InlineKeyboardButton(
            f">> {tag}" if tag == suggested else tag,
            callback_data=f"tag:{tag}",
        )
        for tag in TAGS
    ]
    rows = [buttons[i : i + 3] for i in range(0, len(buttons), 3)]
    return InlineKeyboardMarkup(rows)

The marker goes in front, and that detail matters: Telegram clips long button labels from the end, so my first attempt, wrapping the tag in >> tool <<, showed up as >> tool… with the closing marker eaten. The kind of bug you only catch by testing it on a real phone.

Telegram inline keyboard showing read, lib, tool and skip buttons, with the model's guess > marker" />

Step 6: two steps, one stashed queue

A simple bot is stateless: message in, reply out. This one is not. Step one (the /digest command) fetches and shows the first story; step two fires later, when I tap a button, and needs the queue from step one. context.user_data is a per-user dict the library keeps between handler calls, so I park the queue there:

async def start_digest(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
    if update.message is None:
        return
    topic = context.args[0].lower() if context.args else DEFAULT_TOPIC
    await update.message.reply_text(f"Fetching today's {topic} stories...")
    try:
        context.user_data["queue"] = await fetch_stories(topic)
    except httpx.HTTPStatusError:
        await update.message.reply_text(
            f"No feed for '{topic}'. Try a Lobsters tag like rust, python, or go."
        )
        return
    except httpx.RequestError:
        await update.message.reply_text(
            "Couldn't reach Lobsters right now — try again in a bit."
        )
        return
    await show_next(update.message, context)


async def show_next(message: Message, context: ContextTypes.DEFAULT_TYPE) -> None:
    queue: list[Story] = context.user_data.get("queue", [])
    if not queue:
        await message.reply_text("Inbox zero. That's all the trends today.")
        return
    story = queue[0]
    suggested = await context.bot_data["classifier"].tag(story)
    await message.reply_text(
        f"{story.title}\n{story.url}",
        reply_markup=triage_keyboard(suggested),
    )

context.args is whatever followed the command: /digest python gives ["python"], a bare /digest gives [] and falls back to DEFAULT_TOPIC.

A typo'd topic is a 404 from Lobsters, so I catch HTTPStatusError and reply with a helpful message, otherwise the user would just stare at a digest that never arrives. Validate at the boundary where untrusted input enters.

The second except covers the other failure mode: the request never gets an HTTP response at all. HTTPStatusError only fires once Lobsters answers with a 4xx/5xx — a connect timeout, read timeout, or DNS failure is an httpx.RequestError, which is a sibling of HTTPStatusError, not a subclass. Miss it and a flaky network crashes the handler with a traceback instead of a friendly reply. Catching RequestError covers every transport-level failure (ConnectTimeout, ReadTimeout, ConnectError) in one branch.

Telegram chat: /digest agentic ai returns

Read user_data with .get(...), never [...]. It lives in memory, so if the bot restarts mid-flow the dict is empty and you want a graceful reply, not a KeyError.

context.bot_data is its per-bot sibling: one dict shared across all users. That makes it the right home for the classifier, which holds no per-user state. We build it once in step 7 and read it back here, so every story reuses the same OpenAI client instead of constructing a fresh one each time.

Step 7: the callback, then wire it up

When I tap a button Telegram sends a callback query, not a message. Three rules keep it sane, numbered in the code:

async def on_tag(update: Update, context: ContextTypes.DEFAULT_TYPE) -> None:
    query = update.callback_query
    if query is None or query.data is None:
        return
    await query.answer()  # 1. stop the spinner, first thing
    _, tag = query.data.split(":", 1)  # 2. "tag:read" -> "read"
    queue = context.user_data.get("queue", [])
    if not queue:  # stale button after a restart
        await query.edit_message_text("Session expired, send /digest again.")
        return
    story = queue.pop(0)
    if tag != "skip":
        save_to_reading_list(story, tag)  # the human's final say
    await query.edit_message_text(  # 3. edit, don't reply
        f"Filed under {tag}: {story.title}"
        if tag != "skip"
        else f"Skipped: {story.title}"
    )
    await show_next(query.message, context)

Call await query.answer() first or the loading spinner on the button never stops, even when everything else works. Edit the original message instead of replying, or the dead keyboard sits there inviting a second tap on a story you already filed. The same .get(...)-not-[...] rule applies here: an old keyboard from before a restart can still send a tap, and you want a "send /digest again" nudge, not a KeyError.

Routing is by prefix. The pattern="^tag:" is why a future second keyboard (say setcurrency:EUR) would not trip this handler:

async def on_error(update: object, context: ContextTypes.DEFAULT_TYPE) -> None:
    logger.exception("Handler failed", exc_info=context.error)


def main() -> None:
    logging.basicConfig(
        format="%(asctime)s %(name)s %(levelname)s %(message)s",
        level=logging.INFO,
    )
    logger.info("Starting trend triage bot, polling for updates")
    app = Application.builder().token(config("TELEGRAM_BOT_TOKEN")).build()
    app.bot_data["classifier"] = _build_classifier()
    app.add_handler(CommandHandler("digest", start_digest))
    app.add_handler(CallbackQueryHandler(on_tag, pattern="^tag:"))
    app.add_error_handler(on_error)
    app.run_polling()


if __name__ == "__main__":
    main()

Three things to notice here in main():

Run it

$ export OPENAI_API_KEY=sk-proj-...
$ export TELEGRAM_BOT_TOKEN=...
$ uv run trend_triage_bot.py
2026-06-01 13:19:42,866 __main__ INFO Starting trend triage bot, polling for updates
2026-06-01 13:19:43,190 httpx INFO HTTP Request: POST https://api.telegram.org/bot.../getMe "HTTP/1.1 200 OK"
2026-06-01 13:19:43,242 httpx INFO HTTP Request: POST https://api.telegram.org/bot.../deleteWebhook "HTTP/1.1 200 OK"
2026-06-01 13:19:43,244 telegram.ext.Application INFO Application started

Open your bot in Telegram and send /digest for the default topic, or use a tag like /digest python, /digest rust or any Lobsters tag.

The bot walks you through today's stories one at a time. Tap the highlighted tag to accept the model's guess, or any other tag to overrule it, until you hit inbox zero:

Telegram chat showing stories filed under read and tool and one skipped, ending with

Same bot, any topic: finish the Rust queue, then /digest python and triage that, no code change:

Telegram chat: after the Rust digest reaches inbox zero, /digest python fetches Python stories and shows the first one with the read tag highlighted

Filed stories land in reading_list.jsonl:

{"tag": "read", "title": "One year of Roto, the compiled scripting language for Rust", "url": "https://blog.nlnetlabs.nl/one-year-of-roto-the-compiled-scripting-language-for-rust/"}
{"tag": "lib", "title": "Announcing Rust 1.96.0", "url": "https://blog.rust-lang.org/2026/05/28/Rust-1.96.0/"}
{"tag": "read", "title": "What kache actually caches", "url": "https://kunobi.ninja/blog/what-kache-actually-caches"}
{"tag": "read", "title": "Creusot helps you prove your Rust code is correct", "url": "https://github.com/creusot-rs/creusot/tree/master"}
{"tag": "tool", "title": "uv must be installed to build a standalone Python distribution", "url": "https://github.com/astral-sh/python-build-standalone/commit/c9c40c56eb53136587f0a32382cad9e5cd8d184a"}
{"tag": "tool", "title": "SPy: an interpreter and a compiler for a statically typed variant of Python", "url": "https://github.com/spylang/spy"}
{"tag": "read", "title": "Opaque Types in Python", "url": "https://blog.glyph.im/2026/05/opaque-types-in-python.html"}
{"tag": "read", "title": "uv is fantastic, but its package management UX is a mess", "url": "https://www.loopwerk.io/articles/2026/uv-ux-mess/"}

That file is the actual output; one JSON object per line, ready to feed into whatever reads it next. The whole script is in this gist.

The interesting question is not whether the model can tag a headline. It's pretty accurate, but it can get it wrong, and that's where you want to have a human in the loop. This has been a simple example to show the flow, but real workflows might involve more interesting things like approving trades, triaging support tickets, or moderating content. The model can do the heavy lifting of making a guess, but the human gets the final say, and that's where the value is.

Keep reading

June 01, 2026 12:00 AM UTC

May 31, 2026


Paolo Melchiorre

My PyCon Italia 2026

A timeline of my PyCon Italia 2026 journey, in Bologna (IT), told through the Mastodon posts I shared along the way.

May 31, 2026 10:00 PM UTC


Kay Hayen

Nuitka Release 4.1

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.

This release adds many new features and corrections with a focus on async code compatibility, missing generics features, and Python 3.14 compatibility and Python compilation scalability yet again.

Bug Fixes

Package Support

New Features

Optimization

Anti-Bloat

Organizational

Tests

Cleanups

Summary

This release builds on the scalability improvements established in 4.0, with enhanced Python 3.14 support, expanded package compatibility, and significant optimization work.

The --project option seems usable now.

Python 3.14 support remains experimental, but only barely made the cut, and probably will get there in hotfixes. Some of the corrections came in so late before the release, that it was just not possible to feel good about declaring it fully supported just yet.

May 31, 2026 10:00 PM UTC

May 30, 2026


Talk Python to Me

#550: AI Contributions and Maintainer Load in Open Source

You wake up, brew the coffee, open GitHub, and there it is. Another pull request on your open source project. Thirteen thousand lines added. No issue filed first. No discussion. Just "here, please review this for me." <br/> <br/> Over the past year, GitHub activity has spiked roughly twelve times in a few short months, and a huge chunk of that signal is landing on the same small group of maintainers who were already stretched thin. The curl bug bounty got buried under AI-generated noise. Jazzband, the home of Django classics like pip-tools and the Django debug toolbar, hit what its maintainer called an "apocalypse" and started sunsetting. Even CPython just shipped fresh guidelines on AI-assisted contributions this week. <br/> <br/> So what does all of this actually look like from the receiving end of the pull request? <br/> <br/> On this episode, Paolo Melchiorre joins us to tell that story from inside the maintainer's chair. Paolo is a director of the Django Software Foundation, an organizer of PyCon Italy, a Django Girls coach, and he has spent the past year carefully collecting examples of how AI is reshaping open source contributions. The good, the bad, and the extra fingers. <br/> <br/> We dig into his PyCon US talk on AI-assisted contributions and maintainer load, why AI is best understood as an amplifier rather than a new kind of contributor, the wildly different policies across 86 open source foundations, whether projects banning AI today are reacting to last year's models.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/agentfield-page'>AgentField AI</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <h2 class="links-heading mb-4">Links from the show</h2> <div><strong>Guest</strong><br/> <strong>Paolo Melchiorre</strong>: <a href="https://github.com/pauloxnet?featured_on=talkpython" target="_blank" >github.com</a><br/> <br/> <strong>DSF</strong>: <a href="https://www.djangoproject.com/foundation/?featured_on=talkpython" target="_blank" >www.djangoproject.com</a><br/> <strong>djangonaut-space</strong>: <a href="https://djangonaut.space/?featured_on=talkpython" target="_blank" >djangonaut.space</a><br/> <strong>PyCon Italia</strong>: <a href="https://2026.pycon.it/en?featured_on=talkpython" target="_blank" >2026.pycon.it</a><br/> <strong>uDjango</strong>: <a href="https://github.com/pauloxnet/uDjango?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>My PyCon US 2026 post</strong>: <a href="https://www.paulox.net/2026/05/21/my-pycon-us-2026/?featured_on=talkpython" target="_blank" >www.paulox.net</a><br/> <strong>AI-Assisted Contributions and Maintainer Load</strong>: <a href="https://www.paulox.net/2026/05/15/pycon-us-2026/?featured_on=talkpython" target="_blank" >www.paulox.net</a><br/> <strong>Senior Engineer Tries Vibe Coding</strong>: <a href="https://www.youtube.com/watch?v=_2C2CNmK7dQ" target="_blank" >www.youtube.com</a><br/> <strong>Code Rabbit AI PR Reviews</strong>: <a href="https://www.coderabbit.ai?featured_on=talkpython" target="_blank" >www.coderabbit.ai</a><br/> <strong>GitHub Usage Graphs</strong>: <a href="https://github.blog/news-insights/company-news/an-update-on-github-availability/?featured_on=talkpython" target="_blank" >github.blog</a><br/> <strong>Update on CPython's AI Policies</strong>: <a href="https://fosstodon.org/@mariatta/116610508567734365" target="_blank" >fosstodon.org</a><br/> <strong>High-Quality Chaos from Curl</strong>: <a href="https://daniel.haxx.se/blog/2026/04/22/high-quality-chaos/?featured_on=talkpython" target="_blank" >daniel.haxx.se</a><br/> <strong>The Generative AI Policy Landscape in Open Source</strong>: <a href="https://redmonk.com/kholterhoff/2026/02/26/generative-ai-policy-landscape-in-open-source/?featured_on=pythonbytes" target="_blank" >redmonk.com</a><br/> <br/> <strong>Watch this episode on YouTube</strong>: <a href="https://www.youtube.com/watch?v=1RJ1kkpTdow" target="_blank" >youtube.com</a><br/> <strong>Episode #550 deep-dive</strong>: <a href="https://talkpython.fm/episodes/show/550/ai-contributions-and-maintainer-load-in-open-source#takeaways-anchor" target="_blank" >talkpython.fm/550</a><br/> <strong>Episode transcripts</strong>: <a href="https://talkpython.fm/episodes/transcript/550/ai-contributions-and-maintainer-load-in-open-source" target="_blank" >talkpython.fm</a><br/> <br/> <strong>Theme Song: Developer Rap</strong><br/> <strong>🥁 Served in a Flask 🎸</strong>: <a href="https://talkpython.fm/flasksong" target="_blank" >talkpython.fm/flasksong</a><br/> <br/> <strong>---== Don't be a stranger ==---</strong><br/> <strong>YouTube</strong>: <a href="https://talkpython.fm/youtube" target="_blank" ><i class="fa-brands fa-youtube"></i> youtube.com/@talkpython</a><br/> <br/> <strong>Bluesky</strong>: <a href="https://bsky.app/profile/talkpython.fm" target="_blank" >@talkpython.fm</a><br/> <strong>Mastodon</strong>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" ><i class="fa-brands fa-mastodon"></i> @talkpython@fosstodon.org</a><br/> <strong>X.com</strong>: <a href="https://x.com/talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @talkpython</a><br/> <br/> <strong>Michael on Bluesky</strong>: <a href="https://bsky.app/profile/mkennedy.codes?featured_on=talkpython" target="_blank" >@mkennedy.codes</a><br/> <strong>Michael on Mastodon</strong>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" ><i class="fa-brands fa-mastodon"></i> @mkennedy@fosstodon.org</a><br/> <strong>Michael on X.com</strong>: <a href="https://x.com/mkennedy?featured_on=talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @mkennedy</a><br/></div>

May 30, 2026 03:43 PM UTC


Bob Belderbos

The control layer is the product, not the model

Gary Bernhardt posted something this week that names a phenomenon we're teaching in our agentic AI cohort:

Everyone seems fixated on the models, but I think there's so much low-hanging fruit in the control layer above the model. "Agent" and "harness" sell that layer short. There's so much more that we can do beyond "read input, send to model, run commands it returns."

He's right. The model is a brain in a jar. Useful, fast, occasionally wrong, stateless. Everything that turns it into a product lives in the code that wraps it: the routing, the validation, the state, the audit trail. Gary calls that the control layer. I'm stealing the term.

One of the replies under the tweet nailed the design goal in a single question: do you actually know what the agent is going to do?

That's what a control layer buys you. Not magic, not autonomy, predictability. A workflow where, by the time the model is called, the next move is already constrained to something safe.

Why "agent" and "harness" sell it short

When a developer says "I'm building an agent", they usually mean a while True loop that pings an LLM, parses a tool call, runs it, feeds the result back, and repeats. That pattern works for demos. It rarely survives contact with a real workflow.

The word "harness" makes the wrapping code sound passive, a strap that holds the model in place. It's actually the control layer where the engineering happens. The model is a function call inside it. Once you flip that mental model, you stop asking "which LLM should I use" and start asking "what guarantees does my control layer make?" and "how can I make the inherently unpredictable model fit into a predictable workflow?"

These are the questions production teams have to answer.

Pattern 1: deterministic state machines, not unconstrained agents

An agent without constraints decides what to do next from inside the model. A state machine decides outside the model and gives the model one bounded job at each step. The pipeline runs categorize → validate → confirm → persist, and the LLM only ever gets called inside one of those buckets.

This shifts control flow back to your code, where you can test it, log it, and reason about it. The expense agent we build in our cohort, which I broke down in How an AI expense agent is actually structured, follows exactly this pattern: Protocol-defined LLM boundary, Pydantic-validated outputs, service layer holds the state, human-in-the-loop (HITL) confirms before anything writes. Four layers, no free-roaming agent, constraints at every step.

Pattern 2: the model behind a typed boundary

The model should be one swappable function call inside your control layer, not a dependency threaded through every layer. In our cohort the LLM lives behind a Python Protocol: a small interface the service layer depends on, so nothing downstream knows or cares whether the call goes to OpenAI or Anthropic.

Once the boundary is a Protocol, the decisions people reach for "routing" to solve become wiring instead of rewrites. Picking a cheap fast model for a 12-way classification and saving the expensive one for hard reasoning is a one-line change. Falling back to a second provider when the first is rate-limited is a small factory, not a refactor. Swapping OpenAI for Anthropic, two SDKs that disagree on almost every detail, touches one file because the boundary absorbs the difference.

And it makes the whole pipeline testable. Tests pass a mock that satisfies the Protocol, so you exercise every path without an API call incurring latency or cost.

Pattern 3: evaluators and guardrails

The model's output is not the user's output. Between the two sits validation: schema checks, business rules, PII filters, sometimes a second model grading the first one's work.

This is the generator-evaluator split and it's an important pattern (apart from HITL) I've found for AI code that has to be right. The generator proposes. The evaluator approves or rejects. When the evaluator rejects, control loops back with feedback, not a stack trace.

It's also the layer that catches the worst failure mode of multi-step agents. What production AI agents actually require goes deeper on the four questions the control layer answers before any action runs: state, idempotency, audit, rollback.

Pattern 4: structured generation

A raw string from the model is the start of your problems. You can't store it, validate it, or test it well. The fix is to constrain output at the boundary: the model is allowed to speak, but only in shapes your code understands.

Where the typed boundary in Pattern 2 decides where the model sits in your code, structured generation decides what shape it's allowed to emit.

Pydantic plus your model's structured outputs gives you typed data instead of strings, which means the next layer of your control flow becomes ordinary Python.

I covered this in Build the data layer before you touch the LLM, explaining why we teach students to build the schema before they make a single API call.


The frontier models make the headlines. The control layer ships the product. Gary's tweet names a gap that has been there the whole time, between the people optimizing benchmarks and the people building products. The control layer is the product, not the model. If you want to build AI products, that's where you need to spend your time.

If you want a working walkthrough of the patterns above, the 10 small agentic AI exercises Juanjo and I shipped, run in the browser and cover the arc from a 3-line model call to a complete loop with HITL. They're the conceptual map.

The cohort is the same map, end to end. Six weeks, no frameworks, the control layer built explicitly, with code review at every step. By the end you can answer that one question: you know what your agent is going to do.

May 30, 2026 12:00 AM UTC

May 29, 2026


Real Python

The Real Python Podcast – Episode #297: Improving Python Through PEPs and Protocols

Have you ever been confused by the naming of modules you're importing from a package? Is there a standard way to organize and name your Python virtual environments? This week on the show, Brett Cannon returns to discuss the Python Enhancement Proposals (PEPs) he's been working on recently.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 29, 2026 12:00 PM UTC

Quiz: Python's assert: Debug and Test Your Code Like a Pro

In this quiz, you’ll test your understanding of Python’s assert: Debug and Test Your Code Like a Pro.

By working through this quiz, you’ll revisit how assertions help you debug, test, and document your code, when to disable them in production, and which common pitfalls to avoid.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 29, 2026 12:00 PM UTC


Ned Batchelder

Snake way for ducklings

This is the mascot for Boston Python. It’s called Snake Way for Ducklings:

A cute snake with a napkin around its neck, with eight duckling-shaped bumps along its body

My son Ben drew it, which makes me very happy. He also drew Sleepy Snake. Wearing this image on a shirt around PyCon, I had to explain it a number of times. People in Boston understand it almost immediately, but others need more background.

In 1941, Robert McCloskey wrote a children’s book called Make Way for Ducklings. It’s a classic, selling millions of copies and never going out of print. We read it to our own children growing up many times.

The book is the story of Mrs. Mallard making her way through Boston guiding her eight ducklings (Jack, Kack, Lack, Mack, Nack, Oack, Pack, and Quack) to a pond in Boston’s Public Garden. It has charming pencil illustrations:

A page spread from Make Way for Ducklings showing a policeman stopping traffic to let the ducks cross the street

The book led to a sculpture in the Public Garden near the actual pond:

The bronze sculpture of the ducks in the Public Garden

The sculpture is sized and placed for kids to play on, and is widely known and beloved in Boston. The ducks are dressed in costumes for all kinds of occasions: holidays, sports events, even Star Wars day. On Mother’s Day, there’s a duckling parade: families bring their children dressed as ducklings. In Boston, the ducklings are a big deal.

And it’s not just fiction.

So it seemed natural to Ben to riff on the ducklings for Boston Python. One observer thought a snake eating the ducklings seemed kind of dark, but you can see the ducklings are still quacking, so they are fine!

A cute snake with a napkin around its neck, with eight duckling-shaped bumps along its body

BTW, Boston also has Duck Boat tours, but that’s completely different.

May 29, 2026 11:29 AM UTC


PyCon Ireland

Call for Proposals Now Open

We’re excited to announce that the Call for Proposals (CFP) for PyCon Ireland 2026 is now open!

We Want to Hear From You

Whether you’re a first-time speaker or an experienced presenter, we’d love to hear your Python story. We welcome proposals on a wide range of topics, including:

Talk Formats

How to Submit

Visit our proposal submission page to submit your talk. The deadline is 30 August 2026.

Don’t hesitate to reach out at contact@python.ie if you have any questions about your proposal.

May 29, 2026 12:00 AM UTC


Seth Michael Larson

How much “Super Mario” per year?

It's impossible to objectively quantify art, but we try anyway. For example: Is “Super Mario” a good video-game franchise?

Looking at review scores, Super Mario includes some of the most universally-acclaimed games ever published: Galaxy, Galaxy 2, and Odyssey are respectively the #4, #5, and #13 highest ranking video-games of all time on Metacritic, all with 97 overall. Chances seem good?

What if we tried quantifying art in a different and slightly more reductive way? This blog post introduces and calculates a new unit: “Super Mario per year”. If you enjoy this franchise like I do then this unit is of particular importance to you.

Calculating “Super Mario per year”

There have been ~19 titles (and two add-ons) published to what I consider the "main-line" Super Mario games, both 2D and 3D. Below is a table with every title, the year it was published, and the approximate duration to play. This last column is the most subjective, because there’s speed-runners, casual players, completionists. If you think any value is way off, send me an email.

Game 2D/3D Platform Year Time to Beat
Super Mario Bros. 2D NES 1985 5 hours
Super Mario Bros. Lost Levels 2D NES 1986 10 hours
Super Mario Bros. 2 2D NES 1988 5 hours
Super Mario Bros. 3 2D NES 1988 5 hours
Super Mario Land 2D GB 1989 5 hours
Super Mario World 2D SNES 1990 10 hours
Super Mario Land 2 2D GB 1992 10 hours
Super Mario 64 3D N64 1996 15 hours
Super Mario Sunshine 3D GC 2002 20 hours
New Super Mario Bros. 2D DS 2006 10 hours
Super Mario Galaxy 3D Wii 2007 15 hours
New Super Mario Bros. Wii 2D Wii 2009 5 hours
Super Mario Galaxy 2 3D Wii 2010 15 hours
Super Mario 3D Land 3D 3DS 2011 15 hours
New Super Mario Bros. 2 2D 3DS 2012 10 hours
New Super Mario Bros. U 2D Wii U 2012 15 hours
Super Mario 3D World 3D Wii U 2013 20 hours
Super Mario Odyssey 3D Switch 2017 25 hours
Bowser's Fury (Super Mario 3D World) 3D Switch 2021 5 hours
Super Mario Bros. Wonder 2D Switch 2023 15 hours
Meetup at Bellabel Park (Super Mario Bros. Wonder) 2D Switch 2026 5 hours

Using the table above we can calculate approximately how much new Super Mario gameplay is published on average per year.

Year All-Time Avg 10-Year Avg (10YA) 2D (10YA) 3D (10YA)
1985 5.0 5.0 5.0 0.0
1986 7.5 7.5 7.5 0.0
1987 5.0 5.0 5.0 0.0
1988 6.2 6.2 6.2 0.0
1989 6.0 6.0 6.0 0.0
1990 6.7 6.7 6.7 0.0
1991 5.7 5.7 5.7 0.0
1992 6.2 6.2 6.2 0.0
1993 5.6 5.6 5.6 0.0
1994 5.0 5.0 5.0 0.0
1995 4.5 5.0 5.0 0.0
1996 5.4 6.0 4.5 1.5
1997 5.0 5.0 3.5 1.5
1998 4.6 5.0 3.5 1.5
1999 4.3 4.0 2.5 1.5
2000 4.1 3.5 2.0 1.5
2001 3.8 2.5 1.0 1.5
2002 4.7 4.5 1.0 3.5
2003 4.5 3.5 0.0 3.5
2004 4.2 3.5 0.0 3.5
2005 4.0 3.5 0.0 3.5
2006 4.3 4.5 1.0 3.5
2007 4.8 4.5 1.0 3.5
2008 4.6 4.5 1.0 3.5
2009 4.6 5.0 1.5 3.5
2010 5.0 6.5 1.5 5.0
2011 5.4 8.0 1.5 6.5
2012 6.1 10.5 4.0 6.5
2013 6.6 10.5 4.0 6.5
2014 6.3 10.5 4.0 6.5
2015 6.1 10.5 4.0 6.5
2016 5.9 10.5 4.0 6.5
2017 6.5 12.0 3.0 9.0
2018 6.3 10.5 3.0 7.5
2019 6.1 10.5 3.0 7.5
2020 6.0 10.0 2.5 7.5
2021 5.9 9.0 2.5 6.5
2022 5.8 7.5 2.5 5.0
2023 6.0 6.5 1.5 5.0
2024 5.9 4.5 1.5 3.0
2025 5.7 4.5 1.5 3.0
2026 5.7 5.0 2.0 3.0

This table will help you calculate approximately how much Super Mario is coming in the next decade. The current 10-year window pace shows 5 hours of Super Mario per year.

Looking at the trends, it looks like we may have already passed peak 2D and 3D Mario individually. This table also shows how overdue we are for a new big 3D Super Mario title, the last entry being Super Mario Odyssey almost a decade ago in 2017.

If I were to somewhat morbidly apply these numbers I can estimate how much more new “Super Mario” gameplay I’m likely to experience. Let’s be optimistic and apply the “All-Time Average” instead of the “10-Year Average”: the resulting number is 256 hours. Around 10 games of similar size to “Super Mario Odyssey”... seems good to me!

Super Mario Blogroll

If you want to read more Super Mario writing here are a few personal selections from my blogroll:

Happy gaming!



Thanks for keeping RSS alive! ♥ What to do next? Share your thoughts with me on Mastodon, Bluesky, or email. I try to reply to everyone!Browse the blog archive. Check out my blogroll. Or maybe go outside (best option)?



May 29, 2026 12:00 AM UTC