skip to navigation
skip to content

Planet Python

Last update: May 30, 2026 04:43 PM UTC

May 30, 2026


death and gravity

DynamoDB crash course: part 3 – design patterns

This is the last part of a series covering core DynamoDB concepts. The goal is to help you understand idiomatic usage and trade-offs in under an hour.

In the first part, I summarized DynamoDB's main proposition to its users like so:

data modeling complexity is always preferable to complexity coming from infrastructure maintenance, availability, and scalability

Today, we're looking at the design patterns that help manage this complexity, make the most of DynamoDB's data model and features, and work around its limits.

Contents

Composite keys #

Composite (aka synthetic) keys underpin most other patterns.

The idea is simple: keys don't have to be natural attributes of your data, they can be composed of other attributes that enable specific access patterns. This works both with table and index keys.

How do you compose keys? By string concatenation, of course – I did say low level! (Careful with numbers though, they need padding to be useful in sort keys.)

Example

To sort lexicographically by more than one attribute, you group them in a sort key, e.g. {Album}#{Song}.

Or, in single table design, you distinguish between item types by prefixing keys with the type, e.g. album#{Album}.

Or, in partition key sharding, you spread the load on a GSI partition by splitting one partition key into multiple ones, e.g. {Genre}#{shard}.

But denormalization has its trade-offs. For sort key {Album}#{Song}, should Album and Song also be separate attributes? If yes, you need to ensure they never change, but you can use them in indexes (e.g. a GSI with Album as primary key). If no, the item cannot become inconsistent, but you always need to parse the key.

This was inconvenient enough that DynamoDB finally added multi-attribute keys support to GSIs in 2025 (although not inconvenient enough to also add it to tables).

See also

Single table design #

The AWS guidance guidance is to use as few tables as possible:

As a general rule, you should maintain as few tables as possible in a DynamoDB application. [...] A single table with inverted indexes can usually enable simple queries to create and retrieve the complex hierarchical data structures required by your application.

This culminates in single table design, where you put all entities in the same table, and tell them apart based on the key format, usually using a prefix. With this pattern, one DynamoDB table corresponds to a whole relational database.

The easiest way is to put items related to a top-level entity on the same partition. The main benefit is that joins with the top-level entity become trivial. A second one is that you can sometimes get different entity types in a single query, which can be both faster and cheaper (fewer queries; small items pack into fewer capacity units).

Example

You can group items related to an Artist on the same partition, with sort keys like artist, album#{Album}, and song#{Album}#{Song}.

# table Music (partition key: Artist, sort key: sk)
Solar Fields: !btree
  'album#Leaving Home': { Genre: Electronic }
  'artist': { Variations: [ Solarfields ] }
  'song#Leaving Home#Air Song': { Duration: 741 }
  'song#Leaving Home#Monogram': { Duration: 944 }

Besides getting items of a single type, you can also get artist details and albums in a single query (sk BETWEEN "album#" AND "artist").

But choose wisely – queries can have only one sort key condition, so you can't also get album details and songs in a single query with this schema; sort keys {Album} and {Album}#{Song} would do it, at the expense of the first query.

Sometimes, it can be useful to put some sub-entities on dedicated partitions, accepting that joins will have to be done in code.

Example

In the example above, a popular artist with lots of songs can lead to:

Perhaps it's better to put songs in each album on separate partitions:

# table Music (partition key: pk, sort key: sk)
'artist#Solar Fields': !btree
  'album#Leaving Home': { Genre: Electronic }
  'artist': { Variations: [ Solarfields ] }
'song#Solar Fields#Leaving Home': !btree
  'Air Song': { Duration: 741 }
  'Monogram': { Duration: 944 }

This spreads the load onto multiple partitions, which should fix throttling.

The downside is that list songs for artist is now a two-step operation: first one query for the albums, then one query per album for the songs. The upside is that the per-album queries can be done in parallel, which wasn't possible before.

A consequence of this design is that you need a GSI to list items of a specific type (otherwise, you have to do a full table scan). Of note, exceeding the GSI partition throughput limit will cause write throttling on the base table; in the absence of a natural high-cardinality GSI partition key, sharding or some other composite key can help.

A final benefit of using a single table is better utilization with provisioned mode: usage gets averaged across entities and tends to be smoother, and spikes can share the same spare capacity.

See also

GSI overloading #

GSI overloading is just single table design for indexes – you put different values in the GSI key attributes, depending on item type. This way you can index more attributes than the 20 GSIs per table quota, and it can be cheaper too, since fewer indexes make better use of spare provisioned capacity.

Example

For a table that contains both artist and album items, a single GSI can be used for entirely different purposes:

# table Music (partition key: Artist, sort key: sk)
2 Bit Pie: !btree
  'album#2 Pie Island': { gsi1pk: 'album#Electronic' }
  'artist': { gsi1pk: 'artist#United Kingdom' }
Ishome: !btree
  'album#Confession': { gsi1pk: 'album#Electronic' }
  'artist': { gsi1pk: 'artist#Russia' }
# GSI GSI1 (partition key: gsi1pk, sort key: Artist)
'artist#United Kingdom': !btree
  2 Bit Pie: { sk: 'artist' }
'artist#Russia': !btree
  Ishome: { sk: 'artist' }
'album#Electronic': !btree
  2 Bit Pie: { sk: 'album#2 Pie Island' }
  Ishome: { sk: 'album#Confession' }

See also

Partition key sharding #

Sometimes, a partition key composed of multiple natural attributes is not enough to spread the load evenly across partitions; you can deal with this by putting items with the same natural attributes on multiple partitions.

So, what partition key should you use? One option is to use a random suffix from a known range; this still allows you to list items for a natural attribute value by doing multiple queries, one for each suffix.

Example

For a table of songs, using Album as the partition key won't work, since not all songs are released on an album; Artist always has a value, but some artists have hundreds or even thousands of songs, which can lead to throttling.

Instead, we can use {Artist}#{randrange(10)} as partition key, which allows ten times as many items before we reach throughput limits. To list an artist's songs:

for shard in range(10):
    for item in dynamodb.query(f"{artist}#{shard}"):
        yield item

A downside of random suffixes is that you can't get a specific item, because you don't know what its suffix is. A better option is to calculate the suffix from an attribute that you do know, for example using its hash modulo N.

Example

With primary key {Artist}#{hash(Song) % 10)}, we can get a song like this:

def hash(s):
    return int.from_bytes(sha256(s.encode()).digest())

shard = hash(song_title) % 10
dynamodb.get_item(f"{artist}#{shard}", song_title)

A lot of times you need to list items by a low-cardinality attribute, so sharding may be even more important for GSIs.

Example

Assuming dedicated album items, you can list all the albums by putting them in a single GSI partition key called albums, but this will definitely cause throttling.

To avoid it, you can use GSI partition key album#{hash(Album} % 100} if you don't care about the order, or something like album#{Album[:2].lower()} if you do (but likely more sophistication is needed – th will be a very common album title prefix, and some album titles don't contain letters at all).

Even if throttling is not an issue (e.g. single infrequent reader), sharding allows you to query multiple partitions in parallel, which can speed up getting the entire result set.


So, how many shards should you have? That depends on the number, size, and how often you access the items, and is also a trade-off – too many shards means additional queries and latency, too few shards means you still overload the partitions sometimes.

Importantly, increasing the number of shards is non-trivial. For tables, you usually need to rebalance the items in place. For indexes, it's cleaner to move to a new index, or if you just need to list items by type, you can put all new items on new shards.

Regardless, you have to support it in code, do a backfill, and orchestrate the migration, which all become more complex if downtime and inconsistencies are not acceptable (e.g. if you expose a pagination token based on LastEvaluatedKey, you may want to support both versions during the switch).

See also

Sparse indexes #

An item with missing index partition/sort key attributes won't appear in the index, and you won't pay for it. This can be used deliberately to query a subset of the items in the table, like those of a specific type or in a specific state.

Example

Assuming dedicated album items, an alternative way to list all the albums is to have a GSI with {Album} as partition key, and just scan the entire index (the primary key has to be a dedicated attribute that only albums have, so that only album items appear in the index).

Or, you can use a dedicated GSI with CoverOf as primary key to list cover songs.

See also

Base table indexes #

In some cases, GSIs won't cut it – maybe you need a strongly consistent index, or need to model a many-to-one relationship (indexes map one item in the base table to one item in the index).

Instead, you can maintain an index in the base table by having additional index items associated with the main item; to guarantee atomic updates, use transactions. You then go from the main item to the index items via a main item attribute, and from the index items to the main item via their partition key.

Example

Songs have different identifiers in external systems, such as ISRC, ISWC, or MBID. To query songs by multiple external ids, you'd structure your database like this:

(Alternatively, you could have one sparse index per external id type, but then you lose strong consistency, and risk running out of GSIs).

Note that modeling one-to-many relationships isn't this involved, since it fits neatly into the related-items-same-partition variant of single table design.

See also

Optimistic locking #

Optimistic locking is a concurrency control method useful when conflicts are rare, so instead of acquiring a lock to do changes, you check if someone else changed the data right before commiting, as part of an atomic operation.

In DynamoDB, that operation is a conditional write; items get an integer version attribute, and every time you want to update an item, you:

  1. read the item, including the version
  2. increment the version and modify the item
  3. update the item, using a condition expression to ensure the version matches
    1. if successful, you're done
    2. else, start over from the beginning

You can do also this in transactions to update groups of related items, like in the base table index pattern above, with only the main item needing a version.

The upside of optimistic locking is that it is faster on average, since updates usually succeed on the first try; for fewer conflicts, use strongly consistent reads.

The downside is that it requires explicit support – it must be possible to start over from the beginning, which complicates logic, especially if you need to interact with other systems besides updating the item (e.g. to send a notification).

See also


Anyway, that's it for now.

See also

For mode details and examples, check out the official documentation:

Learned something new today? Share it with others, it really helps!

Want to know when new articles come out? Subscribe here to get new stuff straight to your inbox!

May 30, 2026 04:43 PM UTC


Talk Python to Me

#550: AI Contributions and Maintainer Load in Open Source

You wake up, brew the coffee, open GitHub, and there it is. Another pull request on your open source project. Thirteen thousand lines added. No issue filed first. No discussion. Just "here, please review this for me." <br/> <br/> Over the past year, GitHub activity has spiked roughly twelve times in a few short months, and a huge chunk of that signal is landing on the same small group of maintainers who were already stretched thin. The curl bug bounty got buried under AI-generated noise. Jazzband, the home of Django classics like pip-tools and the Django debug toolbar, hit what its maintainer called an "apocalypse" and started sunsetting. Even CPython just shipped fresh guidelines on AI-assisted contributions this week. <br/> <br/> So what does all of this actually look like from the receiving end of the pull request? <br/> <br/> On this episode, Paolo Melchiorre joins us to tell that story from inside the maintainer's chair. Paolo is a director of the Django Software Foundation, an organizer of PyCon Italy, a Django Girls coach, and he has spent the past year carefully collecting examples of how AI is reshaping open source contributions. The good, the bad, and the extra fingers. <br/> <br/> We dig into his PyCon US talk on AI-assisted contributions and maintainer load, why AI is best understood as an amplifier rather than a new kind of contributor, the wildly different policies across 86 open source foundations, whether projects banning AI today are reacting to last year's models.<br/> <br/> <strong>Episode sponsors</strong><br/> <br/> <a href='https://talkpython.fm/agentfield-page'>AgentField AI</a><br> <a href='https://talkpython.fm/training'>Talk Python Courses</a><br/> <br/> <h2 class="links-heading mb-4">Links from the show</h2> <div><strong>Guest</strong><br/> <strong>Paolo Melchiorre</strong>: <a href="https://github.com/pauloxnet?featured_on=talkpython" target="_blank" >github.com</a><br/> <br/> <strong>DSF</strong>: <a href="https://www.djangoproject.com/foundation/?featured_on=talkpython" target="_blank" >www.djangoproject.com</a><br/> <strong>djangonaut-space</strong>: <a href="https://djangonaut.space/?featured_on=talkpython" target="_blank" >djangonaut.space</a><br/> <strong>PyCon Italia</strong>: <a href="https://2026.pycon.it/en?featured_on=talkpython" target="_blank" >2026.pycon.it</a><br/> <strong>uDjango</strong>: <a href="https://github.com/pauloxnet/uDjango?featured_on=talkpython" target="_blank" >github.com</a><br/> <strong>My PyCon US 2026 post</strong>: <a href="https://www.paulox.net/2026/05/21/my-pycon-us-2026/?featured_on=talkpython" target="_blank" >www.paulox.net</a><br/> <strong>AI-Assisted Contributions and Maintainer Load</strong>: <a href="https://www.paulox.net/2026/05/15/pycon-us-2026/?featured_on=talkpython" target="_blank" >www.paulox.net</a><br/> <strong>Senior Engineer Tries Vibe Coding</strong>: <a href="https://www.youtube.com/watch?v=_2C2CNmK7dQ" target="_blank" >www.youtube.com</a><br/> <strong>Code Rabbit AI PR Reviews</strong>: <a href="https://www.coderabbit.ai?featured_on=talkpython" target="_blank" >www.coderabbit.ai</a><br/> <strong>GitHub Usage Graphs</strong>: <a href="https://github.blog/news-insights/company-news/an-update-on-github-availability/?featured_on=talkpython" target="_blank" >github.blog</a><br/> <strong>Update on CPython's AI Policies</strong>: <a href="https://fosstodon.org/@mariatta/116610508567734365" target="_blank" >fosstodon.org</a><br/> <strong>High-Quality Chaos from Curl</strong>: <a href="https://daniel.haxx.se/blog/2026/04/22/high-quality-chaos/?featured_on=talkpython" target="_blank" >daniel.haxx.se</a><br/> <strong>The Generative AI Policy Landscape in Open Source</strong>: <a href="https://redmonk.com/kholterhoff/2026/02/26/generative-ai-policy-landscape-in-open-source/?featured_on=pythonbytes" target="_blank" >redmonk.com</a><br/> <br/> <strong>Watch this episode on YouTube</strong>: <a href="https://www.youtube.com/watch?v=1RJ1kkpTdow" target="_blank" >youtube.com</a><br/> <strong>Episode #550 deep-dive</strong>: <a href="https://talkpython.fm/episodes/show/550/ai-contributions-and-maintainer-load-in-open-source#takeaways-anchor" target="_blank" >talkpython.fm/550</a><br/> <strong>Episode transcripts</strong>: <a href="https://talkpython.fm/episodes/transcript/550/ai-contributions-and-maintainer-load-in-open-source" target="_blank" >talkpython.fm</a><br/> <br/> <strong>Theme Song: Developer Rap</strong><br/> <strong>🥁 Served in a Flask 🎸</strong>: <a href="https://talkpython.fm/flasksong" target="_blank" >talkpython.fm/flasksong</a><br/> <br/> <strong>---== Don't be a stranger ==---</strong><br/> <strong>YouTube</strong>: <a href="https://talkpython.fm/youtube" target="_blank" ><i class="fa-brands fa-youtube"></i> youtube.com/@talkpython</a><br/> <br/> <strong>Bluesky</strong>: <a href="https://bsky.app/profile/talkpython.fm" target="_blank" >@talkpython.fm</a><br/> <strong>Mastodon</strong>: <a href="https://fosstodon.org/web/@talkpython" target="_blank" ><i class="fa-brands fa-mastodon"></i> @talkpython@fosstodon.org</a><br/> <strong>X.com</strong>: <a href="https://x.com/talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @talkpython</a><br/> <br/> <strong>Michael on Bluesky</strong>: <a href="https://bsky.app/profile/mkennedy.codes?featured_on=talkpython" target="_blank" >@mkennedy.codes</a><br/> <strong>Michael on Mastodon</strong>: <a href="https://fosstodon.org/web/@mkennedy" target="_blank" ><i class="fa-brands fa-mastodon"></i> @mkennedy@fosstodon.org</a><br/> <strong>Michael on X.com</strong>: <a href="https://x.com/mkennedy?featured_on=talkpython" target="_blank" ><i class="fa-brands fa-twitter"></i> @mkennedy</a><br/></div>

May 30, 2026 03:43 PM UTC


Bob Belderbos

The control layer is the product, not the model

Gary Bernhardt posted something this week that names a phenomenon we're teaching in our agentic AI cohort:

Everyone seems fixated on the models, but I think there's so much low-hanging fruit in the control layer above the model. "Agent" and "harness" sell that layer short. There's so much more that we can do beyond "read input, send to model, run commands it returns."

He's right. The model is a brain in a jar. Useful, fast, occasionally wrong, stateless. Everything that turns it into a product lives in the code that wraps it: the routing, the validation, the state, the audit trail. Gary calls that the control layer. I'm stealing the term.

One of the replies under the tweet nailed the design goal in a single question: do you actually know what the agent is going to do?

That's what a control layer buys you. Not magic, not autonomy, predictability. A workflow where, by the time the model is called, the next move is already constrained to something safe.

Why "agent" and "harness" sell it short

When a developer says "I'm building an agent", they usually mean a while True loop that pings an LLM, parses a tool call, runs it, feeds the result back, and repeats. That pattern works for demos. It rarely survives contact with a real workflow.

The word "harness" makes the wrapping code sound passive, a strap that holds the model in place. It's actually the control layer where the engineering happens. The model is a function call inside it. Once you flip that mental model, you stop asking "which LLM should I use" and start asking "what guarantees does my control layer make?" and "how can I make the inherently unpredictable model fit into a predictable workflow?"

These are the questions production teams have to answer.

Pattern 1: deterministic state machines, not unconstrained agents

An agent without constraints decides what to do next from inside the model. A state machine decides outside the model and gives the model one bounded job at each step. The pipeline runs categorize → validate → confirm → persist, and the LLM only ever gets called inside one of those buckets.

This shifts control flow back to your code, where you can test it, log it, and reason about it. The expense agent we build in our cohort, which I broke down in How an AI expense agent is actually structured, follows exactly this pattern: Protocol-defined LLM boundary, Pydantic-validated outputs, service layer holds the state, human-in-the-loop (HITL) confirms before anything writes. Four layers, no free-roaming agent, constraints at every step.

Pattern 2: the model behind a typed boundary

The model should be one swappable function call inside your control layer, not a dependency threaded through every layer. In our cohort the LLM lives behind a Python Protocol: a small interface the service layer depends on, so nothing downstream knows or cares whether the call goes to OpenAI or Anthropic.

Once the boundary is a Protocol, the decisions people reach for "routing" to solve become wiring instead of rewrites. Picking a cheap fast model for a 12-way classification and saving the expensive one for hard reasoning is a one-line change. Falling back to a second provider when the first is rate-limited is a small factory, not a refactor. Swapping OpenAI for Anthropic, two SDKs that disagree on almost every detail, touches one file because the boundary absorbs the difference.

And it makes the whole pipeline testable. Tests pass a mock that satisfies the Protocol, so you exercise every path without an API call incurring latency or cost.

Pattern 3: evaluators and guardrails

The model's output is not the user's output. Between the two sits validation: schema checks, business rules, PII filters, sometimes a second model grading the first one's work.

This is the generator-evaluator split and it's an important pattern (apart from HITL) I've found for AI code that has to be right. The generator proposes. The evaluator approves or rejects. When the evaluator rejects, control loops back with feedback, not a stack trace.

It's also the layer that catches the worst failure mode of multi-step agents. What production AI agents actually require goes deeper on the four questions the control layer answers before any action runs: state, idempotency, audit, rollback.

Pattern 4: structured generation

A raw string from the model is the start of your problems. You can't store it, validate it, or test it well. The fix is to constrain output at the boundary: the model is allowed to speak, but only in shapes your code understands.

Where the typed boundary in Pattern 2 decides where the model sits in your code, structured generation decides what shape it's allowed to emit.

Pydantic plus your model's structured outputs gives you typed data instead of strings, which means the next layer of your control flow becomes ordinary Python.

I covered this in Build the data layer before you touch the LLM, explaining why we teach students to build the schema before they make a single API call.


The frontier models make the headlines. The control layer ships the product. Gary's tweet names a gap that has been there the whole time, between the people optimizing benchmarks and the people building products. The control layer is the product, not the model. If you want to build AI products, that's where you need to spend your time.

If you want a working walkthrough of the patterns above, the 10 small agentic AI exercises Juanjo and I shipped, run in the browser and cover the arc from a 3-line model call to a complete loop with HITL. They're the conceptual map.

The cohort is the same map, end to end. Six weeks, no frameworks, the control layer built explicitly, with code review at every step. By the end you can answer that one question: you know what your agent is going to do.

May 30, 2026 12:00 AM UTC

May 29, 2026


Real Python

The Real Python Podcast – Episode #297: Improving Python Through PEPs and Protocols

Have you ever been confused by the naming of modules you're importing from a package? Is there a standard way to organize and name your Python virtual environments? This week on the show, Brett Cannon returns to discuss the Python Enhancement Proposals (PEPs) he's been working on recently.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 29, 2026 12:00 PM UTC

Quiz: Python's assert: Debug and Test Your Code Like a Pro

In this quiz, you’ll test your understanding of Python’s assert: Debug and Test Your Code Like a Pro.

By working through this quiz, you’ll revisit how assertions help you debug, test, and document your code, when to disable them in production, and which common pitfalls to avoid.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 29, 2026 12:00 PM UTC


Ned Batchelder

Snake way for ducklings

This is the mascot for Boston Python. It’s called Snake Way for Ducklings:

A cute snake with a napkin around its neck, with eight duckling-shaped bumps along its body

My son Ben drew it, which makes me very happy. He also drew Sleepy Snake. Wearing this image on a shirt around PyCon, I had to explain it a number of times. People in Boston understand it almost immediately, but others need more background.

In 1941, Robert McCloskey wrote a children’s book called Make Way for Ducklings. It’s a classic, selling millions of copies and never going out of print. We read it to our own children growing up many times.

The book is the story of Mrs. Mallard making her way through Boston guiding her eight ducklings (Jack, Kack, Lack, Mack, Nack, Oack, Pack, and Quack) to a pond in Boston’s Public Garden. It has charming pencil illustrations:

A page spread from Make Way for Ducklings showing a policeman stopping traffic to let the ducks cross the street

The book led to a sculpture in the Public Garden near the actual pond:

The bronze sculpture of the ducks in the Public Garden

The sculpture is sized and placed for kids to play on, and is widely known and beloved in Boston. The ducks are dressed in costumes for all kinds of occasions: holidays, sports events, even Star Wars day. On Mother’s Day, there’s a duckling parade: families bring their children dressed as ducklings. In Boston, the ducklings are a big deal.

And it’s not just fiction.

So it seemed natural to Ben to riff on the ducklings for Boston Python. One observer thought a snake eating the ducklings seemed kind of dark, but you can see the ducklings are still quacking, so they are fine!

A cute snake with a napkin around its neck, with eight duckling-shaped bumps along its body

BTW, Boston also has Duck Boat tours, but that’s completely different.

May 29, 2026 11:29 AM UTC


Seth Michael Larson

How much “Super Mario” per year?

It's impossible to objectively quantify art, but we try anyway. For example: Is “Super Mario” a good video-game franchise?

Looking at review scores, Super Mario includes some of the most universally-acclaimed games ever published: Galaxy, Galaxy 2, and Odyssey are respectively the #4, #5, and #13 highest ranking video-games of all time on Metacritic, all with 97 overall. Chances seem good?

What if we tried quantifying art in a different and slightly more reductive way? This blog post introduces and calculates a new unit: “Super Mario per year”. If you enjoy this franchise like I do then this unit is of particular importance to you.

Calculating “Super Mario per year”

There have been ~19 titles (and two add-ons) published to what I consider the "main-line" Super Mario games, both 2D and 3D. Below is a table with every title, the year it was published, and the approximate duration to play. This last column is the most subjective, because there’s speed-runners, casual players, completionists. If you think any value is way off, send me an email.

Game 2D/3D Platform Year Time to Beat
Super Mario Bros. 2D NES 1985 5 hours
Super Mario Bros. Lost Levels 2D NES 1986 10 hours
Super Mario Bros. 2 2D NES 1988 5 hours
Super Mario Bros. 3 2D NES 1988 5 hours
Super Mario Land 2D GB 1989 5 hours
Super Mario World 2D SNES 1990 10 hours
Super Mario Land 2 2D GB 1992 10 hours
Super Mario 64 3D N64 1996 15 hours
Super Mario Sunshine 3D GC 2002 20 hours
New Super Mario Bros. 2D DS 2006 10 hours
Super Mario Galaxy 3D Wii 2007 15 hours
New Super Mario Bros. Wii 2D Wii 2009 5 hours
Super Mario Galaxy 2 3D Wii 2010 15 hours
Super Mario 3D Land 3D 3DS 2011 15 hours
New Super Mario Bros. 2 2D 3DS 2012 10 hours
New Super Mario Bros. U 2D Wii U 2012 15 hours
Super Mario 3D World 3D Wii U 2013 20 hours
Super Mario Odyssey 3D Switch 2017 25 hours
Bowser's Fury (Super Mario 3D World) 3D Switch 2021 5 hours
Super Mario Bros. Wonder 2D Switch 2023 15 hours
Meetup at Bellabel Park (Super Mario Bros. Wonder) 2D Switch 2026 5 hours

Using the table above we can calculate approximately how much new Super Mario gameplay is published on average per year.

Year All-Time Avg 10-Year Avg (10YA) 2D (10YA) 3D (10YA)
1985 5.0 5.0 5.0 0.0
1986 7.5 7.5 7.5 0.0
1987 5.0 5.0 5.0 0.0
1988 6.2 6.2 6.2 0.0
1989 6.0 6.0 6.0 0.0
1990 6.7 6.7 6.7 0.0
1991 5.7 5.7 5.7 0.0
1992 6.2 6.2 6.2 0.0
1993 5.6 5.6 5.6 0.0
1994 5.0 5.0 5.0 0.0
1995 4.5 5.0 5.0 0.0
1996 5.4 6.0 4.5 1.5
1997 5.0 5.0 3.5 1.5
1998 4.6 5.0 3.5 1.5
1999 4.3 4.0 2.5 1.5
2000 4.1 3.5 2.0 1.5
2001 3.8 2.5 1.0 1.5
2002 4.7 4.5 1.0 3.5
2003 4.5 3.5 0.0 3.5
2004 4.2 3.5 0.0 3.5
2005 4.0 3.5 0.0 3.5
2006 4.3 4.5 1.0 3.5
2007 4.8 4.5 1.0 3.5
2008 4.6 4.5 1.0 3.5
2009 4.6 5.0 1.5 3.5
2010 5.0 6.5 1.5 5.0
2011 5.4 8.0 1.5 6.5
2012 6.1 10.5 4.0 6.5
2013 6.6 10.5 4.0 6.5
2014 6.3 10.5 4.0 6.5
2015 6.1 10.5 4.0 6.5
2016 5.9 10.5 4.0 6.5
2017 6.5 12.0 3.0 9.0
2018 6.3 10.5 3.0 7.5
2019 6.1 10.5 3.0 7.5
2020 6.0 10.0 2.5 7.5
2021 5.9 9.0 2.5 6.5
2022 5.8 7.5 2.5 5.0
2023 6.0 6.5 1.5 5.0
2024 5.9 4.5 1.5 3.0
2025 5.7 4.5 1.5 3.0
2026 5.7 5.0 2.0 3.0

This table will help you calculate approximately how much Super Mario is coming in the next decade. The current 10-year window pace shows 5 hours of Super Mario per year.

Looking at the trends, it looks like we may have already passed peak 2D and 3D Mario individually. This table also shows how overdue we are for a new big 3D Super Mario title, the last entry being Super Mario Odyssey almost a decade ago in 2017.

If I were to somewhat morbidly apply these numbers I can estimate how much more new “Super Mario” gameplay I’m likely to experience. Let’s be optimistic and apply the “All-Time Average” instead of the “10-Year Average”: the resulting number is 256 hours. Around 10 games of similar size to “Super Mario Odyssey”... seems good to me!

Super Mario Blogroll

If you want to read more Super Mario writing here are a few personal selections from my blogroll:

Happy gaming!



Thanks for keeping RSS alive! ♥

May 29, 2026 12:00 AM UTC

May 28, 2026


Kay Hayen

Nuitka Release 4.1

This is to inform you about the new stable release of Nuitka. It is the extremely compatible Python compiler, “download now”.

This release adds many new features and corrections with a focus on async code compatibility, missing generics features, and Python 3.14 compatibility and Python compilation scalability yet again.

Bug Fixes

Package Support

New Features

Optimization

Anti-Bloat

Organizational

Tests

Cleanups

Summary

This release builds on the scalability improvements established in 4.0, with enhanced Python 3.14 support, expanded package compatibility, and significant optimization work.

The --project option seems usable now.

Python 3.14 support remains experimental, but only barely made the cut, and probably will get there in hotfixes. Some of the corrections came in so late before the release, that it was just not possible to feel good about declaring it fully supported just yet.

May 28, 2026 10:00 PM UTC


Real Python

Quiz: BNF Notation: Dive Deeper Into Python's Grammar

In this quiz, you’ll test your understanding of BNF Notation: Dive Deeper Into Python’s Grammar.

By working through this quiz, you’ll revisit how to read Python’s grammar rules, recognize terminals and nonterminals, and interpret the BNF fragments that appear throughout the official documentation.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 28, 2026 12:00 PM UTC


Bob Belderbos

Two Python Scoping Bugs: A Lesson in Object Lifetimes

Your app works fine with one user. You open a second browser tab and the data is wrong. Your tests pass individually but fail when run together. The culprit: a global object created at module scope.

How it starts

I see this a lot in Python web projects:

# database.py
from sqlmodel import create_engine, Session

engine = create_engine("sqlite:///database.db")

def get_session():
    with Session(engine) as session:
        yield session

This 'innocent' engine is created the moment database.py is first imported. Every module that imports from database shares the same engine, the same connection pool, the same database file. For a simple script, this is fine. For a multi-module app, it creates hidden coupling and shared state.

The test isolation problem

I hit this recently in a FastAPI app:

# test_app.py
from myapp.database import engine, create_db_and_tables, clear_db_and_tables

@pytest.fixture(autouse=True)
def setup_database():
    clear_db_and_tables()
    create_db_and_tables()

def test_create_race(race_events):
    championship = create_races(2026, race_events)
    assert championship.id == 1  # Passes alone, fails in suite

That assert championship.id == 1 works when the test runs first. Run it after another test that inserts data, and the auto-increment ID comes back as 2. The fixture does its job, but state still leaks between tests in subtle ways: connection pool state, cached metadata, and SQLite's own bookkeeping on the shared file can carry over even with drop/recreate cycles.

The root cause is upstream: every test reaches for the same module-level engine pointed at the same on-disk database. If you want true isolation, the engine itself has to be per-test, not the cleanup ritual around it.

The fix is creating an engine per test session:

@pytest.fixture
def engine():
    engine = create_engine("sqlite://", echo=False)
    SQLModel.metadata.create_all(engine)
    yield engine
    engine.dispose()

@pytest.fixture
def session(engine):
    with Session(engine) as session:
        yield session

Now each test gets a fresh database (no scope defined on the fixture decorator means function scope = per test). No cleanup needed. No shared state.

Alternatively, keep the module-level engine and wrap each test in a transaction you roll back at teardown (sqlmodel's Session supports this).

If you omit engine.dispose() in the code above, you may see a ResourceWarning: unclosed database, but only when running pytest --cov. Coverage's sys.settrace() hook keeps frame locals alive longer, delaying GC of the engine.

The shared simulator problem

The database engine bug is about too much sharing. Here is the inverse: not enough sharing, which breaks in a different way.

Consider a race simulation dashboard. FakeDataSource wraps a RaceSimulator that holds the full mutable race state, driver positions, lap counter, cumulative changes, and advances it on each call:

class FakeDataSource(RaceDataSource):
    def __init__(self, data_file: Path, delay_ms: int = 100):
        drivers = self._load_drivers(data_file)
        self.simulator = RaceSimulator(drivers=drivers)  # mutable state lives here

    async def get_positions(self, fixture_id: str) -> list[Position]:
        self.simulator.tick()  # randomly swaps adjacent positions, advances lap
        return self.simulator.get_current_positions()

The FastAPI dependency looks like this:

def get_data_source() -> RaceDataSource:
    source_type = config("DATA_SOURCE", default="fake")
    if source_type == "fake":
        return FakeDataSource(data_file=..., delay_ms=...)  # new instance every call
    ...

def get_race_data_source() -> RaceDataSource:
    return get_data_source()

FastAPI calls get_race_data_source() once per request. Each browser tab that opens the SSE stream gets a brand new FakeDataSource with a brand new RaceSimulator starting at lap 1 with drivers in their initial order.

The random swaps then diverge independently: Tab A shows Verstappen in P1 at lap 12, Tab B shows Hamilton in P1 at lap 3. Neither reflects a shared reality, because there is no shared state at all.

Two fixes, from quick to idiomatic

1. cache: one line, works immediately

from functools import lru_cache

@lru_cache(maxsize=1)
def get_data_source() -> RaceDataSource:
    source_type = config("DATA_SOURCE", default="fake")
    if source_type == "fake":
        return FakeDataSource(...)
    return SportmonksDataSource()

One instance for the lifetime of the process. Simple, but hard to override in tests.

2. FastAPI app.state: idiomatic and testable

@asynccontextmanager
async def lifespan(app: FastAPI):
    app.state.data_source = get_data_source()
    yield

app = FastAPI(lifespan=lifespan)

def get_race_data_source(request: Request) -> RaceDataSource:
    return request.app.state.data_source

The data source is created once at startup, shared across all requests, and easy to replace in tests via app.dependency_overrides[get_race_data_source] = lambda: test_source.

Key takeaways

The rule of thumb: if an object holds mutable state, pick its scope deliberately. Too broad (module scope) and tests leak into each other. Too narrow (per-request) and there's no shared reality. Match the scope to the object's intended lifetime.

Or put more sharply: any module-level object that holds mutable state or owns a resource (DB engines, HTTP clients, caches, queues, connection pools) should be encapsulated. Move it into a fixture, a Depends(), or app.state. Constants and pure values at module scope are fine; resources are not.

The cost of "just import it" is paid later, in test isolation, debugging, and concurrency. Under real concurrency the GIL hides this class of bug until it doesn't, see a race condition Rust wouldn't have let me write, where the same module-global pattern leaked one user's data into another user's response.

May 28, 2026 12:00 AM UTC

May 27, 2026


PyCharm

Build a Live Object Detection App for the Reachy Mini With TensorFlow and PyCharm

This is a guest post from Iulia Feroli, founder of the Back To Engineering YouTube community.

Build a Live Object Detection App

In this tutorial, we build a live object detection app using TensorFlow and PyCharm, then deploy it onto the Reachy Mini open-source robot for real-time object tracking.

Reachy Mini is a compact open-source robot built in collaboration by Pollen Robotics, Hugging Face, and Seeed Studio. It has been going viral lately, getting mentioned in NVIDIA videos and even in the keynotes at some of their conferences. What makes it particularly interesting is that not only is all the code open-source, the body is too, which means you can print your own parts and develop your own apps to run on it.

There is an app store of community-built projects you can explore and try, and easily contribute to. Anything conversational or camera-based is especially fun to build because of the hardware it ships with: a speaker, a microphone, and a camera, plus expressive antennas for emotions.

Reachy Mini tutorial

This really highlights the unique new type of robot that the Reachy Mini embodies: It almost feels like it is a physical representation of an LLM or an AI agent, rather than a robot that has AI added to it. It does not have a body that moves around or hands to grab things, so its main selling point is really its brain. That design choice shapes what is most interesting to build with it.

Let’s learn how to build a TensorFlow object detection app and deploy it on the Reachy Mini, which will then allow us to do live object tracking. You can head over to the PyCharm channel for the full code breakdown and try it at home. All the code is in the Reachy-mini-object-detection GitHub repository.

For an introduction to the robot, you can first watch Iulia’s video here:

What you’ll learn

What we are building

The project is split into two stages.

Stage 1 is a standalone notebook that runs entirely on your laptop using your webcam. No robot needed. This is where we make sure the detection pipeline works correctly before touching any hardware.

Stage 2 is a Reachy Mini app that integrates the same model with the robot: Her head moves to follow detected objects, her antennas wiggle when she spots something new, and a live web dashboard at http://0.0.0.0:8042 shows the annotated camera feed and detections.

You can follow along with the step-by-step video tutorial:

How TensorFlow object detection works: Step by step

1. Capture an image frame from the webcam. 

2. Convert the frame into a TensorFlow tensor.

3. Run inference through the pretrained model.

4. Receive bounding boxes, labels, and confidence scores.

5. Filter low-confidence detections.

6. Draw annotated results onto the frame.

7. Display the processed image in real time.

Prerequisites

Stage 1: Building a Tensorflow object detection pipeline in PyCharm

Before connecting the robot, we want to make sure the TensorFlow part works independently. We are going to make a notebook that only executes through our object detection model and makes it run smoothly. PyCharm’s native notebook integration is a great fit here: You can inspect each step of the pipeline and visualize results inline.

The object detection model

We are using SSD MobileNet V2 from TensorFlow Hub, trained on Open Images V4. This popular model from Google provides SSD-based object detection and has been trained on a lot of open images. With a little bit of fine-tuning you can deploy it with your own use case, though for this tutorial, the general model works well without any fine-tuning at all.

It runs at around 10 FPS on CPU, which is fast enough for responsive real-time behavior on the robot.

Install dependencies

!pip install tensorflow tensorflow-hub opencv-python numpy Pillow

Load the model

import tensorflow as tf
import tensorflow_hub as hub
import numpy as np
import cv2
import time
from IPython.display import display, clear_output
from PIL import Image

MODEL_HANDLE = "https://tfhub.dev/google/openimages_v4/ssd/mobilenet_v2/1"

print(f"TensorFlow version: {tf.__version__}")
print("Loading model (first time downloads ~30MB)...")

detector = hub.load(MODEL_HANDLE)
print("Model loaded!")

The model is about 30 megabytes and gets cached locally after the first download. Because it is very generalized, it can work across a lot of different scenarios without needing additional training data, which makes it a lot easier to get started.

Detection and drawing helpers

We need two helper functions: one to run inference and return a list of detections, and one to draw the bounding boxes on the frame. These are the same functions we use later in the Reachy app.

def detect_objects(frame_bgr, min_score=0.5, max_detections=10):
    rgb = frame_bgr[:, :, ::-1]
    img_tensor = tf.image.convert_image_dtype(rgb, tf.float32)[tf.newaxis, ...]

    results = detector.signatures['default'](img_tensor)

    boxes = np.array(results["detection_boxes"])
    scores = np.array(results["detection_scores"])
    class_labels = np.array(results["detection_class_entities"])

    if boxes.ndim > 2:
        boxes = boxes[0]
    if scores.ndim > 1:
        scores = scores[0]
    if class_labels.ndim > 1:
        class_labels = class_labels[0]

    scores = np.atleast_1d(scores)
    indices = [i for i, score in enumerate(scores) if score >= min_score][:max_detections]

    detections = []
    for idx in indices:
        ymin, xmin, ymax, xmax = boxes[idx]
        label = class_labels[idx].decode('utf-8') if isinstance(class_labels[idx], bytes) else str(class_labels[idx])
        detections.append({
            "box": [ymin, xmin, ymax, xmax],
            "score": float(scores[idx]),
            "label": label
        })

    return detections


def draw_detections(frame_bgr, detections):
    h, w = frame_bgr.shape[:2]
    annotated = frame_bgr.copy()

    for det in detections:
        ymin, xmin, ymax, xmax = det["box"]
        x1, y1 = int(xmin * w), int(ymin * h)
        x2, y2 = int(xmax * w), int(ymax * h)

        color = (0, 255, 0)
        cv2.rectangle(annotated, (x1, y1), (x2, y2), color, 2)

        label = f"{det['label']} {det['score']:.0%}"
        font_scale, thickness = 0.6, 2
        (tw, th), _ = cv2.getTextSize(label, cv2.FONT_HERSHEY_SIMPLEX, font_scale, thickness)
        cv2.rectangle(annotated, (x1, y1 - th - 8), (x1 + tw + 4, y1), color, -1)
        cv2.putText(annotated, label, (x1 + 2, y1 - 4),
                    cv2.FONT_HERSHEY_SIMPLEX, font_scale, (0, 0, 0), thickness)

    return annotated

The detect_objects function runs inference using the model’s detect_objects entry point and handles flattening the batch dimension from the output tensors. Labels come back as bytes from the model, so we decode them to strings before returning.

Test on a single frame

cap = cv2.VideoCapture(0)
ret, frame = cap.read()
cap.release()

if not ret:
    print("ERROR: Could not access webcam. Make sure no other app is using it.")
else:
    print(f"Frame captured: {frame.shape}")

    t0 = time.time()
    detections = detect_objects(frame)
    elapsed = time.time() - t0

    print(f"Inference time: {elapsed:.2f}s ({1/elapsed:.1f} FPS)")
    print(f"Found {len(detections)} objects:")
    for d in detections:
        print(f"  - {d['label']}: {d['score']:.0%}")

    annotated = draw_detections(frame, detections)
    display(Image.fromarray(annotated[:, :, ::-1]))

This is the stage where you check that the model is detecting correctly and the bounding boxes are drawn in the right places. The inline image display in PyCharm’s notebook view makes it easy to see the result right there in the cell.

Running real-time TensorFlow object detection with OpenCV

Once the single-frame test looks good, you can run it continuously:

cap = cv2.VideoCapture(0)

if not cap.isOpened():
    print("ERROR: Could not open webcam.")
else:
    print("Running live detection... (interrupt kernel to stop)")
    try:
        while True:
            ret, frame = cap.read()
            if not ret:
                break

            t0 = time.time()
            detections = detect_objects(frame)
            fps = 1.0 / max(time.time() - t0, 0.001)

            annotated = draw_detections(frame, detections)
            cv2.putText(annotated, f"{fps:.1f} FPS", (10, 30),
                        cv2.FONT_HERSHEY_SIMPLEX, 1.0, (0, 0, 255), 2)

            clear_output(wait=True)
            display(Image.fromarray(annotated[:, :, ::-1]))

            labels = ", ".join(f"{d['label']} ({d['score']:.0%})" for d in detections)
            print(f"{fps:.1f} FPS | {len(detections)} objects: {labels or 'none'}")

    except KeyboardInterrupt:
        print("Stopped.")
    finally:
        cap.release()
        print("Camera released.")

At this point we have built a notebook that works with just having object detection and we can use this with a simple camera of whatever type you have around. Now, we can wrap it up and make it into an app that we can deploy on the Reachy.


Stage 2: Deploying the TensorFlow object detection app on the Reachy Mini

The Reachy Mini app lives in the reachy_mini_object_detector/ folder and extends the detection logic with head tracking, antenna reactions, and a web dashboard. We’ve followed the guidelines for building Reachy Apps laid out in this blog post. Particularly, we can leverage a helper LLM system like Claude by giving it the predefined Agent Helper documentation.

Project structure

reachy_mini_object_detector/
├── pyproject.toml
└── reachy_mini_object_detector/
    ├── detector.py       # TF Hub model wrapper
    ├── main.py           # App: head tracking + web dashboard
    └── static/           # Web UI assets (served at :8042)

The detector.py file wraps the model and the detect_objects logic. main.py imports from it and adds everything specific to the robot.

Installing the app

From the Reachy Mini dashboard, under Apps, or by manually adding:

pip install git+https://huggingface.co/spaces/backtoengineering/reachy_mini_object_detector

How head tracking works

The app runs two loops in parallel: an inference thread that grabs frames from the robot’s camera and runs detection, and a main control loop at around 50Hz that handles head movement and antenna control.

The head tracking feature maps the detected object’s position in the frame to a yaw and pitch offset for the head. The camera has a horizontal field of view of 60 degrees and a vertical field of view of 45 degrees. When an object is at the center of the frame its center_x is 0.5, so subtracting 0.5 and multiplying by the field of view gives the angle offset to track it:

target_yaw = -(largest.center_x - 0.5) * CAMERA_FOV_H_DEG
target_pitch = (largest.center_y - 0.5) * CAMERA_FOV_V_DEG

Rather than snapping the head instantly to that target, the app uses a smoothing factor (TRACKING_ALPHA = 0.15) so the movement looks natural:

self._current_yaw += TRACKING_ALPHA * (target_yaw - self._current_yaw)
self._current_pitch += TRACKING_ALPHA * (target_pitch - self._current_pitch)

When nothing is detected, the head slowly drifts back toward center rather than freezing in place.

Antenna wiggle

The antennas wiggle when a new object class is first detected, not on every frame. The app keeps track of which classes have already been seen in _seen_classes, and when something new appears it sets a wiggle timer for 1.5 seconds. During that window, the control loop drives a sinusoidal antenna movement as follows:

phase = (t - t0) * 8.0  # fast wiggle
antenna_val = np.deg2rad(20.0 * np.sin(phase))
antennas = np.array([antenna_val, -antenna_val]

This makes the interaction feel intentional: Reachy reacts when she sees something new, rather than wiggling constantly while tracking.

The web dashboard

The app serves a live dashboard (available at http://0.0.0.0:8042) with the annotated camera feed (as an MJPEG stream), the current detection list, an FPS counter, and a toggle to enable or disable head tracking. This is useful during development because you can see exactly what the model is detecting from the robot’s perspective in real time.


Where to go next

This is a great starting point and there are a lot of directions you can take it:

You can find all the code in the Reachy-mini-object-detection repository. Everything is open-source, so feel free to build on it, adapt it, or deploy your own version.

FAQs

What is TensorFlow object detection?

TensorFlow object detection is a computer vision technique that uses machine learning models to identify and locate objects within images or video streams.

What is the best TensorFlow object detection model for real-time applications?

SSD MobileNet V2 is commonly used for real-time TensorFlow object detection because it balances inference speed and accuracy efficiently.

Can TensorFlow object detection run on CPU?

Yes. Models like SSD MobileNet V2 can run entirely on CPU, making them suitable for laptops, edge devices, and robotics projects.

What is the difference between the TensorFlow Object Detection API and TensorFlow Hub?

TensorFlow Hub provides pretrained reusable models, while the TensorFlow Object Detection API offers a larger framework for training, evaluation, and deployment workflows.

Can I train TensorFlow object detection on custom data?

Yes. You can fine-tune pretrained models using your own labelled datasets to detect custom objects.

About the author

Iulia Feroli

Iulia Feroli is the founder of the Back To Engineering community on YouTube, where she builds robots, explores physical AI, and makes complex engineering topics accessible and fun. She has a background in data science, AI, cloud architecture, and open source.

May 27, 2026 02:06 PM UTC


Real Python

Sending Emails With Python

You probably found this tutorial because you want to send emails with Python to automate confirmation messages, password resets, or scheduled notifications. Python’s standard library covers the whole pipeline, from making a server connection to building the message and sending it to one or many recipients. This tutorial walks through every step in working code.

By the end of this tutorial, you’ll understand that:

  • A safe testing setup uses a throwaway Gmail account with an app password, a local aiosmtpd debug server, or a privacy-focused provider like Posteo or Proton Mail.
  • A secure SMTP session uses .SMTP_SSL() with ssl.create_default_context(), which validates the server certificate and encrypts your credentials and message content.
  • The EmailMessage class from the email package assembles plain text, HTML alternatives, file attachments, and personalized fields through .set_content(), .add_alternative(), and .add_attachment().
  • Setting msg["reply-to"] or any other RFC 5322 header on an EmailMessage routes replies to a different mailbox than the sender address.
  • For high-volume sending, transactional email services like SendGrid, Mailgun, and Brevo provide deliverability, statistics, and API libraries that go beyond what smtplib alone offers.

Before you jump into the code, you’ll set up a throwaway email account or a local debug server so you can experiment freely without spamming real inboxes.

Get Your Code: Click here to download the free sample code you’ll use to learn how to send plain-text and HTML emails, attach files, and automate email delivery with Python.

Take the Quiz: Test your knowledge with our interactive “Sending Emails With Python” quiz. You’ll receive a score upon completion to help you track your learning progress:


Interactive Quiz

Sending Emails With Python

Use Python's standard library to send email through secure SMTP connections, attach files, include HTML content, and route replies.

Setting Up an Email Service

Email is sent from a client to an email server, and from one email server to another, using the Simple Mail Transfer Protocol, or SMTP, defined under RFC 821. Python comes with the built-in smtplib module, which implements this protocol, allowing you to programmatically send email through any accessible email server.

While you can certainly use your own email account for this tutorial, it’s recommended that you set up a throwaway email account instead. There are several free and paid email services you can use. In this tutorial, you’ll explore the following options:

  • Setting up a Gmail account for development: You’ll learn how to create a dedicated testing account and use app passwords to satisfy modern security requirements.
  • Setting up a local SMTP server: You’ll use the aiosmtpd library to run a server on your own machine, allowing you to inspect email content without sending any live messages.
  • Setting up other email accounts for development: You’ll see how to connect to alternative services like Posteo or Proton Mail to ensure your code works across different providers.

Understanding the distinction between secure (encrypted) and insecure (unencrypted) connections is vital. Most modern providers require encryption via SSL or TLS to protect your data, while the local debugging server uses no encryption. By the end of this section, you’ll know how to choose the right connection type for your specific service choice.

Setting Up a Gmail Account for Development

To set up a Gmail account for testing your code, follow these steps:

  1. Create a new Google account. You need to provide a name, a birthday, and a unique username for the account.
  2. Set up two-factor authentication for the new account.
  3. Add a new app password to allow password sign-ins to the account.

An app password is a temporary password generated by Google. Instead of using your main account password to authenticate with your username, you use the app password. You can delete and recreate app passwords whenever you like.

App passwords allow access to Gmail when modern security measures like OAuth2 aren’t available. When creating one, make sure you copy it to a secure location, as you won’t be able to review it after leaving the page.

If you don’t want to use an app password, check out Google’s documentation on how to obtain access credentials for your Python script using the OAuth2 authorization framework.

A nice feature of Gmail is that you can use the + sign to add modifiers to your email address right before the @ sign. For example, emails sent to my+person1@gmail.com and my+person2@gmail.com will both arrive at my@gmail.com. When testing email functionality, you can use this to simulate multiple addresses that all point to the same inbox.

Setting Up a Local SMTP Server

You can test email functionality by running a local Simple Mail Transfer Protocol (SMTP) debugging server with the aiosmtpd module. Rather than sending emails to a specific address, the local debug server discards the message after printing its content to the console. Running a local debugging server makes it unnecessary to deal with encryption of messages or use credentials to log in to an email server.

Note: aiosmtpd is a third-party library that replaces the former built-in smtpd module, which was initially deprecated in Python 3.4.7. Deprecation notices were repeated in 3.5.4 and 3.6.1, and the module was eventually removed in Python 3.12, as outlined in PEP 594.

Install the aiosmtpd module with the following command:

Language: Shell
$ python -m pip install aiosmtpd

Then, start a local SMTP debugging server with this command:

Language: Shell
$ python -m aiosmtpd -n

Read the full article at https://realpython.com/python-send-email/ »


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 27, 2026 02:00 PM UTC

Quiz: Sending Emails With Python

In this quiz, you’ll test your understanding of Sending Emails With Python.

By working through this quiz, you’ll revisit how to build messages with the EmailMessage class, secure your SMTP connection, attach files, send HTML alternatives, route replies to a different mailbox, and address multiple recipients at once.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 27, 2026 12:00 PM UTC


PyPy

PyPy v7.3.23 release

PyPy v7.3.23: release of python 2.7, 3.11

The PyPy team is proud to release version 7.3.23 of PyPy after the previous release on April 26, 2026. This is a bug-fix release that fixes an overeager warning about unused coroutines, and some problems around multiple inheritance in c-extensions.

This version includes a change to the bytecode interpreter to use exception tables instead of dedicated opcodes. Now the PyPy disassembly will be closer to CPython format. So far it does not impact performance.

The release includes two different interpreters:

The interpreters are based on much the same codebase, thus the double release. This is a micro release, all APIs are compatible with the other 7.3 releases.

We recommend updating. You can find links to download the releases here:

https://pypy.org/download.html

We would like to thank our donors for the continued support of the PyPy project. If PyPy is not quite good enough for your needs, we are available for direct consulting work. If PyPy is helping you out, we would love to hear about it and encourage submissions to our blog via a pull request to https://github.com/pypy/pypy.org

We would also like to thank our contributors and encourage new people to join the project. PyPy has many layers and we need help with all of them: bug fixes, PyPy and RPython documentation improvements, or general help with making RPython's JIT even better.

If you are a python library maintainer and use C-extensions, please consider making a HPy / CFFI / cppyy version of your library that would be performant on PyPy. In any case, cibuildwheel supports building wheels for PyPy.

What is PyPy?

PyPy is a Python interpreter, a drop-in replacement for CPython. It's fast (PyPy and CPython performance comparison) due to its integrated tracing JIT compiler.

We also welcome developers of other dynamic languages to see what RPython can do for them.

We provide binary builds for:

PyPy supports Windows 32-bit, Linux PPC64 big- and little-endian, Linux ARM 32 bit, RISC-V RV64IMAFD Linux, and s390x Linux but does not release binaries. Please reach out to us if you wish to sponsor binary releases for those platforms. Downstream packagers provide binary builds for debian, Fedora, conda, OpenBSD, FreeBSD, Gentoo, and more.

What else is new?

For more information about the 7.3.23 release, see the full changelog.

Please update, and continue to help us make pypy better.

Cheers, The PyPy Team

May 27, 2026 07:40 AM UTC


Python GUIs

Fixing Missing Icons in PyInstaller-Packaged PyQt6 Applications on Windows — Why your app icon disappears after packaging and how to fix it

I've packaged my PyQt application with PyInstaller, but the icon isn't showing up — both the executable icon and the running application icon are just the default Python/Windows icon. What's going on?

This is a common issue when packaging PyQt6 apps with PyInstaller on Windows. The good news is that it usually comes down to one of two straightforward causes: Windows icon caching, and missing resource files in your packaged output.

Setting the executable icon with PyInstaller

When you run PyInstaller, you can set the icon for the .exe file itself using the --icon flag:

sh
pyinstaller --windowed --icon=myicon.ico myapp.py

This embeds the icon into the executable, so it shows up in File Explorer and on the desktop. The icon file needs to be in .ico format — .png or .svg won't work here.

After building, check the dist/ folder. Your .exe should display the custom icon. But sometimes... it doesn't.

Windows icon caching

Windows caches icons aggressively. If you've previously built your app without a custom icon, Windows may continue to show the old default icon even after you've rebuilt the app with the correct one.

This still catches me out, even though I know this. You'll reflexively start checking the config assuming something is wrong, and think you're going mad.

There are a few ways to deal with this:

python
ie4uinit.exe -show

After clearing the cache, the correct icon should appear.

You can also try turning your computer off and on again, or rather restarting Windows. That will also trigger the icon cache to rebuild.

Missing icon file at runtime

Setting the executable icon with --icon only affects what shows up in File Explorer. If your application also sets a window icon in code (using setWindowIcon), that icon file needs to be available at runtime too.

For example, if your code does this:

python
from PyQt6.QtWidgets import QApplication, QMainWindow
from PyQt6.QtGui import QIcon
import sys


class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("My Application")


app = QApplication(sys.argv)
app.setWindowIcon(QIcon("myicon.ico"))

window = MainWindow()
window.show()

app.exec()

Then myicon.ico needs to exist in the working directory when the packaged app runs. By default, PyInstaller doesn't include data files like .ico images unless you tell it to.

You can add the icon file to your build using the --add-data flag:

sh
pyinstaller --windowed --icon=myicon.ico --add-data "myicon.ico;." myapp.py

On Linux or macOS, use : instead of ; as the separator:

sh
pyinstaller --windowed --icon=myicon.ico --add-data "myicon.ico:." myapp.py

This copies myicon.ico into the output directory alongside your executable (or into the temporary directory if you're using --onefile).

An alternative approach (not available on PyQt6) is to use the Qt Resource System to embed your icon directly into your application, which avoids the need to bundle separate icon files entirely.

Handling --onefile builds

When you use --onefile, PyInstaller extracts everything to a temporary folder at runtime. Your code needs to know how to find files relative to that temporary folder. You can handle this by detecting the base path:

python
import sys
import os

if getattr(sys, 'frozen', False):
    # Running as a PyInstaller bundle
    basedir = sys._MEIPASS
else:
    # Running as a normal script
    basedir = os.path.dirname(__file__)

Then use basedir when constructing file paths:

python
app.setWindowIcon(QIcon(os.path.join(basedir, "myicon.ico")))

Taskbar grouping with an Application User Model ID

On Windows, the taskbar groups windows by their application identity. Without an explicit identity, Windows guesses — and sometimes guesses wrong. This can cause your app to show the Python icon in the taskbar, or to group instances inconsistently depending on where they were launched from.

You can fix this by setting an Application User Model ID before creating your QApplication. This tells Windows exactly which application this is:

python
import ctypes

myappid = "com.mycompany.myapp.1.0"
ctypes.windll.shell32.SetCurrentProcessExplicitAppUserModelID(myappid)

The string can be anything, but it's conventional to use a reverse-domain format. The value just needs to be unique to your application.

With an explicit app ID set, all instances of your app will group together in the taskbar regardless of where they were launched from — whether that's your IDE, the dist/ folder, or a --onefile build.

Complete working example

Here's a complete example that handles all of the above — the runtime base path, the window icon, and the application user model ID. If you're new to building PyQt6 applications, you may want to start with creating your first window before tackling packaging.

python
import sys
import os
import ctypes

from PyQt6.QtWidgets import QApplication, QMainWindow, QLabel
from PyQt6.QtGui import QIcon
from PyQt6.QtCore import Qt


# Set the app user model ID before creating QApplication (Windows only)
if sys.platform == "win32":
    myappid = "com.mycompany.myapp.1.0"
    ctypes.windll.shell32.SetCurrentProcessExplicitAppUserModelID(myappid)

# Determine the base directory for resource files
if getattr(sys, "frozen", False):
    basedir = sys._MEIPASS
else:
    basedir = os.path.dirname(__file__)


class MainWindow(QMainWindow):
    def __init__(self):
        super().__init__()
        self.setWindowTitle("My Application")
        label = QLabel("Hello, world!")
        label.setAlignment(Qt.AlignmentFlag.AlignCenter)
        self.setCentralWidget(label)


app = QApplication(sys.argv)
app.setWindowIcon(QIcon(os.path.join(basedir, "myicon.ico")))

window = MainWindow()
window.show()

app.exec()

To package this with PyInstaller:

sh
pyinstaller --windowed --icon=myicon.ico --add-data "myicon.ico;." myapp.py

For an in-depth guide to building Python GUIs with PySide6 see my book, Create GUI Applications with Python & Qt6.

May 27, 2026 06:00 AM UTC


Python Morsels

Selecting random values in Python

Python's random module provides utilities for generating pseudorandom numbers. For cryptographically-secure randomness, use the secrets module instead.

Table of contents

  1. Generating random integers
  2. Generating random floating point numbers
  3. Selecting random items from a sequence
  4. The random utilities are only pseudorandom
  5. Cryptographically-secure randomness with the secrets module
  6. Random and SystemRandom classes
  7. Use random for pseudo-random numbers and secrets for true randomness

Generating random integers

If you need a random integer, you can use the randint function from Python's random module:

>>> from random import randint
>>> randint(1, 6)
4

This function accepts a start value and a stop value and it returns a random integer between the start and stop values inclusively.

The random module also includes a randrange function, which is named after Python's range function:

>>> from random import randrange
>>> randrange(10)
7

This function accepts the same values as range.

Either a stop value:

>>> randrange(5)
2

Or start and stop values:

>>> randrange(5, 10)
8

Or start, stop, and step values:

>>> randrange(0, 100, 10)
70

The randrange function basically chooses a random number within a given range.

When I need a random number, I usually use randint.

Generating random floating point numbers

What if you need a …

Read the full article: https://www.pythonmorsels.com/random-numbers/

May 27, 2026 04:00 AM UTC

May 26, 2026


PyCoder’s Weekly

Issue #736: Polars Sort-Merge Joins, Zen, Resolving Lazy Imports, and More (2026-05-26)

#736 – MAY 26, 2026
View in Browser »

The PyCoder’s Weekly Logo


Streaming Sort-Merge Joins in Polars

“Joins are often one of the most expensive parts of a query. Once tables get large, the join can heavily impact both runtime and memory usage… If the join keys are already sorted, Polars can now take a cheaper path: a streaming sort-merge join.”
THIJS NIEUWDORP

Tapping Into the Zen of Python

Explore the Zen of Python and its 19 guiding principles for writing readable, practical code. Learn its history, jokes, and meaning.
REAL PYTHON course

Quiz: Tapping Into the Zen of Python

REAL PYTHON

FREE Python Error Tracking From Honeybadger – all Signal, no Noise

alt

Production bugs don’t arrive one at a time. Honeybadger groups similar errors into a single issue and lets you pause or ignore alerts in a single click. More signal. Less noise. ⚡ Sign Up for Your FREE Developer Account →
HONEYBADGER sponsor

Resolve a Lazy Import Manually

Learn how to work around the Python 3.15 machinery to resolve an explicit lazy import manually.
RODRIGO GIRÃO SERRÃO

Django 6.1 Alpha 1 Released

Posted by Jacob Walls on May 20, 2026
DJANGO SOFTWARE FOUNDATION

Nuitka Python Compiler Release 4.1

NUITKA.NET

Call for Onsite Volunteers: Make EuroPython 2026 Happen

EUROPYTHON.EU

PEP 831: Frame Pointers Everywhere: Enabling System-Level Observability for Python (Final)

This PEP proposes two things:
PYTHON.ORG

PEP 808: Including Static Values in Dynamic Metadata (Accepted)

PYTHON.ORG

Articles & Tutorials

PyCon US 2026 Packaging Summit Recap

Per-talk notes from the PyCon US 2026 Packaging Summit, including: Emma Smith on Wheel 2.0 and Zstandard compression, Mike Fiedler on PyPI abuse vectors, Mahe Iram Khan on ecosystems, lightning talks on PEP 772, mobile wheels, AI accelerator variants, and the roundtable discussions.
BERNÁT GÁBOR

Slim Down Python Docker Containers

Learn how SlimToolkit can reduce a Python Docker image by analyzing what your app actually uses at runtime. This tutorial walks through slimming a Chainlit LLM chatbot image, shows where container bloat comes from, and explains how to avoid breaking lazily loaded Python frameworks.
CODECUT.AI • Shared by Khuyen Tran

Object-Oriented Python: 5-Day Live Workshop, June 8 to 12

A new live cohort for Python developers comfortable with the basics who want to design classes that hold up under change. Across five 2-hour sessions, OOP features appear at the moment a growing project actually needs them. You leave with a working app and the judgment to know when a class earns its keep →
REAL PYTHON sponsor

What Types of Exceptions Should You Catch?

The trickiest programming bugs are often caused by catching exceptions that you didn’t mean to catch or handling exceptions in ways that obfuscate the actual error that’s occurring. Which exceptions should you catch and which should you leave unhandled?
TREY HUNNER

Reverse Geocoding With Overture Maps

Mark is working on a reverse geocoder that can fetch the 2-letter ISO country code for any point on a map in a country’s boundaries. This post talks about the prototype and his progress on the project.
MARK LITWINTSCHIK

Stop Writing Edge Case Tests. Use Hypothesis Instead

An introduction to property-based testing in Python with Hypothesis: the mental shift from ‘what input should I test?’ to ‘what invariant should always hold?’
PEYTON GREEN • Shared by Anonymous

Opaque Types in Python

Learn how to use the NewType to mask a private class while still providing a public construction mechanism for the users of your library.
GLYPH LEFKOWITZ

How to Use the Claude API in Python

Learn how to use the Claude API in Python to send prompts, control responses with system instructions, and get structured JSON output.
REAL PYTHON

Quiz: How to Use the Claude API in Python

REAL PYTHON

uv Is Fantastic, but Its Package UX Is a Mess

This opinion piece talks about how uv’s CLI feels surprisingly clunky compared to its peers like pnpm or Poetry.
KEVIN RENSKERS

Python Built-in Functions: A Complete Guide

Use Python’s built-in functions for math, data types, iterables, and I/O to write shorter, more Pythonic code.
REAL PYTHON

Projects & Code

flake8-lazy: Detect Lazy-Importable Modules in Python 3.15+

GITHUB.COM/HENRYIII

django-arch-check: Static Checker for Common Django Issues

GITHUB.COM/RJ-GAMER

tdb: A Python Debugger Based on Textual

GITHUB.COM/ALDANIAL

postman2pytest: Convert Postman Collection Into pytest Suite

GITHUB.COM/GOLIKOVICHEV

agent-memory-guard: OWASP ASI06 AI Agent Memory Guard

GITHUB.COM/OWASP • Shared by Vaishnavi Gudur

Events

PyCon Italia 2026

May 27 to May 31, 2026
PYCON.IT

Weekly Real Python Office Hours Q&A (Virtual)

May 27, 2026
REALPYTHON.COM

PyLadies Amsterdam: Scalable Data Harvesting for AI

May 28, 2026
MEETUP.COM

Python Leiden User Group

May 28, 2026
PYTHONLEIDEN.NL

PyDelhi User Group Meetup

May 30, 2026
MEETUP.COM

PyLadies El Alto: Flash Talks

May 30 to May 31, 2026
MEETUP.COM


Happy Pythoning!
This was PyCoder’s Weekly Issue #736.
View in Browser »

alt

[ Subscribe to 🐍 PyCoder’s Weekly 💌 – Get the best Python news, articles, and tutorials delivered to your inbox once a week >> Click here to learn more ]

May 26, 2026 07:30 PM UTC


Real Python

Connecting LLMs to Your Data With Python MCP Servers

The Model Context Protocol (MCP) is a new open protocol that allows AI models to interact with external systems in a standardized, extensible way. In this video course, you’ll install MCP, explore its client-server architecture, and work with its core concepts: prompts, resources, and tools. You’ll then build and test a Python MCP server that queries e-commerce data and integrate it with an AI agent in Cursor to see real tool calls in action.

By the end of this video course, you’ll understand:

You’ll get hands-on experience with Python MCP by creating and testing MCP servers and connecting your MCP to AI tools. To keep the focus on learning MCP rather than building a complex project, you’ll build a simple MCP server that interacts with a simulated e-commerce database. You’ll also use Cursor’s MCP client, which saves you from having to implement your own.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 26, 2026 02:00 PM UTC

Quiz: I/O Operations and String Formatting

In this quiz, you’ll revisit the core concepts covered in the I/O Operations and String Formatting learning path:

A person presenting a machine that can perform modern Python string interpolation and formatting

Learning Path

I/O Operations and String Formatting

10 Resources ⋅ Skills: Python, Fundamentals, I/O, String Formatting, f-strings, print()

The 20 questions span reading keyboard input, controlling print(), stripping characters from strings, the format mini-language, and f-strings, giving you a way to check that you understood the most important ideas.

Take your time and revisit any topics that feel rusty before moving on.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 26, 2026 12:00 PM UTC

Quiz: Visualizing Data in Python With Seaborn

In this quiz, you’ll test your understanding of Visualizing Data in Python With Seaborn.

By working through this quiz, you’ll revisit how seaborn produces polished statistical plots, including bar plots, scatter plots, line plots, histograms, and KDE curves.

You’ll also reinforce the differences between seaborn’s classic functional interface and its newer objects interface, and you’ll see when to reach for figure-level versus axes-level functions.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 26, 2026 12:00 PM UTC

Quiz: Object-Oriented Programming (OOP) in Python

In this quiz, you’ll revisit the core concepts covered in the Object-Oriented Programming (OOP) in Python learning path:

A person with a notebook under their arm, walking towards a technical machine that has concepts of object-oriented programming noted on its parts

Learning Path

Object-Oriented Programming (OOP)

18 Resources ⋅ Skills: Python, OOP, Classes, Data Classes, Getters, Setters, Property, super(), Magic Methods, Operator Overloading, SOLID, Inheritance, Composition, Mixin Classes, Factory Pattern

You’ll test what you know about defining classes, working with instance and class attributes, controlling object instantiation with constructors, and leveraging inheritance, composition, and mixins. You’ll also check your understanding of properties, magic methods, and the SOLID design principles that lead to cleaner object-oriented code.

Take your time and revisit any topics that feel rusty before moving on.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 26, 2026 12:00 PM UTC

Quiz: Python Data Structures

In this quiz, you’ll revisit the core concepts covered in the Python Data Structures learning path:

A person pointing on a screen that transforms a digital data structure representation into a real-life one based on square objects

Learning Path

Python Data Structures

24 Resources ⋅ Skills: Python, Strings, Lists, Tuples, Dictionaries, Sets, List Comprehensions, range(), Bytes, Sorting

The 20 questions span strings, lists, tuples, dictionaries, sets, sorting, and bytes, giving you a way to check that you understood the most important ideas.

Take your time and revisit any topics that feel rusty before moving on to the next learning path.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 26, 2026 12:00 PM UTC

Quiz: Connecting LLMs to Your Data With Python MCP Servers

In this video course quiz, you’ll test your understanding of Connecting LLMs to Your Data With Python MCP Servers.

By working through this quiz, you’ll revisit core MCP concepts like the client-server architecture, tools that LLMs can call, resources that expose static data, and prompts that act as reusable templates.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 26, 2026 12:00 PM UTC

Quiz: Testing and Continuous Integration

In this quiz, you’ll revisit the core concepts covered in the Testing and Continuous Integration learning path:

A person in a white coat with big glasses holding a computer next to a machine that shows big signs reading fail and pass

Learning Path

Testing and Continuous Integration

10 Resources ⋅ Skills: Unit Testing, Doctest, Mock Object Library, Pytest, Continuous Integration, Docker, Code Quality, GitHub Actions, Software Testing, CI/CD

The 20 questions span testing fundamentals, the unittest framework, mock objects, pytest, code quality tools, and continuous integration with GitHub Actions. They give you a way to check that you understood the most important ideas.

Take your time and revisit any topics that feel rusty before moving on.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 26, 2026 12:00 PM UTC

Quiz: Files and File Streams

In this quiz, you’ll revisit the core concepts covered in the Files and File Streams learning path:

Two people working together, one is inputting data on a computer, the other one is reading a long printout

Learning Path

Files and File Streams

13 Resources ⋅ Skills: Python, Pathlib, File I/O, Serialization, Encoding, Unicode, PDF, WAV, Context Managers, ZIP Files

You’ll check your understanding of opening and reading files, navigating the file system with pathlib, managing resources with context managers and the with statement, and reading or writing WAV audio files.

Take your time and revisit any topics that feel rusty before moving on.


[ Improve Your Python With 🐍 Python Tricks 💌 – Get a short & sweet Python Trick delivered to your inbox every couple of days. >> Click here to learn more and see examples ]

May 26, 2026 12:00 PM UTC