AW Dev Rethought

Programs must be written for people to read, and only incidentally for machines to execute - Harold Abelson

🧠 Python DeepCuts — 💡 Inside Python’s Memory Allocator (pymalloc)


Description:

One of the most confusing aspects of Python in production is memory behaviour.

You free objects… but memory usage doesn’t drop.

You delete large lists… but RSS stays high.

You suspect a memory leak… but GC looks fine.

Most of this confusion comes from not understanding how CPython allocates memory.

This DeepCut explains how Python’s pymalloc allocator works, why it exists, and what it means for real-world systems.


🧩 Python Manages Memory for Objects, Not Variables

In Python, variables do not “own” memory.

They are references to objects.

a = 10
b = "hello"
c = [1, 2, 3]

Each of these creates an object on the heap.

The variable name is just a pointer to that object.

Memory allocation happens when objects are created — not when variables are assigned.


🧠 Why CPython Doesn’t Use malloc for Everything

Calling the OS allocator (malloc / free) for every small object would be slow and fragment memory.

CPython solves this by using pymalloc, a custom allocator optimised for:

  • small objects
  • high allocation frequency
  • predictable reuse

Objects smaller than ~512 bytes are handled by Python’s internal allocator instead of the OS.


🔄 Arenas, Pools, and Blocks (The pymalloc Model)

Internally, pymalloc uses a layered structure:

  • Arenas → large chunks requested from the OS
  • Pools → fixed-size regions inside arenas
  • Blocks → slots for individual Python objects

This design allows Python to:

  • allocate objects very quickly
  • reuse freed memory efficiently
  • reduce fragmentation for small objects

These structures live at the C level and are not directly exposed to Python code.


🧠 Why Memory Is Reused Instead of Returned

When Python frees an object:

  • the memory is returned to pymalloc
  • but not necessarily to the OS
lst = [i for i in range(1_000_000)]
del lst

Even after deletion:

  • Python keeps the memory for reuse
  • future allocations can reuse those blocks
  • RSS may not drop

This is intentional.

Python optimises for speed and stability, not for shrinking memory usage.


🔍 Observing Memory with sys.getsizeof

You can inspect object sizes at the Python level.

import sys
sys.getsizeof(10)

This shows:

  • the size of the object structure
  • not the total memory footprint
  • not shared memory or allocator overhead

It’s useful — but incomplete.


🔎 Tracking Allocations with tracemalloc

For real insight into Python-level memory usage, use tracemalloc.

import tracemalloc

tracemalloc.start()
data = [str(i) for i in range(10000)]
tracemalloc.get_traced_memory()

This allows you to:

  • see current vs peak allocations
  • locate allocation hotspots
  • debug reference-related leaks

tracemalloc tracks Python object allocations, not OS memory directly.


🧬 Why Python Chooses This Design

pymalloc favors:

  • fast object creation
  • low fragmentation
  • predictable performance
  • reuse over release

The trade-off:

  • memory may not return to the OS immediately
  • RSS can appear “stuck”
  • restarting a process frees memory fully

For long-running services, this behaviour is often desirable.


🧠 Practical Implications for Real Systems

Understanding pymalloc explains many production realities:

  • “Memory leaks” that aren’t leaks
  • Why worker recycling is common
  • Why deleting objects doesn’t always lower RSS
  • Why pooling objects can improve performance
  • Why Python processes stabilize over time

This knowledge helps you:

  • debug memory issues correctly
  • set realistic monitoring expectations
  • design robust Python services

⚠️ What pymalloc Does Not Do

pymalloc does not:

  • manage large objects (>512 bytes)
  • replace garbage collection
  • detect reference cycles
  • track OS-level memory usage

Those concerns are handled by:

  • the OS allocator
  • Python’s GC
  • reference counting

Each layer has a distinct responsibility.


✅ Key Points

  • Python allocates memory for objects, not variables
  • Small objects are handled by pymalloc
  • Memory is aggressively reused, not immediately released
  • Arenas, pools, and blocks structure allocations
  • RSS staying high is often expected behaviour
  • tracemalloc helps debug Python-level allocations

Understanding pymalloc prevents misdiagnosing performance and memory issues.


Code Snippet:

import sys
import tracemalloc

# Object allocation
a = 10
b = "hello"
c = [1, 2, 3]

print(type(a), type(b), type(c))

# Inspect object sizes
print([sys.getsizeof(i) for i in range(10)])

# Large allocation and deletion
lst = [i for i in range(1_000_000)]
del lst
print("List deleted")

# Track memory allocations
tracemalloc.start()
data = [str(i) for i in range(10000)]
current, peak = tracemalloc.get_traced_memory()
print(f"Current: {current}, Peak: {peak}")
tracemalloc.stop()

Link copied!

Comments

Add Your Comment

Comment Added!