What does the "mutable default arguments" Python question test?

Note on framing: This explainer is for a Python item with one objectively correct answer rather than for a personality trait or graded-quality scenario. The W6 explainer template was originally designed for trait-level and judgment-level items; for objectively-correct items, the framing shifts from “what does this measure?” to “what concept does this test, why is the answer right, and what do the wrong answers reveal about common mental models.” This explainer is the first item-level Python explainer in the AIEH content set; the framing is documented here so future Python explainers can reuse it.

What this question tests

The item asks when mutable default arguments — like def f(items=[]): — are evaluated in Python. The concept being tested is one of the most-cited Python language-design choices: default argument expressions are evaluated exactly once, at function-definition time, not at call time. The same default-value object is reused across every call that doesn’t override the argument.

For immutable defaults (numbers, strings, tuples), the once-at-definition-time semantics is invisible — you can’t tell whether the integer 3 was created at definition time or call time because integers are immutable and behave identically either way. For mutable defaults (lists, dicts, sets, custom objects), the semantics is suddenly visible and produces a class of bugs that surface as “my function’s state mysteriously persists across calls.”

Why this is the right answer

The correct option is “Once at function definition time; the same list object is reused on every subsequent call.” The Python language reference is explicit on this point: default parameter values are evaluated when the def statement executes, and the resulting object is bound to the function’s defaults. The function object stores a reference to this object in its __defaults__ attribute; subsequent calls that don’t pass an argument receive the same object reference each time.

A short trace illustrates the gotcha:

def append_to(item, items=[]):
    items.append(item)
    return items

print(append_to(1))  # [1]
print(append_to(2))  # [1, 2]  -- not [2]
print(append_to(3))  # [1, 2, 3]

The list is created once when def append_to runs; every call that doesn’t pass items mutates that one list. The fix is the canonical Python idiom of using None as the sentinel default and creating a fresh container inside the function body:

def append_to(item, items=None):
    if items is None:
        items = []
    items.append(item)
    return items

This pattern appears in essentially every production Python codebase that has been bitten by the gotcha — which is most of them. The Python documentation (https://docs.python.org/3/tutorial/controlflow.html#default-argument-values) calls this out explicitly because the gotcha is that well-known.

What the wrong answers reveal

The three incorrect options each map to a common but mistaken mental model:

  • “Each time the function is called, fresh.” This is what most Python learners initially expect, and what the language would do if defaults followed the natural-language reading of “default value.” Python’s actual semantics — once at definition time — is a deliberate language-design choice that trades intuitive behavior for performance and consistency with how function objects are constructed. Respondents picking this option typically haven’t yet encountered the mutable- default gotcha in production.
  • “Only the first time the function is called.” A subtler misconception that mixes the once-only semantics (correct) with call-time evaluation (incorrect). Definition-time evaluation is more accurate than first-call evaluation: the default expression runs when def is executed, which is typically import time for module-level functions, not at the first call.
  • “Whenever the function is called without specifying that argument, a new default is created.” The most-common wrong-answer pick from respondents who have heard about the gotcha but haven’t internalized the actual semantics — they remember “something weird happens” but invert the direction. This option describes what Python should do under the natural-language reading; the actual semantics is the opposite.

Distinguishing among these three wrong answers gives the AIEH Python Fundamentals full assessment a clearer signal about where a candidate’s Python mental model breaks than a binary right/wrong scoring would. The sample test uses binary scoring (value 5 for correct, value 1 for any incorrect option) for simplicity; the full assessment can apply graduated values to the wrong answers to extract richer signal.

How the sample test scores you

In the AIEH 5-question Python Fundamentals sample, this item contributes one of five datapoints aggregated into a single python_proficiency score via the W3.2 normalize-by-count threshold. Binary scoring per item: 5 for the correct option, 1 for any of the three wrong options. With 5 binary items, the average ranges 1–5 and the level threshold maps avg ≤ 2 to low, ≤ 4 to mid, > 4 to high.

Data Notice: Sample-test results are directional indicators only. A 5-question sample can’t reliably distinguish between “knows Python idioms” and “got lucky on these specific items”; for a verified Skills Passport credential, take the full 50- question assessment.

The full assessment probes data structures, idioms, function semantics, performance, async, generators, and the specific gotchas (mutable defaults, late-binding closures, broadcasting edge cases) at depth. See the scoring methodology for how Python scores map onto the AIEH 300–850 Skills Passport scale.

  • Late-binding closures. Another Python gotcha in the same language-design family — closures over loop variables capture the variable by reference, not by value. The lambda x: x + i pattern in a for i in range(...) loop captures i itself, producing a closure that uses the final loop value rather than each loop’s value.
  • Function-object construction. def statements produce function objects with __defaults__, __kwdefaults__, and __closure__ attributes that store the bound default values and captured-variable cells. The mutable-default gotcha is a visible consequence of this construction.
  • The None sentinel pattern. The canonical Python idiom for “I want a fresh mutable default each call” — use None as the default, check for it inside the function body, create the fresh container there. Used essentially everywhere production Python code has functions with mutable parameter defaults.
  • Why Python’s design isn’t crazy. The once-at-definition semantics simplifies the function-object model and makes default values visible via inspect.signature(). Languages that evaluate defaults at call time (Ruby, JavaScript) have their own trade-offs. The Python design is a deliberate choice with documented rationale.

For the broader Python Fundamentals lineup including the full 50-question assessment when it ships, see the tests catalog.


Sources

Try the question yourself

This explainer covers what the item measures. To see how you score on the full python fundamentals family, take the free 5-question sample.

Take the python fundamentals sample