What does the "dict insertion order" Python question test?

What this question tests

The item asks how dictionaries iterate in Python 3.7 and later. The concept being tested is the language guarantee that dictionaries iterate in insertion order, formalized in the Python 3.7 language specification (CPython adopted the behavior implementation-internally in 3.6 as a side-effect of a memory-layout optimization, then 3.7 promoted it to a guaranteed language feature). The question probes whether the candidate has internalized current Python semantics rather than the pre-3.7 “no guaranteed iteration order” model that decades of older Python documentation and tutorials reinforce.

The shift from “no guaranteed order” to “guaranteed insertion order” is one of the more meaningful CPython language changes of the past decade — meaningful because it changed how Python code can rely on dict iteration, and consequential because the OrderedDict class that previously served the “dict-with-guaranteed-order” use case became substantially less necessary in modern Python codebases.

Why this is the right answer

The correct option is “Insertion order, by language guarantee — keys are iterated in the order they were first inserted.” The Python 3.7 language reference and PEP 478 (the Python 3.7 release schedule) both document this as a language guarantee rather than an implementation detail.

A short trace illustrates the semantics:

d = {}
d['c'] = 1
d['a'] = 2
d['b'] = 3
list(d)  # ['c', 'a', 'b'] — insertion order, not sorted

Reassigning a key doesn’t change its position in iteration; the position is set when the key is first inserted, not on subsequent assignments:

d = {'a': 1, 'b': 2}
d['a'] = 99  # value changes; iteration position doesn't
list(d)  # ['a', 'b']

Removing and re-inserting a key gives it a new position (at the end):

del d['a']
d['a'] = 1
list(d)  # ['b', 'a'] — 'a' was re-inserted, so it's now last

The behavior is now part of Python’s guaranteed semantics; code that relies on dict iteration order (for serialization, ordered-output generation, or readable debugging) works predictably across CPython, PyPy, and other compliant implementations from Python 3.7 forward.

What the wrong answers reveal

The three incorrect options each map to a common but mistaken mental model:

“No guaranteed iteration order; the order is implementation-defined and may differ across runs.” This was the correct answer in Python ≤3.6 and remains the conventional wisdom in older Python tutorials, Stack Overflow answers from before 2018, and printed books that haven’t been revised. Respondents picking this option likely learned Python before 2018 and haven’t updated their mental model to the 3.7 guarantee.
“Sorted order by key, lexicographically.” This is what some other languages do (notably some default JSON serializers, and the C++ std::map ordering), and what Python beginners sometimes assume by analogy with sorted iteration. Python’s dict has never sorted by key in general iteration; respondents picking this option likely haven’t internalized the actual semantics from any Python version.
“Hash-bucket order, optimized for lookup speed but unpredictable to the user.” This was a common explanation for why pre-3.7 dict iteration order was unpredictable — CPython’s pre-3.6 dict implementation did expose hash-bucket order, but hash randomization and resizing made the bucket layout unpredictable from the user’s perspective. Respondents picking this option likely have a sophisticated pre-3.7 mental model but haven’t updated to the 3.7 guarantee.

The four wrong-answer patterns cluster usefully into different gaps: pre-3.7 mental model that hasn’t updated (option A), fundamental misunderstanding of dict semantics (option C), and pre-3.7 sophisticated-but-stale model (option D). The full Python Fundamentals assessment can apply graduated values to differentiate these gaps; the 5-question sample uses binary scoring for simplicity.

How the sample test scores you

In the AIEH 5-question Python Fundamentals sample, this item contributes one of five datapoints aggregated into a single python_proficiency score via the W3.2 normalize-by-count threshold. Binary scoring per item: 5 for the correct option, 1 for any of the three wrong options. With 5 binary items, the average ranges 1–5 and the level threshold maps avg ≤ 2 to low, ≤ 4 to mid, > 4 to high.

Data Notice: Sample-test results are directional indicators only. A 5-question sample can’t reliably distinguish between “knows Python idioms” and “got lucky on these specific items”; for a verified Skills Passport credential, take the full 50-question assessment.

The full assessment probes data structures, idioms, function semantics, performance, async, generators, comprehensions, and the specific gotchas (mutable defaults, late-binding closures, broadcasting edge cases) at depth. See the scoring methodology for how Python scores map onto the AIEH 300–850 Skills Passport scale.

OrderedDict deprecation in modern Python. Before 3.7, collections.OrderedDict was the canonical way to get a dict-with-guaranteed-order. Since 3.7, plain dict provides the same iteration-order guarantee, making OrderedDict largely redundant. OrderedDict retains a few specific behaviors that plain dict doesn’t (equality comparison considering order, move_to_end() method) but most code that historically used OrderedDict can now use plain dict.
Dict equality and order. Plain dict equality compares keys and values regardless of insertion order: {'a':1, 'b':2} == {'b':2, 'a':1} evaluates to True. OrderedDict equality considers order. This is one of the remaining behavioral differences worth knowing about.
JSON and YAML round-trip behavior. Dict insertion order is preserved through json.dumps and json.loads in modern Python, which produces predictable, diff-friendly serialized output. Pre-3.7 code often used OrderedDict to ensure this; modern code can rely on plain dict.
The 3.6 implementation accident → 3.7 language guarantee story. CPython 3.6 changed the dict implementation to use a more memory-efficient layout that happened to preserve insertion order as a side effect. The 3.7 language specification then promoted this to a language guarantee rather than an implementation detail. The history matters because it explains why the 3.6 version had the behavior but couldn’t be relied upon, while 3.7+ can.

For the broader Python Fundamentals lineup including the full 50-question assessment when it ships, see the tests catalog.

Sources

Coghlan, N., & Stinner, V. (2017). PEP 478 — Python 3.7 Release Schedule. Python Enhancement Proposals. https://peps.python.org/pep-0478/
Python Software Foundation. (2024). The Python Language Reference: The dictionary type. https://docs.python.org/3/library/stdtypes.html#dict
Python Software Foundation. (2024). What’s New In Python 3.7. https://docs.python.org/3/whatsnew/3.7.html
Slatkin, B. (2019). Effective Python: 90 Specific Ways to Write Better Python (2nd ed.). Addison-Wesley. — Item 18 (“Know How to Construct Key-Dependent Default Values with missing”) and Item 23 (“Provide Optional Behavior with Keyword Arguments”) cover related dict semantics.

What this question tests

Why this is the right answer

What the wrong answers reveal

How the sample test scores you

Related concepts

Sources

Try the question yourself