reduce gc_collect_harder default to 1 on CPython by miketheman · Pull Request #14441 · pytest-dev/pytest

miketheman · 2026-05-05T15:27:08Z

The 5-iteration default was borrowed from the Trio project, where it was determined empirically to handle PyPy's object resurrection behavior: on PyPy, objects like coroutines can survive GC rounds because executing their del can resurrect them.

On CPython, reference counting frees most objects immediately. One GC pass is sufficient to handle reference cycles, as confirmed by all test_unraisableexception tests passing (including the refcycle variants).

Use 1 pass on CPython and retain 5 on PyPy.

miketheman · 2026-05-05T15:28:45Z

Draft, since I'm not certain this is a good idea.
I was able to shave ~30% of test runtime on a CPython-only test run locally, I'm curious to see the timings produced by GH Actions.

RonnyPfannschmidt · 2026-05-06T07:29:55Z

Im nit familiar with the plugin myself
Is there a reasonable benchmark we could use as indicator

miketheman · 2026-05-06T13:58:40Z

Unknown - I'm looking at history, and there's already once case that resets it to 0 no matter what - #13482

bluetech · 2026-05-08T20:44:33Z

I was uneasy with the multiple gc rounds but I couldn't entirely demonstrate the slowdown. But for me your report is sufficient. The unraisableexception plugin is non-deterministic and doing multiple gc rounds trying to make it so just seems too heavy. So basically +1 from me. I would also not add a special case for PyPI.

bluetech · 2026-05-08T20:47:54Z

Table 1: Total Job Time (wall clock)

Job                                          Upstream       Ours      Delta   % Change
--------------------------------------------------------------------------------------
  doctesting                                    0m45s      0m53s        +8s     +17.8%
  macos-py310                                   2m03s      1m38s       -25s     -20.3%
  macos-py312                                   1m30s      1m23s        -7s      -7.8%
  macos-py313                                   1m26s      1m46s       +20s     +23.3%
  macos-py314                                   1m49s      1m20s       -29s     -26.6%
  plugins                                       1m19s      0m49s       -30s     -38.0%
  ubuntu-py310-freeze                           0m47s      0m55s        +8s     +17.0%
  ubuntu-py310-lsof-numpy-pexpect              13m58s     12m44s     -1m14s      -8.8%
  ubuntu-py310-pluggy                           2m19s      1m50s       -29s     -20.9%
  ubuntu-py310-unittest-asynctest               0m47s      0m47s        +0s       0.0%
  ubuntu-py310-unittest-twisted24               0m49s      0m52s        +3s      +6.1%
  ubuntu-py310-unittest-twisted25               0m48s      0m52s        +4s      +8.3%
  ubuntu-py310-xdist                            2m11s      1m55s       -16s     -12.2%
  ubuntu-py311                                  7m56s      6m33s     -1m23s     -17.4%
  ubuntu-py312                                  8m32s      7m30s     -1m02s     -12.1%
  ubuntu-py313-pexpect                          9m00s      7m25s     -1m35s     -17.6%
  ubuntu-py314                                  7m48s      6m56s       -52s     -11.1%
  ubuntu-pypy3-xdist                            4m13s      4m23s       +10s      +4.0%  [PyPy]
  windows-py310-pluggy                          3m37s      3m22s       -15s      -6.9%
  windows-py310-unittest-asynctest              1m26s      1m35s        +9s     +10.5%
  windows-py310-unittest-twisted24              1m30s      1m45s       +15s     +16.7%
  windows-py310-unittest-twisted25              1m32s      1m38s        +6s      +6.5%
  windows-py310-xdist                           3m35s      3m40s        +5s      +2.3%
  windows-py311                                 7m18s      6m36s       -42s      -9.6%
  windows-py312                                 7m47s      6m23s     -1m24s     -18.0%
  windows-py313                                 5m39s      6m24s       +45s     +13.3%
  windows-py314                                13m56s     12m00s     -1m56s     -13.9%
--------------------------------------------------------------------------------------
  TOTAL (sum of all jobs)                     114m20s    103m54s    -10m26s      -9.1%

Table 2: Test Step Only (excludes setup/checkout/install/upload)

Job                                          Upstream       Ours      Delta   % Change
--------------------------------------------------------------------------------------
  doctesting                                    0m25s      0m26s        +1s      +4.0%
  macos-py310                                   1m26s      1m05s       -21s     -24.4%
  macos-py312                                   1m10s      1m02s        -8s     -11.4%
  macos-py313                                   1m08s      1m24s       +16s     +23.5%  ← noise
  macos-py314                                   1m29s      1m01s       -28s     -31.5%
  plugins                                       0m37s      0m35s        -2s      -5.4%
  ubuntu-py310-freeze                           0m26s      0m33s        +7s     +26.9%  ← noise
  ubuntu-py310-lsof-numpy-pexpect              13m42s     12m28s     -1m14s      -9.0%
  ubuntu-py310-pluggy                           1m58s      1m37s       -21s     -17.8%
  ubuntu-py310-unittest-asynctest               0m30s      0m29s        -1s      -3.3%
  ubuntu-py310-unittest-twisted24               0m34s      0m34s        +0s       0.0%
  ubuntu-py310-unittest-twisted25               0m29s      0m34s        +5s     +17.2%
  ubuntu-py310-xdist                            1m53s      1m37s       -16s     -14.2%
  ubuntu-py311                                  7m33s      6m19s     -1m14s     -16.3%
  ubuntu-py312                                  8m15s      7m14s     -1m01s     -12.3%
  ubuntu-py313-pexpect                          8m40s      7m10s     -1m30s     -17.3%
  ubuntu-py314                                  7m23s      6m38s       -45s     -10.2%
  ubuntu-pypy3-xdist                            3m51s      3m52s        +1s      +0.4%  [PyPy — no change expected]
  windows-py310-pluggy                          2m41s      2m42s        +1s      +0.6%
  windows-py310-unittest-asynctest              0m44s      0m48s        +4s      +9.1%
  windows-py310-unittest-twisted24              0m49s      0m54s        +5s     +10.2%
  windows-py310-unittest-twisted25              0m48s      0m50s        +2s      +4.2%
  windows-py310-xdist                           2m55s      3m02s        +7s      +4.0%
  windows-py311                                 6m37s      5m59s       -38s      -9.6%
  windows-py312                                 6m56s      5m47s     -1m09s     -16.6%
  windows-py313                                 4m40s      5m32s       +52s     +18.6%  ← noise
  windows-py314                                12m14s     11m09s     -1m05s      -8.9%
--------------------------------------------------------------------------------------
  TOTAL (sum of all test steps)                99m53s     91m21s     -8m32s      -8.5%

Table 3: Test Step Summary by Platform

Group                              Upstream (test)  Ours (test)    Saved        %
----------------------------------------------------------------------------------
  Linux (full suite)                       23m11s       20m11s   -3m00s    12.9%
  Linux (variants)                         32m03s       28m54s   -3m09s     9.8%
  macOS                                     5m13s        4m32s     -41s    13.1%
  Windows                                  38m24s       36m43s   -1m41s     4.4%
  Other                                     1m02s        1m01s      -1s     1.6%
----------------------------------------------------------------------------------
  TOTAL                                    99m53s       91m21s   -8m32s     8.5%

Key takeaways:

-8m32s (-8.5%) saved across all test steps summed; -10m26s (-9.1%) in total wall-clock CI time
Linux full-suite jobs (py311–py314) show the cleanest signal: consistent 10–17% savings per job
PyPy (ubuntu-pypy3-xdist) is flat (+0.4%) — expected, since we kept 5 iterations for PyPy
Windows py310 variant jobs (asynctest, twisted, xdist) show small regressions — these run very few tests (<1m each) so the GC savings are tiny and Windows runner variance dominates
windows-py313 +52s is almost certainly runner noise — Windows CI timing is high-variance and that job has no structural reason to regress
macos-py313 +16s similarly looks like variance; macos-py314 saved 28s on the same runner type

The signal is clear and consistent on Linux where runners are stable.

Comparison script (uses gh CLI to fetch data):

Details

#!/usr/bin/env python3
"""Compare test step timings between two GitHub Actions workflow runs.

Usage:
    python scripts/compare_ci_runs.py <upstream_run_id> <our_run_id> [--repo REPO]

Example:
    python scripts/compare_ci_runs.py 26336401841 26344890356
    python scripts/compare_ci_runs.py 26336401841 26344890356 --repo pytest-dev/pytest

Requires: gh CLI authenticated with repo read access.
"""
from __future__ import annotations

import argparse
import json
import subprocess
import sys
from datetime import datetime


def secs(a: str | None, b: str | None) -> int | None:
    if not a or not b:
        return None
    fmt = "%Y-%m-%dT%H:%M:%SZ"
    try:
        return int((datetime.strptime(b, fmt) - datetime.strptime(a, fmt)).total_seconds())
    except ValueError:
        return None


def fmt_secs(s: int | None) -> str:
    if s is None:
        return "—"
    m, sec = divmod(s, 60)
    return f"{m}m{sec:02d}s"


def delta_fmt(before: int | None, after: int | None) -> str:
    if before is None or after is None:
        return "—"
    d = after - before
    prefix = "+" if d >= 0 else "-"
    m, s = divmod(abs(d), 60)
    return f"{prefix}{m}m{s:02d}s" if m else f"{prefix}{s}s"


def pct(before: int | None, after: int | None) -> str:
    if not before or after is None:
        return "—"
    p = ((after - before) / before) * 100
    return f"{'+' if p > 0 else ''}{p:.1f}%"


def get_jobs(run_id: int, repo: str) -> list[dict]:
    result = subprocess.run(
        ["gh", "run", "view", str(run_id), "--repo", repo, "--json", "jobs"],
        capture_output=True,
        text=True,
        check=True,
    )
    return json.loads(result.stdout)["jobs"]


def parse_jobs(jobs: list[dict]) -> dict[str, dict]:
    parsed = {}
    for job in jobs:
        if not job["name"].startswith("build"):
            continue
        job_secs = secs(job["startedAt"], job["completedAt"])
        test_secs_total = sum(
            secs(s["startedAt"], s["completedAt"]) or 0
            for s in job["steps"]
            if s["name"].startswith("Test ")
            and (secs(s["startedAt"], s["completedAt"]) or 0) > 0
        )
        parsed[job["name"]] = {
            "job_secs": job_secs,
            "test_secs": test_secs_total or None,
        }
    return parsed


PLATFORM_GROUPS = {
    "Linux (full suite)": ["ubuntu-py310", "ubuntu-py311", "ubuntu-py312", "ubuntu-py313", "ubuntu-py314"],
    "Linux (variants)": [
        "ubuntu-py310-freeze", "ubuntu-py310-lsof-numpy-pexpect", "ubuntu-py310-pluggy",
        "ubuntu-py310-unittest-asynctest", "ubuntu-py310-unittest-twisted24",
        "ubuntu-py310-unittest-twisted25", "ubuntu-py310-xdist", "ubuntu-py313-pexpect",
        "ubuntu-pypy3-xdist",
    ],
    "macOS": ["macos-py310", "macos-py312", "macos-py313", "macos-py314"],
    "Windows": [
        "windows-py310-pluggy", "windows-py310-unittest-asynctest", "windows-py310-unittest-twisted24",
        "windows-py310-unittest-twisted25", "windows-py310-xdist", "windows-py311",
        "windows-py312", "windows-py313", "windows-py314",
    ],
}


def get_group(short_name: str) -> str:
    for group, members in PLATFORM_GROUPS.items():
        if short_name in members:
            return group
    return "Other"


def print_table(headers: list[str], rows: list[list[str]], footer: list[str] | None = None) -> None:
    widths = [max(len(str(row[i])) for row in ([headers] + rows + ([footer] if footer else []))) for i in range(len(headers))]
    sep = "-" * (sum(widths) + 3 * len(widths) + 1)
    fmt = "  " + "  ".join(f"{{:<{widths[0]}}}" if i == 0 else f"{{:>{widths[i]}}}" for i in range(len(headers)))
    print(fmt.format(*headers))
    print(sep)
    for row in rows:
        print(fmt.format(*row))
    if footer:
        print(sep)
        print(fmt.format(*footer))


def main() -> None:
    parser = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
    parser.add_argument("upstream_run_id", type=int)
    parser.add_argument("our_run_id", type=int)
    parser.add_argument("--repo", default="pytest-dev/pytest")
    args = parser.parse_args()

    repo = args.repo
    upstream_url = f"https://github.com/{repo}/actions/runs/{args.upstream_run_id}"
    our_url = f"https://github.com/{repo}/actions/runs/{args.our_run_id}"

    print(f"Fetching upstream run {args.upstream_run_id}...")
    upstream = parse_jobs(get_jobs(args.upstream_run_id, repo))
    print(f"Fetching our run {args.our_run_id}...")
    ours = parse_jobs(get_jobs(args.our_run_id, repo))

    all_names = sorted(set(upstream) | set(ours))

    print(f"\n{'=' * 90}")
    print(f"UPSTREAM: {upstream_url}")
    print(f"OURS:     {our_url}")
    print(f"{'=' * 90}")

    # Table 1: total job time
    print("\n### Table 1: Total Job Time (wall clock)\n")
    rows, total_u, total_o = [], 0, 0
    for name in all_names:
        short = name.removeprefix("build (").removesuffix(")")
        u, o = upstream.get(name, {}), ours.get(name, {})
        uj, oj = u.get("job_secs"), o.get("job_secs")
        if uj: total_u += uj
        if oj: total_o += oj
        note = "  [PyPy]" if "pypy" in name else ""
        rows.append([short + note, fmt_secs(uj), fmt_secs(oj), delta_fmt(uj, oj), pct(uj, oj)])
    print_table(
        ["Job", "Upstream", "Ours", "Delta", "% Change"],
        rows,
        ["TOTAL (sum of all jobs)", fmt_secs(total_u), fmt_secs(total_o), delta_fmt(total_u, total_o), pct(total_u, total_o)],
    )

    # Table 2: test step only
    print("\n\n### Table 2: Test Step Duration Only (excludes setup/checkout/install/upload)\n")
    rows, total_u, total_o = [], 0, 0
    for name in all_names:
        short = name.removeprefix("build (").removesuffix(")")
        u, o = upstream.get(name, {}), ours.get(name, {})
        ut, ot = u.get("test_secs"), o.get("test_secs")
        if ut: total_u += ut
        if ot: total_o += ot
        note = "  [PyPy]" if "pypy" in name else ""
        rows.append([short + note, fmt_secs(ut), fmt_secs(ot), delta_fmt(ut, ot), pct(ut, ot)])
    print_table(
        ["Job", "Upstream", "Ours", "Delta", "% Change"],
        rows,
        ["TOTAL (sum of all test steps)", fmt_secs(total_u), fmt_secs(total_o), delta_fmt(total_u, total_o), pct(total_u, total_o)],
    )

    # Table 3: by platform
    print("\n\n### Table 3: Test Step Summary by Platform\n")
    group_u: dict[str, int] = {}
    group_o: dict[str, int] = {}
    for name in all_names:
        short = name.removeprefix("build (").removesuffix(")")
        g = get_group(short)
        group_u[g] = group_u.get(g, 0) + (upstream.get(name, {}).get("test_secs") or 0)
        group_o[g] = group_o.get(g, 0) + (ours.get(name, {}).get("test_secs") or 0)
    rows, gtot_u, gtot_o = [], 0, 0
    for g in [*PLATFORM_GROUPS, "Other"]:
        gu, go = group_u.get(g, 0), group_o.get(g, 0)
        gtot_u += gu
        gtot_o += go
        rows.append([g, fmt_secs(gu), fmt_secs(go), delta_fmt(gu, go), pct(gu, go)])
    print_table(
        ["Group", "Upstream (test)", "Ours (test)", "Saved", "%"],
        rows,
        ["TOTAL", fmt_secs(gtot_u), fmt_secs(gtot_o), delta_fmt(gtot_u, gtot_o), pct(gtot_u, gtot_o)],
    )


if __name__ == "__main__":
    main()

miketheman force-pushed the miketheman/speed-up-test-sutie branch from 07255a2 to e2bc38b Compare May 23, 2026 21:54

miketheman added 2 commits May 23, 2026 18:08

docs: add changelog

07cdaab

Signed-off-by: Mike Fiedler <miketheman@gmail.com>

psf-chronographer Bot added the bot:chronographer:provided (automation) changelog entry is part of PR label May 23, 2026

miketheman marked this pull request as ready for review May 23, 2026 22:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

reduce gc_collect_harder default to 1 on CPython#14441

reduce gc_collect_harder default to 1 on CPython#14441
miketheman wants to merge 3 commits into
pytest-dev:mainfrom
miketheman:miketheman/speed-up-test-sutie

miketheman commented May 5, 2026

Uh oh!

miketheman commented May 5, 2026

Uh oh!

RonnyPfannschmidt commented May 6, 2026

Uh oh!

miketheman commented May 6, 2026

Uh oh!

bluetech commented May 8, 2026

Uh oh!

bluetech commented May 8, 2026

Uh oh!

graingert commented May 9, 2026

Uh oh!

miketheman commented May 23, 2026 •

edited

Loading

Uh oh!

miketheman commented May 23, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

Uh oh!

Conversation

miketheman commented May 5, 2026

Uh oh!

miketheman commented May 5, 2026

Uh oh!

RonnyPfannschmidt commented May 6, 2026

Uh oh!

miketheman commented May 6, 2026

Uh oh!

bluetech commented May 8, 2026

Uh oh!

bluetech commented May 8, 2026

Uh oh!

graingert commented May 9, 2026

Uh oh!

miketheman commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

miketheman commented May 23, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Table 1: Total Job Time (wall clock)

Table 2: Test Step Only (excludes setup/checkout/install/upload)

Table 3: Test Step Summary by Platform

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

miketheman commented May 23, 2026 •

edited

Loading

miketheman commented May 23, 2026 •

edited

Loading