Skip to content

reduce gc_collect_harder default to 1 on CPython#14441

Open
miketheman wants to merge 3 commits into
pytest-dev:mainfrom
miketheman:miketheman/speed-up-test-sutie
Open

reduce gc_collect_harder default to 1 on CPython#14441
miketheman wants to merge 3 commits into
pytest-dev:mainfrom
miketheman:miketheman/speed-up-test-sutie

Conversation

@miketheman
Copy link
Copy Markdown
Contributor

The 5-iteration default was borrowed from the Trio project, where it was determined empirically to handle PyPy's object resurrection behavior: on PyPy, objects like coroutines can survive GC rounds because executing their del can resurrect them.

On CPython, reference counting frees most objects immediately. One GC pass is sufficient to handle reference cycles, as confirmed by all test_unraisableexception tests passing (including the refcycle variants).

Use 1 pass on CPython and retain 5 on PyPy.

@miketheman
Copy link
Copy Markdown
Contributor Author

Draft, since I'm not certain this is a good idea.
I was able to shave ~30% of test runtime on a CPython-only test run locally, I'm curious to see the timings produced by GH Actions.

@RonnyPfannschmidt
Copy link
Copy Markdown
Member

Im nit familiar with the plugin myself
Is there a reasonable benchmark we could use as indicator

@miketheman
Copy link
Copy Markdown
Contributor Author

Unknown - I'm looking at history, and there's already once case that resets it to 0 no matter what - #13482

@bluetech
Copy link
Copy Markdown
Member

bluetech commented May 8, 2026

I was uneasy with the multiple gc rounds but I couldn't entirely demonstrate the slowdown. But for me your report is sufficient. The unraisableexception plugin is non-deterministic and doing multiple gc rounds trying to make it so just seems too heavy. So basically +1 from me. I would also not add a special case for PyPI.

@bluetech
Copy link
Copy Markdown
Member

bluetech commented May 8, 2026

See also 391324e.

Also cc @graingert.

@graingert
Copy link
Copy Markdown
Member

Thanks for the CC, I have no opinion on this though

The 5-iteration default was borrowed from the Trio project, where it was
determined empirically to handle PyPy's object resurrection behavior: on
PyPy, objects like coroutines can survive GC rounds because executing their
__del__ can resurrect them.

On CPython, reference counting frees most objects immediately. One GC pass
is sufficient to handle reference cycles, as confirmed by all
test_unraisableexception tests passing (including the refcycle variants).

Use 1 pass on CPython and retain 5 on PyPy.

Signed-off-by: Mike Fiedler <miketheman@gmail.com>
@miketheman miketheman force-pushed the miketheman/speed-up-test-sutie branch from 07255a2 to e2bc38b Compare May 23, 2026 21:54
Using `timeit`, I found a faster call:

```python
import sys
import timeit

def method_hasattr():
    return 5 if hasattr(sys, "pypy_version_info") else 1

def method_implementation():
    return 5 if sys.implementation.name == "pypy" else 1

time1 = timeit.timeit(method_hasattr, number=10000)
time2 = timeit.timeit(method_implementation, number=10000)

print(f"Method Hasattr: {time1:.5f} seconds")
print(f"Method Implementation: {time2:.5f} seconds")
```

Results:
```
$ PYENV_VERSION=3.14.5 python microbench.py
Method Hasattr: 0.00085 seconds
Method Implementation: 0.00054 seconds

$ PYENV_VERSION=pypy3.11-7.3.22 python microbench.py
Method Hasattr: 0.00768 seconds
Method Implementation: 0.00070 seconds
```

Refs: https://docs.python.org/3/library/sys.html#sys.implementation

Signed-off-by: Mike Fiedler <miketheman@gmail.com>
Signed-off-by: Mike Fiedler <miketheman@gmail.com>
@psf-chronographer psf-chronographer Bot added the bot:chronographer:provided (automation) changelog entry is part of PR label May 23, 2026
@miketheman miketheman marked this pull request as ready for review May 23, 2026 22:11
@miketheman
Copy link
Copy Markdown
Contributor Author

miketheman commented May 23, 2026

I found a faster implementation, and added a changelog entry. If there's a benchamrking/test suite somewhere I can hook into, I'd be happy to explore that, but I didn't find one, so pointers would be appreciated.

@miketheman
Copy link
Copy Markdown
Contributor Author

miketheman commented May 23, 2026

CI results compared:


Table 1: Total Job Time (wall clock)

Job                                          Upstream       Ours      Delta   % Change
--------------------------------------------------------------------------------------
  doctesting                                    0m45s      0m53s        +8s     +17.8%
  macos-py310                                   2m03s      1m38s       -25s     -20.3%
  macos-py312                                   1m30s      1m23s        -7s      -7.8%
  macos-py313                                   1m26s      1m46s       +20s     +23.3%
  macos-py314                                   1m49s      1m20s       -29s     -26.6%
  plugins                                       1m19s      0m49s       -30s     -38.0%
  ubuntu-py310-freeze                           0m47s      0m55s        +8s     +17.0%
  ubuntu-py310-lsof-numpy-pexpect              13m58s     12m44s     -1m14s      -8.8%
  ubuntu-py310-pluggy                           2m19s      1m50s       -29s     -20.9%
  ubuntu-py310-unittest-asynctest               0m47s      0m47s        +0s       0.0%
  ubuntu-py310-unittest-twisted24               0m49s      0m52s        +3s      +6.1%
  ubuntu-py310-unittest-twisted25               0m48s      0m52s        +4s      +8.3%
  ubuntu-py310-xdist                            2m11s      1m55s       -16s     -12.2%
  ubuntu-py311                                  7m56s      6m33s     -1m23s     -17.4%
  ubuntu-py312                                  8m32s      7m30s     -1m02s     -12.1%
  ubuntu-py313-pexpect                          9m00s      7m25s     -1m35s     -17.6%
  ubuntu-py314                                  7m48s      6m56s       -52s     -11.1%
  ubuntu-pypy3-xdist                            4m13s      4m23s       +10s      +4.0%  [PyPy]
  windows-py310-pluggy                          3m37s      3m22s       -15s      -6.9%
  windows-py310-unittest-asynctest              1m26s      1m35s        +9s     +10.5%
  windows-py310-unittest-twisted24              1m30s      1m45s       +15s     +16.7%
  windows-py310-unittest-twisted25              1m32s      1m38s        +6s      +6.5%
  windows-py310-xdist                           3m35s      3m40s        +5s      +2.3%
  windows-py311                                 7m18s      6m36s       -42s      -9.6%
  windows-py312                                 7m47s      6m23s     -1m24s     -18.0%
  windows-py313                                 5m39s      6m24s       +45s     +13.3%
  windows-py314                                13m56s     12m00s     -1m56s     -13.9%
--------------------------------------------------------------------------------------
  TOTAL (sum of all jobs)                     114m20s    103m54s    -10m26s      -9.1%

Table 2: Test Step Only (excludes setup/checkout/install/upload)

Job                                          Upstream       Ours      Delta   % Change
--------------------------------------------------------------------------------------
  doctesting                                    0m25s      0m26s        +1s      +4.0%
  macos-py310                                   1m26s      1m05s       -21s     -24.4%
  macos-py312                                   1m10s      1m02s        -8s     -11.4%
  macos-py313                                   1m08s      1m24s       +16s     +23.5%  ← noise
  macos-py314                                   1m29s      1m01s       -28s     -31.5%
  plugins                                       0m37s      0m35s        -2s      -5.4%
  ubuntu-py310-freeze                           0m26s      0m33s        +7s     +26.9%  ← noise
  ubuntu-py310-lsof-numpy-pexpect              13m42s     12m28s     -1m14s      -9.0%
  ubuntu-py310-pluggy                           1m58s      1m37s       -21s     -17.8%
  ubuntu-py310-unittest-asynctest               0m30s      0m29s        -1s      -3.3%
  ubuntu-py310-unittest-twisted24               0m34s      0m34s        +0s       0.0%
  ubuntu-py310-unittest-twisted25               0m29s      0m34s        +5s     +17.2%
  ubuntu-py310-xdist                            1m53s      1m37s       -16s     -14.2%
  ubuntu-py311                                  7m33s      6m19s     -1m14s     -16.3%
  ubuntu-py312                                  8m15s      7m14s     -1m01s     -12.3%
  ubuntu-py313-pexpect                          8m40s      7m10s     -1m30s     -17.3%
  ubuntu-py314                                  7m23s      6m38s       -45s     -10.2%
  ubuntu-pypy3-xdist                            3m51s      3m52s        +1s      +0.4%  [PyPy — no change expected]
  windows-py310-pluggy                          2m41s      2m42s        +1s      +0.6%
  windows-py310-unittest-asynctest              0m44s      0m48s        +4s      +9.1%
  windows-py310-unittest-twisted24              0m49s      0m54s        +5s     +10.2%
  windows-py310-unittest-twisted25              0m48s      0m50s        +2s      +4.2%
  windows-py310-xdist                           2m55s      3m02s        +7s      +4.0%
  windows-py311                                 6m37s      5m59s       -38s      -9.6%
  windows-py312                                 6m56s      5m47s     -1m09s     -16.6%
  windows-py313                                 4m40s      5m32s       +52s     +18.6%  ← noise
  windows-py314                                12m14s     11m09s     -1m05s      -8.9%
--------------------------------------------------------------------------------------
  TOTAL (sum of all test steps)                99m53s     91m21s     -8m32s      -8.5%

Table 3: Test Step Summary by Platform

Group                              Upstream (test)  Ours (test)    Saved        %
----------------------------------------------------------------------------------
  Linux (full suite)                       23m11s       20m11s   -3m00s    12.9%
  Linux (variants)                         32m03s       28m54s   -3m09s     9.8%
  macOS                                     5m13s        4m32s     -41s    13.1%
  Windows                                  38m24s       36m43s   -1m41s     4.4%
  Other                                     1m02s        1m01s      -1s     1.6%
----------------------------------------------------------------------------------
  TOTAL                                    99m53s       91m21s   -8m32s     8.5%

Key takeaways:

  • -8m32s (-8.5%) saved across all test steps summed; -10m26s (-9.1%) in total wall-clock CI time
  • Linux full-suite jobs (py311–py314) show the cleanest signal: consistent 10–17% savings per job
  • PyPy (ubuntu-pypy3-xdist) is flat (+0.4%) — expected, since we kept 5 iterations for PyPy
  • Windows py310 variant jobs (asynctest, twisted, xdist) show small regressions — these run very few tests (<1m each) so the GC savings are tiny and Windows runner variance dominates
  • windows-py313 +52s is almost certainly runner noise — Windows CI timing is high-variance and that job has no structural reason to regress
  • macos-py313 +16s similarly looks like variance; macos-py314 saved 28s on the same runner type

The signal is clear and consistent on Linux where runners are stable.

Comparison script (uses gh CLI to fetch data):

Details
#!/usr/bin/env python3
"""Compare test step timings between two GitHub Actions workflow runs.

Usage:
    python scripts/compare_ci_runs.py <upstream_run_id> <our_run_id> [--repo REPO]

Example:
    python scripts/compare_ci_runs.py 26336401841 26344890356
    python scripts/compare_ci_runs.py 26336401841 26344890356 --repo pytest-dev/pytest

Requires: gh CLI authenticated with repo read access.
"""
from __future__ import annotations

import argparse
import json
import subprocess
import sys
from datetime import datetime


def secs(a: str | None, b: str | None) -> int | None:
    if not a or not b:
        return None
    fmt = "%Y-%m-%dT%H:%M:%SZ"
    try:
        return int((datetime.strptime(b, fmt) - datetime.strptime(a, fmt)).total_seconds())
    except ValueError:
        return None


def fmt_secs(s: int | None) -> str:
    if s is None:
        return "—"
    m, sec = divmod(s, 60)
    return f"{m}m{sec:02d}s"


def delta_fmt(before: int | None, after: int | None) -> str:
    if before is None or after is None:
        return "—"
    d = after - before
    prefix = "+" if d >= 0 else "-"
    m, s = divmod(abs(d), 60)
    return f"{prefix}{m}m{s:02d}s" if m else f"{prefix}{s}s"


def pct(before: int | None, after: int | None) -> str:
    if not before or after is None:
        return "—"
    p = ((after - before) / before) * 100
    return f"{'+' if p > 0 else ''}{p:.1f}%"


def get_jobs(run_id: int, repo: str) -> list[dict]:
    result = subprocess.run(
        ["gh", "run", "view", str(run_id), "--repo", repo, "--json", "jobs"],
        capture_output=True,
        text=True,
        check=True,
    )
    return json.loads(result.stdout)["jobs"]


def parse_jobs(jobs: list[dict]) -> dict[str, dict]:
    parsed = {}
    for job in jobs:
        if not job["name"].startswith("build"):
            continue
        job_secs = secs(job["startedAt"], job["completedAt"])
        test_secs_total = sum(
            secs(s["startedAt"], s["completedAt"]) or 0
            for s in job["steps"]
            if s["name"].startswith("Test ")
            and (secs(s["startedAt"], s["completedAt"]) or 0) > 0
        )
        parsed[job["name"]] = {
            "job_secs": job_secs,
            "test_secs": test_secs_total or None,
        }
    return parsed


PLATFORM_GROUPS = {
    "Linux (full suite)": ["ubuntu-py310", "ubuntu-py311", "ubuntu-py312", "ubuntu-py313", "ubuntu-py314"],
    "Linux (variants)": [
        "ubuntu-py310-freeze", "ubuntu-py310-lsof-numpy-pexpect", "ubuntu-py310-pluggy",
        "ubuntu-py310-unittest-asynctest", "ubuntu-py310-unittest-twisted24",
        "ubuntu-py310-unittest-twisted25", "ubuntu-py310-xdist", "ubuntu-py313-pexpect",
        "ubuntu-pypy3-xdist",
    ],
    "macOS": ["macos-py310", "macos-py312", "macos-py313", "macos-py314"],
    "Windows": [
        "windows-py310-pluggy", "windows-py310-unittest-asynctest", "windows-py310-unittest-twisted24",
        "windows-py310-unittest-twisted25", "windows-py310-xdist", "windows-py311",
        "windows-py312", "windows-py313", "windows-py314",
    ],
}


def get_group(short_name: str) -> str:
    for group, members in PLATFORM_GROUPS.items():
        if short_name in members:
            return group
    return "Other"


def print_table(headers: list[str], rows: list[list[str]], footer: list[str] | None = None) -> None:
    widths = [max(len(str(row[i])) for row in ([headers] + rows + ([footer] if footer else []))) for i in range(len(headers))]
    sep = "-" * (sum(widths) + 3 * len(widths) + 1)
    fmt = "  " + "  ".join(f"{{:<{widths[0]}}}" if i == 0 else f"{{:>{widths[i]}}}" for i in range(len(headers)))
    print(fmt.format(*headers))
    print(sep)
    for row in rows:
        print(fmt.format(*row))
    if footer:
        print(sep)
        print(fmt.format(*footer))


def main() -> None:
    parser = argparse.ArgumentParser(description=__doc__, formatter_class=argparse.RawDescriptionHelpFormatter)
    parser.add_argument("upstream_run_id", type=int)
    parser.add_argument("our_run_id", type=int)
    parser.add_argument("--repo", default="pytest-dev/pytest")
    args = parser.parse_args()

    repo = args.repo
    upstream_url = f"https://github.com/{repo}/actions/runs/{args.upstream_run_id}"
    our_url = f"https://github.com/{repo}/actions/runs/{args.our_run_id}"

    print(f"Fetching upstream run {args.upstream_run_id}...")
    upstream = parse_jobs(get_jobs(args.upstream_run_id, repo))
    print(f"Fetching our run {args.our_run_id}...")
    ours = parse_jobs(get_jobs(args.our_run_id, repo))

    all_names = sorted(set(upstream) | set(ours))

    print(f"\n{'=' * 90}")
    print(f"UPSTREAM: {upstream_url}")
    print(f"OURS:     {our_url}")
    print(f"{'=' * 90}")

    # Table 1: total job time
    print("\n### Table 1: Total Job Time (wall clock)\n")
    rows, total_u, total_o = [], 0, 0
    for name in all_names:
        short = name.removeprefix("build (").removesuffix(")")
        u, o = upstream.get(name, {}), ours.get(name, {})
        uj, oj = u.get("job_secs"), o.get("job_secs")
        if uj: total_u += uj
        if oj: total_o += oj
        note = "  [PyPy]" if "pypy" in name else ""
        rows.append([short + note, fmt_secs(uj), fmt_secs(oj), delta_fmt(uj, oj), pct(uj, oj)])
    print_table(
        ["Job", "Upstream", "Ours", "Delta", "% Change"],
        rows,
        ["TOTAL (sum of all jobs)", fmt_secs(total_u), fmt_secs(total_o), delta_fmt(total_u, total_o), pct(total_u, total_o)],
    )

    # Table 2: test step only
    print("\n\n### Table 2: Test Step Duration Only (excludes setup/checkout/install/upload)\n")
    rows, total_u, total_o = [], 0, 0
    for name in all_names:
        short = name.removeprefix("build (").removesuffix(")")
        u, o = upstream.get(name, {}), ours.get(name, {})
        ut, ot = u.get("test_secs"), o.get("test_secs")
        if ut: total_u += ut
        if ot: total_o += ot
        note = "  [PyPy]" if "pypy" in name else ""
        rows.append([short + note, fmt_secs(ut), fmt_secs(ot), delta_fmt(ut, ot), pct(ut, ot)])
    print_table(
        ["Job", "Upstream", "Ours", "Delta", "% Change"],
        rows,
        ["TOTAL (sum of all test steps)", fmt_secs(total_u), fmt_secs(total_o), delta_fmt(total_u, total_o), pct(total_u, total_o)],
    )

    # Table 3: by platform
    print("\n\n### Table 3: Test Step Summary by Platform\n")
    group_u: dict[str, int] = {}
    group_o: dict[str, int] = {}
    for name in all_names:
        short = name.removeprefix("build (").removesuffix(")")
        g = get_group(short)
        group_u[g] = group_u.get(g, 0) + (upstream.get(name, {}).get("test_secs") or 0)
        group_o[g] = group_o.get(g, 0) + (ours.get(name, {}).get("test_secs") or 0)
    rows, gtot_u, gtot_o = [], 0, 0
    for g in [*PLATFORM_GROUPS, "Other"]:
        gu, go = group_u.get(g, 0), group_o.get(g, 0)
        gtot_u += gu
        gtot_o += go
        rows.append([g, fmt_secs(gu), fmt_secs(go), delta_fmt(gu, go), pct(gu, go)])
    print_table(
        ["Group", "Upstream (test)", "Ours (test)", "Saved", "%"],
        rows,
        ["TOTAL", fmt_secs(gtot_u), fmt_secs(gtot_o), delta_fmt(gtot_u, gtot_o), pct(gtot_u, gtot_o)],
    )


if __name__ == "__main__":
    main()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bot:chronographer:provided (automation) changelog entry is part of PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants