Skip to content

build_backgrounds_chrombpnet: --gpu N overrides outer CUDA_VISIBLE_DEVICES, breaking parallel-launch pattern #74

@lucapinello

Description

@lucapinello

Background

Caught during the ChromBPNet CDF rebuild on 2026-04-30 (audits/2026-04-29_chrombpnet_cdf_rebuild/report.md, F2).

scripts/build_backgrounds_chrombpnet.py:200 does:

```python
os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpu)
```

…which overrides any pre-set CUDA_VISIBLE_DEVICES. The HANDOFF.md's parallel-launch suggestion uses the standard pattern:

```bash

Terminal 1: GPU 0

CUDA_VISIBLE_DEVICES=0 ... --gpu 0 ...

Terminal 2: GPU 1

CUDA_VISIBLE_DEVICES=1 ... --gpu 0 ... # outer var pins to GPU 1, --gpu 0 is "first visible"
```

But because the script clobbers the outer var, the second job lands on physical GPU 0 too, fighting the first job for memory and OOMing.

Suggested fix

Honour a pre-set CUDA_VISIBLE_DEVICES if present; only set it from --gpu N when no env var was provided:

```python
if "CUDA_VISIBLE_DEVICES" not in os.environ:
os.environ["CUDA_VISIBLE_DEVICES"] = str(args.gpu)
```

Related

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions