Skip to content

Reduce memory usage when searching for commits and issues#159

Open
marcuscaisey wants to merge 1 commit intogharlan:mainfrom
marcuscaisey:search-memory
Open

Reduce memory usage when searching for commits and issues#159
marcuscaisey wants to merge 1 commit intogharlan:mainfrom
marcuscaisey:search-memory

Conversation

@marcuscaisey
Copy link
Copy Markdown
Contributor

@marcuscaisey marcuscaisey commented May 8, 2026

Problem

When searching for commits or issues in a large repository, the gh script filter can exhaust all of its available memory and crash.

For example, the query gh neovim/neovim *72cf89bce8 (specific commit chosen because it's the initial one) results in the following output in the debugger:

[14:30:21.357] ERROR: GitHub[Script Filter] Code 255: loading content for https://api.github.com/repos/gharlan/alfred-github-workflow/releases/latest
loading content for https://api.github.com/user?per_page=100
loading content for https://api.github.com/repos/neovim/neovim/commits?per_page=100PHP Fatal error:  Allowed memory size of 134217728 bytes exhausted (tried to allocate 405504 bytes) in /Users/marcus/scratch/alfred-github-workflow/workflow.php on line 240
PHP Stack trace:
PHP   1. {main}() /Users/marcus/Library/Caches/com.runningwithcrayons.Alfred/Workflow Scripts/A405D906-43C3-468F-BF09-54D4D9D31796:0
PHP   2. Search::run($scope = 'github', $query = ' neovim/neovim *72cf89bce8', $hotkey = '0') /Users/marcus/Library/Caches/com.runningwithcrayons.Alfred/Workflow Scripts/A405D906-43C3-468F-BF09-54D4D9D31796:5
PHP   3. Search::addRepoSubCommands() /Users/marcus/scratch/alfred-github-workflow/search.php:80
PHP   4. Workflow::requestApi($url = '/repos/neovim/neovim/commits', $curl = *uninitialized*, $callback = *uninitialized*, $firstPageOnly = *uninitialized*, $maxAge = *uninitialized*) /Users/marcus/scratch/alfred-github-workflow/search.php:355
PHP   5. Workflow::requestCache($url = 'https://api.github.com/repos/neovim/neovim/commits?per_page=100', $curl = NULL, $callback = NULL, $firstPageOnly = FALSE, $maxAge = 10, $refreshInBackground = *uninitialized*) /Users/marcus/scratch/alfred-github-workflow/workflow.php:303
PHP   6. Curl->execute() /Users/marcus/scratch/alfred-github-workflow/workflow.php:293
PHP   7. Workflow::{closure:/Users/marcus/scratch/alfred-github-workflow/workflow.php:256-258}($response = class CurlResponse { public $request = class CurlRequest { public $url = 'https://api.github.com/repositories/16408992/commits?per_page=100&page=93'; public $etag = NULL; public $token = '****************************************'; public $callback = class Closure { ... } }; public $status = 200; public $contentType = 'application/json; charset=utf-8'; public $etag = 'W/"f85dd32fb287dfdb16b50304ee27c01f83ea42bd3c12f86c08b154d73f44b1a3"'; public $link = '<https://api.github.com/repositories/16408992/commits?per_page=100&page=94>; rel="next", <https://api.github.com/repositories/16408992/commits?per_page=100&page=365>; rel="last", <https://api.github.com/repositories/16408992/commits?per_page=100&page=1>; rel="first", <https://api.github.com/repositories/16408992/commits?per_page=100&page=92>; rel="prev"'; public $content = ... }) /Users/marcus/scratch/alfred-github-workflow/curl.php:66
PHP   8. Workflow::{closure:/Users/marcus/scratch/alfred-github-workflow/workflow.php:224-285}($response = class CurlResponse { public $request = class CurlRequest { public $url = 'https://api.github.com/repositories/16408992/commits?per_page=100&page=93'; public $etag = NULL; public $token = '****************************************'; public $callback = class Closure { ... } }; public $status = 200; public $contentType = 'application/json; charset=utf-8'; public $etag = 'W/"f85dd32fb287dfdb16b50304ee27c01f83ea42bd3c12f86c08b154d73f44b1a3"'; public $link = '<https://api.github.com/repositories/16408992/commits?per_page=100&page=94>; rel="next", <https://api.github.com/repositories/16408992/commits?per_page=100&page=365>; rel="last", <https://api.github.com/repositories/16408992/commits?per_page=100&page=1>; rel="first", <https://api.github.com/repositories/16408992/commits?per_page=100&page=92>; rel="prev"'; public $content = ... }, $content = NULL, $parent = 'https://api.github.com/repositories/16408992/commits?per_page=100&page=92') /Users/marcus/scratch/alfred-github-workflow/workflow.php:257
PHP   9. json_encode($value = ...) /Users/marcus/scratch/alfred-github-workflow/workflow.php:240

I wrote a small script which performs the above query:

<?php
require 'search.php';
Search::run('github', ' neovim/neovim *72cf89bce8', getenv('hotkey'));
echo Workflow::getItemsAsXml();

and profiled it using https://github.com/arnaud-lb/php-memory-profiler:

MEMPROF_PROFILE=dump_on_limit php test.php

The resulting profile surfaced json_decode as the biggest offender:
image

The issue is that in Workflow::requestCache, we store all of the responses from the API in an array:

$responses[] = $response->content;

So for a large repository with lots of commits, the size of this array can outgrow the default memory limit of 128MB.

Solution

  • Add an optional $transformItem parameter to Workflow::requestCache and Workflow::requestApi. When provided, $transformItem is called to transform each item returned from the API into another form.
  • Construct the Item object for each listed commit and issue using $transformItem. By doing so, we can throw away the large response object that we were previously storing in $responses and only store the data that we need.

After these changes, the gh script filter no longer crashes on the input gh neovim/neovim *72cf89bce8.

Effectiveness

To understand the effectiveness of this solution, i've written a slightly modified version of the above test script which sets the memory limit to 1GB so that it won't crash:

<?php
ini_set('memory_limit', '1G');
require 'search.php';
Search::run('github', ' neovim/neovim *72cf89bce8', getenv('hotkey'));

and measured the memory usage using

/usr/bin/time -l php test.php

Before

        1.41 real         1.14 user         0.25 sys
           529268736  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
               33559  page reclaims
                  91  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                   0  signals received
                   0  voluntary context switches
                 493  involuntary context switches
         15237059400  instructions retired
          4447079530  cycles elapsed
           511689544  peak memory footprint

After

        1.23 real         1.04 user         0.18 sys
            70352896  maximum resident set size
                   0  average shared memory size
                   0  average unshared data size
                   0  average unshared stack size
                6509  page reclaims
                 188  page faults
                   0  swaps
                   0  block input operations
                   0  block output operations
                   0  messages sent
                   0  messages received
                   0  signals received
                   0  voluntary context switches
                  87  involuntary context switches
         15096525999  instructions retired
          3906278294  cycles elapsed
            52544016  peak memory footprint

Conclusion

So the max resident set size has decreased from 529MB to 70MB. Or a decrease of 87%.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant