Skip to content

Inefficient regex in extract_full_summary_from_signature #281

@jiasli

Description

@jiasli

As pointed out by https://gist.github.com/prodigysml/d07cd482214c80bfb6d3240454d2f679, this regex (introduced by 430c39e, #8) is inefficient:

regex = r'\s*(:param)\s+(.+?)\s*:(.*)'

It tries to match a string such as:

        :param command_loader: The command loader that commands will be registered into

As shown in https://regex101.com/, a simple :param r requires 1214 steps to fail.

image

:param r causes catastrophic backtracking:

Image

This is because \s+, .+? and \s* all match consecutive spaces, thus can trigger many backtrackings.

A better solution is to replace .+? with \w+ to match the parameter name so that backtrackings can be greatly reduced:

\s*(:param)\s+(\w+)\s*:(.*)

image

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions