Add support for PCRE partial matching #3969
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This is requested in https://bugs.php.net/bug.php?id=77459. Relevant PCRE documentation is https://www.pcre.org/current/doc/html/pcre2partial.html. This is useful for streaming processing of data.
The implementation uses two new modifiers
/pfor soft partial matching and/Pfor hard partial matching. From the test output:The main difference is that a soft partial match recognizes
dogas a full match in the above example, while a hard partial match takes into account that a greedy match would prefer matchingdogsbodyagainst a longer string, so it only returns a partial match.The return value of
preg_match()is the same for partial matches as full matches (1). A partial match can be determined by checkingpreg_last_error() == PREG_PARTIAL_MATCH_ERROR.The
$matchesarray contains two elements: The first is the partial match (which will always be adjacent to the end of the string), while the second contains the part of the subject that might have been inspected to arrive at this partial change. In particular this includes the maximum lookbehind:In application this means that the next match with more data can be started from the position of
bazqu, but the string starting fromfoobarbazqupotentially needs to be preserved for a successful match. (As the above example shows, this may be an overapproximation.)The
$matchesoutput is also affected by thePREG_OFFSET_CAPTUREflag:Partial matching can be used with
preg_match_all()as well, but only ifPREG_SET_ORDERis used. In this case, if after the matchpreg_last_error() == PREG_PARTIAL_MATCH_ERROR, then the last array in$matcheswill correspond to the trailing partial match and have the structure described above. All other matches before that will be full matches.All other functions (like
preg_replaceetc) do not support partial matching and will generate a warning if used in conjunction with/por/P.