~sircmpwn/hare-dev

This thread contains a patchset. You're looking at the original emails, but you may wish to use the patch review UI. Review patch
4 3

[PATCH hare] regex: Handle escaped characters in brackets

Details
Message ID
<20230314165615.18179-1-adnan@maolood.com>
DKIM signature
missing
Download raw message
Patch: +16 -7
Signed-off-by: Adnan Maolood <adnan@maolood.com>
---
 regex/+test.ha |  3 +++
 regex/regex.ha | 20 +++++++++++++-------
 2 files changed, 16 insertions(+), 7 deletions(-)

diff --git a/regex/+test.ha b/regex/+test.ha
index 235d5591..8ef3c04d 100644
--- a/regex/+test.ha
+++ b/regex/+test.ha
@@ -333,6 +333,9 @@ fn run_findall_case(
		(`^a\\b$`, "a\\b", matchres::MATCH, 0, -1),
		(`^x(abc)\{,2\}$`, "xabc{,2}", matchres::MATCH, 0, -1),
		(`^x(abc)\{,2\}$`, "xabcabc{,2}", matchres::NOMATCH, 0, -1),
		(`^[\\]+$`, "\\", matchres::MATCH, 0, -1),
		(`^[\]]+$`, "]", matchres::MATCH, 0, -1),
		(`^[A-Za-z\[\]]+$`, "foo[bar]baz", matchres::MATCH, 0, -1),
		// {m,n}
		(`^x(abc){2}$`, "xabcabc", matchres::MATCH, 0, -1),
		(`^x(abc){3}$`, "xabcabc", matchres::NOMATCH, 0, -1),
diff --git a/regex/regex.ha b/regex/regex.ha
index 28b7176a..684700a6 100644
--- a/regex/regex.ha
+++ b/regex/regex.ha
@@ -146,7 +146,17 @@ fn handle_bracket(
	const range_end = peek2;
	const is_first_char = *bracket_idx == 0 || *bracket_idx == 1
		&& !*is_charset_positive;
	if (r == ']' && !is_first_char) {

	if (r == '\\') {
		if (peek1 is void) {
			return `Trailing backslash '\'`: error;
		} else {
			append(charsets[len(charsets) - 1],
				peek1: charset_lit_item);
			strings::next(iter);
			*r_idx += 1;
		};
	} else if (r == ']' && !is_first_char) {
		const newinst = inst_charset {
			idx = len(charsets) - 1,
			is_positive = *is_charset_positive,
@@ -212,7 +222,7 @@ export fn compile(expr: str) (regex | error) = {
		const next = strings::next(&iter);

		if (r_idx == 0 && next is rune && next: rune != '^') {
				append(insts, void: inst_skip);
			append(insts, void: inst_skip);
		};

		if (in_bracket) {
@@ -256,11 +266,7 @@ export fn compile(expr: str) (regex | error) = {
		case '[' =>
			in_bracket = true;
		case ']' =>
			if (in_bracket) {
				in_bracket = false;
			} else {
				append(insts, r: inst_lit);
			};
			append(insts, r: inst_lit);
		case '(' =>
			if (n_groupstarts > 0) {
				return `Nested capture groups are unsupported`: error;

base-commit: 5af9e06de4d00f9a619670ee2f4e6a62bb434066
-- 
2.40.0

[hare/patches] build success

builds.sr.ht <builds@sr.ht>
Details
Message ID
<CR69KPPAQTTR.3GQEY2PLI2BS1@cirno2>
In-Reply-To
<20230314165615.18179-1-adnan@maolood.com> (view parent)
DKIM signature
missing
Download raw message
hare/patches: SUCCESS in 1m43s

[regex: Handle escaped characters in brackets][0] from [Adnan Maolood][1]

[0]: https://lists.sr.ht/~sircmpwn/hare-dev/patches/39715
[1]: adnan@maolood.com

✓ #957103 SUCCESS hare/patches/alpine.yml  https://builds.sr.ht/~sircmpwn/job/957103
✓ #957104 SUCCESS hare/patches/freebsd.yml https://builds.sr.ht/~sircmpwn/job/957104
Details
Message ID
<CS0M1LB8GHWX.1Y8K7XM88BPQ9@taiga>
In-Reply-To
<20230314165615.18179-1-adnan@maolood.com> (view parent)
DKIM signature
missing
Download raw message
Thanks!

To git@git.sr.ht:~sircmpwn/hare
   b0806f69..f22695bc  master -> master
Details
Message ID
<CS0NXAE6HJ93.1COPRIETN1T4K@maolood.com>
In-Reply-To
<CS0M1LB8GHWX.1Y8K7XM88BPQ9@taiga> (view parent)
DKIM signature
missing
Download raw message
I realize now that this is not supported by POSIX ERE. But it is
supported by other tools like awk. You may want to revert this.
Details
Message ID
<CS0NXVV4ZTWB.2D6XMGZSCI2LC@taiga>
In-Reply-To
<CS0NXAE6HJ93.1COPRIETN1T4K@maolood.com> (view parent)
DKIM signature
missing
Download raw message
Ugh. This should be specified by POSIX. Keeping it on the basis that its
absence is a POSIX bug IMHO.
Reply to thread Export thread (mbox)