Search Postgresql Archives

Re: Latest patches break one of our unit-test, related to RLS

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

 



On Fri, 2025-09-12 at 10:07 -0400, Tom Lane wrote:
> Dominique Devienne <ddevienne@xxxxxxxxx> writes:
> > > This DOES look like a bug, no? I've done regexes for a long time,
> > > and these two forms should be equivalent IMHO. --DD
> 
> Yeah, I agree it's busted.  You can use EXPLAIN VERBOSE to see the
> translated-to-POSIX pattern, and it's wrong:
> 
> regression=# explain verbose with t(v) as (values ('foo:bar'), ('foo/bar'), ('foo0bar'))                                         
> select v from t where v similar to 'foo[\d\w]_%';
>                           QUERY PLAN                          
> --------------------------------------------------------------
>  Values Scan on "*VALUES*"  (cost=0.00..0.05 rows=1 width=32)
>    Output: "*VALUES*".column1
>    Filter: ("*VALUES*".column1 ~ '^(?:foo[\d\w]_%)$'::text)
> (3 rows)
> 
> The _ and % are not getting converted to their POSIX equivalents
> ("." and ".*").

Indeed, and I have to take the blame for introducing a bug in a minor
release :^(

The attached patch should fix the problem.

Yours,
Laurenz Albe
From 8d0f8ebac6c42fe7da36ec8c30ee091d20270068 Mon Sep 17 00:00:00 2001
From: Laurenz Albe <laurenz.albe@xxxxxxxxxxx>
Date: Sat, 13 Sep 2025 01:32:55 +0200
Subject: [PATCH v1] Amend recent fix for SIMILAR TO regex conversion

Commit e3ffc3e91d fixed the translation of character classes in
SIMILAR TO regular expressions.  Unfortunately the fix broke a corner
case: if there is an escape character right after the opening bracket,
(for example in "[\q]") a closing bracket right after the escape
sequence would not be seen as closing the character class.

Author: Laurenz Albe <laurenz.albe@xxxxxxxxxxx>
Reported-By: Dominique Devienne <ddevienne@xxxxxxxxx>
Reported-By: Stephan Springl <springl-psql@xxxxxxxxxxxxx>
Backpatch-through: 13
---
 src/backend/utils/adt/regexp.c        | 6 ++++++
 src/test/regress/expected/strings.out | 9 +++++++++
 src/test/regress/sql/strings.sql      | 3 +++
 3 files changed, 18 insertions(+)

diff --git a/src/backend/utils/adt/regexp.c b/src/backend/utils/adt/regexp.c
index 6e2864cbbda..29692fb1a9d 100644
--- a/src/backend/utils/adt/regexp.c
+++ b/src/backend/utils/adt/regexp.c
@@ -948,6 +948,12 @@ similar_escape_internal(text *pat_text, text *esc_text)
 				 */
 				*r++ = '\\';
 				*r++ = pchar;
+				/*
+				 * If we encounter an escaped character in a character class,
+				 * we must be past an initial ^ or ].
+				 */
+				if (charclass_depth > 0)
+					charclass_start = 3;
 			}
 			afterescape = false;
 		}
diff --git a/src/test/regress/expected/strings.out b/src/test/regress/expected/strings.out
index ba302da51e7..2d6cb02ad60 100644
--- a/src/test/regress/expected/strings.out
+++ b/src/test/regress/expected/strings.out
@@ -693,6 +693,15 @@ EXPLAIN (COSTS OFF) SELECT * FROM TEXT_TBL WHERE f1 SIMILAR TO '[^^]^';
    Filter: (f1 ~ '^(?:[^^]\^)$'::text)
 (2 rows)
 
+-- Closing square bracket after an escape sequence at the beginning of
+-- a character closes the character class
+EXPLAIN (COSTS OFF) SELECT * FROM TEXT_TBL WHERE f1 SIMILAR TO '[|a]%' ESCAPE '|';
+              QUERY PLAN               
+---------------------------------------
+ Seq Scan on text_tbl
+   Filter: (f1 ~ '^(?:[\a].*)$'::text)
+(2 rows)
+
 -- Test backslash escapes in regexp_replace's replacement string
 SELECT regexp_replace('1112223333', E'(\\d{3})(\\d{3})(\\d{4})', E'(\\1) \\2-\\3');
  regexp_replace 
diff --git a/src/test/regress/sql/strings.sql b/src/test/regress/sql/strings.sql
index b94004cc08c..5ed421d6205 100644
--- a/src/test/regress/sql/strings.sql
+++ b/src/test/regress/sql/strings.sql
@@ -218,6 +218,9 @@ EXPLAIN (COSTS OFF) SELECT * FROM TEXT_TBL WHERE f1 SIMILAR TO '[]%][^]%][^%]%';
 -- Closing square bracket effective after two carets at the beginning
 -- of character class.
 EXPLAIN (COSTS OFF) SELECT * FROM TEXT_TBL WHERE f1 SIMILAR TO '[^^]^';
+-- Closing square bracket after an escape sequence at the beginning of
+-- a character closes the character class
+EXPLAIN (COSTS OFF) SELECT * FROM TEXT_TBL WHERE f1 SIMILAR TO '[|a]%' ESCAPE '|';
 
 -- Test backslash escapes in regexp_replace's replacement string
 SELECT regexp_replace('1112223333', E'(\\d{3})(\\d{3})(\\d{4})', E'(\\1) \\2-\\3');
-- 
2.51.0


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
[Index of Archives]     [Postgresql Jobs]     [Postgresql Admin]     [Postgresql Performance]     [Linux Clusters]     [PHP Home]     [PHP on Windows]     [Kernel Newbies]     [PHP Classes]     [PHP Databases]     [Postgresql & PHP]     [Yosemite]

  Powered by Linux