Em Tue, 05 Aug 2025 16:46:10 -0600 Jonathan Corbet <corbet@xxxxxxx> escreveu: > Mauro Carvalho Chehab <mchehab+huawei@xxxxxxxxxx> writes: > > > Perhaps one alternative would do something like: > > > > tuples = struct_members.findall(members) > > if not tuples: > > break > > > > maintype, -, -, content, -, s_ids = tuples > > > > (assuming that we don't need t[1], t[2] and t[4] here) > > > > Btw, on this specific case, better to use non-capture group matches > > to avoid those "empty" spaces, e.g. (if I got it right): > > The problem is this line here: > > oldmember = "".join(t) # Reconstruct the original formatting > > The regex *has* to capture the entire match string so that it can be > reconstructed back to its original form, which we need to edit the full > list of members later on. > > This code could use a deep rethink, but it works for now :) well, we can still do: for t in tuples: maintype, -, -, content, -, s_ids = t oldmember = "".join(t) this way, we'll be naming the relevant parameters and reconstructing the the original form. IMO, this is a lot better than using t[0], t[3], t[5] at the code, as the names makes it clear what each one actually captured. - Btw, while re.findall() has an API that doesn't return match objects which is incoherent with the normal re API, while looking at the specs today(*), there is an alternative: re.finditer(). We could add it to KernRE cass and use it on a way that it will use a Match instance. Something like: # Original regex expression res = Re.finditer(...) # Not much difference here. Probably not worh using it for match in res: oldmember = "".join(match.groups()) maintype, -, -, content, -, s_ids = match.groups() Or alternatively: res = Re.finditer(...) # Not much difference here. Probably not worth using it for match in res: oldmember = "".join(match.groups()) # replace at the code below: # maintype -> match.group('maintype') # content -> match.group('content') # s_ids -> match.group('s_ids') No idea about performance differences between findall and finditer. (*) https://docs.python.org/3/library/re.html btw, one possibility that would avoid having tuples and t is to do something like: struct_members = KernRe("(" + # group 1: the entire pattern type_pattern + # Capture main type r'([^\{\};]+)' + r'(?:\{)' + r'(?:[^\{\}]*)' + # Capture content r'(?:\})' + r'([^\{\};]*)(;)') # Capture IDs ")") match = struct_members.finditer(line) for match in res: oldmember, maintype, content, s_ids = match.groups() (disclaimer notice: none of the above was tested) Thanks, Mauro