Two related string-list API functions, string_list_split() and string_list_split_in_place(), more or less duplicates their implementations. They both take a single string, and split the string at the delimiter and stuff the result into a string list. However, there is one subtle and unnecessary difference. The non "in-place" variant only allows a single byte value as delimiter, while the "in-place" variant can take multiple delimiters (e.g., "split at either a comma or a space"). This series first updates the string_list_split() to allow multiple delimiters like string_list_split_in_place() does, by unifying their implementations into one. This refactoring allows us to give new features to these two functions more easily. Then these functions learn to optionally - trim the split string pieces before placing them in the resulting string list. - omit empty string pieces from the resulting string list. An existing caller of string_list_split() in diff.c trims the elements in the resulting string list before it uses them, which is simplified by taking advantage of this new feature. A handful of code paths call string_list_split*(), immediately followed by string_list_remove_empty_items(). They are simplified by not placing empty items in the list in the first place. Junio C Hamano (7): string-list: report programming error with BUG string-list: align string_list_split() with its _in_place() counterpart string-list: unify string_list_split* functions string-list: optionally trim string pieces split by string_list_split*() diff: simplify parsing of diff.colormovedws string-list: optionally omit empty string pieces in string_list_split*() string-list: split-then-remove-empty can be done while splitting builtin/blame.c | 2 +- builtin/merge.c | 2 +- builtin/var.c | 2 +- connect.c | 2 +- diff.c | 20 ++---- fetch-pack.c | 2 +- notes.c | 6 +- parse-options.c | 2 +- pathspec.c | 3 +- protocol.c | 2 +- ref-filter.c | 4 +- setup.c | 3 +- string-list.c | 120 ++++++++++++++++++++++++----------- string-list.h | 29 ++++++--- t/helper/test-hashmap.c | 4 +- t/helper/test-json-writer.c | 4 +- t/helper/test-path-utils.c | 3 +- t/helper/test-ref-store.c | 2 +- t/unit-tests/u-string-list.c | 95 ++++++++++++++++++++++++--- transport.c | 2 +- upload-pack.c | 2 +- 21 files changed, 221 insertions(+), 90 deletions(-) 1: e56dc89249 ! 1: 1c2b222eec string-list: report programming error with BUG @@ Commit message string-list: report programming error with BUG Passing a string list that has .strdup_strings bit unset to - string_list_split(), orone that has .strdup_strings bit set to + string_list_split(), or one that has .strdup_strings bit set to string_list_split_in_place(), is a programmer error. Do not use die() to abort the execution. Use BUG() instead. 2: 1bd3506fad ! 2: a7e07b94ef string-list: align string_list_split() with its _in_place() counterpart @@ Commit message For some unknown reason, unlike string_list_split_in_place(), string_list_split() took only a single character as a field delimiter. Before giving both functions more features in future - commits, allow stirng_list_split() to take more than one delimiter + commits, allow string_list_split() to take more than one delimiter characters to make them closer to each other. Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx> 3: 52c3b694d2 ! 3: b7a7fbb975 string-list: unify string_list_split* functions @@ string-list.c: void unsorted_string_list_delete_item(struct string_list *list, i -int string_list_split(struct string_list *list, const char *string, - const char *delim, int maxsplit) -+static void append_one(struct string_list *list, -+ const char *p, const char *end, -+ int in_place) ++/* ++ * append a substring [p..end] to list; return number of things it ++ * appended to the list. ++ */ ++static int append_one(struct string_list *list, ++ const char *p, const char *end, ++ int in_place) +{ + if (!end) + end = p + strlen(p); @@ string-list.c: void unsorted_string_list_delete_item(struct string_list *list, i + } else { + string_list_append_nodup(list, xmemdupz(p, end - p)); + } ++ return 1; +} + +/* @@ string-list.c: void unsorted_string_list_delete_item(struct string_list *list, i - BUG("internal error in string_list_split(): " - "list->strdup_strings must be set"); for (;;) { -+ char *end; -+ - count++; +- count++; - if (maxsplit >= 0 && count > maxsplit) { - string_list_append(list, p); - return count; @@ string-list.c: void unsorted_string_list_delete_item(struct string_list *list, i - p = end + 1; - } else { - string_list_append(list, p); -+ if (maxsplit >= 0 && count > maxsplit) ++ char *end; ++ ++ if (0 <= maxsplit && maxsplit <= count) + end = NULL; + else + end = strpbrk(p, delim); + -+ append_one(list, p, end, in_place); ++ count += append_one(list, p, end, in_place); + + if (!end) return count; 4: 13e3d9fbaf ! 4: c566d88c28 string-list: optionally trim string pieces split by string_list_split() @@ Metadata Author: Junio C Hamano <gitster@xxxxxxxxx> ## Commit message ## - string-list: optionally trim string pieces split by string_list_split() + string-list: optionally trim string pieces split by string_list_split*() Teach the unified split_string() to take an optional "flags" word, and define the first flag STRING_LIST_SPLIT_TRIM to cause the split @@ Commit message ## string-list.c ## @@ string-list.c: void unsorted_string_list_delete_item(struct string_list *list, int i, int free_ - - static void append_one(struct string_list *list, - const char *p, const char *end, -- int in_place) -+ int in_place, unsigned flags) + */ + static int append_one(struct string_list *list, + const char *p, const char *end, +- int in_place) ++ int in_place, unsigned flags) { if (!end) end = p + strlen(p); @@ string-list.c: void unsorted_string_list_delete_item(struct string_list *list, i if (in_place) { *((char *)end) = '\0'; string_list_append(list, p); -@@ string-list.c: static void append_one(struct string_list *list, +@@ string-list.c: static int append_one(struct string_list *list, * returns "char *" pointer into that const string. Yucky but works ;-). */ static int split_string(struct string_list *list, const char *string, const char *delim, @@ string-list.c: static int split_string(struct string_list *list, const char *str + p++; + } + - count++; - if (maxsplit >= 0 && count > maxsplit) + if (0 <= maxsplit && maxsplit <= count) end = NULL; else end = strpbrk(p, delim); -- append_one(list, p, end, in_place); -+ append_one(list, p, end, in_place, flags); +- count += append_one(list, p, end, in_place); ++ count += append_one(list, p, end, in_place, flags); if (!end) return count; 5: 912c6ee193 ! 5: eb272e0f22 diff: simplify parsing of diff.colormovedws @@ Commit message The code to parse this configuration variable, whose value is a comma separated known tokens like "ignore-space-change" and - "ignore-all-space", uses string_list_split() to split the value int + "ignore-all-space", uses string_list_split() to split the value into pieces, and then places each piece of string in a strbuf to trim, before comparing the result with the list of known tokens. - Thanks to the previous steps, now string_list_split() knows to trim - the resulting pieces in the string list. Use it to simplify the - code. + Thanks to the previous steps, now string_list_split() can trim the + resulting pieces before it places them in the string list. Use it + to simplify the code. Signed-off-by: Junio C Hamano <gitster@xxxxxxxxx> -: ---------- > 6: d418078a84 string-list: optionally omit empty string pieces in string_list_split*() -: ---------- > 7: 12c1189a08 string-list: split-then-remove-empty can be done while splitting