28 Regular expressions library [re]

28.5 Namespace std::regex_constants [re.const]

The namespace std::regex_constants holds symbolic constants used by the regular expression library. This namespace provides three types, syntax_option_type, match_flag_type, and error_type, along with several constants of these types.

28.5.1 Bitmask type syntax_option_type [re.synopt]

namespace std::regex_constants {
  using syntax_option_type = T1;
  constexpr syntax_option_type icase = unspecified;
  constexpr syntax_option_type nosubs = unspecified;
  constexpr syntax_option_type optimize = unspecified;
  constexpr syntax_option_type collate = unspecified;
  constexpr syntax_option_type ECMAScript = unspecified;
  constexpr syntax_option_type basic = unspecified;
  constexpr syntax_option_type extended = unspecified;
  constexpr syntax_option_type awk = unspecified;
  constexpr syntax_option_type grep = unspecified;
  constexpr syntax_option_type egrep = unspecified;
  constexpr syntax_option_type multiline = unspecified;
}

The type syntax_option_type is an implementation-defined bitmask type ([bitmask.types]). Setting its elements has the effects listed in Table [tab:re:syntaxoption]. A valid value of type syntax_option_type shall have at most one of the grammar elements ECMAScript, basic, extended, awk, grep, egrep, set. If no grammar element is set, the default grammar is ECMAScript.

Table 130syntax_option_type effects
ElementEffect(s) if set
icase Specifies that matching of regular expressions against a character container sequence shall be performed without regard to case.
nosubs Specifies that no sub-expressions shall be considered to be marked, so that when a regular expression is matched against a character container sequence, no sub-expression matches shall be stored in the supplied match_results structure.
optimize Specifies that the regular expression engine should pay more attention to the speed with which regular expressions are matched, and less to the speed with which regular expression objects are constructed. Otherwise it has no detectable effect on the program output.
collate Specifies that character ranges of the form "[a-b]" shall be locale sensitive.
ECMAScript Specifies that the grammar recognized by the regular expression engine shall be that used by ECMAScript in ECMA-262, as modified in [re.grammar].
basic Specifies that the grammar recognized by the regular expression engine shall be that used by basic regular expressions in POSIX, Base Definitions and Headers, Section 9, Regular Expressions.
extended Specifies that the grammar recognized by the regular expression engine shall be that used by extended regular expressions in POSIX, Base Definitions and Headers, Section 9, Regular Expressions.
awk Specifies that the grammar recognized by the regular expression engine shall be that used by the utility awk in POSIX.
grep Specifies that the grammar recognized by the regular expression engine shall be that used by the utility grep in POSIX.
egrep Specifies that the grammar recognized by the regular expression engine shall be that used by the utility grep when given the -E option in POSIX.
multiline Specifies that ^ shall match the beginning of a line and $ shall match the end of a line, if the ECMAScript engine is selected.

28.5.2 Bitmask type match_flag_type [re.matchflag]

namespace std::regex_constants {
  using match_flag_type = T2;
  constexpr match_flag_type match_default = {};
  constexpr match_flag_type match_not_bol = unspecified;
  constexpr match_flag_type match_not_eol = unspecified;
  constexpr match_flag_type match_not_bow = unspecified;
  constexpr match_flag_type match_not_eow = unspecified;
  constexpr match_flag_type match_any = unspecified;
  constexpr match_flag_type match_not_null = unspecified;
  constexpr match_flag_type match_continuous = unspecified;
  constexpr match_flag_type match_prev_avail = unspecified;
  constexpr match_flag_type format_default = {};
  constexpr match_flag_type format_sed = unspecified;
  constexpr match_flag_type format_no_copy = unspecified;
  constexpr match_flag_type format_first_only = unspecified;
}

The type match_flag_type is an implementation-defined bitmask type ([bitmask.types]). The constants of that type, except for match_default and format_default, are bitmask elements. The match_default and format_default constants are empty bitmasks. Matching a regular expression against a sequence of characters [first, last) proceeds according to the rules of the grammar specified for the regular expression object, modified according to the effects listed in Table [tab:re:matchflag] for any bitmask elements set.

Table 131regex_constants::match_flag_type effects when obtaining a match against a character container sequence [first, last).
ElementEffect(s) if set
match_not_bol The first character in the sequence [first, last) shall be treated as though it is not at the beginning of a line, so the character ^ in the regular expression shall not match [first, first).
match_not_eol The last character in the sequence [first, last) shall be treated as though it is not at the end of a line, so the character "$" in the regular expression shall not match [last, last).
match_not_bow The expression "\\b" shall not match the sub-sequence [first, first).
match_not_eow The expression "\\b" shall not match the sub-sequence [last, last).
match_any If more than one match is possible then any match is an acceptable result.
match_not_null The expression shall not match an empty sequence.
match_continuous The expression shall only match a sub-sequence that begins at first.
match_prev_avail --first is a valid iterator position. When this flag is set the flags match_not_bol and match_not_bow shall be ignored by the regular expression algorithms [re.alg] and iterators [re.iter].
format_default When a regular expression match is to be replaced by a new string, the new string shall be constructed using the rules used by the ECMAScript replace function in ECMA-262, part 15.5.4.11 String.prototype.replace. In addition, during search and replace operations all non-overlapping occurrences of the regular expression shall be located and replaced, and sections of the input that did not match the expression shall be copied unchanged to the output string.
format_sed When a regular expression match is to be replaced by a new string, the new string shall be constructed using the rules used by the sed utility in POSIX.
format_no_copy During a search and replace operation, sections of the character container sequence being searched that do not match the regular expression shall not be copied to the output string.
format_first_only When specified during a search and replace operation, only the first occurrence of the regular expression shall be replaced.

28.5.3 Implementation-defined error_type [re.err]

namespace std::regex_constants {
  using error_type = T3;
  constexpr error_type error_collate = unspecified;
  constexpr error_type error_ctype = unspecified;
  constexpr error_type error_escape = unspecified;
  constexpr error_type error_backref = unspecified;
  constexpr error_type error_brack = unspecified;
  constexpr error_type error_paren = unspecified;
  constexpr error_type error_brace = unspecified;
  constexpr error_type error_badbrace = unspecified;
  constexpr error_type error_range = unspecified;
  constexpr error_type error_space = unspecified;
  constexpr error_type error_badrepeat = unspecified;
  constexpr error_type error_complexity = unspecified;
  constexpr error_type error_stack = unspecified;
}

The type error_type is an implementation-defined enumerated type ([enumerated.types]). Values of type error_type represent the error conditions described in Table [tab:re:errortype]:

Table 132error_type values in the C locale
ValueError condition
error_collate The expression contained an invalid collating element name.
error_ctype The expression contained an invalid character class name.
error_escape The expression contained an invalid escaped character, or a trailing escape.
error_backref The expression contained an invalid back reference.
error_brack The expression contained mismatched [ and ].
error_paren The expression contained mismatched ( and ).
error_brace The expression contained mismatched { and }
error_badbrace The expression contained an invalid range in a {} expression.
error_range The expression contained an invalid character range, such as [b-a] in most encodings.
error_space There was insufficient memory to convert the expression into a finite state machine.
error_badrepeat One of *?+{ was not preceded by a valid regular expression.
error_complexity The complexity of an attempted match against a regular expression exceeded a pre-set level.
error_stack There was insufficient memory to determine whether the regular expression could match the specified character sequence.