Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
90 changes: 78 additions & 12 deletions standard/expressions.md
Original file line number Diff line number Diff line change
Expand Up @@ -1394,14 +1394,14 @@

This contains the interpolated string expression segments `"val = {"`, `"}; 2 * val = "`, and `"."`, the first of which factors in the presence of the open/close brace escape sequences described in the grammar below. The quoted text also contains the interpolations `"{val,4:X}"` and `"{2 * val}"`.

Interpolated string expressions have two forms; regular (*interpolated_regular_string_expression*)
and verbatim (*interpolated_verbatim_string_expression*); which are lexically similar to, but differ semantically from, the two forms of string
literals ([§6.4.5.6](lexical-structure.md#6456-string-literals)).
An *interpolated_string_expression* has one of the following forms; regular (*interpolated_regular_string_expression*),
verbatim (*interpolated_verbatim_string_expression*), and raw (*interpolated_raw_string_expression*); which are lexically similar to, but differ semantically from, the corresponding forms of string literals ([§6.4.5.6](lexical-structure.md#6456-string-literals)).

```ANTLR
interpolated_string_expression
: interpolated_regular_string_expression
| interpolated_verbatim_string_expression
| interpolated_raw_string_expression
;

// interpolated regular string expressions
Expand Down Expand Up @@ -1506,16 +1506,79 @@
fragment Close_Brace_Escape_Sequence
: '}}'
;

// interpolated raw string expressions

interpolated_raw_string_expression
: single_line_interpolated_raw_string_expression
| multi_line_interpolated_raw_string_expression
;

single_line_interpolated_raw_string_expression
: Interpolated_Raw_String_Start Interpolated_Raw_String_Mid
Interpolated_Raw_String_End
;

Interpolated_Raw_String_Prefix
: '$'+
;

Interpolated_Raw_String_Start
: Interpolated_Raw_String_Prefix Raw_String_Literal_Delimiter
;

// the following two lexical rules are context sensitive, see details below

Interpolated_Raw_String_Mid
: (Raw_String_Literal_Content | raw_interpolation)+
;

Interpolated_Raw_String_End
: Raw_String_Literal_Delimiter
;

raw_interpolation
: raw_interpolation_start expression
(',' interpolation_minimum_width)? Raw_Interpolation_Format?
raw_interpolation_end
;

raw_interpolation_start
: '{'+
;

raw_interpolation_end
: '}'+
;

// the following lexical rule is context sensitive, see details below

Raw_Interpolation_Format
: ':' Interpolated_Raw_String_Character+
;

fragment Interpolated_Raw_String_Character
// Any character except " (U+0022), \\ (U+005C),
// { (U+007B), } (U+007D), and New_Line_Character.
: ~["\\{}\u000D\u000A\u0085\u2028\u2029]
;

multi_line_interpolated_raw_string_expression
: Interpolated_Raw_String_Start Whitespace* New_Line
(Interpolated_Raw_String_Mid | New_Line)* New_Line
Whitespace* Interpolated_Raw_String_End
;
```

Six of the lexical rules defined above are *context sensitive* as follows:
A number of the lexical rules defined above are *context sensitive* as follows:

| **Rule** | **Contextual Requirements** |
| :------- | :-------------------------- |
| *Interpolated_Regular_String_Mid* | Only recognised after an *Interpolated_Regular_String_Start*, between any following interpolations, and before the corresponding *Interpolated_Regular_String_End*. |
| *Regular_Interpolation_Format* | Only recognised within a *regular_interpolation* and when the starting colon (:) is not nested within any kind of bracket (parentheses/braces/square). |
| *Interpolated_Regular_String_End* | Only recognised after an *Interpolated_Regular_String_Start* and only if any intervening tokens are either *Interpolated_Regular_String_Mid*s or tokens that can be part of *regular_interpolation*s, including tokens for any *interpolated_regular_string_expression*s contained within such interpolations. |
| *Interpolated_Verbatim_String_Mid* *Verbatim_Interpolation_Format* *Interpolated_Verbatim_String_End* | Recognition of these three rules follows that of the corresponding rules above with each mentioned *regular* grammar rule replaced by the corresponding *verbatim* one. |
| *Interpolated_Verbatim_String_Mid* *Verbatim_Interpolation_Format* *Interpolated_Verbatim_String_End* | Recognition of these three rules follows that of the corresponding first three rules above with each mentioned *regular* grammar rule replaced by the corresponding *verbatim* one. |
| *Interpolated_Raw_String_Mid* *Raw_Interpolation_Format* *Interpolated_Raw_String_End* | Recognition of these three rules follows that of the corresponding first three rules above with each mentioned *regular* grammar rule replaced by the corresponding *raw* one. |

> *Note*: The above rules are context sensitive as their definitions overlap with those of
other tokens in the language. *end note*
Expand Down Expand Up @@ -1551,7 +1614,7 @@

The remainder of this subclause deals with the default interpolated string handler behavior only. The declaration and use of custom interpolated string handlers is described in [§23.5.9.1](attributes.md#23591-custom-interpolated-string-expression-handlers).

The meaning of an interpolation, both *regular_interpolation* and *verbatim_interpolation*, is to format the value of the *expression* as a `string` either according to the format specified by the *Regular_Interpolation_Format* or *Verbatim_Interpolation_Format*, or according to a default format for the type of *expression*. The formatted string is then modified by the *interpolation_minimum_width*, if any, to produce the final `string` to be interpolated into the *interpolated_string_expression*.
The meaning of an interpolation (*regular_interpolation*, *verbatim_interpolation*, and *raw_interpolation*) is to format the value of the *expression* as a `string` either according to the format specified by the *Regular_Interpolation_Format*, *Verbatim_Interpolation_Format*, or *Raw_Interpolation_Format*, or according to a default format for the type of *expression*. The formatted string is then modified by the *interpolation_minimum_width*, if any, to produce the final `string` to be interpolated into the *interpolated_string_expression*.

In an *interpolation_minimum_width* the *constant_expression* shall have an implicit conversion to `int`. Let the *field width* be the absolute value of this *constant_expression* and the *alignment* be the sign (positive or negative) of the value of this *constant_expression*:

Expand All @@ -1564,17 +1627,17 @@

The format string literal is constructed as follows, where `N` is the number of interpolations in the *interpolated_string_expression*. The format string literal consists of, in order:

- The characters of the *Interpolated_Regular_String_Start* or *Interpolated_Verbatim_String_Start*
- The characters of the *Interpolated_Regular_String_Mid* or *Interpolated_Verbatim_String_Mid*, if any
- The characters of the *Interpolated_Regular_String_Start*, *Interpolated_Verbatim_String_Start*, or *Interpolated_Raw_String_Start*
- The characters of the *Interpolated_Regular_String_Mid*, *Interpolated_Verbatim_String_Mid*, or *Interpolated_Raw_String_Mid*, if any
- Then if `N ≥ 1` for each number `I` from `0` to `N-1`:
- A placeholder specification:
- A left brace (`{`) character
- The decimal representation of `I`
- Then, if the corresponding *regular_interpolation* or *verbatim_interpolation* has a *interpolation_minimum_width*, a comma (`,`) followed by the decimal representation of the value of the *constant_expression*
- The characters of the *Regular_Interpolation_Format* or *Verbatim_Interpolation_Format*, if any, of the corresponding *regular_interpolation* or *verbatim_interpolation*
- Then, if the corresponding *regular_interpolation*, *verbatim_interpolation*, or *raw_interpolation* has a *interpolation_minimum_width*, a comma (`,`) followed by the decimal representation of the value of the *constant_expression*
- The characters of the *Regular_Interpolation_Format*, *Verbatim_Interpolation_Format*, or *Raw_Interpolation_Format*, if any, of the corresponding *regular_interpolation*, *verbatim_interpolation*, or *raw_interpolation*
- A right brace (`}`) character
- The characters of the *Interpolated_Regular_String_Mid* or *Interpolated_Verbatim_String_Mid* immediately following the corresponding interpolation, if any
- Finally the characters of the *Interpolated_Regular_String_End* or *Interpolated_Verbatim_String_End*.
- The characters of the *Interpolated_Regular_String_Mid*, *Interpolated_Verbatim_String_Mid*, or *Interpolated_Raw_String_Mid* immediately following the corresponding interpolation, if any
- Finally the characters of the *Interpolated_Regular_String_End*, *Interpolated_Verbatim_String_End*, or *Interpolated_Raw_String_End*.

The subsequent arguments are the *expression*s from the interpolations, if any, in order.

Expand Down Expand Up @@ -1612,6 +1675,9 @@
| `$"{text + '?'} {number % 3}"` | `string.Format("{0} {1}", text + '?', number % 3)` | `"red? 2"` |
| `$"{text + $"[{number}]"}"` | `string.Format("{0}", text + string.Format("[{0}]", number))` | `"red[14]"` |
| `$"{(number==0?"Zero":"Non-zero")}"` | `string.Format("{0}", (number==0?"Zero":"Non-zero"))` | `"Non-zero"` |
| `$$""""{number}""""` | `string.Format("{{number}}")` | `"{number}"` |
| `$$"""{{number}}"""` | `string.Format("{0}", number)` | `"14"` |
| `$$"""""{{{number}}}"""""` | `string.Format("{{{0}}}", number)` | `"{14}"` |

*end example*

Expand Down Expand Up @@ -2292,8 +2358,8 @@
A *null_conditional_invocation_expression* is syntactically either a *null_conditional_member_access* ([§12.8.8](expressions.md#1288-null-conditional-member-access)) or *null_conditional_element_access* ([§12.8.13](expressions.md#12813-null-conditional-element-access)) where the final *dependent_access* is an invocation expression ([§12.8.10](expressions.md#12810-invocation-expressions)).

A *null_conditional_invocation_expression* occurs within the context of a *statement_expression* ([§13.7](statements.md#137-expression-statements)), *anonymous_function_body* ([§12.22.1](expressions.md#12221-general)), or *method_body* ([§15.6.1](classes.md#1561-general)).

Check warning on line 2361 in standard/expressions.md

View workflow job for this annotation

GitHub Actions / Markdown to Word Converter

standard/expressions.md#L2361

MDC032::Line length 84 > maximum 81
Unlike the syntactically equivalent *null_conditional_member_access* or *null_conditional_element_access*, a *null_conditional_invocation_expression* may be classified as nothing.

Check warning on line 2362 in standard/expressions.md

View workflow job for this annotation

GitHub Actions / Markdown to Word Converter

standard/expressions.md#L2362

MDC032::Line length 85 > maximum 81

```ANTLR
null_conditional_invocation_expression
Expand Down Expand Up @@ -2373,7 +2439,7 @@
- The *primary_expression* has compile-time type `dynamic`.
- At least one expression of the *argument_list* has compile-time type `dynamic`.

In this case the compile-time type of the *element_access* depends on the compile-time type of its *primary_expression*: if it has an array type then the compile-time type is the element type of that array type; otherwise the compile-time type is `dynamic` and the *element_access* is classified as a value of type `dynamic`. The rules below to determine the meaning of the *element_access* are then applied at run-time, using the run-time type instead of the compile-time type of those of the *primary_expression* and *argument_list* expressions which have the compile-time type `dynamic`. If the *primary_expression* does not have compile-time type `dynamic`, then the element access undergoes a limited compile-time check as described in [§12.6.5](expressions.md#1265-compile-time-checking-of-dynamic-member-invocation).

Check warning on line 2442 in standard/expressions.md

View workflow job for this annotation

GitHub Actions / Markdown to Word Converter

standard/expressions.md#L2442

MDC032::Line length 82 > maximum 81

> *Example*:
>
Expand Down Expand Up @@ -3518,7 +3584,7 @@
- one of the following value types: `sbyte`, `byte`, `short`, `ushort`, `int`, `uint`, `nint`, `nuint`, `long`, `ulong`, `char`, `float`, `double`, `decimal`, `bool`; or
- any enumeration type.

### 12.8.22 Stack allocation

Check warning on line 3587 in standard/expressions.md

View workflow job for this annotation

GitHub Actions / Markdown to Word Converter

standard/expressions.md#L3587

MDC032::Line length 86 > maximum 81

A stack allocation expression allocates a block of memory from the execution stack. The ***execution stack*** is an area of memory where local variables are stored. The execution stack is not part of the managed heap. The memory used for local variable storage is automatically recovered when the current function returns.

Expand Down Expand Up @@ -3986,8 +4052,8 @@
All non-positional properties being changed shall have both set and init accessors.

This expression is evaluated as follows:

Check warning on line 4055 in standard/expressions.md

View workflow job for this annotation

GitHub Actions / Markdown to Word Converter

standard/expressions.md#L4055

MDC032::Line length 84 > maximum 81
- For a record class type, the receiver's clone method ([§15.16.3](classes.md#15163-copy-and-clone-members)) is invoked, and its result is converted to the receiver’s type.

Check warning on line 4056 in standard/expressions.md

View workflow job for this annotation

GitHub Actions / Markdown to Word Converter

standard/expressions.md#L4056

MDC032::Line length 85 > maximum 81
- Each `member_initializer` is processed the same way as an assignment to
a field or property access of the result of the conversion. Assignments are processed in lexical order. If *member_initializer_list* is omitted, no members are changed.

Expand Down Expand Up @@ -5434,7 +5500,7 @@
An ***anonymous function*** is an expression that represents an “in-line” method definition. An anonymous function does not have a value or type in and of itself, but is convertible to a compatible delegate or expression-tree type. The evaluation of an anonymous-function conversion depends on the target type of the conversion: If it is a delegate type, the conversion evaluates to a delegate value referencing the method that the anonymous function defines. If it is an expression-tree type, the conversion evaluates to an expression tree that represents the structure of the method as an object structure.

> *Note*: For historical reasons, there are two syntactic flavors of anonymous functions, namely *lambda_expression* and *anonymous_method_expression*. For almost all purposes, *lambda_expression* is more concise and expressive than *anonymous_method_expression*s, which remain in the language for backwards compatibility. *end note*

Check warning on line 5503 in standard/expressions.md

View workflow job for this annotation

GitHub Actions / Markdown to Word Converter

standard/expressions.md#L5503

MDC032::Line length 90 > maximum 81
```ANTLR
lambda_expression
: attributes? anonymous_function_modifier?
Expand Down
119 changes: 116 additions & 3 deletions standard/lexical-structure.md
Original file line number Diff line number Diff line change
Expand Up @@ -912,18 +912,21 @@

#### 6.4.5.6 String literals

C# supports two forms of string literals: ***regular string literal***s and ***verbatim string literal***s. A regular string literal consists of zero or more characters enclosed in double quotes, as in `"hello"`, and can include both simple escape sequences (such as `\t` for the tab character), and hexadecimal and Unicode escape sequences.
C# supports a number of forms of string literals: ***regular string literal***s, ***verbatim string literal***s, and ***raw string literals***. A regular string literal consists of zero or more characters enclosed in double quotes, as in `"hello"`, and can include both simple escape sequences (such as `\t` for the tab character), and hexadecimal and Unicode escape sequences.

A verbatim string literal consists of an `@` character followed by a double-quote character, zero or more characters, and a closing double-quote character.

> *Example*: A simple example is `@"hello"`. *end example*

In a verbatim string literal, the characters between the delimiters are interpreted verbatim, with the only exception being a *Quote_Escape_Sequence*, which represents one double-quote character. In particular, simple escape sequences, and hexadecimal and Unicode escape sequences are not processed in verbatim string literals. A verbatim string literal may span multiple lines.

A raw string literal consists of arbitrary text and newlines between multi-`"`-sequence delimiters (which better supports the readability of XML, JSON, and other forms of text that have some visually pleasing structure). A raw string literal may span multiple lines.

```ANTLR
String_Literal
: Regular_String_Literal
| Verbatim_String_Literal
| Raw_String_Literal
;

fragment Regular_String_Literal
Expand Down Expand Up @@ -958,8 +961,59 @@
fragment Quote_Escape_Sequence
: '""'
;

fragment Raw_String_Literal
: Single_Line_Raw_String_Literal
| Multi_Line_Raw_String_Literal
;

fragment Single_Line_Raw_String_Literal
: Raw_String_Literal_Delimiter Raw_String_Literal Content
Raw_String_Literal_Delimiter
;

fragment Raw_String_Literal_Delimiter
: '"""' '"'*
;

fragment Raw_String_Literal Content
// anything except New_Line
: ~( '\u000D\u000A' | '\u000D' | '\u000A' | '\u0085' | '\u2028' | '\u2029')
;

fragment Multi_Line_Raw_String_Literal
: Raw_String_Literal_Delimiter Whitespace* New_Line
(Raw_String_Literal Content | New_Line)* New_Line
Whitespace* Raw_String_Literal_Delimiter
;
```

For brevity, a *Raw_String_Literal_Delimiter* is referred to as a “delimiter,” the start *Raw_String_Literal_Delimiter* is referred to as the “start delimiter,” and the end *Raw_String_Literal_Delimiter* is referred to as the “end delimiter.”

For any *Raw_String_Literal*:

- A delimiter shall be the longest set of contiguous `"` characters found at the start or end. The number of `"` characters in a delimiter is called the ***raw string literal delimiter length***.
> *Example*: The string `""" """` is well-formed; it has 3-character start and end delimiters, and its content is a single space. However, the string `""""""` is ill-formed, as it is seen as a 6-character start delimiter, with no content, and no end delimiter, not as 3-character start and end delimiters and empty content. *end example*
- The beginning and end delimiters shall have the same raw string literal delimiter length.
> *Example*: The string `""""X""""` is well-formed; it has 4-character start and end delimiters. However, the strings `"""X""""` and `""""X"""` are ill-formed, as the start and end delimiters in each pair do not have the same length. *end example*
- A *Raw_String_Literal Content* shall not contain a set of contiguous `"` characters whose length is equal to or greater than the raw string literal delimiter length.
> *Example*: The strings `"""" """ """"` and `""""""" """""" """"" """" """ """""""`are well-formed. However, the strings `""" """ """` and `""" """" """` are ill-formed. *end example*
- As text sequences that have the form of *Comment*s are not processed within string literals ([§6.3.3](lexical-structure.md#633-comments)), they appear verbatim in their corresponding *Raw_String_Literal Content*.

For a *Single_Line_Raw_String_Literal* only:

- A *Single_Line_Raw_String_Literal* cannot be empty; it must contain at least one character.
- A *Raw_String_Literal Content* cannot begin with `"`, as such a character is considered to belong to the preceding start delimiter. Similarly, a *Raw_String_Literal Content* cannot end with `"`, as such a character is considered to belong to the following end delimiter.
- The value of the literal is *Raw_String_Literal Content*, which can contain leading, embedded, and trailing horizontal whitespace (as in `"""x x x"""` and `""" xxx """`, the latter having a leading space and trailing tabs).

For a *Multi_Line_Raw_String_Literal* only:

- If *Whitespace* precedes the end delimiter on the same line, the exact number and kind of whitespace characters (e.g., spaces vs. tabs) shall exist at the beginning of each *Raw_String_Literal Content*, and that leading whitespace shall be discarded from those *Raw_String_Literal Content*s.
- A *Raw_String_Literal Content* shall not appear on the same line as a start or end delimiter.
- A *Multi_Line_Raw_String_Literal* can be empty (by having no *Raw_String_Literal Content*s and one or more *New_Line*s).
- A *Raw_String_Literal Content* can begin or end with `"`.
- The value of the literal is the lexical concatenation of all of its *Raw_String_Literal Content*s and *New_Lines* after any whitespace at the beginning of each *Raw_String_Literal Content* has been discarded based on whitespace preceding the ending delimiter. Whitespace following the start delimiter and preceding the end delimiter are not included.

> *Example*: The example
>
> <!-- Example: {template:"code-in-main-without-using", name:"StringLiterals", ignoredWarnings:["CS0219"]} -->
Expand All @@ -983,6 +1037,56 @@
> *end example*
<!-- markdownlint-disable MD028 -->

<!-- markdownlint-enable MD028 -->
> *Example*: Consider the following multi-line string literals:
>
> <!-- Example: {template:"standalone-console", name:"RawStringLiteral1", inferOutput:true, ignoredWarnings:["CS0219"]} -->
> ```csharp
> var xml1= """
> <element attr="content">
> <body>
> </body>
> </element>
> """;
> Console.WriteLine(xml1);
>
> var xml2 = """
> <element attr="content">
> <body>
> </body>
> </element>
> """;
> Console.WriteLine(xml2);
>
> var xml3 = """
> <element attr="content">
> <body>
> </body>
> </element>
> """;
> Console.WriteLine(xml3);
> ```
>
> which produces the output
>
> ```console
> <element attr="content">
> <body>
> </body>
> </element>
> <element attr="content">
> <body>
> </body>
> </element>
> <element attr="content">
> <body>
> </body>
> </element>
> ```
>
> In the case of `xml1`, the end delimiter has 8 leading spaces, so that is the amount of leading whitespace removed from each content line. With `xm12`, 4 leading spaces are removed, and with `xml3`, no leading spaces are removed. *end example*
<!-- markdownlint-disable MD028 -->

<!-- markdownlint-enable MD028 -->
> *Note*: Any line breaks within verbatim string literals are part of the resulting string. If the exact characters used to form line breaks are semantically relevant to an application, any tools that translate line breaks in source code to different formats (between “`\n`” and “`\r\n`”, for example) will change application behavior. Developers should be careful in such situations. *end note*
<!-- markdownlint-disable MD028 -->
Expand All @@ -996,20 +1100,29 @@

> *Example*: For instance, the output produced by
>
> <!-- Example: {template:"standalone-console-without-using", name:"ObjectReferenceEquality", expectedOutput:["True"]} -->
> <!-- Example: {template:"standalone-console-without-using", name:"ObjectReferenceEquality", expectedOutput:["True","True","True","True"]} -->
> ```csharp
> class Test
> {
> static void Main()
> {
> object a = "hello";
> object b = "hello";
> object c = @"hello";
> object d = """hello""";
> object e = """
> hello
> """;
>
> System.Console.WriteLine(a == b);
> System.Console.WriteLine(a == c);
> System.Console.WriteLine(a == d);
> System.Console.WriteLine(a == e);
> }
> }
> ```
>
> is `True` because the two literals refer to the same string instance.
> is all `True` because the five literals refer to the same string instance.
>
> *end example*

Expand Down Expand Up @@ -1521,7 +1634,7 @@
: Decimal_Digit+ PP_Whitespace PP_Compilation_Unit_Name
| Decimal_Digit+
| DEFAULT
| 'hidden'

Check warning on line 1637 in standard/lexical-structure.md

View workflow job for this annotation

GitHub Actions / Markdown to Word Converter

standard/lexical-structure.md#L1637

MDC032::Line length 85 > maximum 81
| PP_Start_Line_Character PP_Whitespace? '-' PP_Whitespace? PP_End_Line_Character
PP_Whitespace (PP_Character_Offset PP_Whitespace)? PP_Compilation_Unit_Name
;
Expand Down
Loading