Additional Rules for Bit String Literals

Proposal Details

  • Who Updates: DanielKho
  • Date Proposed:
  • Date Last Updated:
  • Priority:
  • Complexity:
  • Focus:

Related Issues

There's an Mentor Bugzilla case filed for this one, but I don't remember the case number. -- DanielKho - 2015-01-02

ExtendedStringLiterals may be a potential candidate that we can incorporate into the specification of bit string literals as well.

Current Situation

The LRM currently does not restrict usage of any graphic characters in a bit string literal. This means all graphic characters, including the double-quote (") character may be used directly in a bit string literal without any additional escape mechanism. Because a bit string literal encloses the bit value using double-quotes, it is necessary for an escape mechanism to be required by the language.

This proposal aims to specify additional rules for proper handling of special_characters, other_special_characters, and space characters for use in bit string literals.

LRM Definitions

From Section 15.8:

A bit string literal is defined as

bit_string_literal ::= [integer] base_specifier " [bit_value] "

And a bit value as

bit_value ::= graphic_character { [underline] graphic_character }

From Section 15.2:

A graphic character is further defined as follows:

graphic_character ::= 
 basic_graphic_character | lower_case_letter | other_special_character
 
 basic_graphic_character ::= 
 upper_case_letter | digit | special_character | space_character

Special characters are defined as:

" # & ' () * + , - . / : ; < = > ? @ [ ] _ ` |

Other special characters are defined as:

! $ % \ ^ { } ~ ¡ ¢ £ € Â¥   §  ̈ © a « ¬ ®  ̄ ° ± 2 3  ́ ÎŒ ¶ ·  ̧ 1 o » 1⁄4 1⁄2 3⁄4 ¿ × ÷ -

Questions

Question (DanielKho):

Is there going to be a problem including space characters (SPACE and NBSP) in a bit string literal? For a string literal, having space characters make sense (spaces are usually required in normal strings), but it does not make sense to have spaces in bit strings?

My view is we should either ignore spaces (treat them like underlines), or issue an error.

Question (DanielKho):

Other than the double-quote character (") and the underline character (which is already properly specified), I think there should be no problems directly including all the special_characters and other_special_characters as part of a bit string literal?

For the treatment of the double-quote, we could require that two consecutive double-quotes be written if a double-quote character was to be inserted as part of the bit string literal. The string literal (Section 15.7) already requires this.

Section 15.7:
"A string literal has a value that is a sequence of character values corresponding to the graphic characters of the string literal apart from the quotation mark itself. If a quotation mark value is to be represented in the sequence of character values, then a pair of adjacent quotation marks shall be written at the corresponding place within the string literal. (This means that a string literal that includes two adjacent quotation marks is never interpreted as two adjacent string literals.)"

Requirement

Propose to add, for Section 15.8, a similar rule as Section 15.7 regarding escaping a double-quote character with an additional double-quote if a double-quote character was to be inserted into the bit string literal.

Propose to either ignore space characters in a bit string literal, or issue an error.

Implementation details

Code Examples

16x"%#cd";    -- equivalent to b"%%%%_####_1100_1101"
16x""#cd";    -- per existing LRM, this is legal. Proposing to make this illegal.

-- Proposing additional double-quote for escaping purpose.
-- Actually, each " in the comment for the following example should be written as 2 consecutive double 
-- quotes "", but I've written it as such for more clarity.
16x"""#cd";        -- Equivalent to to the string literal """"_####_1100_1101.
32x"12ab 34cd";    -- ignore spaces, treat like underlines?

If we want to allow insertion of space characters into bit string literals, how do we write them?

32x"12 b34cd"; -- equivalent to b"0001_0010_<space><space><space><space>_1011_0011_0100_1100_1101" ?

Does this even make sense at all? -- DanielKho - 2013-11-10

Use Cases

Arguments FOR

Arguments AGAINST

General Comments

-- JimLewis - 2014-11-11 Bit string literals are targeted at types like std_logic_vector, unsigned, signed, ... Once you start using graphic characters and such, I suspect you are trying to work with regular strings. Why not introduce an extended string notation that allows some form of quoting within the string? Does Ada have something like this? If Ada doesn't, why not add a prefix that signifies a C style strings with quoting done with '\'? That way we could also get things like LF into a string? I could envision a prefix of either 'C' (for C style) or 'E' (for exended strings).

-- DanielKho - 2015-01-02 Yes, if we want to have graphic characters, then we had better use the same specification for regular strings and incorporate them into the LRM wording for bit string literals as well. The ExtendedStringLiterals can be a good candidate for bit strings too.

Supporters

-- DanielKho - 2013-11-10

Add your signature here to indicate your support for the proposal

Topic revision: r3 - 2015-01-02 - 10:24:14 - DanielKho
 
Copyright © 2008-2025 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback