Language Change Specification for File IO / TextIO updates

LCS Number: LCS-2016-006a
Version: 1 {15-Jan-2017}
2 {18-Jan-2017}
3 {19-Jan-2017}
4 {02-Feb-2017}
Date: 02-Feb-2017
Status: Voting
Author: Patrick Lehmann
Email: Main.PatrickLehmann
Source Doc: File IO / TextIO updates
Summary: Add new sub programs for file I/O.
See LCS-2016-006b for TextIO additions.

Voting Results: Cast your votes here

Yes:

  1. Jakko Verhallen - 2017-01-18 - ver 1
  2. Yann Guidon - 2017-01-19 - ver 2
  3. Thomas Preusser - 2017-01-19 - ver 4
  4. Kevin Jennings - 2017-01-19 - Ver 4
  5. Rob Gaddi - 2017-01-23 - ver 3
  6. Lieven Lemiengre - 2017-01-27 ver 3
  7. Hendrik Eeckhaut - 2017-01-27 ver 3
  8. Patrick Lehmann - 2017-02-02 - ver 4
  9. Martin Zabel - 2017-02-02 ver 4

No:

Abstain:

  1. Farrell Ostler - 2017-01-17 - ver 1
  2. Brent Hayhoe - 2017-02-16 Version 4 - Abstain due to lack of personal time for review.

Style Notes

Changes are shown in red font.
Deletions are crossed out.
Editing or reviewing notes in green font.

Reviewing Notes

Version 1:
This LCS adds the ability to seek in files. It adds a new read/write mode, seek and rewind procedures as well as some functions to get a file objects state and mode.

Version 2:
Defined seek and size operations on whole values.

Version 4:

  • Defined new operations as read or update operations on file objects.
  • Reworked former blue text: "unit of measurement".
  • Removed wording "format effector".

Details of Language Change

(006a.01) 5.5.1 General

A file type definition defines a file type. File types are used to define objects representing files in the host system environment. The value of a file
object is the sequence of values contained in the host system file.
file_type_definition ::= file of type_mark
The type mark in a file type definition defines the subtype of the values contained in the file. The type mark may denote either a fully constrained, a
partially constrained, or an unconstrained subtype. The base type of this subtype shall not be a file type, an access type, a protected type, or a formal
generic type. If the base type is a composite type, it shall not contain a subelement of an access type. If the base type is an array type, it shall be a
one-dimensional array type whose element subtype is fully constrained. If the base type is a record type, it shall be fully constrained.

A slash ('/') occurring as an element of a path string is interpreted by the implementation as signifying the path separator sign. The implementation
shall transform the slash into the implementation-defined representation of the path separator sign.

Examples:

file of STRING -- Defines a file type that can contain
-- an indefinite number of strings of
-- arbitrary length.
file of NATURAL -- Defines a file type that can contain
-- only nonnegative integer values.

(006a.02) 5.2.2.2 Predefined enumeration types

The predefined enumeration types are CHARACTER, BIT, BOOLEAN, SEVERITY_LEVEL, FILE_OPEN_KIND, and FILE_OPEN_STATUS.,
FILE_OPEN_STATE, and FILE_ORIGIN_KIND.

The predefined type CHARACTER is a character type whose values are the 256 characters of the ISO/IEC 8859-1 character set. Each of
the 191 graphic characters of this character set is denoted by the corresponding character literal.

The declarations of the predefined types CHARACTER, BIT, BOOLEAN, SEVERITY_LEVEL, FILE_OPEN_KIND, and FILE_OPEN_STATUS.,
FILE_OPEN_STATE, and FILE_ORIGIN_KIND
appear in package STANDARD in Clause 16.

(006a.03) 5.5.2 File operations

The language implicitly defines the operations for objects of a file type. Given the following file type declaration:

type FT is file of TM;

where type mark TM denotes a scalar type, a record type, or a fully constrained array subtype, the following operations are implicitly declared
immediately following the file type declaration:

procedure FILE_OPEN (file F: FT; External_Name: in STRING; Open_Kind: in FILE_OPEN_KIND := READ_MODE);
procedure FILE_OPEN (Status: out FILE_OPEN_STATUS; file F: FT; External_Name: in STRING; Open_Kind: in FILE_OPEN_KIND := READ_MODE);
procedure FILE_REWIND (file F: FT);
procedure FILE_SEEK (file F: FT; Offset : INTEGER; Origin : FILE_ORIGIN_KIND := FILE_ORIGIN_BEGIN); [INTEGER64 if LCS-2016-026a is approved]
procedure FILE_TRUNCATE (file F: FT; Size : INTEGER; Origin : FILE_ORIGIN_KIND := FILE_ORIGIN_BEGIN); [INTEGER64 if LCS-2016-026a is approved]
function FILE_STATE (file F: FT) return FILE_OPEN_STATE;
function FILE_MODE (file F: FT) return FILE_OPEN_KIND;
function FILE_POSITION (file F: FT; Origin : FILE_ORIGIN_KIND := FILE_ORIGIN_BEGIN) return INTEGER; [INTEGER64 if LCS-2016-026a is approved]
function FILE_SIZE (file F: FT) return INTEGER; [INTEGER64 if LCS-2016-026a is approved]
function FILE_CANSEEK (file F: FT) return BOOLEAN;
procedure FILE_CLOSE (file F: FT);
procedure READ (file F: FT; VALUE: out TM);
procedure WRITE (file F: FT; VALUE: in TM);
procedure FLUSH (file F: FT);
function  ENDFILE (file F: FT) return BOOLEAN;

The FILE_OPEN procedures open an external file specified by the External_Name parameter and associate it with the file object F. If the call to
FILE_OPEN is successful (see the following), the file object is said to be open and the file object has an access mode dependent on the value
supplied to the Open_Kind parameter (see 16.3).

  • If the value supplied to the Open_Kind parameter is READ_MODE, the access mode of the file object is read-only. In addition,
    the file object is initialized so that a subsequent READ will return the first value in the external file. The file position is at the file
    beginning (position zero).
    Values are read from the file object in the order that they appear in the external file.
  • If the value supplied to the Open_Kind parameter is READ_WRITE_MODE, the access mode of the file object is read/write. The
    file position is at the file beginning (position zero). Values are read from the file object in the order that they appear in the
    external file. Values written to the file object are placed in the external file in the order in which they are written.
  • If the value supplied to the Open_Kind parameter is WRITE_MODE, the access mode of the file object is write-only. In addition,
    the external file is made initially empty. The file position is at the file beginning (position zero). Values written to the file object
    are placed in the external file in the order in which they are written.
  • If the value supplied to the Open_Kind parameter is APPEND_MODE, the access mode of the file object is write-only. In addition,
    the file object is initialized so that values written to it will be added to the end of the external file in the order in which they are
    written. The file position is at the file ending.

In the second form of FILE_OPEN, the value returned through the Status parameter indicates the results of the procedure call:

  • A value of OPEN_OK indicates that the call to FILE_OPEN was successful. If the call to FILE_OPEN specifies an external file that
    does not exist at the beginning of the call, and if the access mode of the file object passed to the call is write-only, then the
    external file is created.
  • A value of STATUS_ERROR indicates that the file object already has an external file associated with it.
  • A value of NAME_ERROR indicates that the external file does not exist (in the case of an attempt to read from the external file) or
    the external file cannot be created (in the case of an attempt to write or append to an external file that does not exist). This value
    is also returned if the external file cannot be associated with the file object for any reason.
  • A value of MODE_ERROR indicates that the external file cannot be opened with the requested Open_Kind.

The first form of FILE_OPEN causes an error to occur if the second form of FILE_OPEN, when called under identical conditions, would return a
Status value other than OPEN_OK.

A call to FILE_OPEN of the first form is successful if and only if the call does not cause an error to occur. Similarly, a call to FILE_OPEN of the
second form is successful if and only if it returns a Status value of OPEN_OK.

The unit of measurement is for all read, write, seek and size operations is one scalar or fully constrained value as denoted by the type mark in
the file type definition. If the type mark in a file type definition is denoting an unconstrained array type, then each operation shall use the fully
constrained element type of that unconstrained one-dimensional array type as a unit of measurement. The size and data structure alignment of
the physical representation of a type mark is implementation-defined.

The procedure FILE_REWIND moves the file position to the beginning of the file (position zero). It is an error if the file object is not open.
It is also an error if the underlying file does not support seek operations, especially for the predefined file objects INPUT and OUTPUT.

The procedure FILE_SEEK moves the the file position relative to one of three origins denoted by the parameter Origin. Negative values for
parameter Offset are allowed. The file position shall not exceed the range of a file object, which ranges from position zero (beginning) to the
position returned by FILE_SIZE minus one. It is an error if the file object is not open. It is also an error if the underlying file does not support
seek operations, especially for the predefined file objects INPUT and OUTPUT.

The procedure FILE_TRUNCATE sets the size of a file. The file size can be set as an absolute size from the beginning or as a relative size by
setting the Origin parameter. If parameter Origin is either FILE_ORIGIN_CURRENT or FILE_ORIGIN_END, then Size can also be negative to
shrink a file. It is an error if the file object is not open or the file object was not opened in write or append mode. It is also an error if the
underlying file does not support resize operations, especially for the predefined file objects INPUT and OUTPUT.

The function FILE_STATE returns the current state (FILE_OPEN_STATE) of a file object, which is STATE_OPEN, if the file is open, otherwise
STATE_CLOSED.

The function FILE_MODE returns the mode (FILE_OPEN_KIND) in which a file object was opened. It is an error to call FILE_MODE, if the file is
not open.

The function FILE_POSITION returns the current file position. The return value is relative to one of three possible origins denoted by the Origin
parameter. The absolute value of the return value shall not exceed the file object's range from position 0 to the position returned by FILE_SIZE
minus one. It is an error if the underlying file does not support seek operations, especially for the predefined file objects INPUT and OUTPUT.

The function FILE_SIZE returns the current size of a file object. It is also an error if the underlying file does not support seek operations,
especially for the predefined file objects INPUT and OUTPUT.

The function FILE_CANSEEK returns TRUE if the file supports seek and size operations, otherwise FALSE. It is also an error if the file is not open.

If a file object F is associated with an external file, procedure FILE_CLOSE terminates access to the external file associated with F and closes the
external file. If F is not associated with an external file, then FILE_CLOSE has no effect. In either case, the file object is no longer open after a
call to FILE_CLOSE that associates the file object with the formal parameter F.

[...]

(006a.04) 6.5.2 Interface object declarations

[...]

The value of an object is said to be read when one of the following conditions is satisfied:

  • When the object is evaluated, and also (indirectly) when the object is associated with an interface object of the modes in, inout, or linkage.
  • When the object is a signal and a name denoting the object appears in a sensitivity list in a wait statement or a process statement.
  • When the object is a signal and the value of any of its predefined attributes 'STABLE, 'QUIET, 'DELAYED, 'TRANSACTION, 'EVENT, 'ACTIVE,
    'LAST_EVENT, 'LAST_ACTIVE, or 'LAST_VALUE is read.
  • When one of its subelements is read.
  • When the object is a file and a READ, FILE_STATE, FILE_MODE, FILE_POSITION, FILE_SIZE or FILE_CANSEEK operation is performed on the file.
  • When the object is a file of type STD.TEXTIO.TEXT and the procedure STD.TEXTIO.READLINE is called with the given object associated with the
    formal parameter F of the given procedure.

The value of an object is said to be updated when one of the following conditions is satisfied:

  • When it is the target of an assignment, and also (indirectly) when the object is associated with an interface object of the modes out,
    buffer, inout, or linkage.
  • When a VHPI information model object representing the given object is updated using a call to the function vhpi_put_value.
  • When the object is a signal and the vhpi_schedule_transaction function is used to schedule a transaction on a driver of the signal.
  • When one of its subelements is updated.
  • When the object is a file and a WRITE, or FLUSH, FILE_REWIND, FILE_SEEK or FILE_TRUNCATE operation is performed on the file.
  • When the object is a file of type STD.TEXTIO.TEXT and the procedure STD.TEXTIO.WRITELINE is called with the given object associated with
    the formal parameter F of the given procedure.

(006a.05) 16.3 Package STANDARD

package STANDARD is
[...]
  -- The predefined types for opening files:
  type FILE_OPEN_KIND is (
    READ_MODE,       -- Resulting access mode is read-only.
    READ_WRITE_MODE, -- Resulting access mode is read/write.
    WRITE_MODE,      -- Resulting access mode is write-only.
    APPEND_MODE      -- Resulting access mode is write-only;
  );                 -- information is appended to the end of the existing file.
  -- The predefined operations for this type are as follows:
  -- function "="(anonymous, anonymous: FILE_OPEN_KIND) return BOOLEAN;
  -- function "/="(anonymous, anonymous: FILE_OPEN_KIND) return BOOLEAN;
  -- function "<"(anonymous, anonymous: FILE_OPEN_KIND) return BOOLEAN;
  -- function "<="(anonymous, anonymous: FILE_OPEN_KIND) return BOOLEAN;
  -- function ">"(anonymous, anonymous: FILE_OPEN_KIND) return BOOLEAN;
  -- function ">="(anonymous, anonymous: FILE_OPEN_KIND) return BOOLEAN;
  -- function MINIMUM (L, R: FILE_OPEN_KIND) return FILE_OPEN_KIND;
  -- function MAXIMUM (L, R: FILE_OPEN_KIND) return FILE_OPEN_KIND;
  
  type FILE_OPEN_STATUS is (
    OPEN_OK,      -- File open was successful.
    STATUS_ERROR, -- File object was already open.
    NAME_ERROR,   -- External file not found or inaccessible.
    MODE_ERROR    -- Could not open file with requested access mode.
  );
  -- The predefined operations for this type are as follows:
  -- function "="(anonymous, anonymous: FILE_OPEN_STATUS) return BOOLEAN;
  -- function "/="(anonymous, anonymous: FILE_OPEN_STATUS) return BOOLEAN;
  -- function "<"(anonymous, anonymous: FILE_OPEN_STATUS) return BOOLEAN;
  -- function "<="(anonymous, anonymous: FILE_OPEN_STATUS) return BOOLEAN;
  -- function ">"(anonymous, anonymous: FILE_OPEN_STATUS) return BOOLEAN;
  -- function ">="(anonymous, anonymous: FILE_OPEN_STATUS) return BOOLEAN;
  -- function MINIMUM (L, R: FILE_OPEN_STATUS) return FILE_OPEN_STATUS;
  -- function MAXIMUM (L, R: FILE_OPEN_STATUS) return FILE_OPEN_STATUS;

  type FILE_OPEN_STATE is (
    STATE_OPEN,   -- File object is open.
    STATE_CLOSED  -- File object is closed.
  );
  -- The predefined operations for this type are as follows:
  -- function "=" (anonymous, anonymous: FILE_OPEN_STATE) return BOOLEAN;
  -- function "/=" (anonymous, anonymous: FILE_OPEN_STATE) return BOOLEAN;
  -- function "<" (anonymous, anonymous: FILE_OPEN_STATE) return BOOLEAN;
  -- function "<=" (anonymous, anonymous: FILE_OPEN_STATE) return BOOLEAN;
  -- function ">" (anonymous, anonymous: FILE_OPEN_STATE) return BOOLEAN;
  -- function ">=" (anonymous, anonymous: FILE_OPEN_STATE) return BOOLEAN;
  -- function MINIMUM (L, R: FILE_OPEN_STATE) return FILE_OPEN_STATE;
  -- function MAXIMUM (L, R: FILE_OPEN_STATE) return FILE_OPEN_STATE;
  
  type FILE_ORIGIN_KIND is (
    FILE_ORIGIN_BEGIN,   -- File open was successful.
    FILE_ORIGIN_CURRENT, -- File object was already open.
    FILE_ORIGIN_END      -- External file not found
  );
  -- The predefined operations for this type are as follows:
  -- function "=" (anonymous, anonymous: FILE_ORIGIN_KIND) return BOOLEAN;
  -- function "/=" (anonymous, anonymous: FILE_ORIGIN_KIND) return BOOLEAN;
  -- function "<" (anonymous, anonymous: FILE_ORIGIN_KIND) return BOOLEAN;
  -- function "<=" (anonymous, anonymous: FILE_ORIGIN_KIND) return BOOLEAN;
  -- function ">" (anonymous, anonymous: FILE_ORIGIN_KIND) return BOOLEAN;
  -- function ">=" (anonymous, anonymous: FILE_ORIGIN_KIND) return BOOLEAN;
  -- function MINIMUM (L, R: FILE_ORIGIN_KIND) return FILE_ORIGIN_KIND;
  -- function MAXIMUM (L, R: FILE_ORIGIN_KIND) return FILE_ORIGIN_KIND;
  [...]
end package STANDARD;

Comments

Version 2

FILE_POS or FILE_POSITION would be far more intuitive than FILE_TELL, then you don't have sentences that say "FILE_TELL returns the current file position". Just because Python uses 'Tell' (for some reason), doesn't mean that VHDL should propagate that silliness.

Should there be a function that can be used to determine "if the underlying file does not support seek operation"? That would give a more graceful way to report the problem rather than causing an error.

-- Kevin Jennings - 2017-01-19

Hmmm, actually, I looked into the C API. Does Python really use tell? I never used it, but yes tell is a strange name .... I just wanted to comply with C. We can discuss this today.

I could add a file_CanSeek function returning a boolean. Is this OK?

For the blue text: Is there a better wording?

-- Patrick Lehmann - 2017-01-19

Version 3

C also has fgetpos=/=fsetpos which are intuitive descriptions but have a funky interface. In C, I'm assuming that ftell was created to give a better interface and named such because fgetpos was already taken (but why 'ftell' itself was thought to be a good name is strange). Why other languages would use 'tell' to mean 'position' is rather a mystery...but probably one that ends with "Well, that's the way C does it" which is weak since if C was so perfect there would not be a need for another language.

Personally, I think using the 'POS attribute and applying it to a file type would be another good approach to get the position, but I'm not opposed to a dedicated function.

File_CanSeek looks good.

I think I know what you're trying to get across with the blue text but I don't understand why that text should be added. In particular, phrases such as "shall move the file position in whole quantities of a value represented in a file" doesn't add anything useful that I can see because you have already defined file position to be an integer (aka 'whole quantity'?). By defining file position as an integer, that implies that there is no notion of a 'fractional' file position.

-- Kevin Jennings - 2017-01-19

@Kevin I was also thinking about a 'pos attribute, but currently, there are no attributes on file types. OTOH, I'm the person adding a lot of new attributes to VHDL, why should I just define 'pos for file types :).

Usually, people use the TEXT file type, which is based on strings and thus uses 1-byte characters, BUT you could also use e.g. an INTEGER or record type. Pascal calls this structured I/O. So no a file position or seek operation is not based on moving the pointer in 1-byte steps. For an 32-bit integer it's moving by 4 bytes; for an record it might be 42 bytes per read operation (when a read returns one value per call). So how do I express that file positions, offsets and sizes are not a value in bytes like in C rather then a move of N bytes, where N is in C terms: sizeof(subtype).

The current LRM doesn't touch this undefined area and defines read and write operations based on values, but when we start to talk about seek operations, I think we should define that we don't mean the C understanding of byte offsets or sizes.

-- Patrick Lehmann - 2017-01-19

Well, if that's what you intend, then I don't think the blue text (or the rest of the LCS) is expressing that (or at least I didn't read it that way). I also don't think it's a good idea since I think all sizes and offsets should be in units of the OS provided bytes, like C does. It sounds like if you have a file that is a collection of let's say 5 integers and assuming that each integer is four bytes then the VHDL File_Size function would return 5 even though the size of the physical file on disk is 20 bytes, and one could seek from position 0 to 4 relative in the file, is that correct? If it is, then how would the File_Size or File_Seek functions know that you want the file size in units of 'integers'? What happens when a file is a collection of different sizes, such as a JPEG file?

A sizeof() function might be needed to be added.

-- Kevin Jennings - 2017-01-19

VHDL does not define any physical underlying size like Ada. But yes, that's how Ada and Pascal work. A file is an array of equal sized objects. VHDL doesn't guarantee compatibility between tools for read and write. But what you write can be read back with the same tool on the same machine. It works for strings and characters, because there is no smaller unit.

C can do the same if you read data into a void* buffer and cast it to an array of records. No you can iterate over whole structure by increasing the array pointer by one.

The sizeof is implicitly given in the file type declaration: type INT is file of INTEGER;

-- Patrick Lehmann - 2017-01-19

How about this for the blue text?

The units of measure of the parameter Offset in FILE_SEEK, the parameter Size in FILE_TRUNCATE, the return value from function FILE_POSITION and the return value from function FILE_SIZE are type_mark from the file_type_definition declaration.

Example:

type IntegerFile is file of INTEGER;
FILE F: IntegerFile := file_open("test.dat");
variable FSize: integer := FILE_SIZE(F); -- FILE_SIZE returns the number of integers that can be read from file "test.dat".

-- Kevin Jennings - 2017-01-19

This is good, and I'll take it as is, but is there any reason to create an entirely new FILE_OPEN_STATE type rather than just use booleans? Is it open? Hey, look, true it is.

-- Rob Gaddi - 2017-01-23

I used a new state so I could name the function FILE_STATE, otherwise it would be FILE_ISOPEN for boolean return values. I'm open to both ideas.

-- Patrick Lehmann - 2017-01-23

How does the blue text applies to the predefined type TEXT?

-- Martin Zabel - 2017-01-30

What about section 6.5.2? Should SEEK, ... considered as updating a file object like WRITE and FLUSH?

-- Martin Zabel - 2017-01-30

Version 4

As a note that probably doesn't matter practically, adding an element to FILE_OPEN_KIND means that code currently using a case statement on one will now break. Likewise, by adding READ_WRITE_MODE in the middle, you also change 'pos/'value and ordinals. That said, I'd be rendered speechless if any of those cases has occurred even 3 times worldwide over the course of history.

-- Rob Gaddi - 2017-02-09

Topic revision: r35 - 2017-07-23 - 22:27:45 - RobGaddi
 
Copyright © 2008-2021 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback