1 Introduction
2 Ground Rules

Building a File System
3 File Systems
4 File Content Data Structure
5 Allocation Cluster Manager
6 Exceptions and Emancipation
7 Base Classes, Testing, and More
8 File Meta Data
9 Native File Class
10 Our File System
11 Allocation Table
12 File System Support Code
13 Initializing the File System
14 Contiguous Files
15 Rebuilding the File System
16 Native File System Support Methods
17 Lookups, Wildcards, and Unicode, Oh My
18 Finishing the File System Class

The Init Program
19 Hardware Abstraction and UOS Architecture
20 Init Command Mode
21 Using Our File System
22 Hardware and Device Lists
23 Fun with Stores: Partitions
24 Fun with Stores: RAID
25 Fun with Stores: RAM Disks
26 Init wrap-up

The Executive
27 Overview of The Executive
28 Starting the Kernel
29 The Kernel
30 Making a Store Bootable
31 The MMC
32 The HMC
33 Loading the components
34 Using the File Processor
35 Symbols and the SSC
36 The File Processor and Device Management
37 The File Processor and File System Management
38 Finishing Executive Startup

Users and Security
39 Introduction to Users and Security
40 More Fun With Stores: File Heaps
41 File Heaps, part 2
42 SysUAF
43 TUser
44 SysUAF API

Terminal I/O
45 Shells and UCL
46 UOS API, the Application Side
47 UOS API, the Executive Side
48 I/O Devices
49 Streams
50 Terminal Output Filters
51 The TTerminal Class
52 Handles
53 Putting it All Together
54 Getting Terminal Input
55 QIO
56 Cooking Terminal Input
57 Putting it all together, part 2
58 Quotas and I/O

UCL
59 UCL Basics
60 Symbol Substitution
61 Command execution
62 Command execution, part 2
63 Command Abbreviation
64 ASTs
65 Expressions, Part 1
66 Expressions, Part 2: Support code
67 Expressions, part 3: Parsing
68 SYS_GETJPIW and SYS_TRNLNM
69 Expressions, part 4: Evaluation

UCL Lexical Functions
70 PROCESS_SCAN
71 PROCESS_SCAN, Part 2
72 TProcess updates
73 Unicode revisted
74 Lexical functions: F$CONTEXT
75 Lexical functions: F$PID
76 Lexical Functions: F$CUNITS
77 Lexical Functions: F$CVSI and F$CVUI
78 UOS Date and Time Formatting
79 Lexical Functions: F$CVTIME
80 LIB_CVTIME
81 Date/Time Contexts
82 SYS_GETTIM, LIB_Get_Timestamp, SYS_ASCTIM, and LIB_SYS_ASCTIM
83 Lexical Functions: F$DELTA_TIME
84 Lexical functions: F$DEVICE
85 SYS_DEVICE_SCAN
86 Lexical functions: F$DIRECTORY
87 Lexical functions: F$EDIT and F$ELEMENT
88 Lexical functions: F$ENVIRONMENT
89 SYS_GETUAI
90 Lexical functions: F$EXTRACT and F$IDENTIFIER
91 LIB_FAO and LIB_FAOL
92 LIB_FAO and LIB_FAOL, part 2
93 Lexical functions: F$FAO
94 File Processing Structures
95 Lexical functions: F$FILE_ATTRIBUTES
96 SYS_DISPLAY
97 Lexical functions: F$GETDVI
98 Parse_GetDVI
99 GetDVI
100 GetDVI, part 2
101 GetDVI, part 3
102 Lexical functions: F$GETJPI
103 GETJPI
104 Lexical functions: F$GETSYI
105 GETSYI
106 Lexical functions: F$INTEGER, F$LENGTH, F$LOCATE, and F$MATCH_WILD
107 Lexical function: F$PARSE
108 FILESCAN
109 SYS_PARSE
110 Lexical Functions: F$MODE, F$PRIVILEGE, and F$PROCESS
111 File Lookup Service
112 Lexical Functions: F$SEARCH
113 SYS_SEARCH
114 F$SETPRV and SYS_SETPRV
115 Lexical Functions: F$STRING, F$TIME, and F$TYPE
116 More on symbols
117 Lexical Functions: F$TRNLNM
118 SYS_TRNLNM, Part 2
119 Lexical functions: F$UNIQUE, F$USER, and F$VERIFY
120 Lexical functions: F$MESSAGE
121 TUOS_File_Wrapper
122 OPEN, CLOSE, and READ system services

UCL Commands
123 WRITE
124 Symbol assignment
125 The @ command
126 @ and EXIT
127 CRELNT system service
128 DELLNT system service
129 IF...THEN...ELSE
130 Comments, labels, and GOTO
131 GOSUB and RETURN
132 CALL, SUBROUTINE, and ENDSUBROUTINE
133 ON, SET {NO}ON, and error handling
134 INQUIRE
135 SYS_WRITE Service
136 OPEN
137 CLOSE
138 DELLNM system service
139 READ
140 Command Recall
141 RECALL
142 RUN
143 LIB_RUN
144 The Data Stream Interface
145 Preparing for execution
146 EOJ and LOGOUT
147 SYS_DELPROC and LIB_GET_FOREIGN

CUSPs and utilities
148 The I/O Queue
149 Timers
150 Logging in, part one
151 Logging in, part 2
152 System configuration
153 SET NODE utility
154 UUI
155 SETTERM utility
156 SETTERM utility, part 2
157 SETTERM utility, part 3
158 AUTHORIZE utility
159 AUTHORIZE utility, UI
160 AUTHORIZE utility, Access Restrictions
161 AUTHORIZE utility, Part 4
162 AUTHORIZE utility, Reporting
163 AUTHORIZE utility, Part 6
164 Authentication
165 Hashlib
166 Authenticate, Part 7
167 Logging in, part 3
168 DAY_OF_WEEK, CVT_FROM_INTERNAL_TIME, and SPAWN
169 DAY_OF_WEEK and CVT_FROM_INTERNAL_TIME
170 LIB_SPAWN
171 CREPRC
172 CREPRC, Part 2
173 COPY
174 COPY, part 2
175 COPY, part 3
176 COPY, part 4
177 LIB_Get_Default_File_Protection and LIB_Substitute_Wildcards
178 CREATESTREAM, STREAMNAME, and Set_Contiguous
179 Help Files
180 LBR Services
181 LBR Services, Part 2
182 LIBRARY utility
183 LIBRARY utility, Part 2
184 FS Services
185 FS Services, Part 2
186 Implementing Help
187 HELP
188 HELP, Part 2
189 DMG_Get_Key and LIB_Put_Formatted_Output
190 LIBRARY utility, Part 3
191 Shutting Down UOS
192 SHUTDOWN
193 WAIT
194 SETIMR
195 WAITFR and Scheduling
196 REPLY, OPCOM, and Mailboxes
197 REPLY utility
198 Mailboxes
199 BRKTHRU
200 OPCOM

Glossary/Index


Downloads

FILESCAN

The FILESCAN system service is used to parse a file into its component fields. As we've seen before with other system services, the FILESCAN system service is embedded within the VMS executive, but it doesn't really belong there. Thus, in UOS, it is included in Starlet instead. This saves the overhead of system calls for a simple non-security-related and non-resource-sharing services. You might ask, "but doesn't the executive also need to parse files? And shouldn't we put that parsing code in a single place?" As a general rule, yes, we want to avoid duplicating code. However, in this case we make an exception and have it in two places for performance sake. Remember: it is relatively expensive to call across rings. And, as you will see, the source code for parsing exists in a single place, it is simply compiled into two different places: ring 0 executive and ring 3 starlet.

Here is the definition for SYS_FILESCAN:

FILESCAN searches a string for a file specification and parses it into its fields.

Format
SYS_FILESCAN( filespec, itemlist, fldflags, auxout, retlen )

Arguments
filespec

String to be searched. This is the address of an SRB structure that points to the string.

itemlist

Item list specifying which components of the file specification are to be returned. This argument is the address of the first descriptor in the list. Each descriptor has the following layout:
BytesDescription
0-3Item Code. Indicates the file specification field to return. See table below for valid item codes.
4-7Length. This is where the length of the field is written. If the corresponding field is missing, 0 is written here.
8-15Address. The address of the start of the field is written here. If the corresponding field is missing, 0 is written here.

fldflgs

The address of where a bitmask is written that indicates which fields of the file specification were specified. If this value is 0, this is ignored. The fields are indicated by the following flag values:
Symbol nameDescription
FSCN_V_DEVICEDevice name
FSCN_V_DIRECTORYDirectory name
FSCN_V_NAMEFile name
FSCN_V_NODENode name
FSCN_V_NODE_ACSAccess control string of primary node
FSCN_V_NODE_PRIMARYPrimary (first) node name
FSCN_V_NODE_SECONDARYSecondary (additional) node information
FSCN_V_ROOTRoot directory name string
FSCN_V_TYPEFile type
FSCN_V_VERSIONVersion number

auxout

Auxillary output buffer. This argument is the address of an SRB structure which indicates where the complete file specification (as provided) is written. Any secondary node information is stripped from the output and quotations are reduced and simplified.
If this value is 0, it is ignored. If provided, the values written to the item list are addresses within this auxillary buffer.

retlen

Auxillary output buffer length. This is the address of an 8-byte integer where the length of the auxillary output buffer is written. If this is 0, no length is written.

Description
The FILESCAN service searches a string for a file specification and parses the fields of that specification. The length and starting addresses of the fields requested are returned. If a field was requested in the item list but not found in the file specification, a length and address of 0 are written to the descriptor. The descriptor list is terminated with a descriptor that has an item code of 0.

The information returned describes the entire contiguous file specification. For example, to extract only the file name and type from the full string, you can use the address of the file name, for the length of the sum of the name and type to obtain the full file name. However, FSCN_NODE_PRIMARY and FSCN_NODE_ACS items contain no double colon (::), so you would have to add 2 to the sum of the lengths of those two fields to obtain the entire node specification.

FILESCAN does not check all aspects of validity in the specification. For instance, it does not verify that the node name specified corresponds to a valid node. Nor does it validate the access control string contents. Nor does it verify the existence of the path or specified file. It treats wildcard characters as any other valid character. It doesn't validate lengths either. Finally, multiple whitespace characters are not collapsed to a single space, nor trimmed from the beginning or end of the string. However, spaces, tabs, and delimiting characters must be enclosed in quotes if they are part of the file name or type, otherwise the character is treated as a terminator for the specification. Quotes used to indicate a node access control string require that the node name be enclosed in quotes and that the quotes delimiting the access control string must be doubled (""). For example, the node specification:
abcd"efg"
would need to be specified as:
"abcd""efg"""

FILESCAN does not assume default values for missing fields or perform logical name translations.

Here are the item codes that can be used in the passed descriptors:
Symbol nameDescription
FSCN_DEVICEReturns length and starting address of the device name, including the colon (:).
FSCN_DIRECTORYReturns the length and starting address of the path, including all backslashes (\).
FSCN_FILESPECReturns the length and starting address of the full file specification.
FSCN_NAMEReturns the length and starting address of the file name, including no syntatical elements.
FSCN_NODEReturns the length and starting address of the node, access control string, and double colon (::).
FSCN_NODE_ACSReturns the length and starting address of the node access control string.
FSCN_NODE_PRIMARYReturns the length and starting address of the primary node name. It doesn't include the double colon (::) or access control string.
FSCN_NODE_SECONDARYReturns the length and starting address of the secondary node string.
FSCN_ROOTReturns the length and starting address of the root diretory of the path, including backslashes (\).
FSCN_TYPEReturns the length and starting address of the file type, including the leading dot (.).
FSCN_VERSIONReturns the length and starting address of the version, including the leading semicolon (;).

function FILESCAN( var Name : string ) : TStringList ;

var Descriptors : array[ FSCN_NODE..FSCN_VERSION + 1 ] of TScan_Descriptor ;
    I : integer ;
    S : string ;
    SRB : TSRB ;

begin
    // Setup...
    Result := TStringList.Create ;
    Set_String( Name, SRB ) ;
    fillchar( Descriptors, sizeof( Descriptors ), 0 ) ;
    Result.Add( '' ) ; // Position 0 unused
    Result.Add( '' ) ; // Position 1 also unused
    for I := FSCN_NODE to FSCN_VERSION do
    begin
        Descriptors[ I ].Code := I ;
        Result.Add( '' ) ;
    end ;

    // Make the call
    SYS_FILESCAN( int64( @SRB ), int64( @Descriptors ), 0, 0, 0 ) ;

    // Parse into result stringlist...
    for I := FSCN_NODE to FSCN_VERSION do
    begin
        if( Descriptors[ I ].Address <> 0 ) then
        begin
            setlength( S, Descriptors[ I ].Length ) ;
            move( PChar( Descriptors[ I ].Address )[ 0 ], PChar( S )[ 0 ], length( S ) ) ;
            Result[ I ] := S ;
        end ; // if( Descriptor[ I ].Address <> 0 )
    end ; // for I := FSCN_V_NODE to FSCN_V_VERSION
end ;
This new function is added to the PasStarlet unit to provide a Pascal interface to the FILESCAN system service. It creates and returns a TStringList instance that contains the parsed file specification. Each offset in this result list corresponds to a FSCN_ constant. Because FSCN_NODE is 2 we add two null strings to the list first off (the list's first element is index 0).

Note: The reason that we start with FSCN_NODE is because the value of 0 is used to indicate the end of a descriptor list and 1 indicates the entire filespec. Since we are returning the individual fields, we set both of those indexes in the result list to null strings.

We fill the descriptor array with zeroes so that the last descriptor is a terminator. Then we loop through the constants for each field, adding a null to the result list as a placeholder, and then set the corresponding descriptor item code to the FSCN_ constant value. Then we call the SYS_FILESCAN system service to parse the specification.

The FSCN_ constants are arranged in the order, from left to right, that the file specification fields occur. Thus, we can iterate from FSCN_NODE to FSCN_VERSION and fill the result list indexes with the appropriate fields. For each descriptor with a non-zero address, we set S to the appropriate length and then copy that many bytes into S and set the string list element to that.

function SYS_FILESCAN( Name, Itemlist : int64 ; Fldflgs : int64 = 0 ;
    auxout : int64 = 0 ; retlen : int64 = 0 ) : int64 ;

begin
    Result := LIB_SYS_FILESCAN( Name, Itemlist, Fldflgs, auxout, retlen ) ;
end ;
As we discussed at the start of the article, UOS implements this service in Starlet. Thus, we redirect a call to SYS_FILESCAN to starlet. Of course, the Starlet version can be called directly even though it doesn't exist in VMS.

type TScan_Descriptor_Array = array[ 0..65535 ] of TScan_Descriptor ;
     PScan_Descriptor_Array = ^TScan_Descriptor_Array ;

function LIB_SYS_FILESCAN( Name, Itemlist : int64 ; Fldflgs : int64 = 0 ;
    auxout : int64 = 0 ; retlen : int64 = 0 ) : int64 ;

var I, L : integer ;
    Access, Device, Nam, Node, Node2, Path, FType, Version, Root : string ;
    Access_Offset, Device_Offset, Name_Offset, Path_Offset, Type_Offset, Version_Offset : integer ;
    Descriptors : PScan_Descriptor_Array ;
    _Offset : integer ;
    Res : int64 ;
    S : string ;
    SRB : PSRB ;
This function exists in Starlet. It implements the FILESCAN system service. First we define a Scan_Descriptor_Array type to make access to the passed descriptor list easier to manipulate in code. This will allow up to 65,536 descriptors in the list, which is far, far more than needed in this case.

begin
    // Setup...
    if( ItemList = 0 ) then
    begin
        exit ;
    end ;
    SRB := PSRB( Name ) ;
    S := Get_String( SRB^ ) ;
    Descriptors := PScan_Descriptor_Array( pointer( Itemlist ) ) ;
    _Offset := 0 ;
    while( pos( copy( S, _Offset + 1, 1 ), ' '#9 ) > 0 ) do
    begin
        inc( _Offset ) ;
    end ;
If the passed item list pointer is nil, we return immediately. Otherwise we get the file specification from the SRB pointer and point our descriptor array to the passed item list. Then we iterate through the passed string until we find a non-whitespace character. _Offset indicates the offset from the start of the string where the actual file specification begins.

    // Parse the string...
    Parse_Filename( copy( S, _Offset + 1, length( S ) ), Node, Access, Node2, Device, Path, Nam, FType, 
        Version ) ;
    if( Auxout <> 0 ) then
    begin
        SRB := PSRB( Auxout ) ;
        S := Node + Device + Path + Nam + FType + Version ;
        if( length( S ) > SRB.Length ) then
        begin
            setlength( S, SRB.Length ) ;
        end ;
        move( PChar( S )[ 0 ], PChar( SRB.Buffer )[ 0 ], length( S ) ) ;
        if( Retlen <> 0 ) then
        begin
            Res := length( S ) ;
            move( Res, Pchar( Retlen )[ 0 ], sizeof( Res ) ) ;
        end ;
    end ;
    Access_Offset := length( Node ) - 2 ;
    Device_Offset := length( Node ) + length( Access ) + length( Node2 ) ;
    Path_Offset := Device_Offset + length( Device ) ;
    Name_Offset := Path_Offset + length( Path ) ;
    Type_Offset := Name_Offset + length( Nam ) ;
    Version_Offset := Type_Offset + length( FType ) ;
Next we call Parse_Filename to do the actual parsing (covered later in this article). If the Auxout value was provided, we build the full specification from its component fields, make sure it is no longer than the result buffer, truncating it if necessary, and then writing its length to the Retlen address, if that was provided. Whether or not Auxout was provided, by the end of the above code, the SRB structure contains the base address that we will be using to write back to the descriptors.

    L := 2 ;
    while( L <= length( Path ) ) do
    begin
        if( Path[ L ] = '\' ) then
        begin
            Root := copy( Path, 1, L ) ;
            break ;
        end ;
        inc( L ) ;
    end ;
Next, we extract the root directory from the path. We start at the second character to avoid the root backslash (if present), and proceed until we find a backslash or the end of the path. Root is then set to that portion of the path.

    // Return addresses...
    I := 0 ;
    while( I < 65535 ) do
    begin
        case Descriptors^[ I ].Code of
            FSCN_NODE : Set_Descriptor( I, 0, Node + Access ) ;
            FSCN_NODE_ACS : Set_Descriptor( I, Access_Offset, Access ) ;
            FSCN_NODE_PRIMARY : Set_Descriptor( I, 0, copy( Node, 3, length( Node ) ) ) ; // Excluding ::
            FSCN_NODE_SECONDARY : Set_Descriptor( I, 0, '' ) ; //todo
            FSCN_DEVICE : Set_Descriptor( I, Device_Offset, Device ) ;
            FSCN_ROOT : Set_Descriptor( I, Path_Offset, Root ) ;
            FSCN_DIRECTORY : Set_Descriptor( I, Path_Offset, Path ) ;
            FSCN_NAME : Set_Descriptor( I, Name_Offset, Nam ) ;
            FSCN_TYPE : Set_Descriptor( I, Type_Offset, FType ) ;
            FSCN_VERSION : Set_Descriptor( I, Version_Offset, Version ) ;
            else break ;
        end ;
        inc( I ) ;
    end ;
Next, we iterate through the passed descriptors, until we hit a terminator (any code other than one of the FSCN_ constants) or reach the 65,536th one, and call the local Set_Descriptor function to write the value to the current descriptor.

    if( Fldflgs <> 0 ) then
    begin
        Res := 0 ;
        if( length( Node ) <> 0 ) then
        begin
            Res := FSCN_V_NODE ;
        end ;
        if( length( Access ) <> 0 ) then
        begin
            Res := Res or FSCN_V_NODE_ACS ;
        end ;
        if( length( Node ) <> 0 ) then
        begin
            Res := Res or FSCN_V_NODE_PRIMARY ;
        end ;
        // Res := Res or FSCN_V_NODE_SECONDARY ; //todo
        if( length( Device ) <> 0 ) then
        begin
            Res := Res or FSCN_V_DEVICE ;
        end ;
        if( length( Path ) <> 0 ) then
        begin
            Res := Res or FSCN_V_ROOT or FSCN_V_DIRECTORY ;
        end ;
        if( length( Nam ) <> 0 ) then
        begin
            Res := Res or FSCN_V_NAME ;
        end ;
        if( length( FType ) <> 0 ) then
        begin
            Res := Res or FSCN_V_TYPE ;
        end ;
        if( length( Version ) <> 0 ) then
        begin
            Res := Res or FSCN_V_VERSION ;
        end ;
        move( Res, PChar( Fldflgs )[ 0 ], sizeof( Res ) ) ;
    end ; // if( Fldflgs <> 0 )
end ; // LIB_SYS_FILESCAN
Finally, if Fldflgs was specified, we construct the bitmask and then write it to the specified address. A given flag is set if the corresponding field is non-null.

    procedure Set_Descriptor( Index, Offset : integer ; const S : string ) ;

    begin
        if( ( length( S ) = 0 ) or ( Offset + _Offset >= SRB.Length ) ) then
        begin
            Descriptors^[ Index ].Address := 0 ;
            Descriptors^[ Index ].Length := 0 ;
        end else
        begin
            Descriptors^[ Index ].Address := SRB^.Buffer + Offset + _Offset ;
            Descriptors^[ Index ].Length := length( S ) ;
        end ;
    end ;
This local function writes the descriptor element Index. Offset is the offset from the start of the string and _Offset is any offset for leading whitespace. If the passed field value is null or is after the end of the SRB buffer, we write 0 to the address and length. Otherwise we write the length and the address which is built from the SRB's buffer address plus the offsets.

Note: You might be wondering how the offset could ever be greater than the length of the SRB buffer, since that was passed in to us. But remember that if Auxout is specified, we write the file specification to that address and return addresses for that buffer instead of the buffer passed in to us. The problem stems from the possibility that the buffer specified for Auxout may be too small to hold the entire specification. In that case, the offset for some of the fields may be beyond what fit in the buffer. So if the offset is beyond that buffer length, we cannot write an address that is not part of the specification - and may very well be invalid memory.

procedure Parse_Filename( const S : string ;
    var Node, Access, Secondary_Node, Device, Path, Name, Extension, Version : string ) ;

var I, L, Last, P : integer ;
    In_Quotes : boolean ;

begin
    // Setup
    In_Quotes := False ;
    I := 0 ;
This function does the actual file specification parsing. It is placed in the UOS_Util unit which is used by both the FIP and Starlet units. Thus, although in a compiled sense it exists in two places, there is only a single source file for it.

    // Find node name...
    Access := '' ;
    I := 1 ;
    Node := Parse_Field_Until( '::', ':' ) ;
    if( pos( '::', Node ) = 0 ) then // No node
    begin
        Node := '' ;
        I := 1 ;
    end else
    begin
        P := pos( '"', Node ) ;
        if( P > 0 ) then
        begin
            Access := copy( Node, P, length( Node ) ) ;
            setlength( Node, P - 1 ) ;
            Node := Node + '::' ;
            setlength( Access, length( Access ) - 2 ) ;
        end ;
    end ;
First, we get the Node by calling Parse_Field_Until. If there is no double-colon in the node name, then it is not a node name and we clear it and reset our string index (I) to the beginning of the string. Otherwise, we then look for any quote within the node name. If found, this indicates an access control string, so we extract that from the node name and save it in Access.

    // Device...
    L := I ;
    Device := Parse_Field_Until( ':', '\' ) ;

    // Path...
    L := I ;
    Path := Parse_Field_While( '\', ' ', True ) ;
Next we parse the device name from the specification, and then the path. You will note the two different parse functions. We will cover these later in the article, but the are largely the same. The difference is that Parse_Field_Until processes the string until the specified terminator is found, while Parse_Field_While processes the string until the last instance of the specified delimiter is found. Thus, a device name ends as soon as a colon or double-colon is encountered. The path ends with the last backslash that is found in the specification.

    // Name...
    L := I ;
    Name := Parse_Field_While( '.', ';') ;
    if( copy( Name, length( Name ), 1 ) = '.' ) then
    begin
        setlength( Name, length( Name ) - 1 ) ; // Trim dot
        dec( I ) ;
    end ;
Next we parse the file name. This ends with the last encountered dot. The Parse_Field_While function includes the dot, so if the last character of the name is a dot, we trim it and decrement the string index. We have to check to see if there is a dot at the end, because there may be no dot in the file specification at all.

    // Type...
    L := I ;
    Extension := Parse_Field_While( ';' , ' ' ) ;
    if( Last > 0 ) then // Semicolon found
    begin
        if( Valid_Version( Last, I ) ) then
        begin
            Extension := copy( S, L, Last - L ) ;
            I := Last ;
        end else
        begin
            Extension := Extension + Parse_Field_Until( ' ', ' ' ) ;
        end ;
    end ;
Now we parse the file type (extension). The parsing ends at the last semicolon. But if the characters following the semicolon are not an integer value, it isn't a version but part of the extension. This is checked with the Valid_Version function. If the version isn't valid, we add it to the extension by calling the Parse_Field_Until with a terminating delimiter of space. Since spaces already end parsing, this essentially parses until the end of the file specification, thus putting all of the remaining specification into the extension.

    // Version...
    Version := '' ;
    L := I ;
    if( I <= length( S ) ) then
    begin
        if( S[ I ] = ';' ) then // Found version
        begin
            Version := ';' ;
            inc( I ) ;
            if( S[ I ] = '-' ) then
            begin
                inc( I ) ;
                Version := Version + '-' ;
            end ;
            while( ( I <= length( S ) ) and ( pos( S[ I ], '0123456789' ) > 0 ) ) do
            begin
                Version := Version + S[ I ] ;
                inc( I ) ;
            end ;
        end ;
    end ; // if( I <= length( S ) )
end ; // Parse_Filename
Finally, we iterate through the rest of the specification ending when the end of the specification is reached or a value that ends a valid integer value. The integer can be negative, so the first character after the semicolon may be a dash (-).

    function Valid_Version( Starting, Ending : integer ) : boolean ;

    begin
        Result := False ;
        if( Ending > length( S ) ) then
        begin
            Ending := length( S ) ;
        end ;
        if( S[ Starting ] <> ';' ) then
        begin
            exit ;
        end ;
        inc( Starting ) ;
        if( copy( S, Starting, 1 ) = '-' ) then
        begin
            inc( Starting ) ;
        end ;
        while( Starting <= Ending ) do
        begin
            if( pos( copy( S, Starting, 1 ), '0123456789' ) = 0 ) then
            begin
                exit ;
            end ;
            inc( Starting ) ;
        end ;
        Result := True ;
    end ;
This local function checks to see if the range of characters passed to the function constitute a valid version field.

    function Parse_Field_Until( const Terminator, Next_Terminator : string ) : string ;

    begin
        // Find field...
        Result := '' ;
        while( I <= length( S ) ) do
        begin
            if( S[ I ] = '"' ) then
            begin
                if( copy( S, I + 1, 1 ) = '"' ) then // ""
                begin
                    if( In_Quotes ) then
                    begin
                        Result := Result + '"' ;
                    end ;
                    inc( I ) ;
                end else
                begin
                    In_Quotes := not In_Quotes ;
                end ;
            end else
            if( In_Quotes ) then
            begin
                Result := Result + S[ I ] ;
            end else
            if( ( S[ I ] = ' ' ) or ( S[ I ] = ',' ) or ( S[ I ] = HT ) ) then
            begin
                break ;
            end else
            if( copy( S, I, length( Terminator ) ) = Terminator ) then
            begin
                I := I + length( Terminator ) ;
                Result := Result + Terminator ;
                break ;
            end else
            if( S[ I ] = Next_Terminator ) then // Found terminator for next field, not this one
            begin
                Result := '' ;
                break ;
            end else
            begin
                Result := Result + S[ I ] ;
            end ;
            inc( I ) ;
        end ; // while( I <= length( S ) )
    end ; // .Parse_Field_Until
This local function parses through the file specification one character at a time. If a quote is encountered, we toggle the quote flag unless the next character is also a quote, in which case we treat it as a single literal quote. Otherwise, if a space or comma is found, we end the processing. Otherwise, if we encounter the terminator, we include that in the current field and exit. Otherwise, if we encounter the next field's terminator, we exit with a null field - because if we encounter the next field's terminator without encountering this field's terminator then it means that this field was not found at all. If we make it through this if...then gauntlet, we add the character to the result and then loop to the next character.

    function Parse_Field_While( Separator, Next_Separator : string ; Required : boolean = False ) : string ;

    begin
        Last := 0 ;
        Result := '' ;
        while( I <= length( S ) ) do
        begin
            if( S[ I ] = '"' ) then
            begin
                if( copy( S, I + 1, 1 ) = '"' ) then // ""
                begin
                    if( In_Quotes ) then
                    begin
                        Result := Result + '"' ;
                    end ;
                    inc( I ) ;
                end else
                begin
                    In_Quotes := not In_Quotes ;
                end ;
            end else
            if( In_Quotes ) then
            begin
                Result := Result + S[ I ] ;
            end else
            if( S[ I ] = Separator ) then
            begin
                Result := Result + Separator ;
                Last := I ;
            end else
            if( S[ I ] = Next_Separator ) then
            begin
                break ;
            end else
            if( ( S[ I ] = ' ' ) or ( S[ I ] = ',' ) or ( S[ I ] = HT ) ) then
            begin
                break ;
            end else
            begin
                Result := Result + S[ I ] ;
            end ;
            inc( I ) ;
        end ; // while( I <= length( S ) )
        if( pos( Separator, Result ) = 0 ) then // No separator
        begin
            if( Required ) then // Separator is required
            begin
                Result := '' ;
                I := L ;
            end ;
        end else
        begin
            Result := copy( S, L, Last - L + 1 ) ;
            I := Last + 1 ;
        end ;
        if( I < 1 ) then
        begin
            I := 1 ;
        end ;
    end ; // .Parse_Field_While
This local function works much like the previous Parse_Field_Until function. However, we process to the end of the specification keeping track of the position of the last encountered separator. Then we return everything up to, and including, that separator. There are two possibilities. Either the separator is required (such as with a path) or it is not. The difference is that if the separator is required but not found, then the field is considered to have been omitted and a null string is returned. If the separator is not required, then the field simply continues to the end of the specification. An example is the file name. It terminates with an extension separator (.), if found. But the lack of an extension simply means that the name continues to the end of the specification - or until the specified next separator.

In the next article, we will look at the SYS_PARSE service.

 

Copyright © 2020 by Alan Conroy. This article may be copied in whole or in part as long as this copyright is included.