1 Introduction
2 Ground Rules

Building a File System
3 File Systems
4 File Content Data Structure
5 Allocation Cluster Manager
6 Exceptions and Emancipation
7 Base Classes, Testing, and More
8 File Meta Data
9 Native File Class
10 Our File System
11 Allocation Table
12 File System Support Code
13 Initializing the File System
14 Contiguous Files
15 Rebuilding the File System
16 Native File System Support Methods
17 Lookups, Wildcards, and Unicode, Oh My
18 Finishing the File System Class

The Init Program
19 Hardware Abstraction and UOS Architecture
20 Init Command Mode
21 Using Our File System
22 Hardware and Device Lists
23 Fun with Stores: Partitions
24 Fun with Stores: RAID
25 Fun with Stores: RAM Disks
26 Init wrap-up

The Executive
27 Overview of The Executive
28 Starting the Kernel
29 The Kernel
30 Making a Store Bootable
31 The MMC
32 The HMC
33 Loading the components
34 Using the File Processor
35 Symbols and the SSC
36 The File Processor and Device Management
37 The File Processor and File System Management
38 Finishing Executive Startup

Users and Security
39 Introduction to Users and Security
40 More Fun With Stores: File Heaps
41 File Heaps, part 2
42 SysUAF
43 TUser
44 SysUAF API

Terminal I/O
45 Shells and UCL
46 UOS API, the Application Side
47 UOS API, the Executive Side
48 I/O Devices
49 Streams
50 Terminal Output Filters
51 The TTerminal Class
52 Handles
53 Putting it All Together
54 Getting Terminal Input
55 QIO
56 Cooking Terminal Input
57 Putting it all together, part 2
58 Quotas and I/O

UCL
59 UCL Basics
60 Symbol Substitution
61 UCL Command execution
62 UCL Command execution, part 2
63 UCL Command Abbreviation
64 ASTs
65 UCL Expressions, Part 1
66 UCL Expressions, Part 2: Support code
67 UCL Expressions, part 3: Parsing
68 SYS_GETJPIW and SYS_TRNLNM
69 UCL Expressions, part 4: Evaluation

UCL Lexical Functions
70 PROCESS_SCAN
71 PROCESS_SCAN, Part 2
72 TProcess updates
73 Unicode revisted
74 Lexical functions: F$CONTEXT
75 Lexical functions: F$PID
76 Lexical Functions: F$CUNITS
77 Lexical Functions: F$CVSI and F$CVUI
78 UOS Date and Time Formatting
79 Lexical Functions: F$CVTIME
80 LIB_CVTIME
81 Date/Time Contexts
82 SYS_GETTIM, LIB_Get_Timestamp, SYS_ASCTIM, and LIB_SYS_ASCTIM
83 Lexical Functions: F$DELTA_TIME
84 Lexical functions: F$DEVICE
85 SYS_DEVICE_SCAN
86 Lexical functions: F$DIRECTORY
87 Lexical functions: F$EDIT and F$ELEMENT
88 Lexical functions: F$ENVIRONMENT
89 SYS_GETUAI
90 Lexical functions: F$EXTRACT and F$IDENTIFIER
91 LIB_FAO and LIB_FAOL
92 LIB_FAO and LIB_FAOL, part 2
93 Lexical functions: F$FAO
94 File Processing Structures
95 Lexical functions: F$FILE_ATTRIBUTES
96 SYS_DISPLAY
97 UCL Lexical functions: F$GETDVI
98 Parse_GetDVI

Glossary/Index


Download sources
Download binaries

UCL Expressions, part 2: Support code

In the previous article, we documented the behavior and syntax of UCL expressions. In this article, we will look at the support code and definitions used by the expression parsing and evaluation.

First let us examine the Get_Token function. We've used this function previously to get the next token from the input stream. It is a wrapper for the parser class, of which UCL uses an instance. We won't cover the TParser class here (or any time soon), but the Get_Token function is important to the operation of UCL expression parsing.

function Get_Token : string ;

var S : string ;

begin
    Result := Parser.Token ;
    if( copy( Result, 1, 1 ) = '!' ) then // Comment
    begin
        while( copy( Result, 1, 1 ) = '!' ) do 
        begin
            Parser.Get_Token_Line ;
            Result := Parser.Token ;
        end ;
    end ;
    if( copy( Result, 1, 1 ) = '%' ) then // Possible hexadecimal value
    begin
        S := Parser.Token ;
        if( lowercase( copy( S, 1, 1 ) ) = 'x' ) then
        begin
            Result := Result + S ;
        end else
        begin
            Parser.Put_Token( S ) ;
        end ;
    end else
    if( Result = '.' ) then // operator
    begin
        Result := '.' + Parser.Token + Parser.Token ;
    end ;
end ;
This function gets the next token from the parser instance. It then checks for three special cases.
  1. If the token starts with an exclamation point (!) that indicates that the rest of the token line is a comment. So we grab (and discard) the line, get the next token, and then loop back again if its another exclamation.
  2. If the token starts with a percent sign (%), we get the next token and see if it starts with (or is) "X". If not, we push that token back and simply return the percent sign. However, if the next character starts with (or is) "X", then we return the percent sign and that token, assuming that it is a hexadecimal value. The validity of the hexadecimal value is checked by the calling code - we do no syntax validation here.
  3. If the token starts with a period (.), it is an operator. So we get the next token, which will be the operator name (e.g. "EQS" or "NE"), and then the closing period. Note that if the third token is not a period, the operator will not be recognized as a valid operator and the calling code will issue an error. Again, we do no validation here.
By the end of the function, we will return the whole UCL symbol, having skipped any comments.

function Valid_Int( S : string ; var I : int64 ) : boolean ;

begin
    if( copy( S, 1, 2 ) = '%X' ) then
    begin
        S := copy( S, 3, length( S ) ) ;
        if( not Valid_Hex( S ) ) then
        begin
            Result := False ;
            exit ;
        end ;
        Result := True ;
        I := From_Hex( S ) ;
        exit ;
    end ;
    Result := trystrtoint64( S, I ) ;
end ;
The Valid_Int function checks that a string contains a valid integer numeric value and, if valid, returns it in int64 form. First it checks to see if the first two characters are "%X". If so, this is a possible hexadecimal value. So we trim off the first two characters and check if the remaining characters are valid hexadecimal digits. If not, we return false. Otherwise, we convert from the hexadecimal string to an integer value and return true.
If the string doesn't start with "%X", we assume it is a decimal (base 10) number. We use the Pascal trystrtoint64 function to try to convert the value and return true/false, depending on the validity of the number.

function UCL_Strtoint( S : string ) : string ;

var I : int64 ;

begin
    Result := '' ;
    if( copy( S, 1, 2 ) = '%X' ) then
    begin
        S := copy( S, 3, length( S ) ) ;
        if( not Valid_Hex( S ) ) then
        begin
            Result := '0' ;
            exit ;
        end ;
        Result := inttostr( From_Hex( S ) ) ;
        exit ;
    end ;
    if( Valid_Int( S, I ) ) then
    begin
        Result := S ;
        exit ;
    end else
    if( S = '' ) then
    begin
        Result := '0' ;
    end else
    if( pos( S[ 1 ], 'TtYy' ) = 0 ) then
    begin
        Result := '0' ;
    end else
    begin
        Result := '1' ;
    end ;
end ;
This function does type conversions on a string and returns a string containing the numeric value of the converted string, according to the rules we defined in the previous article.
First, we check for the "%X" prefix. If present, we convert the value from hexadecimal to decimal. However, if the hexadecimal value is invalid, we return 0.
Second, we check for a valid integer numeric. If valid, we simply return the passed value.
Third, if the string is null, return 0.
Finally, if the string begins with T, t, Y, or y, return 1. Otherwise, return 0.

Now let's define some constants:

const Op_EQ = 1 ;
const Op_EQS = 2 ;
const Op_NE = 3 ;
const Op_NES = 4 ;
const Op_GE = 5 ;
const Op_GES = 6 ;
const Op_LE = 7 ;
const Op_LES = 8 ;
const Op_GT = 9 ;
const Op_GTS = 10 ;
const Op_LT = 11 ;
const Op_LTS = 12 ;
const Op_NOT = 13 ;
const Op_OR = 14 ;
const Op_AND = 15 ;
const Op_Multiply = 20 ;
const Op_Divide = 21 ;
const Op_Add = 22 ;
const Op_Subtract = 23 ;
The above constants define the various possible operators.

const Function_Context = 100 ;
const Function_CSID = 101 ;
const Function_Cunits = 102 ;
const Function_Cvsi = 103 ;
const Function_Cvtime = 104 ;
const Function_Cvui = 105 ;
const Function_Delta = 106 ;
const Function_Device = 107 ;
const Function_Directory = 108 ;
const Function_Edit = 109 ;
const Function_Element = 110 ;
const Function_Environment = 111 ;
const Function_Extract = 112 ;
const Function_FAO = 113 ;
const Function_FID_To_Name = 114 ;
const Function_File_Attributes = 115 ;
const Function_GetDVI = 116 ;
const Function_Getenv = 117 ;
const Function_GetJPI = 118 ;
const Function_GETQUI = 119 ;
const Function_GETSYI = 120 ;
const Function_IDENTIFIER = 121 ;
const Function_INTEGER = 122 ;
const Function_LENGTH = 123 ;
const Function_LICENSE = 124 ;
const Function_LOCATE = 125 ;
const Function_MATCH_WILD = 126 ;
const Function_MESSAGE = 127 ;
const Function_MODE = 128 ;
const Function_MULTIPATH = 129 ;
const Function_PARSE = 130 ;
const Function_PID = 131 ;
const Function_PRIVILEGE = 132 ;
const Function_PROCESS = 133 ;
const Function_SEARCH = 134 ;
const Function_SETPRV = 135 ;
const Function_STRING = 136 ;
const Function_TIME = 137 ;
const Function_TRNLNM = 138 ;
const Function_TYPE = 139 ;
const Function_UNIQUE = 140 ;
const Function_USER = 141 ;
const Function_VERIFY = 142 ;
These constants define the various lexical functions.

// UCL errors...
const UCL_OK = 0 ; // No error
const UCL_ARGREQ = 1 ; // Missing argument
const UCL_EXPSYN = 2 ; // Invalid expression syntax
const UCL_IVCHAR = 3 ; // Invalid numeric value
const UCL_IVKEYW = 4 ; // Unrecognized keyword
const UCL_IVOPER = 5 ; // unrecognized operator in expression
const UCL_MISSRP = 6 ; // Missing right parenthesis
const UCL_NOLBLS = 7 ; // Label ignored
const UCL_NULFIL = 8 ; // Missing or invalid file specification
const UCL_ONERR  = 9 ; // use WARNING, SEVER, ERROR or CONTROL_Y
const UCL_ONLEVL = 10 ; // Invalid ON context
const UCL_SYMDEL = 11 ; // Invalid symbol or value delimiter
const UCL_UNDSYM = 12 ; // Undefined symbol
These constants define the various UCL errors that we've already seen, or will encounter during expression parsing and evaluation.

Expression Trees
The expression evaluation occurs in two phases: 1) parsing, and 2) evaluating. The parsing phase builds an expression tree for later evaluation. The expression tree is a binary tree, which we will look at in a moment. But first, why create a tree and then evaluate it rather than evaluate as we go? Certainly, we could take that approach. However, the issue of precedence (discussed in the previous article) would make the code more complicated. Instead, we create the tree, and make slight adjustments to it based on precedence as we encounter operators and parentheses. Once the entire expression is parsed, the tree is complete and the evaluation is done recursively - as we will see in a couple articles from now. The tree recursion is, perhaps, the simplest way to evaluate things in precedence order. We'll look at the tree structure next, but first here is the structure we use for each node in the tree:

type tExpression_Node = class
                            Operator : longint ; // Operator
                            Precedence : integer ; // Precedence of node
                            Back, Left, Right : tExpression_Node ;
                            Value : Ansistring ;
                        end ;
Each node is either a value or an operator. If it is an operator, the Operator value will be the constant value associated with the operator (see above). Otherwise, Value contains the value. Precedence only has meaning if the node is an operator. In such case, the precedence is used to build the tree in correct precedence order. The Left and Right links point to the left and right values that the operator is acting on (or only the right value if this is a unary operator). Values are always terminal nodes on the tree and have no left or right links. The Back link points back to the parent node of this node, which simply makes traversing the tree easier.

Before we continue looking at code, let's consider some simple examples of expression trees to illustrate what we will be building.

These three examples of expression trees should help illustrate what we are talking about. Before any operator node can be evaluated, all sub-trees must first be evaluated. Thus, placing higher precedence operators into subtrees ensures that they will be evaluated before other operations. Such is the case with the rightmost example. Before the addition operation in the root node can be evaluated, the right sub-tree has to be evaluated. Evaluating that sub-tree gives us a value of 6. Then the addition can be done, resulting in 7. Using parentheses changes the tree structure so that the sub-trees that are evaluated first are those within the parentheses. For example:

Once we've evaluated the expression tree, we will need to free up the memory used by the data structure. The following function accomplishes this:

procedure Zero_Expression_Tree( X : tExpression_Node ) ;
// Zero storage stored at tree with root X

var Current, Previous : tExpression_Node ;

begin
    // Start at root
    Current := X ;
    while( Current <> nil ) do
    begin
        // Go all the way left 
        while( Current.Left <> nil ) do
        begin
            Current := Current.Left ;
        end ;
        if( Current.Right <> nil ) then
        begin
            Current := Current.Right ;
        end else
        begin
            Previous := Current.Back ;
            if( Previous <> nil ) then
            begin
                if( Previous.Left = Current ) then
                begin
                    Previous.Left := nil ;
                end else
                begin
                    Previous.Right := nil ;
                end ;
            end ;
            Current.Free ;
            Current := Previous ;
        end ;
    end ; // while( Current <> nil ) do
end ; // Zero_Expression_Tree
This function clears the sub-tree with the root node that was passed in. If the root of the entire expression tree is passed, then the entire expression tree is deleted from memory. The algorithm is to follow the left links all the way to a node with no left link, then see if there is a right link. If so, go all the way left from that right node. This is repeated until we are at a terminal node. We free the terminal node, then back up to the parent node and clear the link to the node we just freed (se we don't try to traverse the deleted node again). Then we repeat the process from that point. We could do this recursively, but that requires more stack space for each recursion. Looping like this takes the minimum amount of resources, although the code is slightly more complicated.

We'll leave it here for now. In the next article, we will look at the parsing code itself, as it builds the expression tree. That will be a long enough article by itself, so we put the support code in this separate article so that the next one won't be so long.

 

Copyright © 2019 by Alan Conroy. This article may be copied in whole or in part as long as this copyright is included.