1 Introduction
2 Ground Rules

Building a File System
3 File Systems
4 File Content Data Structure
5 Allocation Cluster Manager
6 Exceptions and Emancipation
7 Base Classes, Testing, and More
8 File Meta Data
9 Native File Class
10 Our File System
11 Allocation Table
12 File System Support Code
13 Initializing the File System
14 Contiguous Files
15 Rebuilding the File System
16 Native File System Support Methods
17 Lookups, Wildcards, and Unicode, Oh My
18 Finishing the File System Class

The Init Program
19 Hardware Abstraction and UOS Architecture
20 Init Command Mode
21 Using Our File System
22 Hardware and Device Lists
23 Fun with Stores: Partitions
24 Fun with Stores: RAID
25 Fun with Stores: RAM Disks
26 Init wrap-up

The Executive
27 Overview of The Executive
28 Starting the Kernel
29 The Kernel
30 Making a Store Bootable
31 The MMC
32 The HMC
33 Loading the components
34 Using the File Processor
35 Symbols and the SSC
36 The File Processor and Device Management
37 The File Processor and File System Management
38 Finishing Executive Startup

Users and Security
39 Introduction to Users and Security
40 More Fun With Stores: File Heaps
41 File Heaps, part 2
42 SysUAF
43 TUser
44 SysUAF API

UCL
45 Shells and UCL

Glossary/Index


Download sources
Download binaries

Contiguous Files

Okay, after a break for the holidays, I'm ready to jump back into the development of UOS. However, before we continue with the file system, we need to change our UOS Native File class.
Let's get theoretical here for a moment... There are three important aspects to software engineering (or, in fact, any kind of engineering). The first aspect of engineering is to identify goals - what we want to end up with. If we don't have clear goals, we neither know what direction we are heading nor will we know when we get there. Without the focus of goals, no engineering project will be a success. But it might be a spectacular failure! The second aspect of engineering is the design. The design is guided by a design philosophy. I don't mean the philosophy of design itself, but rather the philosophy you will use to reach your goals. That is, what is the general approach one will take to meet the goals. The approach must be complex enough to reach our goals, but no more complex than that. In fact, the simpler the approach, the more likely you are to complete the project. The philosophy (or approach, or design) is important because engineering is, essentially, managing complexity. Having a clear philosophy is important because if the design isn't clear to you, you will find that different parts of your code are not going to work well together. Changing philosophy mid-stream is just as bad. Imagine if you started building an airplane and you put a jet engine on one wing, but then decided to put a rocket engine on the other. Spectacular failure. What if our goals change? Well, then your philosophy may need to change as well. But unless it is a minor adjustment, it is best to scrap what you've done and start over. You may think you are losing time, but you will gain that back over the life of the project, and then some. The third aspect of engineering is the implementation. In software, this encompasses the classes, algorithms, and data structures that you use to meet your goals in accordance with your design. It is vital to recognize that goals determine philosophy and philosophy determines implementation - and not the other way around. You don't change your goals to match your approach, nor do you change your approach to match your implementation.

Picking the right goals is a matter of understanding the problem domain and the requirements placed on you by others and/or the situation. We stated the goals of UOS early on. Picking the right approach is largely a matter of experience. It may require trial and error to gain that experience. But if you choose the wrong approach, you will not meet your goals. Even experienced engineers may sometimes choose the wrong approach, especially if they are doing something new and different. Fortunately, we have not only my experience in operating systems, but we are standing on the shoulders of the operating system designers of the past who have already worked out many of the sticky problems that present themselves in this domain. We have talked a bit about our approach to UOS before, but in order to keep these articles from bogging down, I've revealed it in bits and pieces as we go along. Mainly we've been discussing the implementation aspect of UOS.

One general approach of all software design is the concept of "layering". That is, we have one layer of software that does one thing, and then a layer above that which adds additional features. And one above that, etc. The idea is that each layer does a few things very well instead of a bunch of things inefficiently. So far, we've talked about 5 layers to our file system: the store which is concerned solely with the storage and access of data; the managed store, which adds allocation management to the store; the Allocation Cluster Manager, which manages variable-length buffers (files) through a chain of pointers on a managed store; the UOS Native File class, which abstracts the data access from the cluster-oriented data access of the ACM and adds the concept of multiple stores; and the file system, which supports multiple files on the store and manages their relationship to each other. Each layer from the store up to the file system does one or two things which are easy to implement, test, and maintain. This is part of our general approach/design. Another aspect of our design is to push the features up to the highest level that makes sense. We don't implement multiple streams inside the ACM class, because it would make the ACM overly complicated and more prone to failures. It is easier to manage two classes, one of which simply uses multiple instances of the other to accomplish its goals. Here is a diagram of our current layering (or code "stack").

But sometimes it is difficult to know exactly where to place a given feature. And sometimes mistakes can be made. The general rule of thumb here is to place features where they most easily fit. Remember, we want to keep things conceptually simple or we will get lost inside the code as it grows in size. In general, the best place to implement a feature is where it will require the least and simplest (most straight-forward) code to implement. I bring this up because I had originally planned to implement the contiguous file feature in the file system class. However, as I started to contemplate the actual code for this feature, it became obvious that the best place for it was in the UOS Native File class. So, we will go back to visit that class just long enough to implement the contiguous file feature, which requires only a few additions to the existing methods, as it turns out. It doesn't go in the ACM because we cache the first few clusters of data in the file header (which the ACM knows nothing about), and we want to push the feature to as high a level as possible without making the code more complicated than necessary.

Here is the design we will use for contiguous files:
A non-contiguous file is made up of some number of clusters all linked via the Allocation Cluster chains. But if the file is gathered into one place instead of multiple extents, there is no need for a chain of clusters. We simply need a start cluster and a length. Since the file header already holds five allocation clusters, we can place the start cluster in the first header allocation cluster. The rest will be unused. Nor will there need to be the use of the ACM for the file's data, although it will still be used for meta data. So, the means of accessing the data depends upon the setting of the contiguous flag. Here is the code for class definition:

_Contiguous : boolean ;

function Get_Contiguous : boolean ; override ;

procedure Set_Contiguous( Value : boolean ) ; override ;

procedure Set_Allocation_Cluster( Stream, Index : longint ; Value : TStore_Address64 ) ; virtual ;

We add _Contiguous as instance data, and define a public getter and setter. We also add a new method, which we will discuss in a bit.

First we update Offset_To_Pointer:

function TUOS_Native_File.Offset_To_Pointer( ACM : TNative_File_ACM ;
    Position : TStore_Address64 ) : TStore_Address64 ;

var H : TStore_Address64 ;

begin
    if( ACM = Streams[ 0 ] ) then // Data stream
    begin
        H := Position div ACM.Get_Clustersize ; // Cluster index
        if( Contiguous ) then
        begin
            Result := Header.Clusters[ 0 ] + H * ACM.Get_Clustersize ;
            Position := Position - Result ;
            exit ;
        end else
        begin
            if( H <= Max_Header_Cluster ) then
            begin
                Result := Header.Clusters[ H ] ;
                exit ;
            end ;
            Position := Position - ( Max_Header_Cluster + 1 ) * ACM.Get_Clustersize ;
        end ;
    end ;
    Result := ACM.Offset_To_Pointer( Position ) ;
end ;

If contiguous, we calculate a given file cluster based on the first cluster, plus the offset. If not set, we use the existing code.

Next we add code to the Set_Size method to handle resizing a contiguous file:

        if( Contiguous ) then // Changing the size of a contiguous store
        begin
            if( Header.Size = 0 ) then // First allocation
            begin
                P := _Store.Allocate( Size ) ;
            end else
            if( Size = 0 ) then
            begin
                _Store.Deallocate( Header.Clusters[ 0 ], Header.Size ) ;
                P := 0 ;
            end else
            begin
                P := _Store.Reallocate( Header.Clusters[ 0 ], Header.Size, Size ) ;
            end ;
            if( ( P = 0 ) and ( Size > 0 ) ) then
            begin
                Set_Last_Error( Create_Exception( UOS_File_Error_Out_Of_Space, _Store.Last_Error ) ) ;
                exit ;
            end ;
            Header.Clusters[ 0 ] := P ;
            if( Header.EOF > Requested_Size ) then
            begin
                Header.EOF := Requested_Size ;
            end ;
            Header.Size := Size ;
            Update_Header ;
            exit ;
        end ;

If the file has no data, we simply allocate the requested size. If we are setting the size to 0, then we simply deallocate what we have. Otherwise, however, we need to see if we can resize the allocated data. When a store reallocates data, it allocates the new area, copies the data, and deallocates the old. It then returns the new location, which we update our header with. Note: some stores may opt to attempt to resize the allocated data in place, which saves on the work of copying the data, so we may get the same store address back that we already had. But to our code, it doesn't matter. Finally, we make sure that our EOF offset isn't past the new size.

We also make a slight change to Get_Allocation_Cluster:

        if( Contiguous ) then
        begin
            Result := Header.Clusters[ 0 ] ;
            exit ;
        end ;

Set_Allocation_Cluster allows us to set individual cluster values in the allocation cluster chain, which we will need in the Set_Contiguous method:

procedure TUOS_Native_File.Set_Allocation_Cluster( Stream, Index : longint ;
    Value : TStore_Address64 ) ;

var ACM : TNative_File_ACM ;

begin
    if( Stream < 0 ) then
    begin
        Set_Last_Error( Create_Exception( UOS_File_Error_Invalid_Stream_Index, nil ) ) ;
        exit ;
    end ;
    Set_Last_Error( nil ) ;
    ACM := Find_Data_Stream( Stream ) ;
    if( ACM = nil ) then
    begin
        if( _Store.Last_Error <> nil ) then
        begin
            Set_Last_Error( Create_Exception( UOS_File_Error_Write_Failure, _Store.Last_Error ) ) ; { Untested }
        end else
        begin
            Set_Last_Error( Create_Exception( UOS_File_Error_No_Such_Stream, nil ) ) ;
        end ;
        exit ;
    end ;
    if( Stream = 0 ) then
    begin
        if( Contiguous ) then
        begin
            Header.Clusters[ 0 ] := Value ;
            exit ;
        end ;
        if( Index <= Max_Header_Cluster ) then
        begin
            Header.Clusters[ Index ] := Value ;
            Update_Header ;
            exit ;
        end ;
        Index := Index - Max_Header_Cluster - 1 ;
    end ;
    ACM.Set_Allocation_Cluster( Index, Value ) ;
    if( ACM.Last_Error <> nil ) then
    begin
        Set_Last_Error( Create_Exception( UOS_File_Error_IO_Error, _Store.Last_Error ) ) ; { Untested }
    end ;
end ; // TUOS_Native_File.Set_Allocation_Cluster

Once you get past the error checking and such, this simply calls the corresponding method in the allocation cluster manager.

Finally, the implementation of the getter and setter:

function TUOS_Native_File.Get_Contiguous : boolean ;

begin
    Result := _Contiguous ;
end ;


procedure TUOS_Native_File.Set_Contiguous( Value : boolean ) ;

var ACM : TNative_File_ACM ;
    Buffer : PChar ;
    Index : longint ;
    C, S : TStore_Size64 ;
    New, P : TStore_Address64 ;

begin
    if( Header.Size = 0 ) then
    begin
        exit ;
    end ;
    if( _Contiguous <> Value ) then
    begin
        // Setup...
        ACM := Find_Data_Stream( 0 ) ;
        S := Header.Size ;
        C := ACM.Get_Clustersize ;
        Index := 0 ;

        // Convert...

The getter is simple. The real work comes where we need to convert between contiguous and non-contiguous. The operation is quite different depending on which direction we are going:
        if( Value ) then
        begin
            // Convert to contiguous...
            P := Store.Allocate( S ) ;
            if( P = 0 ) then
            begin
                exit ; { Untested }
            end ;

            // Copy the current data to the new contiguous area...
            New := P ;
            while( S > 0 ) do
            begin
                S := S - C ;
                Store.Copy( Get_Allocation_Cluster( 0, Index ), P, C ) ;
                if( Store.Last_Error <> nil ) then
                begin
                    Set_Last_Error( Create_Exception( UOS_File_Error_Read_Failure, nil ) ) ;
                    exit ;
                end ;
                P := P + C ;
                inc( Index ) ;
            end ;

            // Deallocate the old data and assign pointer to new data...
            _Contiguous := Value ;
            Set_Size( ACM, Header.Size ) ;
            Header.Clusters[ 0 ] := New ;
            for Index := 1 to Max_Header_Cluster do
            begin
                Header.Clusters[ Index ] := 0 ;
            end ;
        end else

To convert to contiguous, we allocate a contiguous area on the store that is large enough to hold the file data. Then we copy the data from the old clusters to the new, contiguous, cluster. We clear all the existing header clusters and use Set_Size to clear the ACM, if any. We also set the first header cluster to the new starting cluster.
        begin
            // Convert from contiguous...
            P := Header.Clusters[ 0 ] ; // First offset of contiguous area
            _Contiguous := Value ;
            ACM.Prepare_Size( S - Max_Header_Cluster * C ) ; // Make room for required number of cluster pointers
            while( S > 0 ) do
            begin
                S := S - C ;
                Set_Allocation_Cluster( 0, Index, P ) ;
                P := P + C ;
                inc( Index ) ;
            end ;
        end ;
        Update_Header ;
    end ; // if( _Contiguous <> Value )
end ; // TUOS_Native_File.Set_Contiguous

To convert to non-contiguous, we could go through the process of allocating each cluster in a chain and copying the data. But, we can do a shortcut instead. Since a contiguous file's data is really nothing more than a series of clusters, one after another, in order, we can leave the data clusters where they are, and simply create the entries for an allocation cluster chain to point to the various clusters in the contiguous area. In fact, the file data is still contiguous on the disk, although it is accessed via the allocation chain. So, it is physically contiguous, but logically non-contiguous. It is also possible that files will end up being physically contiguous just based on where free space exists on the store when the file is being written to, even though they are logically non-contiguous. Now the user has the option to define which files are accessed as contiguous and which ones are not, with all the advantages and disadvantages we discussed way back in Article 3.

We did add one method to the Allocation Cluster Manager class as well, to support setting individual cluster values in the chain:

procedure TCOM_Allocation_Cluster_Manager64.Set_Allocation_Cluster( I : longint ;
     Cluster : TStore_Address64 ) ;

var AC, Offset : cardinal ;

begin
    Set_Last_Error( nil ) ;
    if( I < 0 ) then
    begin
        exit ;
    end ;
    if( Size < I * _Clustersize ) then
    begin
        Size := I * _Clustersize ;
    end ;
    AC := I div _Max_Index ; // Allocation cluster index
    Offset := I mod _Max_Index ; // Pointer index within allocation cluster index
    Pull( AC ) ;
    if( _Last_Error <> nil ) then
    begin
        exit ;
    end ;
    Buffer^[ Offset ] := Cluster ;
    Write( _Current_AC_Pointer ) ;
end ;

There was no need to set individual cluster offsets from outside the class until the changes in TUOS_Native_File to handle contiguous files. Fortunately, it is a simple change, and doesn't require the class to know anything about how it is actually used. Thus, it remains a "black box" class.

As a side note, while testing the contiguous file changes, I ran into a problem due to the store I was using. I was using a THeap64 class, which is just a wrapper around the Delphi heap manager. Since I hadn't replaced the standard heap manager with our own heap manager, the code was subject to implementation issues of the standard heap manager. In the case of the standard Delphi Heap manager, everything worked fine when we were allocating and deallocating clusters. But when we converted from contiguous to non-contiguous, and simply pointed to the individual clusters within the contiguously-allocated area, it corrupted the heap. Why? Because the standard heap allocates more than you ask for and stores the size of the allocated area just prior to the address that it returns. This is to make deallocations easier, since the program doesn't have to know how much memory is allocated when it frees a buffer or object. After our conversion, the "pointers" we used didn't have this extra size information prior to the data and when we deleted the file or resized it smaller, the heap assumed that before the pointer address was size information. But there wasn't. So I fixed this by using a RAM disk store which just allocates a large buffer and uses an allocation table to allocate data in that buffer. The use of the standard heap originally was just for speed of implementation of the test. The current test environment is closer to the actual situation that UOS will have.

In the next articles, we will write the methods that will allow us to navigate and modify our file system.