CDFS module 3.00 Functional Specification ----------------------------------------- Authors: T.G.Roddis, Mike Challis Contents -------- 2 Purpose 4 Product Overview 5 Concepts and Definitions 5.1 Definitions 5.2 Concepts 5.3 Working Model 5.3.1 Initialisation 5.3.2 Finalisation 5.3.3 Directory Navigation Policy 5.3.5 File names and attributes 5.3.6 ISOFORM objects 5.3.7 Standard objects 5.3.8 Filename Translation Policy 5.3.9 Filetype mappings 5.3.10 Interpretation of the CMOS byte 5.3.11 Current drive concept 5.3.12 Disc names 5.3.13 Discs and drives 5.3.14 Ambiguous disc names 5.4 Use of CDManager 5.4.1 SWIs 6 User Interfaces 7 Programmer Interfaces 7.1.2 Service Calls 7.2 Direct Programmer Interface 7.2.1 SWIs 7.2.2 Star commands 7.3 Errors returned 8 Standards 8.1 High Sierra 8.2 ISO 9660 8.2.1 Clarification of Support for ISO 9660 9 Data Interchange Formats 2. Purpose ---------- This document describes the behaviour of CDFS 3.00. CDFS 3.00 is the filing system specific module of the CD software system. It is highly recommended that the reader is familiar with the CD System Specification before continuing. 4. Product Overview ------------------- Using the functionality provided by CD Manager, the CDFS module reads data from CDs. Where that data represents a valid High Sierra or ISO9660 filing system, CDFS processes the data and provides a FileSwitch interface to the filing system, giving it RISC OS compliance. If the compact disc has been formatted with ISOFORM, CDFS provides extra, Acorn-specific information about the data objects on the disc. More precisely, CDFS recognises directory hierarchies which are recorded on CD-ROMs according to any one of the following standards: High Sierra Green Book (CD-I) ISO9660 Examples of CD formats which include such hierarchies are: CD-ROMs (standard "Yellow Book") CD-I ("Green Book") CD-ROM XA Video CDs ("White Book") Photo CDs Enhanced Music CDs ("Blue Book") CDFS also supports multi-session discs, where the root of the filing system (called the "PVD") is retrieved from the final recorded session rather than the first. On the other hand, CDFS does not support discs formatted according to proprietary standards such as Apple's HFS format. The data inside files is recorded in one of three ways: Mode 1, Mode 2 Form 1 or Mode 2 Form 2. The first two formats give 2048 bytes of high reliability data per sector, and the third format gives 2324 bytes of less reliable data per sector. For a given drive speed, the number of bytes which can be read each second is proportional to the number of bytes recorded in each sector, and Mode 2 Form 2 was designed to make it possible to retrieve MPEG encoded video from a single-speed drive. CDFS assumes that files contain only Mode 1 or Mode 2 Form 1 sectors, and provides support for files to be loaded or opened using the standard RISC OS file operations such as OS_File 255 (Load), OS_Find 64 (Open) and OS_GBPB 3 (read bytes). If CDFS encounters a Mode 2 Form 2 sector while reading a file, an error is returned. Files which are known to contain only Mode 1 or Mode 2 Form 1 sectors include: All Mode 1 files (standard 'Yellow Book' CD-ROMs). All files containing image packs on Photo CDs. Most data (as opposed to video or audio) files on CD-ROM XA discs. Files which usually contain Mode 2 Form 2 sectors include those containing compressed audio and video data, as found on CD-I and Video CD discs (Green Book and White Book). CDFS does not provide support for interleaved files. [ The CD-ROM XA standard for Mode 2 discs specifies an extension to ISO9660 whereby each directory record includes information about the form of sectors contained in the file, so under certain circumstances it is possible for an application to determine in advance whether a given file includes Mode 2 Form 2 sectors or not. Unfortunately, CD-I discs do not contain this information. ] 5. Concepts and Definitions --------------------------- 5.1 Definitions --------------- Error numbers are given in this specification relative to the module's error base; this is indicated by the notation '+n'. This notation is also used when describing the layout of parameter blocks, and there represents a byte offset from the start of the block. 5.2 Concepts ------------ CDFS is responsible only for the filing system aspects of the CD System. It has minimal awareness of the other system components. Equally other system components have minimal awareness of the filing system hierarchy on the disc. CDFS requires CD Manager in order to function. The following notes rely on access to the filing system sections of the PRMs. Whilst it is not necessary to understand the technical aspects of the interface with FileSwitch, the PRMs should be the first port of call if any doubt should arise concerning the following text. 5.3 Working Model ----------------- 5.3.1 Initialisation -------------------- During initialisation, CDFS registers with FileSwitch using OS_FSControl 12. 5.3.2 Finalisation ------------------ During finalisation, CDFS executes *bye (see 7.2.2) and then deregisters from FileSwitch by calling OS_FSControl 16 before freeing workspace. 5.3.3 Directory Navigation Policy --------------------------------- CDFS uses only the directory hierarchy to navigate the CD-ROM. Path tables are not used. 5.3.5 File names and attributes ------------------------------- Every file or directory on a CD-ROM is described by a 'directory record' which includes the following information: The object's name. A flag indicating whether the object is a file or a directory. A flag indicating whether the object is an 'associated file' or not. The object's size in bytes. The time and date when the object was recorded. A 'system use area' whose content is system-dependent (optional). If the disc has been formatted using ISOFORM, the system use area includes further information specific to RISC OS; if not, the details required by RISC OS must be derived from the standard information available in the directory record. These two situations are described separately in the following two sections. 5.3.6 ISOFORM objects --------------------- In this case, the directory record's system use area is formatted as follows Bytes Format Value Contents +0 to +9 STRING "ARCHIMEDES" Identifying signature +10 to +13 WORD Load address +14 to +17 WORD Execution address +18 to +21 WORD File attributes +22 to +31 0 Reserved (zero) The Load address, Execution address and File attributes fields are the usual RISC OS values except that the attributes word has an additional bit defined: bit 8 - is 1 if the object name starts with an underline which should be replaced by an exclamation mark by CDFS ! is not a legal ISO9660 character, and the purpose of bit 8 is to allow RISC OS application directories and boot files etc. to be stored on a CD-ROM. An object whose name is ! is stored as an object with name _ on the disc, and bit 8 above is set to 1. CDFS examines this attribute bit before processing an object name, and, if set, replaces the leading underline by !. In detail, if the directory record's system use area is 32 bytes long, and has the (unterminated) string "ARCHIMEDES" in the first ten bytes, then: The object's name is taken from the directory record, and modified by replacing an initial underline by ! if bit 8 of the attributes field in the system use area is set. If the object is a file, the string ".;1" is removed from the end of the name. (Note that ISOFORM ensures that only legal RISC OS characters appear in the object's name, and that files are always recorded with null extensions and version number 1.) The load and execution addresses (which, if the top 12 bits are all one, are interpreted by RISC OS as a date stamp and file type), are taken from the corresponding fields in the system use area. The object's length is taken from its directory record. The file attributes are a copy of the corresponding field in the system use area with bit 8 forced to zero. 5.3.7 Standard objects ---------------------- In all other cases, the RISC OS properties of the object are determined according to the following rules. The object's RISC OS name is constructed as follows: The object name is extracted from the directory record, and parsed according to the following syntax: name -> [] [ . [] ] [ ; [] ] (where is an integer value in the range 1 to 32767). The final component [ ; [] ] is removed if the version number is absent or equal to 1. The middle component [ . [] ] is removed if no extension is present. An exclamation mark is added to the end of the name if the object is flagged as an associated file. Any special characters in the name (including the separator .) are translated as described in section 5.3.8 below. Some examples: ISO9660 name RISC OS name FRED.DAT;3 FRED/DAT;3 FRED.DAT;1 FRED/DAT FRED.DAT; FRED/DAT FRED.;1 FRED FRED.;1 FRED! - if it is an 'associated' file F$76.BAT F_76/BAT The object's RISC OS date stamp is calculated from the date and time when the object was recorded. If the object is a file, its RISC OS filetype is determined by mapping any extension as described in 5.3.9 below; if no extension is present, it is given filetype &FFD (Data). The object's RISC OS length is copied from its directory record. The object's RISC OS attributes are given as &11 (owner read, public read). 5.3.8 Filename Translation Policy --------------------------------- Some, rare, discs have characters in filenames which are outside the normal range of characters in use on CD-ROMs. All characters are treated as per the ISO 8859/1 extensions to ASCII. The control characters, that is those characters in the range &0 to &1F, are translated to '?' (&3F). The characters in the interval &20?to &7E?(inclusive) are, with the exception of the following characters, not translated: . (&2E) is translated to / (&2F) SPC (&20) is translated to _ (&5F) : (&3A) \ * (&2A) | # (&23) | $ (&24) | & (&26) |>- These are all translated to '?' (&3F) @ (&40) | ^ (&5E) | % (&25) | \ (&5C) | " (&22) / Character DEL (&7F) is translated to '?'. Top-bit set characters (&80 - &FF) are not translated. The translation will be reversible for the vast majority of cases, that is to say, the algorithm used to find a file will be designed to appreciate that certain characters, particularly '_' when occurring in a target filename may actually represent a different character to that stored on the ISO9660 disc. CDFS will not avoid the possibility of two files having the same names and will carry out no contingency action in such cases, but it will remain robust in such cases, neither faulting itself nor causing an error elsewhere. 5.3.9 Filetype Mappings ----------------------- Filetype mapping (also known as 'User Aliasing') is provided using the TypeMapper module. This is a generic mechanism, allowing other applications to utilise its facilities. Filename extensions can appear at the end of any ISO 9600 filename. If they occur they are mapped onto a filetype - although the extension is not removed in order to avoid introducing possible filename ambiguities. The following mappings are implemented automatically (these include the default mappings provided by the old CD system): extension file type DOC &FFF Text TXT &FFF Text TIF &FF0 TIFF BAT &FDA MSDOSbat EXE &FD9 MSDOSexe COM &FD8 MSDOScom PCD &BE8 PhotoCD GIF &695 GIF BMP &69C BMP WAV &FB1 WAV HTM &FAF HTML AVI &FB2 AVI MPG &BF8 MPEG JPG &C85 JPEG CSV &DFE CSV 5.3.10 Interpretation of the CMOS byte -------------------------------------- The bottom nybble of the CDFS CMOS byte (number 138) determines the size of the directory cache as follows: value directory cache size 0 - 3 Reset this CMOS nybble to 7 (see below) 4 0 (ie CDFS will use as little space as possible) 5 16k 6 32k 7 64k 8 128k 9 256k 10 512k 11 1024k 12 - 15 Reset this CMOS nybble to 7 (see below) In previous versions of CDFS, bits 0 - 4 of this CMOS byte contained the configured number of drives. This would normally be less than 3, and this is why values 0 to 3 are reset automatically by CDManager. 5.3.11 Current drive concept ---------------------------- Like FileCore, CDFS supports a "current drive number" concept: this is set by the *drive command, and defaults to 0. It is used whenever FileSwitch asks for the canonical form of the NULL disc name: CDFS returns the name of the disc in the current drive. This means, for example, that a request to access a file 'x' when no CSD is defined will look for 'x' inside the root directory of the disc in the current drive. 5.3.12 Disc names ----------------- The disc name associated with a CD-ROM is derived from its ISO9660 Volume Identifier as follows: a) Replace any special characters as described in 5.3.8 for filenames. b) If the resulting name consists of exactly one or exactly two decimal digits, insert a leading underscore. This process ensures that no disc name will confuse FileSwitch, and that disc names and drive numbers cannot be confused. 5.3.13 Discs and drives ----------------------- CDFS maintains information about each disc that it knows about, including the logical number of the drive where it was last seen, and the corresponding sequence number (see the CDManager specification for an explanation of this concept). When a call is made to CDManager, both logical drive number and sequence number are quoted; if the disc is not available on the drive, one of the following errors is returned: Disc may have changed Drive empty Drive becoming ready Drive not responding as expected Drive number not known The action taken by CDFS in each case is as follows: Disc may have changed: CDFS first checks the original drive to verify that the user hasn't just replaced the disc. If it is the case that the disc has changed, CDFS initiates a search as described below. Drive empty: CDFS initiates a search as described below. Drive becoming ready: CDFS keeps trying to access the drive until (a) the condition is cleared, (b) a predetermined timeout* expires, or (c) the condition changes. If the condition changes it will then deal with that appropriately; otherwise the error "Drive remains busy" is returned to the client. [ * An appropriate value for the timeout depends on the behaviour of the drive, and is determined by calling CDMgr_DriveOp BusyTimeOut. ] Drive not responding as expected: CDFS returns this error to the client. Drive number not known: CDFS calls CDMgr_DriveOp, EnumerateDrives; if this indicates that the drive might exist, the call is repeated. Otherwise, the error is returned to the client. Look for the disc in each known drive. If not found, call CDMgr_DriveOp, EnumerateDrives to see if there are any more drives to check. Look for the disc in any additional drives now available. If still not found, issue UpCall 1/2: either the Wimp or the CLI will display a message asking the user to insert the requested disc. If the user fails to insert the disc, the error "Disc not found" is returned to the client. There is a limit to the number of different discs which CDFS can "remember" at any one time, and when this limit is exceeded one will be selected to be "forgotten" by dismounting it (see section 7.2.2). This limit will never be less than 32, and will be increased automatically in systems with large numbers of CD-ROM drives. 5.3.14 Ambiguous disc names --------------------------- It is possible (but unlikely) that CDFS will encounter two CD-ROMs which have identical disc names but different content: note that two discs with the same name are assumed to be identical if CDMgr_MiscOp, WhichDisc returns the same value for each disc. If this situation arises, CDFS forcibly dismounts the previous disc so that future references will always access the most recently encountered disc. 5.4 Use of CDManager -------------------- CDFS must 'understand' the following parts of the CDManager interface in order to carry out the functions it performs. 5.4.1 SWIs ---------- CDMgr_GetSupported is used to determine whether multi-session discs are supported. CDMgr_ReadTOC is used to determine the number of sessions on each disc, where they are, and whether the disc includes any data tracks (or is audio only). CDMgr_DriveOp is used, in particular the following reason codes are used: GetSequenceNumber is used to get the media change count for the current disc. BusyTimeOut is called to ascertain a suitable timeout value for a busy drive. CDMgr_ReadData is used to read any data necessary from the program area of the disc. The transfer is a reliable, foreground, user data read, with expected sector type = &10000: this will read 2048 bytes from either a Mode 1 or Mode 2 Form 1 sector. CDMgr_MiscOp is used, in particular the following reason code is used: WhichDisc is used in conjunction with reading the disc name from the primary volume descriptor, to quickly determine whether the disc in the drive after a media change is identical to the one before. 6. User Interfaces ------------------ The user interface is via the Filer. The Filer in turn makes calls to FileSwitch. The Wimp or CLI responds to UpCall 1/2 (when a requested disc is not present). 7. Programmer Interfaces ------------------------ 7.1.2 Service Calls ------------------- CDFS will respond to the service call Service_FSRedeclare by calling OS_FSControl 12 again in an identical manner to start up. 7.2 Direct Programmer Interface ------------------------------- The following calls are provided because of their usefulness to applications programmers. 7.2.1 SWIs ---------- The SWI chunk base is &4be00. CDFS_LogicalBlockSize (&4be00) --------------------- On entry: R0 = flags (all reserved, = 0) R1 -> NUL terminated disc name On exit: R0 = logical block size This returns the disc's logical block size in bytes. Values permitted by the ISO9660 standard are 512, 1024 and 2048. The disc name is canonicalised by CDFS before use, and so a drive number can be supplied instead. Errors: +2 Drive contains audio disc +7 Multiple matches for wild-carded disc name +&AC Drive not known +&D3 Drive empty +&D4 Disc not found +&D7 Cannot recognise data format of disc in drive CDFS_LocateFile (&4be01) --------------- On entry: R0 = flags (all reserved, = 0) R1 -> NUL terminated pathname On exit: R0 = start address (logical block address) R1 = size of extended attribute record (logical blocks) R2 = file size (bytes) This returns information from the ISO9660 directory record for the specified file. The start address is returned as a logical block address. This can be converted to a physical sector address and offset using the logical block size (returned by CDFS_LogicalBlockSize). The file may start with an extended attribute record: if so, its size in logical blocks is returned in R1. R2 gives the size of the file as recorded in the directory entry. Note that this value may not be the same as the number of bytes in the file if the file includes Mode 2 Form 2 sectors, or is interleaved. The pathname is canonicalised by CDFS before use. Error: +2 Drive contains audio disc +4 Not a CDFS object +5 Object is not a file +7 Multiple matches for wild-carded disc name +&AC Drive not known +&D3 Drive empty +&D4 Disc not found +&D6 Object does not exist +&D7 Cannot recognise data format of disc in drive CDFS_ISODirectoryRecord (&4be02) ----------------------- On entry: R0 = flags (all reserved, = 0) R1 -> NUL terminated pathname R2 -> buffer, or = 0 to determine buffer size R3 = buffer length On exit: R0 -> System Use Area R1 = size of system use area R3 = number of bytes written to buffer, or size of buffer if R2 was zero If R2 = 0 on entry, the size of buffer required is returned in R3. If R2 is non-zero on entry, the ISO9660 directory record for the specified object is copied to the buffer, and its size is returned in R3. R1 is set to the size of any System Use Area within the directory record, and, if this is non-zero, R0 points to the copy of it. The pathname is canonicalised by CDFS before use. Errors: +2 Drive contains audio disc +4 Not a CDFS object +6 Buffer too small +7 Multiple matches for wild-carded disc name +&AC Drive not known +&D3 Drive empty +&D4 Disc not found +&D6 Object does not exist +&D7 Cannot recognise data format of disc in drive CDFS_DiscType (&4be03) ------------- On entry: R0 = flags (all reserved, = 0) R1 -> NUL terminated disc name On exit: R0 = disc type Information is returned about the disc to which the specified file or directory belongs. One or more flags are set in R0 as follows: bit 0 Multi-session 1 CD-ROM 2 CD-I 3 CD-ROM XA (1) 4 CD-ROM XA (2) 5 CD-I BRIDGE 6 Video CD 7 Photo CD 8 Enhanced Music CD 9 High Sierra The disc name is canonicalised by CDFS before use, and so a drive number can be supplied instead. Errors: +2 Drive contains audio disc +7 Multiple matches for wild-carded disc name +&AC Drive not known +&D3 Drive empty +&D4 Disc not found +&D7 Cannot recognise data format of disc in drive CDFS_CurrentDrive (&4be04) ----------------- On entry: R0 = flags (all reserved, = 0) On exit: R0 = current drive number This returns the number of the current drive (see *Drive below). Errors: None. 7.2.2 Star commands ------------------- *CDFS ----- This command selects CDFS as the current filing system. In order to do this, it calls OS_FSControl 14. [This command is mandatory for filing systems with support for stored/storing files.] *Bye ---- Close any files open on CDFS. Unset any special directories (@, \, &, %) that refer to CDFS. Forget all CDFS disc names and issue Service_DiscDismounted for each one. *Dismount [[:]<|>] ------------------------------------------- This command dismounts a disc. This is the specified disc () or the disc in the specified drive () or the disc in the current drive (no parameter given). The disc is dismounted as follows: Close any files open on the disc. Unset any special directories (@, \, &, %) that refer to the disc. Forget the disc's name, and issue Service_DiscDismounted for it. CDFS canonicalises the parameter before use, and so wild cards are permitted inside a . *Drive ---------- This command sets the current drive to . *Mount [[:]<|>] ---------------------------------------- This command mounts a disc. This is the specified disc () or the disc in the specified drive () or the disc in the current drive (no parameter given). The disc is mounted as follows: Set CSD (@) to :.$ Set LIB (%) to the directory :.$.Library if it exists; otherwise LIB is untouched. Unset URD. Note that the current drive is *not* reset to the one which contains the mounted disc. CDFS canonicalises the parameter before use, and so wild cards are permitted inside a . *Configure CDROMBuffers K ------------------------------- *Configure CDROMBuffers nK configures the directory cache as follows: n = 0 0k n < 16 16k n >= 16 mk where m = 2^p and 2^p <= n < 2^(p+1) In other words, values of n >= 16 are rounded down to the nearest power of 2. 7.3 Errors returned ------------------- CDFS may return CDManager errors, but in most cases it will return an error of its own explaining more precisely what the fault is. Equally CDFS may simply return information to FileSwitch (such as 'Disc not found') which causes FileSwitch to emit an error. Note that where errors are similar to the 'standard' FileCore errors listed on p2-590 of PRM3, the same relative error number has been assigned. The error base is &12500. +0 CDFS does not support this operation +1 CDFS is a read-only filing system +2 Drive contains audio disc +3 CDFS does not support special fields +4 Not a CDFS object +5 Object is not a file +6 Buffer too small +7 Multiple matches for wild-carded disc name +8 Insufficient memory available +9 CDFS cannot read Mode 2 Form 2 sectors +&AC Drive not known +&D3 Drive empty +&D4 Disc not found +&D6 Object does not exist +&D7 Cannot recognise data format of disc in drive All these errors are marked as program errors by OR'ing &1b into the top byte of the error number (see p5a-493 in PRM3). 8. Standards ------------ The CDFS module provides support for accessing data stored according to the following standards (8.1 and 8.2). The extensions discussed in 8.3 are not supported. 8.1 High Sierra --------------- This was a standard developed by parties interested in a generic CD-ROM filing system before ISO 9660 existed. ISO 9660 was largely based on this and thus is very similar. 8.2 ISO 9660 ------------ ISO 9660 defines the layout of a generic filing system on a CD-ROM with allowance for machine-specific information within its structure. It imposes several constraints on the composition and length of filenames and provides the format of directories and path tables. It imposes no constraints on the format of the data stored on a CD-ROM. 8.2.1 Clarification of Support for ISO 9660 ------------------------------------------- This section aims to provide useful information about aspects of ISO9660 which are not or have historically not been supported. Logical blocks (6.2.2 of ISO9660) of any size are properly supported. Interleaved mode files (6.4.3 of ISO9660) are not supported. These were not previously supported and this method of recording files on ISO9660 is believed to be quite rare. Some support for extended attribute records (6.4.4.1 of ISO9660) is provided, as was the case for some of the more recent of the previous versions of CDFS. CDFS takes note of the existence of extended attribute records, and skips them where necessary; on the other hand, the data contained inside is not interpreted, as it has little relevance to the RISC OS world. Path tables (6.9 of ISO9660) are not exploited as the directory hierarchy is sufficient for the purposes of RISC OS. There is some support for the occurrence of arbitrary characters in disc and filenames. However, some of these characters are illegal in RISC OS filenames and these are mapped to other characters. This mapping process may introduce ambiguities if it maps two different filenames to the same name. There is no internal limit on the depth of the directory tree, the ISO 9660 limit of 8 levels is ignored. Version numbering is supported, but the first version, ';1', of each filename is listed without the suffix. Subsequent versions, ';2', ';3' etc. do have their suffix listed. Mode 2 Form 2 data sectors, which are effectively forbidden by ISO660, cannot be read by CDFS. Instead the SWI CDFS_LocateFile can be used to locate the data and the client application can then fetch the data direct from the CDManager. 9. Data Interchange Formats --------------------------- Section 8, 'Standards' discusses the filing system formats understood by CDFS. In addition to those standards, CDFS copes with the following and their data interchange conventions:- The FileSwitch module according to the PRMs. The CD Manager module according to the accompanying specification.