CENG334 Introduction to Operating Systems

CENG334 Introduction to Operating Systems Filesystems Case studies Topics: FAT UNIX V7 Erol Sahin Dept of Computer Eng. Middle East Technical University Ankara, TURKEY 1 FAT (MS-DOS) Filesystems MS-DOS, Windows 98, Windows ME

Still supported in Windows NT, XP, Vista Its use has been shifted towards embedded devices such as Digital cameras MP3 playes iPod (the default filesystem) 2 Directories Filenames are limited to 8+3 characters.. Smaller ones are left justified and padded with space. Attributes: read-only, archive, hidden, system Time represented with 2 bytes: correct upto +-2 second

Date: Counts in three fields: day (5 bit), month (4 bits), year (7 bits). Contains: Y2108 problem File size: 2 bytes. Theoretically 4GB limit, but the limit is 2GB due to other reasons. 10 bits reserved for future use. 3 File Allocation Table MS-DOS keeps track of files through FAT table hold in memory. First block number (2 bytes) used as the index to the FAT table which has 64K entries. Block size can be set as multiple of 512

byes. Three versions of FAT depending on the number of bits a disk address contains: FAT-12 FAT-16 FAT-32 (actually should be called as FAT-28) FAT is also used for keeping track of free blocks. 4 FAT-12

Block size: 512 bytes (2^12-10) X 512 bytes =~ 2MB 10 disk addresses are used as special markers. FAT table size: 4096 entries with 2 bytes each. Worked well for floppy disks. The limit was extended using larger block sizes: 1KB, 2KB, 4KB providing support for partitions 16MB. Limited for hard disks. 5

The MS-DOS File System (2) Maximum partition size for different block sizes. The empty boxes represent forbidden combinations. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 6 The UNIX V7 File System (1) A UNIX V7 directory entry. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 7 The UNIX V7 File System (2)

A UNIX i-node. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 8 The UNIX V7 File System (3) The steps in looking up /usr/ast/mbox. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 9 10 11

12 13 14 The ISO 9660 File System The ISO 9660 directory entry. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 15 Rock Ridge Extensions Rock Ridge extension fields:

PX - POSIX attributes. PN - Major and minor device numbers. SL - Symbolic link. NM - Alternative name. CL - Child location. PL - Parent location. RE - Relocation. TF - Time stamps. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639

16 Joliet Extensions Joliet extension fields: Long file names. Unicode character set. Directory nesting deeper than eight levels. Directory names with extensions Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639

17 File System Mounting A file system must be mounted before it can be accessed A unmounted file system is mounted at a mount point 18 Virtual File Systems VFS: another layer of abstraction Upper interface: for processes implementing POSIX interface Lower interface: for concrete file systems

VFS translates the POSIX calls to the calls of the filesystems under it. 19 VFS- 2 At boot time, the root filesystem is registered with VFS. When other filesystems are mounted, they must also register with VFS. When a filesystem registers, it provides the list of addresses of the functions that the VFS demands, such as reading a block. After registration, when one opens a file: open(/usr/include/unistd.h, O_RDONLY) VFS creates a v-node and makes a call to the concrete filesystem to return all the information needed. The created v-node also contains pointers to the table of functions for the concrete filesystem that the file resides.

20 Virtual File Systems (2) A simplified view of the data structures and code used by the VFS and concrete file system to do a read. 21 CENG334 Introduction to Operating Systems Filesystem Corruption and Backups Erol Sahin Dept of Computer Eng.

Middle East Technical University Ankara, TURKEY 22 Filesystem corruption What happens when you are making changes to a filesystem and the system crashes? Example: Modifying block 5 of a large directory, adding lots of new file entries System crashes while the block is being written The new files are lost! System runs fsck program on reboot

Scans through the entire filesystem and locates corrupted inodes and directories Can typically find the bad directory, but may not be able to repair it! The directory could have been left in any state during the write fsck can take a very long time on large filesystems And, no guarantees that it fixes the problems anyway 23 Example: Example: removing a file requires

Remove the file from its directory Release the i-node to the pool of free i-nodes Return all the disk blocks to the pool of free disk blocks In the absence of crashes the order these steps taken do not matter. In the presence of crashes, however, it does! 24 Example: Example: removing a file requires

Remove the file from its directory Release the i-node to the pool of free i-nodes Return all the disk blocks to the pool of free disk blocks The inodes and file blocks will not be accessible from any file yet they will not be available for reassignment. 25 Example: Example: removing a file requires

Remove the file from its directory Release the i-node to the pool of free i-nodes Return all the disk blocks to the pool of free disk blocks The directory node will point to an invalid inode or (if the inode is reassigned) point to a different file. The blocks of the file will not be available for reassignment. 26 Example: Example: removing a file requires Remove the file from its directory

Release the i-node to the pool of free i-nodes Return all the disk blocks to the pool of free disk blocks The file will point to empty blocks, or (after reassignment) it will share the blocks of other files to which these were reassigned.. 27 File system consistency Consistency check is typically done after a crash.. UNIX: fsck Windoze: scandisk Redundant information in the filesystem is used:

Check the blocks Check the files Two tables, each containing a counter initialized to 0 Blocks in use: How many times a block is present in a file Read all the i-nodes using a raw device (not through the filesystem calls) For each block that is referenced in the inode structure, increment the corresponding block use counter by one.

Free blocks: Examine the free block list of free block bitmap structure Each appearance of a free block increments the counter by one 28 File System Consistency File system states. (a) Consistent. (b) Missing block. (c) Duplicate block in free list. (d) Duplicate data block. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 29 Missing block

Harmless but wastes space Action: Add the missing block to the free list. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 30 Duplicate block in free list

Can only occur in linked list representation. Bitmap representation does not have this problem. Action: Rebuild the free list. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 31 Duplicate data block The worst thing that can happen! A data block appears in two different files.. Action:

Allocate a free block Copy the contents into the new block. Change the links such that each copy appears once in each file. For sure, the contents of one of the files is garbled. The filesystem is made to be consistent. The user is informed. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 32 Consistency check for directories

Uses a table of counters per file (rather than per block) Starts from the root and traverses the tree For each i-node, it increments the corresponding counter for that file Remember due to hard links, a file can appear more than once It then checks the link counts stored in the i-nodes to these values. If the link count > counter Even if the file is deleted by from all the directory entries, it will continue to exist.

Solution: correct the link count If the counter > link count Although the file is linked from, say, two directories, removal from one would cause the i-node deleted leaving the other one invalid. Solution: correct the link count 33 Journaling Filesystems Ensure that changes to the filesystem are made atomically

That is, a group of changes are made all together, or not at all In the directory modification example, this means that after the system reboots: The directory either looks exactly as it did before the block was modified Or the directory looks exactly as it did after the block was modified Cannot leave an FS entity (data block, inode, directory, etc.) in an intermediate state! Idea: Maintain a log of all changes to the filesystem Log contains entries that indicate what was done e.g., Directory 2841 had inodes 404, 407, and 408 added to it To make a filesystem change:

1. Write an intent-to-commit record to the log 2. Write the appropriate changes to the log Do not modify the filesystem data directly!!! 3. Write a commit record to the log This is essentially the same as the notion of database transactions 34 Journaling FS Recovery What happens when the system crashes?

Filesystem data has not actually been modified, just the log! So, the FS itself reflects only what happened before the crash Periodically synchronize the log with the filesystem data Called a checkpoint Ensures that the FS data reflects all of the changes in the log No need to scan the entire filesystem after a crash... Only need to look at the log entries since the last checkpoint!

For each log entry, see if the commit record is there If not, consider the changes incomplete, and don't try to make them 35 Journaling FS Example File 1 File 2 Checkpoint Log 36

Journaling FS Example File 1 File 2 Checkpoint Log 37 Journaling FS Example File 1 File 2 Checkpoint

Log Filesystem reflects changes up to last checkpoint Fsck scans changelog from last checkpoint forward Doesn't find a commit record ... changes are simply ignored 38 File System Backups (1) Backups are generally made to handle one of two potential problems: Recover from disaster. Recover from stupidity.

Thrash bins Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 39 Backup what? In order to have storage efficiency, one can choose not to backup: Executable files (since they can usually be restored from manufacturer CD-ROMs Temporary files /tmp

Special files (which correspond to I/O devices) /dev Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 40 Back-up issues Full backup backup all the filesystem providing a snapshot of the system at that point which can be fully restored Typically done weekly or monthly

Incremental backup Backup only the files that were changed after the most recent backup Typically, a weekly backup is followed by daily incremental backups Smaller backup size, and faster Compressed backups Reduced storage Less secure: a single bad byte can screw the whole backup

Backing up and active filesystem is tricky Since during backup, files and directories are being modified Makes the system less secure Each backup tape/disk needs to be as safe as the serve itself.. It doesnt matter if your backup tapes are lying around, even if you have the most secure computer system.. 41 Dumping strategies physical dump Algorithm

Starts at block 0 of the disk and copies all the data onto a tape/disk Pros Simple to implement in a bug-free way Fast Cons Backups the free blocks as well

Backuping a free disk takes as much storage/time as a full disk Dumping of bad blocks is a concern Typically disk controllers provide bad block replacement transparently without the OS even knowing about it If a block goes bad after formatting, then the OS typically creates a file consisting of all the bad blocks. 42 Dumping strategies logical dump Pros Allows to restore a single file. If directories that lie on the path

to the file-to-be-restored were deleted, then they would also be restored. Cons Slow and complicated. 43 Dumping strategies logical dump Algorithm

Full backup: Traverses the filesystem as a tree and creates the same filesystem structure on the backup disk/tape. Partial backup: all the directories on the path to the particular file needs to be saved. For instance, backing up file 9 requires the saving of directories, 1,5,6, and 7. Squares are directories, circles are files. Shaded items have been modified since last dump. Each directory and file is labeled by its i-node number. 44

Dumping strategies incremental logical dump Bitmaps, indexed by i-node number are used. Phase 1: Examine all files and directories below the starting directory (root in this case), and mark the modified files. mark ALL THE DIRECTORIES Squares are directories, circles are files. Shaded items have been modified since last dump. Each directory and file is labeled by its i-node number.

45 Dumping strategies incremental logical dump Phase 2: Recursively walk the tree again, and UNMARK all the directories that do not have any modified files under them. Note: 10 and 11 are unmarked

5 and 6 remain marked Squares are directories, circles are files. Shaded items have been modified since last dump. Each directory and file is labeled by its i-node number. 46 Dumping strategies incremental logical dump Phase 3: Scan the i-nodes in numerical order and Dump all the marked directories

Squares are directories, circles are files. Shaded items have been modified since last dump. Each directory and file is labeled by its i-node number. 47 Dumping strategies incremental logical dump Phase 4: dump the marked files. Squares are directories, circles are files. Shaded items have been modified since last dump. Each directory and file is labeled by its i-node number. 48

Dumping strategies incremental logical dump Restoring: Restore all the directories that were backupped Restore all the files Squares are directories, circles are files. Shaded items have been modified since last dump. Each directory and file is labeled by its i-node number. 49

Issues Links: If a file is linked to more than one directory, only one copy should be saved. Holes: In UNIX, some file, such as core files may contain holes. These files, write a few bytes, and then seek to a distant file offset and write some more bytes.

These empty blocks that are not written should not be dumped and stored. Cores typically have may megabytes of empty blocks. Special files Such as named pipes (which can appear anywhere in the filesystem) should not be dumped. 50 CENG334 Introduction to Operating Systems Filesystem Caching Topics: Disks

Erol Sahin Dept of Computer Eng. Middle East Technical University Ankara, TURKEY 51 File System Caching Most filesystems cache significant amounts of disk in memory e.g., Linux tries to use all free physical memory as a giant cache Avoids huge overhead for going to disk for every I/O Issues:

When do you commit a write to the disk? What happens if you write only to the memory cache and then the system crashes? How do you keep the memory and disk views of a file consistent? What if the file metadata (inodes, etc.) is modified before the data blocks? Read-ahead Read a few extra blocks into memory when you do one read operation

Amortize the cost of the seek Useful if the blocks of a file are laid out in contiguous blocks Take advantage of sequential access patterns on the file 52 Caching Reading a 32-bit word from memory takes 10 nsec. Hard disks can transfer data at: 100MB/sec, that is 40nsec per 32-bit words.. PLUS 5-10 msecs of seek time! Caching aims to fill in the gap..

Often thousands of blocks are kept in cache. 53 Caching (1) Hash the device and the disk address and look up the result in a hash table. All the blocks with the same cache value are chained together through a linked list. In addition to this, a bidirectional link runs through all the blocks implementing a LRU list. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 54 Caching (2)

Some blocks, such as i-node blocks, are rarely referenced two times within a short interval. Consider a modified LRU scheme, taking two factors into account: Is the block likely to be needed again soon? Is the block essential to the consistency of the file system? For both questions, blocks can be divided into categories

i-node blocks Indirect blocks Directory blocks Data blocks Full Partially full Blocks that are essential for filesystem consistency should be written to disk immediately write-through-caches UNIX: synch (every 30 seconds) Windows: In the past: none, recent ones: FlushFileBuffers

Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 55 Block read ahead The system tries to get blocks into the cache, before they are accessed to increase the hit rate. Sequential access: performance improvement Random access: performance degradation 56 Reducing Disk Arm Motion

Figure 4-29. (a) I-nodes placed at the start of the disk. (b) Disk divided into cylinder groups, each with its own blocks and i-nodes. Tanenbaum, Modern Operating Systems 3 e, (c) 2008 Prentice-Hall, Inc. All rights reserved. 0-13-6006639 57

Recently Viewed Presentations

  • Self-supporting frameworks an d agile PebblePad development Slide

    Self-supporting frameworks an d agile PebblePad development Slide

    Challenges Staff and student understanding of process, framework and methodology. Staff. Confident and capable - will flow to the students. Recognise that if staff are not confident and capable this also flows to students
  • Section 3: Modern Atomic Theory

    Section 3: Modern Atomic Theory

    Modern Models of the Atom. In the modern atomic model, electrons can be found only in certain energy levels, not between levels. Electron location is limited to energy levels. Electrons act like waves. The exact location of an electron cannot...
  • Familles de gènes - Université de Montréal

    Familles de gènes - Université de Montréal

    Une variété d'outils de regroupements des gènes en familles: COG, InParanoid, OrthoMCL, Proteinortho. Méthode d'homologie de séquence. Méthodes des « best BLAST hits »: Regroupe les copies le plus « similaires » en fonction des scores de BLAST.
  • The Foundation High School Program (FHSP) + Endorsements

    The Foundation High School Program (FHSP) + Endorsements

    Business & Industry endorsement in the areas listed on this screen requires . a progressive sequence of . 3 levels . in . one. of the bulleted courses of . study, plus one additional English elective. Business & Industry also...
  • Informatics and Mathematical Modeling Latent Causal Modelling of

    Informatics and Mathematical Modeling Latent Causal Modelling of

    1Cognitive Systems, DTU Informatics, Denmark, 2Danish Research Centre for Magnetic Resonance, email: {mm,khm,lkh}@imm.dtu.dk Granger Causality and the Directed Transfer Function (DTF) We are interested in causality based on the common sense notion that causes always precede their effects.
  • INTRODUCTION Chapter A.12 PARENTING PROGRAMS Divna Haslam, Anilena

    INTRODUCTION Chapter A.12 PARENTING PROGRAMS Divna Haslam, Anilena

    Reach Up is quite intensive and is delivered as part of home visits to help parents enhance their child's development. There are several trials showing that Reach Up has been effective in Jamaica and there are versions available in English,...
  • www.aleap4principals.com.au

    www.aleap4principals.com.au

    Presented by Professors Rhonda Craven, Alex Yeung, Janet Mooney, & Dr Anthony Dillon. Institute for Positive Psychology and Education, Australian Catholic University. Enabling Aboriginal Students to Not Just Succeed but to Flourish: What Research Says.
  • Load Rating of Box Culverts - Concrete Pipe

    Load Rating of Box Culverts - Concrete Pipe

    AASHTO Load Rating of Precast Box Culverts and Pipe. Josh Beakley. January, 2016. Load Rating is becoming very prominent in the design of box culverts these days. In fact, we are quickly coming to the point where load rating vehicles...