UNIX File Systems 3

Published on July 2020 | Categories: Documents | Downloads: 9 | Comments: 0 | Views: 157
of 14
Download PDF   Embed   Report

Comments

Content

 

UNIX File and Directory Caching How UNIX Optimizes File System Performance and Presents Data to User Processes Using a Virtual File System

 

V-NODE Layer •

V-node in-memory in-memory interface to the disk consist of:



File system independent functions dealing with:  –    Hierarchical Hierarchical naming;  –    Locking;  –   –     Quotas; Attribute managem management ent and protection.



Object (file) creation and deletion, read and write, changes in space allocation:  –    These functions functions ref refer er to file-store interna internals ls specific to the file syste system: m:  –    Physical organization of data on device;  –    For local data files, these functions refer to v-node refers to UNI UNIX-specific X-specific structure called i-node (index node) that has all necessary information to access the actual data store.

 

 Actual File I/O •

CPU cannot access the file data directly.



Must be first brought to the main memory.  –  How to do this efficiently?



Read/Write mapping using in-memory system file/directory buffer cache.



Memory mapped files –  INODE  INODE lists, Directories, Regular files.



Then fed from memory memory to data pipeline in CPU.

 

Virtual INODES (In Memory)

 

Relation between logical and physical data views

 

Program I/O - Virtual to Real

 

File Read/Write Memory Mapping •

File data is made available to applications via a pre-allocated main memory region - the  buffer cache.



The file systems transfers data between the buffer cache and disk in granularity of disk  blocks. The data is explicitly copied from/to buffer cache to/from the user application address space (process).





A file (or a portion thereof) is mapped into a contiguous region of the process virtual memory.



Mapping operation is very efficient: just marking a block.



The access to file is governed by the virtual mem memory ory sub subsystem. system.



Advantages:  – reduce copying  – no need for a pre-allocated buffer cache in the main memory



Disadvantages:  – less or no control over the actual disk writing: writi ng: the file data becomes volatile

 

 –  A mapped area must fit the virtual address space

 

Read/Write Mapping Kernel Main Memory

File C

File A

File B

Buffer Cache

 

Reading data (Disk block=1K) User 

Kernel

Buffer Cache

 Buf  ptr 

File C 1324

3172

UNSIGNED CHAR BUF[8192]; UNSIGNED CHAR *PTR=BUF+126; FD = OPEN(“C”,…); SEEK(FD,1324);

// 1324=1024+300

READ(FD,PTR,1848);

// 724+1024+100=1848

 

Writing data (Disk block=1K) User 

Kernel

Buffer Cache

 Buf  ptr 

File C 1324

3172

Unallocat ed region

UNSIGNED CHAR BUF[8192]; UNSIGNED CHAR *PTR=BUF+126; FD = OPEN(“C”,…); SEEK(FD,1324); WRITE(FD,PTR,1848);

// 1324=1024+300 // 724+1024+100=1848

 

Buffer Cache management •

All disk I/O goes through the buffer bu ffer cache. Both data and metadata (e.g., i-node, directories) are cached using LRU replacem replacement ent



Dirty (modified) marker to indicate whether write-back is needed for data blocks.



Advantages: -multiples Hiding disk access useretc… program. Block size, memory alignment, memory allocation in of the blockthe size,   - Disk blocks are cached - Block aggregation for small transfers (locality) - Block re-use across processes - Transient data might be never written to disk



Disadvantages: - Extra copying: Disk->buffer cache->user space - Vulnerability to failures - Does not care about the user data blocks - Control data blocks (metadata) are the real problem p roblem •

INODES, pointer blocks, directories directories can be in cache when a failure occurs

••

As a required, result theresulting file system internal state might fsck in long (re-)boot times be corrupted

 

File System Reliability and Recovery •

File system data consists of file control data (metadata), user data



Failures can cause data loss and corruption for cached metadata or user data



Power failure during the sector write may corrupt physically the data stored in the sector



Lost or corruption of the metadata might lead to a more massive user data loss.  –  File systems must care about the metadata more than about the user data d ata  –  The Operating System cares about the file system data (e.g. metadata)  –  Users must care about their data themselves (e.g., backups)



Caching affects the WRITE process reliability.  –  Is it guaranteed that the requested data is indeed written on disk?  –  What if cache blocks are the metadata blocks versus user data?



Solutions:  –  write-through: writes bypass cache  –  write-back: dirty blocks are written asynchronously [bracket processes]

 

Data Reliability in UNIX •

User data writes based on write-back  policy:  policy:  –  User data is written back to disk periodically  –  Program commands commands like sync and fsync are used for forced write of the dirty blocks.



Metadata writes are based on write-through policy. Updates are written to disk imm immediately ediately  bypassing cache. Problem: - Some data is not written in-place. Can go back to the last consistent version - Some data is replicated like UNIX superblock. - File system goes through consistency check/repair cycle at the boot b oot time as specified in /etc/fstab options (see manpage on fsck, fstab). - Write-through negatively affects performance



Solution: maintain maintain a sequential log of metadata updates, a Journal: e.g. IBM’s Journal File System (JFS) in AIX

 

Journal File System (JFS) •

Metadata operations logged (journaled):  – create,link,mkdir,truncate,alloc create,link,mkdir,truncate,allocating, ating, write, …   –  each operation may involve several metadata updates (transaction)



Once operation is logged it returns, write ahead logging



The disk writes are performed asynchronously. Block aggregation possible.



A cursor (pointer) is maintained. The cursor is adva advanced nced once the updated bl blocks ocks associated with the transaction are written to disk (hardened) (hardened).. Hardened transa transaction ction records can be deleted from the journal.



Upon recovery: Re-do all the operations starting from the last cursor position.



Advantages:  –  Asynchronous metadata write  –  Fast recovery: depends on the Journal size and not on the file-system size



Disadvantages  –  extra write  –  space wasted by journal (insignificant)

Sponsor Documents

Or use your account on DocShare.tips

Hide

Forgot your password?

Or register your new account on DocShare.tips

Hide

Lost your password? Please enter your email address. You will receive a link to create a new password.

Back to log-in

Close