Paper Review Series : Google File System

Google File System(GFS) a scalable distributed data-intensive file system designed and implemented by Google. While sharing features of traditional file systems, it is designed with new issues/concerns taken into considerations, such as frequent component failures, files of extra large sizes, append-only updates, coupling of file system and applications, throughput over latency, efficient multi-client appending, etc. It discussed the design overview, how system works, metadata, and garbage collection. It also assessed the high availability, fault tolerance and diagnosis. Finally it showed the performance benchmark and analysis. The system consists of a single master, multiple chunk servers and clients. Files are divided into chunks and are stored on chunk servers. Master server stores all file system metadata including the namespace and the current locations of chunks. It also manages system activities like garbage collection, chunk lease and chunk migration. Clients are interface between the system and applications. The topology of the system is simple and clear. It can be deployed on commodity machines and can achieve linear scalability by adding more nodes. Reliability and high availability is fulfilled by replica and shadow mechanism. Every chunk is replicated onto multiple chunk servers. Both master and chunk server are designed to be able to restart and recover fast. Master states are also replicated onto “shadow” masters. If the master fails, shadow masters provide read-only access to the file system. For data integrity, chunk server uses checksumming. If data corruption detected, chunk server will inform master and requests are directed to other replicas. The master creates a new uncorrupted replica and delete the corrupted replica.

Written on October 1, 2018