Skip to content

Everything is a file

“Everything is a file”是Unix-like OS的一个philosophy,它对于在linux OS中进行programming大有裨益。

wikipedia Everything is a file

"Everything is a file" describes one of the defining(最典型的) features of Unix, and its derivatives — that a wide range of input/output resources such as documents, directories, hard-drives, modems, keyboards, printers and even some inter-process and network communications are simple streams of bytes exposed through the filesystem name space.

NOTE: 最后一段话是对***"Everything is a file"*** 含义的解释:即将这些resource都看做是file(streams of bytes

NOTE: 上述 hard-drives,modems,keyboards,等都是device,显然在Unix中,它们都被看做成了file(streams of bytes ),所以everything is a file,可以解释为everything is a file descriptor,每个descriptor对应的是一个stream,所以everything is a file descriptor可以解释为everything is a stream。

The advantage of this approach is that the same set of tools, utilities and APIs can be used on a wide range of resources. There are a number of file types. When a file is opened, a file descriptor is created. The file path becoming the addressing system and the file descriptor being the byte stream I/O interface. But file descriptors are also created for things like anonymous pipes and network sockets via different methods. So it is more accurate to say "Everything is a file descriptor".

Additionally, a range of pseudo and virtual filesystems exists which exposes information about processes and other system information in a hierarchical file-like structure. These are mounted into the single file hierarchy.

An example of this purely virtual filesystem is under /proc that exposes many system properties as files.

All of these "files" have standard Unix file attributes such as an owner and access permissions, and can be queried by the same classic Unix tools and filters. However, this is not universally considered a fast or portable approach. Some operating systems do not even mount /proc by default due to security or speed concerns. It is, though, used heavily by both the widely installed BusyBox [5] on embedded systems and by procps, which is used on most Linux systems. In both cases it is used in implementations of process-related POSIX shell commands. It is similarly used on Android systems in the operating system's Toolbox program.

Unix's successor Plan 9 took this concept into distributed computing with the 9P protocol.

superuser Why is “Everything is a file” unique to the Unix operating systems?

A

So, why is this unique to Unix?

Typical operating systems, prior to Unix, treated files one way and treated each peripheral device(外设) according to the characteristics of that device. That is, if the output of a program was written to a file on disk, that was the only place the output could go; you could not send it to the printer or the tape drive. Each program had to be aware of each device used for input and output, and have command options to deal with alternate I/O devices.

Unix treats all devices as files, but with special attributes. To simplify programs, standard input and standard output are the default input and output devices of a program(这句话解释了*standard input*,standard output 的原因 ). So program output normally intended for the console screen could go anywhere, to a disk file or a printer or a serial port. This is called I/O redirection.

Does other operating systems such as Windows and Macs not operate on files?

Of course all modern OSes support various filesystems and can "operate on files", but the distinction is how are devices handled? Don't know about Mac, but Windows does offer some I/O redirection.

And, compared to what other operating systems is it unique?

Not really any more. Linux has the same feature. Of course, if an OS adopts I/O redirection, then it tends to use other Unix features and ends up Unix-like in the end.

wikipedia Event loop

在这篇文章的File_interface章节对every thing is a file进行阐释;

wikipedia Device file

将device抽象为file,这就是everything is a file最好的体现;

Beej's Guide to Network Programming

在这本书的第二章2. What is a socket?中对everything is a file进行了阐述;

APUE chapter 16 Network IPC: Sockets

昨天在阅读APUE的的chapter 16 Network IPC: Sockets时,所想:

everything in Unix is a file,所以和我应该采用看待普通文件的方式来看待Unix的socket。socket和file一样,都是通过**file descriptor**来进行访问。POSIX中提供的操作socket的函数的第一个参数都是fd,表示这个socket的file descriptor,这种做法和file是非常类似的。

socket() 函数就好比create()函数。其实APUE的作者在16.2中就对比了Unix的针对file的API和针对socket的API。

如果从面向对象的角度来构造POSIX的文件api和socket api的话,接受file descriptor的api都可以作为成员函数,每个对象都有一个file descriptor。

Beej's Guide to Network Programming的2. What is a socket?章节也是从file descriptor的角度来描述socket的;

Why everything in Unix is a file

Unix是典型的Monolithic kernel,所以它需要将很多东西封装好而只提供一个descriptor来供用户使用,这个descriptor从用户的角度来看就是file descriptor。显然,everything in Unix is a file是一种简化的抽象,它让用户更加容易理解。

当然,从内核的实现上是否真的是如此我目前还不得而知,但是从用户的角度来看,这是非常正确的。

Everything in Unix is file 和 file API

需要注意的是,everything in Unix is file是一个个philosophy,它是概念上的,它更多的是指:将它看做是一个file,我们可以对其进行IO,但是这并不是指我们可以对everything in Unix都使用Unix file的API。

关于这一点,APUE的16.2 Socket Descriptors进行了一些描述;

关于这一点,在pipe(7) - Linux man page的I/O on pipes and FIFOs章节中提及:

It is not possible to apply lseek(2) to a pipe.

显然,我们可以*认为*(从逻辑上)pipe是一个file,但是它实际上并不是file,所以,并不能够对其使用lseek系统调用。

从kernel实现的角度来看待everything in Unix is file

引用自维基百科File descriptor :

In Unix-like systems, file descriptors can refer to any Unix file type named in a file system. As well as regular files, this includes directories, block and character devices (also called "special files"), Unix domain sockets, and named pipes. File descriptors can also refer to other objects that do not normally exist in the file system, such as anonymous pipesand network sockets.

NOTE: Everything is a file ;从kernel实现的角度来看看待everything in Unix is file,Unix-like system是monolithic kernel,上面提到的这些device或者file都是由kernel来进行维护,它们都有对应的kernel structure;我们通过file descriptor来引用这些kernel structure,我们只能够通过system call来对这些kernel structure进行操作;

对这个观点的验证包括:

《Understanding.The.Linux.kernel.3rd.Edition》chapter 1.6.9. Device Drivers的

关于everything is a file,《Understanding.The.Linux.kernel.3rd.Edition》chapter 1.6.9. Device Drivers的内容与此有关。