Jump to content

Unix file types: Difference between revisions

From Wikipedia, the free encyclopedia
Content deleted Content added
Rescuing 3 sources and tagging 0 as dead.) #IABot (v2.0.9.5
 
(37 intermediate revisions by 12 users not shown)
Line 1: Line 1:
{{Short description|File types in Unix operating systems}}
The standard '''[[Unix]] file types''' are ''regular'', ''directory'', ''symbolic link'', ''FIFO special'', ''block special'', ''character special'', and ''socket'' as defined by [[POSIX]].<ref name=stath>{{cite web |url=http://pubs.opengroup.org/onlinepubs/009604499/basedefs/sys/stat.h.html |title=<sys/stat.h> |website=The Open Group Base Specifications Issue 6| publisher=The Open Group |date=21 July 2019}}</ref> Different OS-specific implementations allow more types than what POSIX requires (e.g. Solaris [[#Door|doors]]). A file's type can be identified by the [[ls|<code>ls -l</code>]] command, which displays the type in the first character of the [[file system permissions]] field.
The seven standard '''Unix file types''' are ''regular'', ''directory'', ''symbolic link'', ''FIFO special'', ''block special'', ''character special'', and ''socket'' as defined by [[POSIX]].<ref name=stath>{{cite web |url=http://pubs.opengroup.org/onlinepubs/009604499/basedefs/sys/stat.h.html |title=<sys/stat.h> |website=The Open Group Base Specifications Issue 6 |publisher=The Open Group |date=21 July 2019 |access-date=10 February 2017 |archive-date=27 November 2016 |archive-url=https://web.archive.org/web/20161127132311/http://pubs.opengroup.org/onlinepubs/009604499/basedefs/sys/stat.h.html |url-status=live }}</ref> Different OS-specific implementations allow more types than what POSIX requires (e.g. Solaris [[Doors (computing)|doors]]). A file's type can be identified by the [[ls|<code>ls -l</code>]] command, which displays the type in the first character of the [[file-system permissions]] field.


For regular files, Unix does not impose or provide any internal file structure; therefore, their structure and interpretation is entirely dependent on the software using them. However, the [[File (command)|<code>file</code>]] command can be used to determine what type of data they contain.
For [[Computer file|regular files]], [[Unix]] does not impose or provide any internal file structure; therefore, their structure and interpretation is entirely dependent on the software using them.<ref>{{cite book |last1=Loukides |first1=Mike |title=Unix Power Tools |date=October 2002 |publisher=O'Reilly |isbn=9780596003302 |page=80 |edition=3 |chapter=When Is a File Not a File? |quote=A file is nothing more than a stream of bytes ...}}</ref> However, the [[File (command)|<code>file</code>]] command can usually be used to determine what [[file format|type of data]] they contain.<ref>{{cite web |url=https://pubs.opengroup.org/onlinepubs/9699919799/utilities/file.html |title=<code>file</code> |publisher=[[The Open Group]] |work=IEEE Std 1003.1-2017 ([[POSIX]]) |date=2018 |access-date=2023-10-26 |archive-date=2018-10-12 |archive-url=https://web.archive.org/web/20181012025917/https://pubs.opengroup.org/onlinepubs/9699919799/utilities/file.html |url-status=live }}</ref>


== Representations ==
== Representations ==
{{see also|File system permissions#Notation of traditional Unix permissions}}
{{see also|File-system permissions#Notation of traditional Unix permissions|chmod}}
=== Numeric ===
=== Numeric ===
In the stat structure, file type and permissions are stored together in a {{code|st_mode}} [[bit field]], which has a size of at least 12 bits (3 bits for seven types of files; 9 bits for permissions). The layout for permissions is defined by POSIX to be at the least-significant 9 bits, but the rest is undefined.<ref name=stath/>
In the [[stat (system call)|stat structure]], file type and permissions (the '''mode''') are stored together in a {{code|st_mode}} [[bit field]], which has a size of at least 12 bits (3 bits to specify the type among the seven possible types of files; 9 bits for permissions). The layout for permissions is defined by POSIX to be at the [[least significant bits|least-significant]] 9 bits, but the rest is undefined.<ref name=stath/>


By convention, the mode is a 16-bit value written out as a six-digit octal number without a leading zero. The format part occupies the lead 4-bits (2 digits), and "10" ({{mono|1000}} in binary) usually stands for a regular file. The mid 3 bits (1 digit) are usually used for [[File_system_permissions#Changing_permission_behavior_with_setuid,_setgid,_and_sticky_bits|setuid, setgid, and sticky]]. The last part is already defined by POSIX to contain the permission. An example is "100644" for a typical file. This format can be seen in [[git]] and [[ar (Unix)|ar]], among other places.<ref>{{cite web |last1=Kitt |first1=Stephen |title=What file mode is a symlink? |url=https://unix.stackexchange.com/a/193468 |website=Unix & Linux Stack Exchange}}</ref>
By convention, the mode is a 16-bit value written out as a six-digit octal number without a leading zero. The format part occupies the lead 4-bits (2 octal digits), and "010" ({{mono|1000}} in binary) usually stands for a regular file. The next 3 bits (1 digit) are usually used for [[File_system_permissions#Changing_permission_behavior_with_setuid,_setgid,_and_sticky_bits|setuid, setgid, and sticky]]. The last part is already defined by POSIX to contain the permission. An example is "100644" for a typical file. This format can be seen in [[git]], [[tar (computing)|tar]], and [[ar (Unix)|ar]], among other places.<ref>{{cite web |last1=Kitt |first1=Stephen |title=What file mode is a symlink? |url=https://unix.stackexchange.com/a/193468 |website=Unix & Linux Stack Exchange}}</ref>


The type of a file can be tested using macros like <code>S_ISDIR</code>. Such a check is usually performed by masking the mode with <code>S_IFMT</code> (often the octal number "170000" for the lead 4 bits convention) and checking whether the result matches <code>S_IFDIR</code>. <code>S_IFMT</code> is not a core POSIX concept, but a X/Open System Interfaces (XSI) extension; systems conforming to ''only'' POSIX may use some other methods.<ref name=stath/>
The type of a file can be tested using macros like <code>S_ISDIR</code>. Such a check is usually performed by masking the mode with <code>S_IFMT</code> (often the octal number "170000" for the lead 4 bits convention) and checking whether the result matches <code>S_IFDIR</code>. <code>S_IFMT</code> is not a core POSIX concept, but a X/Open System Interfaces (XSI) extension; systems conforming to ''only'' POSIX may use some other methods.<ref name=stath/>
Line 17: Line 18:
drwxr-xr-x 2 root root 0 Jan 1 1970 home
drwxr-xr-x 2 root root 0 Jan 1 1970 home


[[POSIX]] specifies<ref>{{cite web |url=http://pubs.opengroup.org/onlinepubs/9699919799/utilities/ls.html |title=IEEE Std 1003.1-2008 ls |publisher=The Open Group |date=11 March 2017}}</ref> the format of the output for the long format (<code>-l</code> option). In particular, the first field (before the first space) is dubbed the "file mode string" and its first character describes the file type. The rest of this string indicates the [[File system permissions#Symbolic notation|file permissions]].
[[POSIX]] specifies<ref name="ls">{{cite web |url=http://pubs.opengroup.org/onlinepubs/9699919799/utilities/ls.html |work=IEEE Std 1003.1-2008 ([[POSIX]]) |title=<code>ls</code> |publisher=The Open Group |date=11 March 2017 |access-date=10 February 2017 |archive-date=3 August 2017 |archive-url=https://web.archive.org/web/20170803094750/http://pubs.opengroup.org/onlinepubs/9699919799/utilities/ls.html |url-status=live }}</ref> the format of the output for the long format (<code>-l</code> option). In particular, the first field (before the first space) is dubbed the "file mode string", here <code>drwxr-xr-x</code>. Its first character describes the file type, here <code>d</code> (directory). The rest of this string indicates the [[File-system permissions#Symbolic notation|file permissions]].

Therefore, in the example, the mode string is <code>drwxr-xr-x</code>: the file type is <code>d</code> (directory) and the permissions are <code>rwxr-xr-x</code>.


=== Examples of implementations ===
=== Examples of implementations ===
Line 26: Line 25:


FreeBSD uses a simpler approach but allows a smaller number of file types.<ref>{{cite web |url=https://github.com/freebsd/freebsd/blob/8e401abc421060406354c05d494b887f91849b6d/bin/ls/print.c#L557 |title=printtype function from FreeBSD |publisher=FreeBSD |date=11 March 2017}}</ref>
FreeBSD uses a simpler approach but allows a smaller number of file types.<ref>{{cite web |url=https://github.com/freebsd/freebsd/blob/8e401abc421060406354c05d494b887f91849b6d/bin/ls/print.c#L557 |title=printtype function from FreeBSD |publisher=FreeBSD |date=11 March 2017}}</ref>

==Regular file==
{{Main|Computer file}}

Regular files show up in <code>ls -l</code> with a [[hyphen-minus]] <code>-</code> in the mode field:

$ ls -l /etc/passwd
'''-'''rw-r--r-- ... /etc/passwd


==Directory==
==Directory==
{{Main|Directory (computing)}}
{{Main|Directory (computing)}}{{needs more sources|section|date=October 2023}}

The most common special file is the directory. The layout of a directory file is defined by the filesystem used. As several filesystems are available under Unix, both native and non-native, there is no one directory file layout.
The most common special file is the directory. The layout of a directory file is defined by the filesystem used. As several filesystems are available under Unix, both native and non-native, there is no one directory file layout.


A directory is marked with a <code>'''d'''</code> as the first letter in the mode field in the output of <code>ls -dl</code> or <code>stat</code>, e.g.
A directory is marked with a <code>'''d'''</code> as the first letter in the mode field in the output of <code>ls -dl</code><ref name="ls" /> or <code>stat</code>, e.g.


$ ls -dl /
$ ls -dl /
Line 53: Line 43:


==Symbolic link==
==Symbolic link==
{{Main|Symbolic link}}
{{Main|Symbolic link}}{{needs more references|section|date=October 2023}}

A symbolic link is a reference to another file. This special file is stored as a textual representation of the referenced file's path (which means the destination may be a relative path, or may not exist at all).
A symbolic link is a reference to another file. This special file is stored as a textual representation of the referenced file's path (which means the destination may be a relative path, or may not exist at all).


A symbolic link is marked with an <code>'''l'''</code> (lower case <code>L</code>) as the first letter of the mode string, e.g.
A symbolic link is marked with an <code>'''l'''</code> (lower case <code>L</code>) as the first letter of the mode string, e.g. in this abbreviated <code>ls -l</code> output:<ref name="ls" />


'''l'''rwxrwxrwx ... termcap -> /usr/share/misc/termcap
'''l'''rwxrwxrwx ... termcap -> /usr/share/misc/termcap
Line 63: Line 52:


==FIFO (named pipe)==
==FIFO (named pipe)==
{{Main|Named pipe}}
{{Main|Named pipe}}{{needs more references|section|date=October 2023}}

One of the strengths of Unix has always been [[inter-process communication]]. Among the facilities provided by the OS are ''pipes'', which connect the output of one [[Process (computing)|process]] to the input of another. This is fine if both processes exist in the same parent process space, started by the same user, but there are circumstances where the communicating processes must use FIFOs, here referred to as ''named pipes''. One such circumstance occurs when the processes must be executed under different user names and permissions.
One of the strengths of Unix has always been [[inter-process communication]]. Among the facilities provided by the OS are ''pipes'', which connect the output of one [[Process (computing)|process]] to the input of another. This is fine if both processes exist in the same parent process space, started by the same user, but there are circumstances where the communicating processes must use FIFOs, here referred to as ''named pipes''. One such circumstance occurs when the processes must be executed under different user names and permissions.


Named pipes are special files that can exist anywhere in the file system. They can be created with the command <code>[[mkfifo]]</code> as in <code>mkfifo mypipe</code>.
Named pipes are special files that can exist anywhere in the file system. They can be created with the command <code>[[mkfifo]]</code> as in <code>mkfifo mypipe</code>.


A named pipe is marked with a <code>'''p'''</code> as the first letter of the mode string, e.g.
A named pipe is marked with a <code>'''p'''</code> as the first letter of the mode string, e.g. in this abbreviated <code>ls -l</code> output:<ref name="ls" />


'''p'''rw-rw---- ... mypipe
'''p'''rw-rw---- ... mypipe


==Socket==
==Socket==
{{Main|Unix domain socket}}
{{Main|Unix domain socket}}{{No references|section|date=October 2023}}

A socket is a special file used for [[inter-process communication]], which enables communication between two processes. In addition to sending data, processes can send [[file descriptor]]s across a Unix domain socket connection using the <code>sendmsg()</code> and <code>recvmsg()</code> system calls.
A socket is a special file used for [[inter-process communication]], which enables communication between two processes. In addition to sending data, processes can send [[file descriptor]]s across a Unix domain socket connection using the <code>sendmsg()</code> and <code>recvmsg()</code> system calls.


Line 85: Line 72:


==Device file (block, character)==
==Device file (block, character)==
{{Main|Device file}}
{{Main|Device file}}{{needs more references|section|date=October 2023}}

In Unix, almost all things are handled as files and have a location in the file system, even hardware devices like hard drives. The great exception is network devices, which do not turn up in the file system but are handled separately.
In Unix, almost all things are handled as files and have a location in the file system, even hardware devices like hard drives. The great exception is network devices, which do not turn up in the file system but are handled separately.


Line 98: Line 84:
Although, for example, [[disk partition]]s may have both character devices that provide un-buffered random access to blocks on the partition and block devices that provide buffered random access to blocks on the partition.
Although, for example, [[disk partition]]s may have both character devices that provide un-buffered random access to blocks on the partition and block devices that provide buffered random access to blocks on the partition.


A character device is marked with a <code>'''c'''</code> as the first letter of the mode string. Likewise, a block device is marked with a <code>'''b'''</code>, e.g.
A character device is marked with a <code>'''c'''</code> as the first letter of the mode string and a block device is marked with a <code>'''b'''</code>, e.g. in this abbreviated <code>ls -l</code> output:<ref name="ls" />


'''c'''rw------- ... [[/dev/null]]
'''c'''rw-rw-rw- ... [[/dev/null]]
'''b'''rw-rw---- ... [[/dev/sda]]
'''b'''rw-rw---- ... [[/dev/sda]]

==Door==
{{Main|Doors (computing)}}

A door is a special file for inter-process communication between a client and server, currently implemented only in [[Solaris (operating system)|Solaris]].

A door is marked with a <code>'''D'''</code> (upper case) as the first letter of the mode string, e.g.

'''D'''r--r--r-- ... name_service_door

==See also==

* [[file (command)]]


==References==
==References==

Latest revision as of 23:47, 24 July 2024

The seven standard Unix file types are regular, directory, symbolic link, FIFO special, block special, character special, and socket as defined by POSIX.[1] Different OS-specific implementations allow more types than what POSIX requires (e.g. Solaris doors). A file's type can be identified by the ls -l command, which displays the type in the first character of the file-system permissions field.

For regular files, Unix does not impose or provide any internal file structure; therefore, their structure and interpretation is entirely dependent on the software using them.[2] However, the file command can usually be used to determine what type of data they contain.[3]

Representations

[edit]

Numeric

[edit]

In the stat structure, file type and permissions (the mode) are stored together in a st_mode bit field, which has a size of at least 12 bits (3 bits to specify the type among the seven possible types of files; 9 bits for permissions). The layout for permissions is defined by POSIX to be at the least-significant 9 bits, but the rest is undefined.[1]

By convention, the mode is a 16-bit value written out as a six-digit octal number without a leading zero. The format part occupies the lead 4-bits (2 octal digits), and "010" (1000 in binary) usually stands for a regular file. The next 3 bits (1 digit) are usually used for setuid, setgid, and sticky. The last part is already defined by POSIX to contain the permission. An example is "100644" for a typical file. This format can be seen in git, tar, and ar, among other places.[4]

The type of a file can be tested using macros like S_ISDIR. Such a check is usually performed by masking the mode with S_IFMT (often the octal number "170000" for the lead 4 bits convention) and checking whether the result matches S_IFDIR. S_IFMT is not a core POSIX concept, but a X/Open System Interfaces (XSI) extension; systems conforming to only POSIX may use some other methods.[1]

Mode string

[edit]

Take for example one line in the ls -l output:

drwxr-xr-x 2 root root     0 Jan  1  1970 home

POSIX specifies[5] the format of the output for the long format (-l option). In particular, the first field (before the first space) is dubbed the "file mode string", here drwxr-xr-x. Its first character describes the file type, here d (directory). The rest of this string indicates the file permissions.

Examples of implementations

[edit]

The GNU coreutils version of ls uses a call to filemode(), a glibc function (exposed in the gnulib library[6]) to get the mode string.

FreeBSD uses a simpler approach but allows a smaller number of file types.[7]

Directory

[edit]

The most common special file is the directory. The layout of a directory file is defined by the filesystem used. As several filesystems are available under Unix, both native and non-native, there is no one directory file layout.

A directory is marked with a d as the first letter in the mode field in the output of ls -dl[5] or stat, e.g.

$ ls -dl /
drwxr-xr-x 26 root root 4096 Sep 22 09:29 /

$ stat /
  File: "/"
  Size: 4096            Blocks: 8          IO Block: 4096   directory
Device: 802h/2050d      Inode: 128         Links: 26
Access: (0755/drwxr-xr-x)  Uid: (    0/    root)   Gid: (    0/    root)
...
[edit]

A symbolic link is a reference to another file. This special file is stored as a textual representation of the referenced file's path (which means the destination may be a relative path, or may not exist at all).

A symbolic link is marked with an l (lower case L) as the first letter of the mode string, e.g. in this abbreviated ls -l output:[5]

lrwxrwxrwx ... termcap -> /usr/share/misc/termcap
lrwxrwxrwx ... S03xinetd -> ../init.d/xinetd

FIFO (named pipe)

[edit]

One of the strengths of Unix has always been inter-process communication. Among the facilities provided by the OS are pipes, which connect the output of one process to the input of another. This is fine if both processes exist in the same parent process space, started by the same user, but there are circumstances where the communicating processes must use FIFOs, here referred to as named pipes. One such circumstance occurs when the processes must be executed under different user names and permissions.

Named pipes are special files that can exist anywhere in the file system. They can be created with the command mkfifo as in mkfifo mypipe.

A named pipe is marked with a p as the first letter of the mode string, e.g. in this abbreviated ls -l output:[5]

prw-rw---- ... mypipe

Socket

[edit]

A socket is a special file used for inter-process communication, which enables communication between two processes. In addition to sending data, processes can send file descriptors across a Unix domain socket connection using the sendmsg() and recvmsg() system calls.

Unlike named pipes which allow only unidirectional data flow, sockets are fully duplex-capable.

A socket is marked with an s as the first letter of the mode string, e.g.

srwxrwxrwx /tmp/.X11-unix/X0

Device file (block, character)

[edit]

In Unix, almost all things are handled as files and have a location in the file system, even hardware devices like hard drives. The great exception is network devices, which do not turn up in the file system but are handled separately.

Device files are used to apply access rights to the devices and to direct operations on the files to the appropriate device drivers.

Unix makes a distinction between character devices and block devices. The distinction is roughly as follows:

  • Character devices provide only a serial stream of input or accept a serial stream of output
  • Block devices are randomly accessible

Although, for example, disk partitions may have both character devices that provide un-buffered random access to blocks on the partition and block devices that provide buffered random access to blocks on the partition.

A character device is marked with a c as the first letter of the mode string and a block device is marked with a b, e.g. in this abbreviated ls -l output:[5]

crw-rw-rw- ... /dev/null
brw-rw---- ... /dev/sda

References

[edit]
  1. ^ a b c "<sys/stat.h>". The Open Group Base Specifications Issue 6. The Open Group. 21 July 2019. Archived from the original on 27 November 2016. Retrieved 10 February 2017.
  2. ^ Loukides, Mike (October 2002). "When Is a File Not a File?". Unix Power Tools (3 ed.). O'Reilly. p. 80. ISBN 9780596003302. A file is nothing more than a stream of bytes ...
  3. ^ "file". IEEE Std 1003.1-2017 (POSIX). The Open Group. 2018. Archived from the original on 2018-10-12. Retrieved 2023-10-26.
  4. ^ Kitt, Stephen. "What file mode is a symlink?". Unix & Linux Stack Exchange.
  5. ^ a b c d e "ls". IEEE Std 1003.1-2008 (POSIX). The Open Group. 11 March 2017. Archived from the original on 3 August 2017. Retrieved 10 February 2017.
  6. ^ "filemode function in GNU coreutils". GNU. 11 March 2017.
  7. ^ "printtype function from FreeBSD". FreeBSD. 11 March 2017.