Jump to content

File naming

From Wikiversity

There are various ways for software to name files automatically.

This lesson describes the file naming strategies in detail, among respective benefits and shortcomings.

Multiple strategies may be applied to the same file names, as described in § Combination.

Time stamps

[edit | edit source]

File names are automatically generated from the current date and time, usually with descending order until second (→ISO 8601).

Examples
  • 20241012205711 (no separators)
  • 20241012_205711 (underscore separator)
  • 20241012-205711 (dash separator)
  • 2024-10-12 20:57:11 (note: colons in file names are unsupported by many file systems, →§ Characters)
  • 2024-10-12_20-57-11
  • 2024-10-12-20-57-11
  • 2024-10-12-20_57-11
  • 20241012_20h57m11s
  • 20241012-20h57m11s
  • 2024-10-12_20h57m11s
  • 2024-10-12-20h57m11s
  • 2024-10-12_20h-57m-11s
  • 2024-10-12-20h-57m-11s
  • 20241012T205711
  • 2024-10-12-T205711
  • 2024-10-12-T20-57-11
  • 2024-10-12-T20-57-11-UTC

File names may include prefixes such as IMG_20241012_205711.jpg and Screenshot_20241012-205711.png.

If more than one file is created within one second, such as in burst photography, a suffix number may be added before the file extension, such as IMG_20241012_205711_1.jpg or IMG_20241012_205711(1).jpg, or milleseconds, such as IMG_20241012_205711_228.jpg or IMG_20241012_205711-437.jpg.

Mobile phone camera software commonly uses this method.

Benefits

This method safely prevents duplicate file names which may cause confusion and clobbering (overwriting) if handled improperly, such as when using the mv command without the -n flag. Naming conflicts when merging files from multiple devices into one place are also avoided.

In addition, during file transfers between systems, the file attributes containing the date and time information may be dismissed and reset. If the file's internal meta data does not contain date and time information, the file name is the only location where the otherwise lost date and time information is stored. To avoid losing such information, a list of files with dates and times should recommendedly be created before initiating a file transfer.

Searches for files by time are facilitated, because all file searching software is able to search for files by name, while not all software is able to search by file time attributes or internal metadata time stamp usually found in photo and video files.

Time-based selection of files is facilitated using wildcards, as usually supported by command prompt and terminal software, for time-based file operations such as copying all photos and videos from a specific day or month. As an example, the asterisk wildcard DCIM/Camera/202410*.mp4 would match all videos captured in October of 2024, if the camera software uses a 20241012_205711.mp4 file name format.

There is no realistic limit of possible file names at a fixed name length, while on numbered files with a too short fixed length (e.g. four digits) not split between directories, the possible number of file names might foreseeably be exhausted, where a variable digit length (e.g. 9999 followed by 10000) might be presented in an incorrect order in alphanumerical file sorting, if number detection is not implemented into the file management software.

Pasting

[edit | edit source]

Time stamps can be set to be pasted with a keyboard shortcut, implemented through AutoHotKey for Windows and xdotool for Linux.

AutoHotKey (Windows)

For example, this AutoHotKey script pastes the time stamp when pressing Alt+2.

Alt & 2::
Send, %A_Now%
Return

%A_Now does the same as Send,%A_YYYY%%A_MM%%A_DD%%A_HH24%%A_MI%%A_SS%, which is pasting a YYYYMMDDHHMMSS time stamp, such as 20241012205711.

xdotool (Linux)

On Linux, a keyboard shortcut can be assigned to call this example script, which can be modified to suit one's needs. It requires xdotool to be installed.

sleep .2
xdotool type "$(date +%Y%m%d%H%M%S)" # Format can be changed

sleep .2 (synonymous with sleep 0.2) adds a fifth-second delay before pasting the time stamp to prevent pasting prior to the key being released, possibly causing a part of the output to be omitted.

Numbering

[edit | edit source]

This method is commonly used among digital cameras and camcorders due to lack of internet connectivity or GPS reception which would be necessary for the clock to set automatically, and some users not bothering with setting the date and time before starting to operate the device.

When storing on a new (or cleared) memory card, some digital cameras and other software starts over counting (based on the last used file number), while others keep a memory of the last used number, stored in a local variable, the latter of which prevents duplicate file names when moving to a common place.

Examples
  • DSCI0001.JPG
  • IMG_0002.JPG
  • 00003.MTS
  • Screenshot (4).png

Digital cameras may split the files between directories per day or per 999 or 1000 files:

  • DCIM/100_PANA/P1000999.JPG
  • DCIM/101_PANA/P1010001.JPG

This naming scheme is used by at least several Panasonic Lumix digital cameras.

Benefits

Numbered file names facilitate counting files and noticing files that possibly went missing during a transfer.

If the device's clock setting is incorrect as described above, the file names will not carry wrong date and time information. Some devices such as standalone flat bed (paper) scanners which allow direct saving to flash drives and/or memory cards may lack a clock setting entirely, which may make the date and time information appear as 1970-01-01, the Unix epoch.

Numbered file names are usually shorter than timestamped ones. With modern file systems however which are not limited to eight-point-three file names, this benefit is only marginial.

Combination

[edit | edit source]

A combination of both naming strategies may be used, combining the benefits. In practice, this has not been commonly implemented so far.

Examples
  • Screenshot_000001_20241012205711.png
  • IMG00002_20241012205711.jpg
  • VID_0003_20241012205711.mp4
  • 2024-10-12-20_57_11.4.ogg (The leading zeroes for file numbers are not necessary here for correct sorting, due to the time stamp preceeding it, but they may be necessary for a constant file name length, to prevent irritatingly protruding file names in a list.)
  • 2024-10-12-20_57_11.00005.ogg (With leading zeroes.)

Contextual

[edit | edit source]

Contextual file names, often used in addition to a time stamp, may contain information such as location[1] or opened software[2].

The contextual information may be applied prior to or after the file name:

  • 20241012_205711_Avenue_Kléber.jpg – Location name option in the camera software of some Samsung mobile phones since 2013.
  • Screenshot_YouTube_20241012-205711.png – Application title name, before time stamp, as used by Samsung Mobile since Android 8.0 Oreo.[2]
  • Screnshot_2024-10-12-20-57-11_com.mi.android.globalFileexplorer.png – Application package name, after time stamp, as used by Xiaomi Mobile.

Manual

[edit | edit source]

Consider adding short notes to names of multimedia files such as screen shots, photos, videos, voice notes, etc., as they are not searchable like text.

Screenshots' text may be searchable using character recognition software, but scanning each picture for text would be far slower (presumably over 100 times) than directly machine-readable text, and as of 2021, no open-source software for this purpose appears to exist.

File names for downloaded media like videos should include a title and may include the source ID and date stamp and author name to facilitate finding. A format example is "Will YouTube Ever Run Out Of Video IDs - Tom Scott (2016-03-21).gocwRvLhDf8.mp4".

Disk images' file names may include the data storage type (e. g. SSD, HDD, BD-R), the vendor name, vendor-specified size, and the approximate date of purchase. If the image is intended for long-term archival rather than a short-term backup, the current date stamp (e. g. "2024-10-12") should be written as well. An example is SD-SanDisk-64GB-2016.2024-10-12.img").

Characters like a colon (:) and a question mark (?), as well as duplicate file and directory names with only letter case differences (e. g. "New folder" and "New Folder"), should be avoided from to ensure support across file systems and operating systems.

Since file managers typically allow jumping to files in a directory listing by typing in the file name from the beginning, choose the start of the file name by which information you would like to type in. For example, if you wish to jump to a file in the directory listing by typing in a date, start the file name with that. If you wish to jump to a file by a name or location, put that at the beginning of the file name. As of 2022, file managers are not known to have a search feature that only searches the current directory listing without sub directories, but this can be done in the terminal (command prompt) using wildcard characters.

Characters

[edit | edit source]

Use of the following characters should be avoided wherever possible, due to limited support or automatic substitution among file systems and resource locator types:

  • (space character) — Inconvenient in Unix/Linux terminal auto completion (needs backslash escaping if not within quotation marks); substituted with %20 in URL and _ (underscore) in MediaWiki.
  • : (colon) — Possible compatibility issues; reserved for drive letters in Microsoft Windows (e.g. C:\).
  • = (equal sign) — Might be parsed as variable modifier in Bash and MediaWiki.

Mistakes

[edit | edit source]

The following mistakes should be avoided:

Clobbering (overwriting) existing files
This is more likely to happen on applications which use static file names (e.g. for data exports), where an existing file always causes a naming collision, but might happen on deficient implementations where numbered files are written from a local number variable without verifying that the file name is not reserved.

Temporary names

[edit | edit source]

Software may give files a temporary name while it is being processed. For example, the Firefox web browser appends ".part" to the file name, and Chrome ".crdownload". The benefits are that it distinguishes the file, and signifies to the user that the file is incomplete if the download was interrupted. However, a disadvantage is that it interferes with streaming a multimedia file in a locally installed media player software during download.

When saving a file, a .bak file may be created to prevent corruption in case of a failed write. This may be caused by a power outage, operating system crash, or disk space exhaustion.

Additional tips

[edit | edit source]
  • Manual file naming does not have always to be completely consistent where the benefit of sparing efforts needed for maintaining such outweighs, for example for names of text files containing drafted text snippets, where content can be grepped through rapidly when necessary.

References

[edit | edit source]