Archive file: Difference between revisions
Mainframe98 (talk | contribs) m Fix broken link |
GreenC bot (talk | contribs) Rescued 1 archive link; reformat 1 link. Wayback Medic 2.5 per WP:URLREQ#www-03.ibm.com |
||
(30 intermediate revisions by 25 users not shown) | |||
Line 1: | Line 1: | ||
{{Short description| |
{{Short description|One or more computer files with metadata}} |
||
In [[computing]], an '''archive file''' is a [[computer file]] that is composed of one or more files along with [[metadata]]. Many archive formats also support compression of member files. Archive files are used to [[Linker (computing)|collect]] multiple data files together into a single file for easier [[Software portability|portability]] and storage, or simply to [[lossless compression|compress]] files to use less storage space. Archive files often store [[directory structure]]s, [[error detection and correction]] information, comments, and some use built-in [[encryption]].<ref>{{Cite web |title=Archive File: What it's Used For |url=https://www.lifewire.com/what-is-an-archive-file-2625792 |access-date=2022-06-17 |website=Lifewire |language=en |archive-date=2024-07-11 |archive-url=https://web.archive.org/web/20240711021756/https://www.lifewire.com/what-is-an-archive-file-2625792 |url-status=live }}</ref><ref>{{Cite web |date=2015-02-07 |title=Archive files |url=https://www.ibm.com/docs/en/zos/2.1.0?topic=routine-archive-files |access-date=2022-06-17 |website=www.ibm.com |language=en-us |archive-date=2023-09-07 |archive-url=https://web.archive.org/web/20230907001929/https://www.ibm.com/docs/en/zos/2.1.0?topic=routine-archive-files |url-status=live }}</ref><ref>{{Cite web |date=2015-03-23 |title=What is Archiving And Why is it Important? |url=https://www.securedatamgt.com/blog/what-is-archiving/ |access-date=2022-06-17 |website=Secure Data MGT |language=en |archive-date=2022-05-24 |archive-url=https://web.archive.org/web/20220524041509/https://www.securedatamgt.com/blog/what-is-archiving/ |url-status=live }}</ref> |
|||
{{Redirect|File archive||file archiver}} |
|||
{{Unreferenced |date=May 2011}} |
|||
[[File:A corridor of files at The National Archives.jpg|thumb|A corridor of files at The National Archives]] |
|||
In [[computing]], an '''archive file''' is a [[computer file]] that is composed of one or more files along with [[metadata]]. Archive files are used to [[Linker (computing)|collect]] multiple data files together into a single file for easier [[Software portability|portability]] and storage, or simply to [[lossless compression|compress]] files to use less storage space. Archive files often store [[directory structure]]s, [[error detection and correction]] information, arbitrary comments, and sometimes use built-in [[encryption]]. |
|||
==Applications== |
==Applications== |
||
=== Portability === |
|||
Archive files are particularly useful in that they store [[file system]] data and [[metadata]] within the contents of a particular file, and thus can be stored on systems or sent over [[Communication channel|channels]] that do not support the file system in question, only file contents – examples include sending a [[directory structure]] over [[email]]. |
Archive files are particularly useful in that they store [[file system]] data and [[metadata]] within the contents of a particular file, and thus can be stored on systems or sent over [[Communication channel|channels]] that do not support the file system in question, only file contents – examples include sending a [[directory structure]] over [[email]], files with names unsupported on the target file system due to length or characters, and [[:v:File management#Time stamp preservation|retaining files' date and time information]].<ref>{{Cite web |title=Data Portability and Platform Competition {{!}} Is User Data Exported From Facebook Actually Useful to Competitors? |url=https://archive.org/details/data_portability_and_platform_competition_-_is_user_data_exported_from_facebook_ |access-date=June 17, 2022 |website=[[Archive.org]] |pages=22}}</ref> |
||
A single archive file may contain multiple member files; this can speed [[file operation|file transfers and other operations]] with processing overheads for each file,<ref>{{Cite web |date=2020-06-17 |title=Why file transfer speeds of small vs large files could be different |url=https://kb.netapp.com/Advice_and_Troubleshooting/Data_Storage_Software/ONTAP_OS/Why_file_transfer_speeds_of_small_vs_large_files_could_be_different |access-date=2022-06-17 |website=NetApp Knowledge Base |language=en |archive-date=2022-01-01 |archive-url=https://web.archive.org/web/20220101141232/https://kb.netapp.com/Advice_and_Troubleshooting/Data_Storage_Software/ONTAP_OS/Why_file_transfer_speeds_of_small_vs_large_files_could_be_different |url-status=live }}</ref><ref>{{Cite web |date=2018-10-10 |title=Why Small Files Take Longer to Copy Than Large Files |url=https://www.dq-int.co.uk/blog/why-small-files-take-longer-to-copy-than-large-files/ |access-date=2022-06-17 |website=Dataquest |language=en-GB |archive-date=2022-07-02 |archive-url=https://web.archive.org/web/20220702161208/https://www.dq-int.co.uk/blog/why-small-files-take-longer-to-copy-than-large-files/ |url-status=live }}</ref> in addition to gains due to compression. |
|||
=== Software distribution === |
|||
Beyond archival purposes, archive files are frequently used for packaging software for [[Software distribution|distribution]], as software contents are often naturally spread across several files; the archive is then known as a [[Software package (disambiguation)|''package'']]. While the archival file format is the same, there are additional conventions about contents, such as requiring a [[manifest file]], and the resulting format is known as a [[package format]]. Examples include [[Deb (file format)|deb]] for [[Debian]], [[JAR (file format)|JAR]] for [[Java (programming language)|Java]], |
Beyond archival purposes, archive files are frequently used for packaging software for [[Software distribution|distribution]], as software contents are often naturally spread across several files; the archive is then known as a [[Software package (disambiguation)|''package'']]. While the archival file format is the same, there are additional conventions about contents, such as requiring a [[manifest file]], and the resulting format is known as a [[package format]].<ref>{{Cite web |first=Amit |last=Ashbel |title=Data Archiving: The Basics and 5 Best Practices |url=https://cloud.netapp.com/blog/clc-blg-data-archiving-the-basics-and-5-best-practices |access-date=2022-06-17 |website=cloud.netapp.com |language=en-us |archive-date=2022-01-19 |archive-url=https://web.archive.org/web/20220119105316/https://cloud.netapp.com/blog/clc-blg-data-archiving-the-basics-and-5-best-practices |url-status=live }}</ref> Examples include [[Deb (file format)|deb]] for [[Debian]], [[JAR (file format)|JAR]] for [[Java (programming language)|Java]], [[APK (file format)|APK]] for [[Android (operating system)|Android]], and [[Self-extracting archive|self-extracting]] [[Windows Installer]] [[executable file|executables]]. |
||
==Features== |
==Features== |
||
Line 22: | Line 22: | ||
* [[file spanning|splitting a large file into many equal sized files]] for storage or transmission |
* [[file spanning|splitting a large file into many equal sized files]] for storage or transmission |
||
Some archive programs have self-extraction, self-installation, source volume and medium information, and package notes/description. |
Some archive programs have self-extraction, self-installation, source volume and medium information, and package notes/description. |
||
The [[file extension]] or [[file header]] of the archive file are indicators of the [[file format]] used. Computer archive files are created by [[file archiver]] software, [[optical disc authoring software]], and [[disk image]] software. |
The [[file extension]] or [[file header]] of the archive file are indicators of the [[file format]] used. Computer archive files are created by [[file archiver]] software, [[optical disc authoring software]], and [[disk image]] software.<ref>{{Cite web |title=What Is a File Extension & Why Are They Important? |url=https://www.lifewire.com/what-is-a-file-extension-2625879 |access-date=2022-06-17 |website=Lifewire |language=en |archive-date=2022-06-03 |archive-url=https://web.archive.org/web/20220603202723/https://www.lifewire.com/what-is-a-file-extension-2625879 |url-status=live }}</ref> |
||
==Archive formats== |
==Archive formats== |
||
An '''archive format''' is the [[file format]] of an archive file. Some formats are well-defined by their authors and have become conventions supported by multiple vendors and communities. |
An '''archive format''' is the [[file format]] of an archive file. Some formats are well-defined by their authors and have become conventions supported by multiple vendors and communities.<ref>{{Cite web |title=What are Archive Files? |url=https://www.exefiles.com/en/extensions/file-types/archive/ |access-date=2022-06-17 |website=www.exefiles.com |archive-date=2022-05-28 |archive-url=https://web.archive.org/web/20220528110322/https://www.exefiles.com/en/extensions/file-types/archive/ |url-status=live }}</ref> |
||
===Types=== |
===Types=== |
||
Line 39: | Line 39: | ||
{{further|List of archive formats|Comparison of archive formats}} |
{{further|List of archive formats|Comparison of archive formats}} |
||
[[Filename extension]]s used to distinguish different types of archives include [[ZIP (file format)|zip]], [[RAR (file format)|rar]], [[7z (file format)|7z]], and [[tar (file format)|tar]]. |
[[Filename extension]]s used to distinguish different types of archives include [[ZIP (file format)|zip]], [[RAR (file format)|rar]], [[7z (file format)|7z]], and [[tar (file format)|tar]], the first of which is the most widely implemented.<ref>{{Cite web |title=Common file name extensions in Windows |url=https://support.microsoft.com/en-us/windows/common-file-name-extensions-in-windows-da4a4430-8e76-89c5-59f7-1cdbbc75cb01 |access-date=2022-06-17 |website=support.microsoft.com |archive-date=2022-05-27 |archive-url=https://web.archive.org/web/20220527200605/https://support.microsoft.com/en-us/windows/common-file-name-extensions-in-windows-da4a4430-8e76-89c5-59f7-1cdbbc75cb01 |url-status=live }}</ref> |
||
Java also introduced a whole family of archive extensions such as [[JAR (file format)|jar]] and [[WAR (file format)|war]] (''j'' is for Java and ''w'' is for web). They are used to exchange entire byte-code deployment. Sometimes they are also used to exchange source code and other text, HTML and XML files. By default they are all compressed. |
Java also introduced a whole family of archive extensions such as [[JAR (file format)|jar]] and [[WAR (file format)|war]] (''j'' is for Java and ''w'' is for web). They are used to exchange entire byte-code deployment. Sometimes they are also used to exchange source code and other text, HTML and XML files. By default they are all compressed.<ref>{{Cite web |last=Malefanem |first=Moses |title=Learning Java Network Programming |url=https://www.academia.edu/21445522 |access-date=2022-06-17 |archive-date=2023-09-07 |archive-url=https://web.archive.org/web/20230907003607/https://www.academia.edu/21445522 |url-status=live }}</ref> |
||
== Error detection and recovery == |
== Error detection and recovery == |
||
Archive files often include [[parity check]]s and other [[checksum]]s for [[error detection]], for instance [[Zip (file format)|zip files]] use a [[cyclic redundancy check]] (CRC). [[RAR archive]]s may include |
Archive files often include [[parity check]]s and other [[checksum]]s for [[error detection]], for instance [[Zip (file format)|zip files]] use a [[cyclic redundancy check]] (CRC). [[RAR archive]]s may include additional [[Error correction code|error correction]] data (called recovery records).<ref>{{Cite book |last=Drummond |first=James R. |url=https://faraday.physics.utoronto.ca/PVB/Drummond/Micro/ln_comm1.pdf |title=Parity, Checksums and CRC Checks |year=1997 |edition=1st |location=[[Toronto]] |pages=13 |language=En |access-date=2022-06-17 |archive-date=2020-10-31 |archive-url=https://web.archive.org/web/20201031235541/https://faraday.physics.utoronto.ca/PVB/Drummond/Micro/ln_comm1.pdf |url-status=live }}</ref> |
||
Archive files that do not natively support recovery records can use separate [[parchive]] (PAR) files that allows for additional error correction and recovery of missing files in a multi-file archive.<ref>{{Cite web |last=text |title=What are PAR and PAR2 Files? |url=https://help.easynews.com/kb/article/72-what-are-par-and-par2-files/ |access-date=2022-06-17 |website=Easynews |language=en |archive-date=2024-07-11 |archive-url=https://web.archive.org/web/20240711021759/https://help.easynews.com/kb/article/72-what-are-par-and-par2-files/ |url-status=live }}</ref> |
|||
Archive files are sometimes accompanied by separate parity archive (PAR) files that allow for additional error detection and recovery, particularly in recovery of missing files in a multi-file archive. |
|||
==See also== |
==See also== |
||
Line 54: | Line 54: | ||
==References== |
==References== |
||
{{Reflist}} |
|||
* [http://www.pkware.com/documents/casestudies/APPNOTE.TXT "Application Note on the .ZIP file format"]- official white paper published by PKWARE, Inc. |
* [http://www.pkware.com/documents/casestudies/APPNOTE.TXT "Application Note on the .ZIP file format"]- official white paper published by PKWARE, Inc. |
||
* [https://web.archive.org/web/20090611043446/http://datacompression.info/ArchiveFormats/tar.txt Tape Archive (.TAR) file format specification]- excerpt from File Format List 2.0 by Max Maischein |
* [https://web.archive.org/web/20090611043446/http://datacompression.info/ArchiveFormats/tar.txt Tape Archive (.TAR) file format specification]- excerpt from File Format List 2.0 by Max Maischein |
||
* [http://www-03.ibm.com/ibm/history/exhibits/701/701_1415bx26.html "IBM 726 Magnetic tape reader/recorder] from IBM Archives |
* [https://web.archive.org/web/20050122223045/http://www-03.ibm.com/ibm/history/exhibits/701/701_1415bx26.html "IBM 726 Magnetic tape reader/recorder] from IBM Archives |
||
* [https://web.archive.org/web/20120702230325/http://www-03.ibm.com/ibm/history/exhibits/mainframe/mainframe_PP1401.html "1401 Data Processing System"] from IBM Archives |
* [https://web.archive.org/web/20120702230325/http://www-03.ibm.com/ibm/history/exhibits/mainframe/mainframe_PP1401.html "1401 Data Processing System"] from IBM Archives |
||
==External links== |
==External links== |
||
* {{ |
* {{curlie|Computers/Data_Formats/Archive/|File Archive formats}} |
||
{{Archive formats}} |
{{Archive formats}} |
Revision as of 20:05, 25 August 2024
In computing, an archive file is a computer file that is composed of one or more files along with metadata. Many archive formats also support compression of member files. Archive files are used to collect multiple data files together into a single file for easier portability and storage, or simply to compress files to use less storage space. Archive files often store directory structures, error detection and correction information, comments, and some use built-in encryption.[1][2][3]
Applications
Portability
Archive files are particularly useful in that they store file system data and metadata within the contents of a particular file, and thus can be stored on systems or sent over channels that do not support the file system in question, only file contents – examples include sending a directory structure over email, files with names unsupported on the target file system due to length or characters, and retaining files' date and time information.[4]
A single archive file may contain multiple member files; this can speed file transfers and other operations with processing overheads for each file,[5][6] in addition to gains due to compression.
Software distribution
Beyond archival purposes, archive files are frequently used for packaging software for distribution, as software contents are often naturally spread across several files; the archive is then known as a package. While the archival file format is the same, there are additional conventions about contents, such as requiring a manifest file, and the resulting format is known as a package format.[7] Examples include deb for Debian, JAR for Java, APK for Android, and self-extracting Windows Installer executables.
Features
Features supported by various kinds of archives include:
- converting metadata into data stored inside a file (e.g., file name, permissions, etc.)
- checksums to detect errors
- data compression
- file concatenation to store multiple files in a single file
- file patches / updates (when recording changes since a previous archive)
- encryption
- error correction code to fix errors
- splitting a large file into many equal sized files for storage or transmission
Some archive programs have self-extraction, self-installation, source volume and medium information, and package notes/description.
The file extension or file header of the archive file are indicators of the file format used. Computer archive files are created by file archiver software, optical disc authoring software, and disk image software.[8]
Archive formats
An archive format is the file format of an archive file. Some formats are well-defined by their authors and have become conventions supported by multiple vendors and communities.[9]
Types
- Archiving only formats store metadata and concatenate files.
- Compression only formats only compress files.
- Multi-function formats can store metadata, concatenate, compress, encrypt, create error detection and recovery information, and package the archive into self-extracting and self-expanding files.
- Software packaging formats are used to create software packages that may be self-installing files.
- Disk image formats are used to create disk images of mass storage volumes.
Examples
Filename extensions used to distinguish different types of archives include zip, rar, 7z, and tar, the first of which is the most widely implemented.[10]
Java also introduced a whole family of archive extensions such as jar and war (j is for Java and w is for web). They are used to exchange entire byte-code deployment. Sometimes they are also used to exchange source code and other text, HTML and XML files. By default they are all compressed.[11]
Error detection and recovery
Archive files often include parity checks and other checksums for error detection, for instance zip files use a cyclic redundancy check (CRC). RAR archives may include additional error correction data (called recovery records).[12]
Archive files that do not natively support recovery records can use separate parchive (PAR) files that allows for additional error correction and recovery of missing files in a multi-file archive.[13]
See also
- File archiver
- Disk image
- Digital container format, a similar concept in media files
References
- ^ "Archive File: What it's Used For". Lifewire. Archived from the original on 2024-07-11. Retrieved 2022-06-17.
- ^ "Archive files". www.ibm.com. 2015-02-07. Archived from the original on 2023-09-07. Retrieved 2022-06-17.
- ^ "What is Archiving And Why is it Important?". Secure Data MGT. 2015-03-23. Archived from the original on 2022-05-24. Retrieved 2022-06-17.
- ^ "Data Portability and Platform Competition | Is User Data Exported From Facebook Actually Useful to Competitors?". Archive.org. p. 22. Retrieved June 17, 2022.
- ^ "Why file transfer speeds of small vs large files could be different". NetApp Knowledge Base. 2020-06-17. Archived from the original on 2022-01-01. Retrieved 2022-06-17.
- ^ "Why Small Files Take Longer to Copy Than Large Files". Dataquest. 2018-10-10. Archived from the original on 2022-07-02. Retrieved 2022-06-17.
- ^ Ashbel, Amit. "Data Archiving: The Basics and 5 Best Practices". cloud.netapp.com. Archived from the original on 2022-01-19. Retrieved 2022-06-17.
- ^ "What Is a File Extension & Why Are They Important?". Lifewire. Archived from the original on 2022-06-03. Retrieved 2022-06-17.
- ^ "What are Archive Files?". www.exefiles.com. Archived from the original on 2022-05-28. Retrieved 2022-06-17.
- ^ "Common file name extensions in Windows". support.microsoft.com. Archived from the original on 2022-05-27. Retrieved 2022-06-17.
- ^ Malefanem, Moses. "Learning Java Network Programming". Archived from the original on 2023-09-07. Retrieved 2022-06-17.
- ^ Drummond, James R. (1997). Parity, Checksums and CRC Checks (PDF) (1st ed.). Toronto. p. 13. Archived (PDF) from the original on 2020-10-31. Retrieved 2022-06-17.
{{cite book}}
: CS1 maint: location missing publisher (link) - ^ text. "What are PAR and PAR2 Files?". Easynews. Archived from the original on 2024-07-11. Retrieved 2022-06-17.
- "Application Note on the .ZIP file format"- official white paper published by PKWARE, Inc.
- Tape Archive (.TAR) file format specification- excerpt from File Format List 2.0 by Max Maischein
- "IBM 726 Magnetic tape reader/recorder from IBM Archives
- "1401 Data Processing System" from IBM Archives