BinHex

Last updated
BinHex 4
Filename extension
.hqx
Internet media type
  • application/mac-binhex40
  • application/mac-binhex
  • application/binhex
Uniform Type Identifier (UTI) com.apple.binhex-archive

BinHex, originally short for "binary-to-hexadecimal", is a binary-to-text encoding system which was used on the classic Mac OS for sending binary files over email. BinHexed files take up more space than the original files, but avoid data corruption by software that is not 8-bit clean.

Contents

History

TRS-80 BinHex

BinHex was originally written in 1981 by Tim Mann for the TRS-80 computer, as a standalone version of the encoding scheme of the popular terminal emulator ST80-III, for users of other terminals. It was used for sending files via major online services such as CompuServe which, not being 8-bit clean, required files to use ASCII armoring to survive. The system became very popular after Mann uploaded it to CompuServe's TRS-80 files area. [1]

The original scheme converted the binary file contents to hexadecimal numbers, encoding those as ASCII digits and letters (09, AF), and adding a newline after every 60 characters. The system quickly gained the addition of a checksum at the end of every line to check for errors, and a subsequent conversion to use the BASIC/S compiler allowed it to run much faster than the original interpreted version. [1]

BinHex files of the era were typically given the file extension .hex. Ports soon appeared for other popular computers of the era, including the Apple II. When CompuServe later added support for 8-bit transfers, the format fell out of use. [1]

Mac BinHex

When the Macintosh 128K was released in January 1984, the file upload problem still existed on CompuServe. In April BinHex was ported to the Mac using MS BASIC for Macintosh. [1] The Macintosh File System had introduced the storage of files as a "resource fork" and "data fork", and the Macintosh port only supported encoding of files' data fork, meaning it could only be used for data files. Several newer versions were published during 1984, resulting in BinHex 3 which could encode both forks.

Yves Lempereur, author of the first assembler for the Mac, MacASM, ported BinHex 3 to assembly language, increasing its speed a hundred-fold, and released it as BinHex 1.0. [2]

Compact BinHex

The simplicity of the original BinHex format made it inefficient, expanding every byte of input into two, as required by the hexadecimal representation, an 8-to-4 bit encoding. Lempereur implemented a new 8-to-6 bit encoding, which decreased file size by 50% and expanded the checksum from 8 to 16-bits, releasing this as BinHex 2.0. [2]

The new encoding used the first 64 ASCII printing characters, including the space, to represent the data, similarly to uuencoding. As the smaller files were incompatible with the older format Lempereur changed the file extension to .hcx, with c meaning compact. The name BinHex did not change, despite the format no longer being a hexadecimal representation. [2]

BinHex 4 and 5

In 1985 Lempereur released BinHex 4.0, skipping 3.0 to avoid confusion with a similarly-numbered version of BASIC. This version performed the following sequence of operations: [3]

The resulting files were roughly the same size as those from BinHex 2, but much more robust, with the metadata information in the header being protected from corruption by no longer being in plain text. The file extension for this new format was .hqx. [2]

At about the same time, most online services had started to support robust 8-bit file transfer protocols such as ZMODEM. This obviated the need for ASCII armoring, but on the Macintosh there was still the need to encode the two forks into one, leading to the development of the MacBinary file format. Lempereur released BinHex 5.0, which only differed by using MacBinary to combine the forks, but it saw little use. [2]

Internet usage

While the Internet was gaining popularity in the 1990s, email was still the primary method of moving files. Relatively few people had full access, and services like FTPmail were the only way many users could download files. Consequently binary files still required encoding, and BinHex 4.0 remained a popular tool for doing so into the late 1990s. BinHexed files can still be found today in archives of classic Mac OS software. [2]

BinHex 4 file format

A BinHex file may begin with any text content, followed by a line which indicates the format of the file, and that binary data is about to begin: (This file must be converted with BinHex 4.0). The text preceding that line is ignored when the file is converted out of BinHex format. [3] [4]

The binary data is encoded to 7-bit ASCII characters, with three bytes of input (24 bits) divided into four 6-bit values, in a similar fashion to Base64 encoding but using a different set of characters. The encoded data has a colon (:) placed before and after it, and is split into lines of a maximum of 64 characters in length. [3]

Example of a BinHex-encoded file

(This file must be converted with BinHex 4.0)  :$f*TEQKPH#jdCA0d,R0TG!"6594%8dP8)3#3"!&m!*!%EMa6593K!!%!!!&mFNa KG3,r!*!$&[rr$3d,BQPZD'9i,R4PFh3!RQ+!!"AV#J#3!i!!N!@QKUjrU!#3'[q 3"&4&@&483N)f!3#Xaj6bV-H8mJ!!!B3!N!0"!*!$[3#3!cR@iiY)!*!'[I%4!!J Fp$X%X3@J!mZE6!GRiKUi$HGKMf0U61S46%i1"AB!TI,fLl!d1X3RDDE8ALfTCbM 8UP9p4iUqY-0k4krHpk9XK@`rbj2Ti'U@5rGH@+[fr-i4T6-qXpfl26,k!H5$Nml TIkI'(l3GI4)f8mII&01CNEbC2LrNLBeaZ1HG@$G8!Z6"k)hh,q9p"r6FC*!!Se" (ic,Pd(4(b`pflKC`H1&JN5)GVX3mREdH55[l`%`Yhp%q092c`A(hPV)!83Dr&f4 $$L#I1aM-"VjqV-q$34KQq6$M$f8#,Zc,i),!(`*ZN!$K$rS!LA%3cL+dYi"@,K( Z"`#3!fKi!!!:

Related Research Articles

<span class="mw-page-title-main">Plain text</span> Term for computer data consisting only of unformatted characters of readable material

In computing, plain text is a loose term for data that represent only characters of readable material but not its graphical representation nor other objects. It may also include a limited number of "whitespace" characters that affect simple arrangement of text, such as spaces, line breaks, or tabulation characters. Plain text is different from formatted text, where style information is included; from structured text, where structural parts of the document such as paragraphs, sections, and the like are identified; and from binary files in which some portions must be interpreted as binary objects.

8-bit clean is an attribute of computer systems, communication channels, and other devices and software, that process 8-bit character encodings without treating any byte as an in-band control code.

The byte-order mark (BOM) is a particular usage of the special Unicode character code, U+FEFFZERO WIDTH NO-BREAK SPACE, whose appearance as a magic number at the start of a text stream can signal several things to a program reading the text:

A resource fork is a fork of a file on Apple's classic Mac OS operating system that is used to store structured data. It is one of the two forks of a file, along with the data fork, which stores data that the operating system treats as unstructured. Resource fork capability has been carried over to the modern macOS for compatibility.

A text file is a kind of computer file that is structured as a sequence of lines of electronic text. A text file exists stored as data within a computer file system.

In computer programming, Base64 is a group of binary-to-text encoding schemes that transforms binary data into a sequence of printable characters, limited to a set of 64 unique characters. More specifically, the source binary data is taken 6 bits at a time, then this group of 6 bits is mapped to one of 64 unique characters.

uuencoding is a form of binary-to-text encoding that originated in the Unix programs uuencode and uudecode written by Mary Ann Horton at the University of California, Berkeley in 1980, for encoding binary data for transmission in email systems.

<span class="mw-page-title-main">Binary file</span> Non-human-readable computer file encoded in binary form

A binary file is a computer file that is not a text file. The term "binary file" is often used as a term meaning "non-text file". Many binary file formats contain parts that can be interpreted as text; for example, some computer document files containing formatted text, such as older Microsoft Word document files, contain the text of the document but also contain formatting information in binary form.

MacBinary is a file format that combines the data fork and the resource fork of a classic Mac OS file into a single file, along with HFS's extended metadata. The resulting file is suitable for transmission over FTP, the World Wide Web, and electronic mail. The documents can also be stored on computers that run operating systems with no HFS support, such as Unix or Windows.

A FourCC is a sequence of four bytes used to uniquely identify data formats. It originated from the OSType or ResType metadata system used in classic Mac OS and was adopted for the Amiga/Electronic Arts Interchange File Format and derivatives. The idea was later reused to identify compressed data types in QuickTime and DirectShow.

A hex editor is a computer program that allows for manipulation of the fundamental binary data that constitutes a computer file. The name 'hex' comes from 'hexadecimal', a standard numerical format for representing binary data. A typical computer file occupies multiple areas on the storage medium, whose contents are combined to form the file. Hex editors that are designed to parse and edit sector data from the physical segments of floppy or hard disks are sometimes called sector editors or disk editors.

Ascii85, also called Base85, is a form of binary-to-text encoding developed by Paul E. Rutter for the btoa utility. By using five ASCII characters to represent four bytes of binary data, it is more efficient than uuencode or Base64, which use four characters to represent three bytes of data.

Netpbm is an open-source package of graphics programs and a programming library. It is used mainly in the Unix world, where one can find it included in all major open-source operating system distributions, but also works on Microsoft Windows, macOS, and other operating systems.

A binary-to-text encoding is encoding of data in plain text. More precisely, it is an encoding of binary data in a sequence of printable characters. These encodings are necessary for transmission of data when the communication channel does not allow binary data or is not 8-bit clean. PGP documentation uses the term "ASCII armor" for binary-to-text encoding when referring to Base64.

Intel hexadecimal object file format, Intel hex format or Intellec Hex is a file format that conveys binary information in ASCII text form, making it possible to store on non-binary media such as paper tape, punch cards, etc., to display on text terminals or be printed on line-oriented printers. The format is commonly used for programming microcontrollers, EPROMs, and other types of programmable logic devices and hardware emulators. In a typical application, a compiler or assembler converts a program's source code to machine code and outputs it into a object or executable file in hexadecimal format. In some applications, the Intel hex format is also used as a container format holding packets of stream data. Common file extensions used for the resulting files are .HEX or .H86. The HEX file is then read by a programmer to write the machine code into a PROM or is transferred to the target system for loading and execution. There are various tools to convert files between hexadecimal and binary format, and vice versa.

<span class="mw-page-title-main">SREC (file format)</span> File format developed by Motorola

Motorola S-record is a file format, created by Motorola in the mid-1970s, that conveys binary information as hex values in ASCII text form. This file format may also be known as SRECORD, SREC, S19, S28, S37. It is commonly used for programming flash memory in microcontrollers, EPROMs, EEPROMs, and other types of programmable logic devices. In a typical application, a compiler or assembler converts a program's source code to machine code and outputs it into a HEX file. The HEX file is then imported by a programmer to write the machine code into non-volatile memory, or is transferred to the target system for loading and execution.

<span class="mw-page-title-main">GNU Unifont</span> Duospaced bitmap font

GNU Unifont is a free Unicode bitmap font created by Roman Czyborra. The main Unifont covers all of the Basic Multilingual Plane (BMP). The "upper" companion covers significant parts of the Supplementary Multilingual Plane (SMP). The "Unifont JP" companion contains Japanese kanji present in the JIS X 0213 character set.

Tektronix hex format and Extended Tektronix hex format / Extended Tektronix Object Format are ASCII-based hexadecimal file formats, created by Tektronix, for conveying binary information for applications like programming microcontrollers, EPROMs, and other kinds of chips.

010 Editor is a commercial hex editor and text editor for Microsoft Windows, Linux and macOS. Typically 010 Editor is used to edit text files, binary files, hard drives, processes, tagged data, source code, shell scripts, log files, etc. A large variety of binary data formats can be edited through the use of Binary Templates.

The MOS Technology file format is a file format that conveys binary information in ASCII text form.

References

Citations

See also