Category | Sans-serif |
---|---|
Classification | Neo-grotesque |
Designer(s) | Adrian Frutiger |
Date created | 1968 |
Sample |
OCR-B is a monospace font developed in 1968 by Adrian Frutiger for Monotype by following the European Computer Manufacturer's Association standard. Its function was to facilitate the optical character recognition operations by specific electronic devices, originally for financial and bank-oriented uses. It was accepted as the world standard in 1973. [1] It follows the ISO 1073-2:1976 (E) standard, refined in 1979 ("letterpress" design, size I). It includes all ASCII symbols, and other symbols needed in the bank environment. It is widely used for the human readable digits in UPC/EAN barcodes. [2] [ citation needed ] It is also used for machine-readable passports. [3] It shares that purpose with OCR-A, but it is easier for the human eye and brain to read and it has a less technical look than OCR-A.
In June 1961, the European Computer Manufacturers Association (ECMA) started standardization activities related to Optical Character Recognition (OCR). After evaluating existing OCR designs, it was decided to develop two new fonts: A stylized design with just digits, called “Class A”; and a more conventional type design with broader character coverage, called “Class B”. In February 1965, ECMA proposed a design for the “Class B” font to ISO, who adopted it as international standard ISO 1073-2 in October 1965. [4] The first revision contained three font sizes: I, II and III. The specification included a Letterpress design, intended for high-quality printing equipment; and a rounded-edge Constant Strokewidth design for impact printers [5] : 3 with reduced typographic quality.
In September 1969, ECMA started work to revise its published standard. To make OCR-B more widely accepted, the shapes of some characters were slightly modified. The new revision removed font size II, which had been rarely used in practice; it deleted five character shapes; and it added a new font size IV. ECMA published the second edition of OCR-B in October 1971. [4]
In March 1976, ECMA published a third revision of its ECMA-11 specification. It added the symbols § and ¥ to OCR-B; two types of erasure marks (█) for blackening out mis-printed characters were added; and the length of the Vertical bar was changed to match ISO 1073-2. [4]
In 1993, Turkey proposed extending ISO 1073-2 to include the Turkish letters Ğğ, İı, and Şş. [6] The request was generalized to extend OCR-B with a number of Latin and Greek letters used in European languages. [7] : 27 A revision of the ISO 1073-2:1976 standard was therefore started, producing three successive draft documents. The final draft would have extended OCR-B with 40 Latin and 10 Greek letters; for six Latin letters, the draft gave new alternate shapes. [7] : 26 A request to extend OCR-B with Vietnamese accents was rejected. [7] : 27 Other than previous versions of the standard, which specified glyph shapes via reference drawings, the new revision would have included the shapes in machine-readable form. [7] : 26 However, industry support for testing the new font could not be secured at the time, so the revision effort was halted in 1997. [7] : IV The working group described their findings in a technical report. [7] : 1
In June 1998, the European Committee for Standardization published a report for adding the Euro sign to OCR-B. [5] The report proposed both a single-stroked and a double-stroked variant of the Euro sign, leaving the decision to further testing of OCR performance. [5] : 4 Testing was difficult: the theoretical design methods used when the OCR-B glyphs were originally developed could no longer be reproduced, and the technological constraints of the 1960s were also not entirely relevant anymore in the OCR environments of the 1990s. [8] A new test method was devised, using present-time OCR technology. The tests found no difference in OCR performance between the two Euro variants, and recommended the adoption of the double-stroked variant as it matches the conventional glyph shape. [8] The project did not have funds to thorougly test the glyph extensions of the 1993 proposal; initial results were inconclusive. [8]
Microsoft Office ships a version of Letterpress OCR-B produced by Monotype. It covers Windows-1252. [9] Many vendors, including Adobe, still sell their versions of OCR-A and OCR-B.
The TeX typesetting system has a public domain Constant Strokewidth OCR-B font in METAFONT definition form. It was created by Norbert Swartz in 1995 and updated in 2010. It has a setting for square stroke ends. [10] The definition has also been translated to METATYPE1, so the rounded version is available in TrueType and OpenType too. [11]
A version of Constant Strokewidth OCR-B by Matthew Anderson has extended character coverage. It is available under CC-BY 4.0. [12]
The MS-DOS OCR-B encoding is code page 877. Note that the grave, acute, circumflex (at 0x9B), tilde, diaeresis, and cedilla can be added over (in the case of the cedilla, under) letters to form accented letters. Note that the alternative Latin small latter m (not in Unicode), euro sign (€), and vertical bar (|) were later added to the code page, but it is unknown where they are located.
0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | A | B | C | D | E | F | |
0x | ||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
1x | [a] | |||||||||||||||
2x | SP | ! | " | # | $ | % | & | ' | ( | ) | * | + | , | - | . | / |
3x | 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9 | : | ; | < | = | > | ? |
4x | @ | A | B | C | D | E | F | G | H | I | J | K | L | M | N | O |
5x | P | Q | R | S | T | U | V | W | X | Y | Z | [ | \ | ] | ^ | _ |
6x | ` | a | b | c | d | e | f | g | h | i | j | k | l | m | n | o |
7x | p | q | r | s | t | u | v | w | x | y | z | { | [b] | } | ~ | [c] |
8x | ü | ä | å | Ä | Å | |||||||||||
9x | æ | Æ | ö | Ö | Ü | ^ | £ | ¥ | ||||||||
Ax | Ñ | ø | Ø | ˍ 02CD | ||||||||||||
Bx | IJ | ij | ||||||||||||||
Cx | ¤ | |||||||||||||||
Dx | ||||||||||||||||
Ex | ß | ´ | ||||||||||||||
Fx | § | ¸ | ¨ |
Characters not in Unicode:
A glyph is any kind of purposeful mark. In typography, a glyph is "the specific shape, design, or representation of a character". It is a particular graphical representation, in a particular typeface, of an element of written language. A grapheme, or part of a grapheme, or sometimes several graphemes in combination can be represented by a glyph.
ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. The ISO working group maintaining this series of standards has been disbanded.
Blackboard bold is a style of writing bold symbols on a blackboard by doubling certain strokes, commonly used in mathematical lectures, and the derived style of typeface used in printed mathematical texts. The style is most commonly used to represent the number sets , (integers), , , and .
A typeface is a design of letters, numbers and other symbols, to be used in printing or for electronic display. Most typefaces include variations in size, weight, slope, width, and so on. Each of these variations of the typeface is a font.
A cedilla, or cedille, is a hook or tail added under certain letters as a diacritical mark to modify their pronunciation. In Catalan, French, and Portuguese it is used only under the letter c, and the entire letter is called, respectively, c trencada, c cédille, and c cedilhado. It is used to mark vowel nasalization in many languages of sub-Saharan Africa, including Vute from Cameroon.
Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese (hanzi), Japanese (kanji), Korean (hanja) and Vietnamese.
OpenType is a format for scalable computer fonts. Derived from TrueType, it retains TrueType's basic structure but adds many intricate data structures for describing typographic behavior. OpenType is a registered trademark of Microsoft Corporation.
Magnetic ink character recognition code, known in short as MICR code, is a character recognition technology used mainly by the banking industry to streamline the processing and clearance of cheques and other documents. MICR encoding, called the MICR line, is at the bottom of cheques and other vouchers and typically includes the document-type indicator, bank code, bank account number, cheque number, cheque amount, and a control indicator. The format for the bank code and bank account number is country-specific.
The Romanian alphabet is a variant of the Latin alphabet used for writing the Romanian language. It is a modification of the classical Latin alphabet and consists of 31 letters, five of which have been modified from their Latin originals for the phonetic requirements of the language:
The currency sign¤ is a character used to denote an unspecified currency. It can be described as a circle the size of a lowercase character with four short radiating arms at 45° (NE), 135° (SE), 225° (SW) and 315° (NW). It is raised slightly above the baseline. The character is sometimes called scarab.
Segoe is a typeface, or family of fonts, that is best known for its use by Microsoft. The company uses Segoe in its online and printed marketing materials, including recent logos for a number of products. Additionally, the Segoe UI font sub-family is used by numerous Microsoft applications, and may be installed by applications. It was adopted as Microsoft's default operating system font, and is also used on Outlook.com, Microsoft's web-based email service. On August 23, 2012, Microsoft unveiled its new corporate logo typeset in Segoe, replacing the logo it had used for the previous 25 years.
A Unicode font is a computer font that maps glyphs to code points defined in the Unicode Standard. The vast majority of modern computer fonts use Unicode mappings, even those fonts which only include glyphs for a single writing system, or even only support the basic Latin alphabet. Fonts which support a wide range of Unicode scripts and Unicode symbols are sometimes referred to as "pan-Unicode fonts", although as the maximum number of glyphs that can be defined in a TrueType font is restricted to 65,535, it is not possible for a single font to provide individual glyphs for all defined Unicode characters. This article lists some widely used Unicode fonts that support a comparatively large number and broad range of Unicode characters.
Microsoft Sans Serif is a sans-serif typeface introduced with early Microsoft Windows versions. It is the successor of MS Sans Serif, formerly Helv, a proportional bitmap font introduced in Windows 1.0. Both typefaces are very similar in design to Arial and Helvetica. The typeface was designed to match the MS Sans bitmap included in the early releases of Microsoft Windows.
The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set, is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple interpretations depending on the character encoding in use, resulting in mojibake if the wrong one is chosen.
OCR-A is a font issued in 1966 and first implemented in 1968. A special font was needed in the early days of computer optical character recognition, when there was a need for a font that could be recognized not only by the computers of that day, but also by humans. OCR-A uses simple, thick strokes to form recognizable characters. The font is monospaced (fixed-width), with the printer required to place glyphs 0.254 cm apart, and the reader required to accept any spacing between 0.2286 cm and 0.4572 cm.
Unicode input is the insertion of a specific Unicode character on a computer by a user; it is a common way to input characters not directly supported by a physical keyboard. Unicode characters can be produced either by selecting them from a display or by typing a certain sequence of keys on a physical keyboard. In addition, a character produced by one of these methods in one web page or document can be copied into another. In contrast to ASCII's 96 element character set, Unicode encodes hundreds of thousands of graphemes (characters) from almost all of the world's written languages and many other signs and symbols besides.
Extended ASCII is a repertoire of character encodings that include the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes criticized, because it can be mistakenly interpreted to mean that the American National Standards Institute (ANSI) had updated its ANSI X3.4-1986 standard to include more characters, or that the term identifies a single unambiguous encoding, neither of which is the case.
KPS 9566 is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK.
The ISO 2033:1983 standard defines character sets for use with Optical Character Recognition or Magnetic Ink Character Recognition systems. The Japanese standard JIS X 9010:1984 is closely related.