OCR-B

OCR-B
Category	Sans-serif
Classification	Neo-grotesque
Designer(s)	Adrian Frutiger
Date created	1968
	Sample

Last updated July 04, 2024

OCR-B is a monospace font developed in 1968 by Adrian Frutiger for Monotype by following the European Computer Manufacturer's Association standard. Its function was to facilitate the optical character recognition operations by specific electronic devices, originally for financial and bank-oriented uses. It was accepted as the world standard in 1973.^[1] It follows the ISO 1073-2:1976 (E) standard, refined in 1979 ("letterpress" design, size I). It includes all ASCII symbols, and other symbols needed in the bank environment. It is widely used for the human readable digits in UPC/EAN barcodes.^[2]^{[ citation needed ]} It is also used for machine-readable passports.^[3] It shares that purpose with OCR-A, but it is easier for the human eye and brain to read and it has a less technical look than OCR-A.

History

In June 1961, the European Computer Manufacturers Association (ECMA) started standardization activities related to Optical Character Recognition (OCR). After evaluating existing OCR designs, it was decided to develop two new fonts: A stylized design with just digits, called “Class A”; and a more conventional type design with broader character coverage, called “Class B”. In February 1965, ECMA proposed a design for the “Class B” font to ISO, who adopted it as international standard ISO 1073-2 in October 1965.^[4] The first revision contained three font sizes: I, II and III. The specification included a Letterpress design, intended for high-quality printing equipment; and a rounded-edge Constant Strokewidth design for impact printers ^[5]^: 3 with reduced typographic quality.

In September 1969, ECMA started work to revise its published standard. To make OCR-B more widely accepted, the shapes of some characters were slightly modified. The new revision removed font size II, which had been rarely used in practice; it deleted five character shapes; and it added a new font size IV. ECMA published the second edition of OCR-B in October 1971.^[4]

In March 1976, ECMA published a third revision of its ECMA-11 specification. It added the symbols § and ¥ to OCR-B; two types of erasure marks (█) for blackening out mis-printed characters were added; and the length of the Vertical bar was changed to match ISO 1073-2.^[4]

In 1993, Turkey proposed extending ISO 1073-2 to include the Turkish letters Ğğ, İı, and Şş.^[6] The request was generalized to extend OCR-B with a number of Latin and Greek letters used in European languages.^[7]^: 27 A revision of the ISO 1073-2:1976 standard was therefore started, producing three successive draft documents. The final draft would have extended OCR-B with 40 Latin and 10 Greek letters; for six Latin letters, the draft gave new alternate shapes.^[7]^: 26 A request to extend OCR-B with Vietnamese accents was rejected.^[7]^: 27 Other than previous versions of the standard, which specified glyph shapes via reference drawings, the new revision would have included the shapes in machine-readable form.^[7]^: 26 However, industry support for testing the new font could not be secured at the time, so the revision effort was halted in 1997.^[7]^: IV The working group described their findings in a technical report.^[7]^: 1

Two proposed variants for the OCR-B Euro sign OCR-B-Euro-Proposals.png — Two proposed variants for the OCR-B Euro sign

In June 1998, the European Committee for Standardization published a report for adding the Euro sign to OCR-B.^[5] The report proposed both a single-stroked and a double-stroked variant of the Euro sign, leaving the decision to further testing of OCR performance.^[5]^: 4 Testing was difficult: the theoretical design methods used when the OCR-B glyphs were originally developed could no longer be reproduced, and the technological constraints of the 1960s were also not entirely relevant anymore in the OCR environments of the 1990s.^[8] A new test method was devised, using present-time OCR technology. The tests found no difference in OCR performance between the two Euro variants, and recommended the adoption of the double-stroked variant as it matches the conventional glyph shape.^[8] The project did not have funds to thorougly test the glyph extensions of the 1993 proposal; initial results were inconclusive.^[8]

Availability

Microsoft Office ships a version of Letterpress OCR-B produced by Monotype. It covers Windows-1252.^[9] Many vendors, including Adobe, still sell their versions of OCR-A and OCR-B.

The TeX typesetting system has a public domain Constant Strokewidth OCR-B font in METAFONT definition form. It was created by Norbert Swartz in 1995 and updated in 2010. It has a setting for square stroke ends.^[10] The definition has also been translated to METATYPE1, so the rounded version is available in TrueType and OpenType too.^[11]

A version of Constant Strokewidth OCR-B by Matthew Anderson has extended character coverage. It is available under CC-BY 4.0.^[12]

MS-DOS OCR-B encoding

The MS-DOS OCR-B encoding is code page 877. Note that the grave, acute, circumflex (at 0x9B), tilde, diaeresis, and cedilla can be added over (in the case of the cedilla, under) letters to form accented letters. Note that the alternative Latin small latter m (not in Unicode), euro sign (€), and vertical bar (|) were later added to the code page, but it is unknown where they are located.

MS-DOS OCR-B^[13]
	0	1	2	3	4	5	6	7	8	9	A	B	C	D	E	F
0x
1x									^[a]
2x	SP	!	"	#	$	%	&	'	(	)	*	+	,	-	.	/
3x	0	1	2	3	4	5	6	7	8	9	:	;	<	=	>	?
4x	@	A	B	C	D	E	F	G	H	I	J	K	L	M	N	O
5x	P	Q	R	S	T	U	V	W	X	Y	Z	[	\	]	^	_
6x	`	a	b	c	d	e	f	g	h	i	j	k	l	m	n	o
7x	p	q	r	s	t	u	v	w	x	y	z	{	^[b]	}	~	^[c]
8x		ü			ä		å								Ä	Å
9x		æ	Æ		ö					Ö	Ü	^	£	¥
Ax						Ñ	ø	Ø		ˍ 02CD
Bx										Ĳ	ĳ
Cx																¤
Dx
Ex		ß														´
Fx						§		¸		¨

Characters not in Unicode:

^a Group erase (0x18)
^b Alternative vertical bar (0x7C)
^c Character erase (0x7F)

Related Research Articles

A glyph is any kind of purposeful mark. In typography, a glyph is "the specific shape, design, or representation of a character". It is a particular graphical representation, in a particular typeface, of an element of written language. A grapheme, or part of a grapheme, or sometimes several graphemes in combination can be represented by a glyph.

ISO/IEC 8859 is a joint ISO and IEC series of standards for 8-bit character encodings. The series of standards consists of numbered parts, such as ISO/IEC 8859-1, ISO/IEC 8859-2, etc. There are 15 parts, excluding the abandoned ISO/IEC 8859-12. The ISO working group maintaining this series of standards has been disbanded.

Blackboard bold is a style of writing bold symbols on a blackboard by doubling certain strokes, commonly used in mathematical lectures, and the derived style of typeface used in printed mathematical texts. The style is most commonly used to represent the number sets $, (integers),,, and .$

A typeface is a design of letters, numbers and other symbols, to be used in printing or for electronic display. Most typefaces include variations in size, weight, slope, width, and so on. Each of these variations of the typeface is a font.

A cedilla, or cedille, is a hook or tail added under certain letters as a diacritical mark to modify their pronunciation. In Catalan, French, and Portuguese it is used only under the letter c, and the entire letter is called, respectively, c trencada, c cédille, and c cedilhado. It is used to mark vowel nasalization in many languages of sub-Saharan Africa, including Vute from Cameroon.

Han unification is an effort by the authors of Unicode and the Universal Character Set to map multiple character sets of the Han characters of the so-called CJK languages into a single set of unified characters. Han characters are a feature shared in common by written Chinese (hanzi), Japanese (kanji), Korean (hanja) and Vietnamese.

OpenType is a format for scalable computer fonts. Derived from TrueType, it retains TrueType's basic structure but adds many intricate data structures for describing typographic behavior. OpenType is a registered trademark of Microsoft Corporation.

Magnetic ink character recognition code, known in short as MICR code, is a character recognition technology used mainly by the banking industry to streamline the processing and clearance of cheques and other documents. MICR encoding, called the MICR line, is at the bottom of cheques and other vouchers and typically includes the document-type indicator, bank code, bank account number, cheque number, cheque amount, and a control indicator. The format for the bank code and bank account number is country-specific.

The Romanian alphabet is a variant of the Latin alphabet used for writing the Romanian language. It is a modification of the classical Latin alphabet and consists of 31 letters, five of which have been modified from their Latin originals for the phonetic requirements of the language:

The currency sign¤ is a character used to denote an unspecified currency. It can be described as a circle the size of a lowercase character with four short radiating arms at 45° (NE), 135° (SE), 225° (SW) and 315° (NW). It is raised slightly above the baseline. The character is sometimes called scarab.

Segoe is a typeface, or family of fonts, that is best known for its use by Microsoft. The company uses Segoe in its online and printed marketing materials, including recent logos for a number of products. Additionally, the Segoe UI font sub-family is used by numerous Microsoft applications, and may be installed by applications. It was adopted as Microsoft's default operating system font, and is also used on Outlook.com, Microsoft's web-based email service. On August 23, 2012, Microsoft unveiled its new corporate logo typeset in Segoe, replacing the logo it had used for the previous 25 years.

A Unicode font is a computer font that maps glyphs to code points defined in the Unicode Standard. The vast majority of modern computer fonts use Unicode mappings, even those fonts which only include glyphs for a single writing system, or even only support the basic Latin alphabet. Fonts which support a wide range of Unicode scripts and Unicode symbols are sometimes referred to as "pan-Unicode fonts", although as the maximum number of glyphs that can be defined in a TrueType font is restricted to 65,535, it is not possible for a single font to provide individual glyphs for all defined Unicode characters. This article lists some widely used Unicode fonts that support a comparatively large number and broad range of Unicode characters.

<span class="mw-page-title-main">Microsoft Sans Serif</span> Neo-grotesque sans-serif typeface

Microsoft Sans Serif is a sans-serif typeface introduced with early Microsoft Windows versions. It is the successor of MS Sans Serif, formerly Helv, a proportional bitmap font introduced in Windows 1.0. Both typefaces are very similar in design to Arial and Helvetica. The typeface was designed to match the MS Sans bitmap included in the early releases of Microsoft Windows.

The Unicode Consortium and the ISO/IEC JTC 1/SC 2/WG 2 jointly collaborate on the list of the characters in the Universal Coded Character Set. The Universal Coded Character Set, most commonly called the Universal Character Set, is an international standard to map characters, discrete symbols used in natural language, mathematics, music, and other domains, to unique machine-readable data values. By creating this mapping, the UCS enables computer software vendors to interoperate, and transmit—interchange—UCS-encoded text strings from one to another. Because it is a universal map, it can be used to represent multiple languages at the same time. This avoids the confusion of using multiple legacy character encodings, which can result in the same sequence of codes having multiple interpretations depending on the character encoding in use, resulting in mojibake if the wrong one is chosen.

OCR-A is a font issued in 1966 and first implemented in 1968. A special font was needed in the early days of computer optical character recognition, when there was a need for a font that could be recognized not only by the computers of that day, but also by humans. OCR-A uses simple, thick strokes to form recognizable characters. The font is monospaced (fixed-width), with the printer required to place glyphs 0.254 cm apart, and the reader required to accept any spacing between 0.2286 cm and 0.4572 cm.

Unicode input is the insertion of a specific Unicode character on a computer by a user; it is a common way to input characters not directly supported by a physical keyboard. Unicode characters can be produced either by selecting them from a display or by typing a certain sequence of keys on a physical keyboard. In addition, a character produced by one of these methods in one web page or document can be copied into another. In contrast to ASCII's 96 element character set, Unicode encodes hundreds of thousands of graphemes (characters) from almost all of the world's written languages and many other signs and symbols besides.

Extended ASCII is a repertoire of character encodings that include the original 96 ASCII character set, plus up to 128 additional characters. There is no formal definition of "extended ASCII", and even use of the term is sometimes criticized, because it can be mistakenly interpreted to mean that the American National Standards Institute (ANSI) had updated its ANSI X3.4-1986 standard to include more characters, or that the term identifies a single unambiguous encoding, neither of which is the case.

KPS 9566 is a North Korean standard specifying a character encoding for the Chosŏn'gŭl (Hangul) writing system used for the Korean language. The edition of 1997 specified an ISO 2022-compliant 94×94 two-byte coded character set. Subsequent editions have added additional encoded characters outside of the 94×94 plane, in a manner comparable to UHC or GBK.

The ISO 2033:1983 standard defines character sets for use with Optical Character Recognition or Magnetic Ink Character Recognition systems. The Japanese standard JIS X 9010:1984 is closely related.

References

↑ Frutiger, Adrian. Type. Sign. Symbol. ABC Verlag, Zurich, 1980. p. 50
↑ "GS1 Human Readable Interpretation (HRI) Implementation Guideline" (PDF). GS1 AISBL. 2018. p. 13. Retrieved 2018-09-27.
↑ Doc 9303: Machine Readable Travel Documents, Part 3: Specifications Common to all MRTDs (PDF) (Eighth ed.). International Civil Aviation Organization. 2015. p. 25. ISBN 978-92-9249-792-7 . Retrieved 2016-03-03.
1 2 3 "Standard ECMA-11 for the Alphanumeric Character Set OCR-B for Optical Recognition" (PDF). European Computer Manufacturers Association. March 1976. Section “Brief History”.
1 2 3 4 "Draft Report on the Euro Glyph in OCR-B" (PDF). June 28, 1998.
↑ Karl Ivar Larsson (August 8, 2000). "Notes on transfer of responsibility for OCR-B standards".
1 2 3 4 5 6 "Proposal for Type 3 Technical Report, TR 15907, Information technology — Revision of OCR-B standard (ISO 1073/II-1976)" (PDF). September 28, 1998.
1 2 3 Karsson, Kent Ivar (June 28, 1998), Report to TC304 on OCR-B situation, Unicode Technical Committee, Unicode Consortium, UTC Document L2/01-259
↑ "OCRB font family - Typography". 30 March 2022.
↑ "CTAN: /Tex-archive/Fonts/Ocr-b".
↑ "OCR a and OCR B".
↑ "OCR-B". wehtt.am. Archived from the original on 28 March 2019. Retrieved 11 January 2022.
↑ "Code Page 877" (PDF). Archived from the original (PDF) on 2013-01-21.

External links

This page is based on this Wikipedia article
Text is available under the CC BY-SA 4.0 license; additional terms may apply.
Images, videos and audio are available under their respective licenses.

[1] Frutiger, Adrian. Type. Sign. Symbol. ABC Verlag, Zurich, 1980. p. 50

[2] "GS1 Human Readable Interpretation (HRI) Implementation Guideline" (PDF). GS1 AISBL. 2018. p. 13. Retrieved 2018-09-27.

[3] Doc 9303: Machine Readable Travel Documents, Part 3: Specifications Common to all MRTDs (PDF) (Eighth ed.). International Civil Aviation Organization. 2015. p. 25. ISBN 978-92-9249-792-7 . Retrieved 2016-03-03.

[ecma11-v3-history-4] 1 2 3 "Standard ECMA-11 for the Alphanumeric Character Set OCR-B for Optical Recognition" (PDF). European Computer Manufacturers Association. March 1976. Section “Brief History”.

[cen-n837-5] 1 2 3 4 "Draft Report on the Euro Glyph in OCR-B" (PDF). June 28, 1998.

[larsson-2000-6] Karl Ivar Larsson (August 8, 2000). "Notes on transfer of responsibility for OCR-B standards".

[iso-1073-n470-7] 1 2 3 4 5 6 "Proposal for Type 3 Technical Report, TR 15907, Information technology — Revision of OCR-B standard (ISO 1073/II-1976)" (PDF). September 28, 1998.

[tc304-n982-8] 1 2 3 Karsson, Kent Ivar (June 28, 1998), Report to TC304 on OCR-B situation, Unicode Technical Committee, Unicode Consortium, UTC Document L2/01-259

[9] "OCRB font family - Typography". 30 March 2022.

[10] "CTAN: /Tex-archive/Fonts/Ocr-b".

[11] "OCR a and OCR B".

[12] "OCR-B". wehtt.am. Archived from the original on 28 March 2019. Retrieved 11 January 2022.

[13] "Code Page 877" (PDF). Archived from the original (PDF) on 2013-01-21.

[1]

[2]

[3]

[4]

[5]

[6]

[7]

[8]

[9]

[10]

[11]

[12]

[13]

v t e Free and open-source typography
Software and libraries	FontForge Fontmatrix FreeType Ghostscript HarfBuzz Metafont MetaPost MetaType1 Pango TeX Graphite ICU
Licenses	Apache License BSD licenses Creative Commons licenses GNU General Public License + GPL font exception GNU Lesser General Public License LaTeX Project Public License MIT License SIL Open Font License Ubuntu Font Licence
Operating system, corporate and professional	Andika Bitstream Charter Bitstream Vera DejaVu Breeze Sans Cascadia Code Cantarell Charis SIL Computer Modern Concrete Roman Courier Prime Doulos SIL Droid Noto Open Sans Fira Gentium Ghostscript fonts GNU FreeFont GNU Unifont Go and Go Mono Hershey fonts IBM Plex Liberation Croscore Literata Lohit Luxi Nanum fonts Nimbus Mono Sans Roman OCR-A OCR-B Overpass Roboto Selawik Source Code Source Han Sans Source Han Serif Source Sans Source Serif STIX fonts Tiresias Ubuntu, Ubuntu Titling Utopia Zilla Slab
Government typefaces	National Fonts PT Fonts Railway Sans
Other typefaces	Amiri Antykwa Półtawskiego Asana-Math Atkinson Hyperlegible Cardo Chandas Comic Neue Cormorant EB Garamond Gentium IM Fell Inconsolata Iosevka Jomolhari Junicode Kochi Lato Linux Libertine Montserrat M⁺ News Cycle Open Baskerville OpenDyslexic Squarish Sans CT Theano Didot WenQuanYi XITS
Groups and people	Donald Knuth Ray Larabie Raph Levien Behdad Esfahbod Font Awesome Greek Font Society Font Library Google Fonts SIL International
Free and open-source softwareportal Open-source Unicode typefaces List of open source typefaces List of free software Unicode typefaces

v t e ISO standards by standard number
List of ISO standards – ISO romanizations – IEC standards
1–9999	1 2 3 4 6 7 9 16 17 31 -0 -1 -3 -4 -5 -6 -7 -8 -9 -10 -11 -12 -13 68-1 128 216 217 226 228 233 259 261 262 302 306 361 500 518 519 639 -1 -2 -3 -5 -6 646 657 668 690 704 732 764 838 843 860 898 965 999 1000 1004 1007 1073-1 1073-2 1155 1413 1538 1629 1745 1989 2014 2015 2022 2033 2047 2108 2145 2146 2240 2281 2533 2709 2711 2720 2788 2848 2852 2921 3029 3103 3166 -1 -2 -3 3297 3307 3601 3602 3864 3901 3950 3977 4031 4157 4165 4217 4909 5218 5426 5427 5428 5725 5775 5776 5800 5807 5964 6166 6344 6346 6373 6385 6425 6429 6438 6523 6709 6943 7001 7002 7010 7027 7064 7098 7185 7200 7498 -1 7637 7736 7810 7811 7812 7813 7816 7942 8000 8093 8178 8217 8373 8501-1 8571 8583 8601 8613 8632 8651 8652 8691 8805/8806 8807 8820-5 8859 -1 -2 -3 -4 -5 -6 -7 -8 -8-I -9 -10 -11 -12 -13 -14 -15 -16 8879 9000/9001 9036 9075 9126 9141 9227 9241 9293 9314 9362 9407 9496 9506 9529 9564 9592/9593 9594 9660 9797-1 9897 9899 9945 9984 9985 9995
10000–19999	10006 10007 10116 10118-3 10160 10161 10165 10179 10206 10218 10279 10303 -11 -21 -22 -28 -238 10383 10585 10589 10628 10646 10664 10746 10861 10957 10962 10967 11073 11170 11172 11179 11404 11544 11783 11784 11785 11801 11889 11898 11940 (-2) 11941 11941 (TR) 11992 12006 12052 12182 12207 12234-2 12620 13211 -1 -2 13216 13250 13399 13406-2 13450 13485 13490 13567 13568 13584 13616 13816 13818 14000 14031 14224 14289 14396 14443 14496 -2 -3 -6 -10 -11 -12 -14 -17 -20 14617 14644 14649 14651 14698 14764 14882 14971 15022 15189 15288 15291 15398 15408 15444 -3 -9 15445 15438 15504 15511 15686 15693 15706 -2 15707 15897 15919 15924 15926 15926 WIP 15930 15938 16023 16262 16355-1 16485 16612-2 16750 16949 (TS) 17024 17025 17100 17203 17369 17442 17506 17799 18004 18014 18181 18245 18629 18916 19005 19011 19092 -1 -2 19114 19115 19125 19136 19407 19439 19500 19501 19502 19503 19505 19506 19507 19508 19509 19510 19600 19752 19757 19770 19775-1 19794-5 19831
20000–29999	20000 20022 20121 20400 20802 20830 21000 21001 21047 21122 21500 21827 22000 22275 22300 22301 22395 22537 23000 23003 23008 23009 23090-3 23092 23094-1 23094-2 23270 23271 23360 23941 24517 24613 24617 24707 24728 25178 25964 26000 26262 26300 26324 27000 series 27000 27001 27002 27005 27006 27729 28000 29110 29148 29199-2 29500
30000+	30170 31000 32000 37001 38500 39075 40500 42010 45001 50001 55000 56000 80000
Category