A kind of disposal route of character font data and device
Technical field
The present invention relates to the font technical field, particularly, relate to a kind of disposal route and device of character font data.
Background technology
Electronic document is when storage and showing, comprise figure, picture, form, formula, the character of multilingual word etc., and the alphabetic character topmost element of electronic document content normally, the proportion accounted in electronic document is also maximum, the character font data of character is as a kind of resource, stored the font descriptor of each character in the character set of electronic document, when showing electronic document, render the image of character set according to the font descriptor in the character set in electronic document and corresponding character font data thereof, then be shown to computer screen or output on printer.
Want the content of character set in the displaying electronic document of the former formula of real master, the attribute informations such as the character set color that necessary storage user arranges, font, font size size, and character set and character font data integration is as a whole, export same result to guarantee same electronic document at any display terminal.Processing mode of the prior art is: character font data is processed, to remove the information of partial redundance, for example remove in the glyf in OpenType font specification form the description to font, but because the out of Memory in character font data does not process, so it is as broad as long to remove the use-pattern of the font of information of partial redundance and complete font.
At present, the electronic document frequency of utilization is more and more, tend to store the tens font descriptors of hundreds and thousands of kinds even in an electronic document, too much character font data causes electronic document very large, IO operation in the time of simultaneously can making to resolve electronic document is too much, thereby makes the resolution speed of electronic document very slow.
Summary of the invention
For addressing the above problem, embodiments of the invention provide a kind of disposal route and device of character font data, for solving the too much character font data of prior art, cause the size of electronic document very large, and the IO while resolving electronic document operates too much problem.
For this reason, the invention provides a kind of disposal route of character font data, wherein, comprising:
Obtain the character font data of character set;
Judge in the character font data of described character set and whether have identical font descriptor;
Font descriptor identical in the character font data of described character set is merged;
Character font data after merging according to described character set merges the character map of described character set.
Wherein, before the described character font data that obtains character set, also comprise:
Obtain the coded message of described character set to obtain described character set.
Wherein, the described character font data that obtains character set also comprises:
Obtain the character map of described character set, described character map comprises character code and character font data index.
Wherein, described character font data after merging according to described character set comprises after merging the character map of described character set:
Character font data after described character set is merged and character map upgrade and store in the electronic document of described character set.
Wherein, also comprise that the character font data after described character set is merged is stored in a character font data file.
The present invention also provides a kind for the treatment of apparatus of character font data, wherein, comprising:
Acquiring unit, for obtaining the character font data of character set;
Whether judging unit is identical for the font descriptor that judges described each character of character set;
Merge cells, the font descriptor identical for the character font data by described character set merges, and the character font data after merging according to described character set merges the character map of described character set.
Wherein, described acquiring unit also for the coded message of obtaining described character set to obtain described character set.
Wherein, described acquiring unit is also for the character map of the character font data that obtains described character set, and described character map comprises character code and character font data index.
Wherein, also comprise: updating block;
Described updating block upgrades for character font data and the character map after described character set is merged the electronic document that stores described character set into.
Wherein, described merge cells also is stored in a character font data file for the character font data after described character set is merged.
The present invention has following beneficial effect:
The disposal route of character font data provided by the invention, in the present embodiment, by the same font descriptor by character set, merge, kind and the quantity of font descriptor and font name in the electronic document have been reduced, corresponding quantity and the complexity that has reduced character map, reduce the size of electronic document simultaneously, thereby reduced the IO number of operations occurred while resolving electronic document, improved the resolution speed of electronic document.
The treating apparatus of character font data provided by the invention, obtain the character font data of character set in electronic document by acquiring unit, by merge cells, font descriptor identical in the character font data of character set is merged into to one again, thereby kind and the quantity of font descriptor in the electronic document and font name have been reduced, and the character font data after merging according to character set merges the character map of character set, reduced the quantity of character map, reduced the size of electronic document, and reduced the IO number of operations occurred while resolving electronic document, improved the resolution speed of electronic document.
the accompanying drawing explanation
The process flow diagram of disposal route the first embodiment that Fig. 1 is character font data provided by the invention;
The process flow diagram of disposal route the second embodiment that Fig. 2 is character font data provided by the invention;
Electronic document character set in disposal route the second embodiment that Fig. 3 is character font data provided by the invention;
The character font data that Fig. 4 is electronic document character set in Fig. 3;
The structural representation for the treatment of apparatus the first embodiment that Fig. 5 is character font data provided by the invention;
The structural representation for the treatment of apparatus the second embodiment that Fig. 6 is character font data provided by the invention.
For making those skilled in the art understand better technical scheme of the present invention, disposal route and device below in conjunction with accompanying drawing to character font data provided by the invention are described in detail.
The process flow diagram of disposal route the first embodiment that Fig. 1 is character font data provided by the invention.
As shown in Figure 1, the idiographic flow of the disposal route of the present embodiment character font data comprises the steps:
Step 101, obtain the character font data of character set.
When intelligent terminals such as utilizing terminal is opened a electronic document, at first obtain the character font data on terminal, the coded message of character set and the character font data of describing this character set on this electronic document, the character font data of describing this character set comprises the font descriptor of describing each character in the electronic document character set etc., character font data comprises the font size of character, the information such as font and color, font size is such as little four or No. 13 etc., font is regular script (Type2) for example, lishu (TrueType) and row pattern (OpenType) etc., wherein, Type2, TrueType and OpenType are the font format standard, the character font data of describing this character set usually according to certain standard storage in different character font data files, in the present embodiment, can the character font data of same font format standard be stored in same character font data file according to the standard of font format standard, after obtaining the character font data of character set, enter step 102.
Step 102, judge in the character font data of character set whether have identical font descriptor.
Character font data comprises the font descriptor of each character, judge in the character font data of character set and whether have identical font descriptor, in the present embodiment, character font data can comprise respectively about font size, font, the font descriptor of color etc., for example for character " in " and " state ", if judge " in " with the font size of " state " be all No. 12 illustrate about " in " identical with the font descriptor of the font size of " state ", if judge " in " and the color of the font size of " state " be all blue, illustrate about " in " identical with the font descriptor of the color of " state ", if character " in " the font descriptor in font be regular script (Type2), font in the font descriptor of character " state " is regular script (TrueType), due to character " in " and " state " the title of font is all regular script, equally also show about " in " and the font descriptor of " state " be identical.After having identical font descriptor in the character font data of judgement character set, enter step 103.
Step 103, the identical font descriptor of character set is merged.
In the present embodiment, can by character " in " the font descriptor of regular script (Type2) and the font descriptor of the regular script (TrueType) of " state " merge into a font descriptor, " in " and " state " font after merging be regular script, simultaneously, will " in " and the descriptor of the font about font size of " state " merge into one, font size after merging is No. 12, will " in " and the descriptor of the font about color of " state " merge into one, the font size after merging be the blueness.After obtaining describing the font descriptor after the merging of this electronic document character set, enter step 104.
Step 104, the character font data after merging according to character set merge the character map of character set.
After every identical font descriptor merges, obtain describing the character font data after this electronic document character set merges, preferably, character font data after merging is stored in a character font data file, then according to the character map of the character font data file corresponding modify character set after merging, character map is for meaning font descriptor that each character the is corresponding positional information at the character font data file, after character font data merges, the value volume and range of product of the font descriptor of character set obviously reduces, the quantity of character map corresponding to character set is obviously reduced, structure is also more simple and clear.
In the present embodiment, by the same font descriptor by character set, merge, kind and the quantity of font descriptor and font name in the electronic document have been reduced, corresponding quantity and the complexity that has reduced character map, reduced the size of electronic document simultaneously, thereby reduced the IO number of operations occurred while resolving electronic document, improved the resolution speed of electronic document.
The process flow diagram of disposal route the second embodiment that Fig. 2 is character font data provided by the invention.As shown in Figure 2, the idiographic flow of the disposal route of the present embodiment character font data comprises the steps:
The coded message of step 201, reading character collection is to obtain character set.
When needs are consulted the character set of storing on electronic document, at first to obtain the coded message about all characters of this electronic document, wherein, the coded system of character such as Unicode, GBK etc., unified binary coding that Unicode or GBK etc. have been encoded to each character setting in every kind of language, intelligent terminal is to obtain character set by codings such as the Unicode of character set in the read electric document or GBK.After obtaining the character set in electronic document, enter step 202.
Step 202, obtain the character font data of character set.
The character font data of character set comprises the font descriptor of all characters in this electronic document.The present embodiment be take and introduced technical scheme as example when the PDF document is converted into the CEBX document, electronic document character set in disposal route the second embodiment that Fig. 3 is character font data provided by the invention, the character font data that Fig. 4 is electronic document character set in Fig. 3.The character font data that obtains the character set in the PDF document shown in Fig. 3 comprises 6 kinds of font descriptors shown in Fig. 4, 6 kinds of font descriptors of character set comprise mathematical formulae character set and alphabetic character collection, wherein, the both font types of mathematical formulae character set is respectively Cambria Math and Calibri, four kinds of fonts of alphabetic character collection are respectively regular script (Type2), regular script (TrueType), Microsoft refined black (Type2) and Microsoft refined black (TrueType), in the present embodiment, character font data corresponding to Microsoft refined black (Type2) is stored in the first character font data file, character font data corresponding to Microsoft refined black (TrueType) is stored in the second character font data file, the character font data file embeds and is stored in electronic document.
Step 203, obtain the character map of each character in character set.
" Pythagorean theorem " shown in Fig. 3 of take is example, wherein, the font of " hook " and " thigh " is refined black (Type2) font of Microsoft, its character font data is stored in the first character font data file, between character code and character font data index, the character map of relation is as shown in table 1, wherein, in the character font data index 0001 and 0002 for meaning font descriptor that character the is corresponding position at the first character font data file.
The coded message of character (sexadecimal) |
The character font data index |
52FE |
0001 |
80A1 |
0002 |
Table 1
" determine " and the font of " reason " is refined black (TrueType) font of Microsoft, its character font data is stored in the second character font data file, its character map is as shown in table 2, wherein, in the character font data index 0001 and 0002 for meaning font descriptor that character the is corresponding position at the second character font data file.
The coded message of character (sexadecimal) |
The character font data index |
5B9A |
0001 |
7406 |
0002 |
Table 2
After the character font data file and character map of the various character font datas in obtaining character set, enter step 204.
Step 204, judge in the character font data of character set whether have identical font descriptor.
According to the font descriptor of each character in character set, whether the font descriptor that whether each font descriptor that the character font size is described in judgement is identical, character color is described in judgement is identical.In the present embodiment, be that the PDF document is converted to the CEBX document, in the CEBX document, the font Cambria Math of mathematical formulae character set and Calibri have identical font descriptor, regular script (Type2) and regular script (TrueType) have identical font descriptor, and Microsoft refined black (Type2) and Microsoft refined black (TrueType) have identical font descriptor.While having identical font descriptor in the character font data of character set, enter step 205.
Step 205, the same font descriptor of character set is merged, to obtain the character font data after character set merges.
According to TrueType and this both font types format specification of Type2, the same font descriptor of Fig. 3 character set is merged, the font before and after merging is as shown in table 3.
Table 3
Step 206, the character font data after merging according to character set merge the character map of character set.
In the present embodiment, before character font data merges, the character map of " hook " and " thigh " is as shown in table 1, before character font data merges, the character map of " determining " and " reason " is as shown in table 2, from step 204, refined black (Type2) font of Microsoft is identical with the font descriptor of refined black (TrueType) font of Microsoft, can merge into a font descriptor, thereby the character font data after being merged, preferably, character font data after merging is stored in a character font data file, character map according to relation between the character font data corresponding modify charset after merging and character code, thereby the character map of the character font data after being merged is as shown in table 4.
Character code (sexadecimal) |
The character font data index |
52FE |
0001 |
80A1 |
0002 |
5B9A |
0003 |
7406 |
0004 |
Table 4
In the electronic document that step 207, the character font data after character set is merged and character map update stored in character set.
Character font data after merging and character map are embedded in this electronic document and preserve, namely will store the character font data file of the character font data after merging and character map thereof is embedded in this electronic document and preserves, because total size of the character font data of character set is dwindled, and the quantity of character map is reduced, thereby the character font data that stores after merging and the size of character map have been dwindled, further reduced the size of electronic document, and when again opening electronic document, the IO number of operations that corresponding minimizing occurs while resolving electronic document.
In the embodiment of the present invention, by the same font descriptor by character, merge, kind and the quantity of font descriptor and font name in the electronic document have been reduced, dwindled the size of the character font data of character set, and simplified character map, reduce the size of electronic document, thereby reduced the IO number of operations occurred while resolving electronic document, improved the resolution speed of electronic document.
The structural representation for the treatment of apparatus the first embodiment that Fig. 5 is character font data provided by the invention.As shown in Figure 5, the treating apparatus of the character font data that the present embodiment provides comprises: acquiring unit 501, judging unit 502 and merge cells 503, wherein, acquiring unit 501 is for obtaining the character font data of all character set of electronic document, whether judging unit 502 is identical for the font descriptor that judges each character of character set, if judge that the font descriptor of each character is identical, merge cells 503 is merged into a font descriptor by same font descriptor in the character font data of character set, to reduce the quantity of font descriptor, and the character font data after merging according to character set merges the character map of character set, this device can be applied to computing machine, on the display device of the electronic documents such as printer.
In the present embodiment, obtain the character font data of character set in electronic document by acquiring unit, by merge cells, font descriptor identical in the character font data of character set is merged into to one again, thereby kind and the quantity of font descriptor in the electronic document and font name have been reduced, and the character font data after merging according to character set merges the character map of character set, reduced the quantity of character map, reduced the size of electronic document, and reduced the IO number of operations occurred while resolving electronic document, improved the resolution speed of electronic document.
Further, acquiring unit 501 can also be for the coded message of obtaining character set to obtain character set, and the character map that can also obtain character set, and character map comprises character code and character font data index.
The structural representation for the treatment of apparatus the second embodiment that Fig. 6 is character font data provided by the invention.As shown in Figure 6, the treating apparatus of the character font data that the present embodiment provides also comprises updating block 504, character font data after updating block 504 merges character set and character map upgrade and store in the electronic document of character set, namely will store the character font data file of the character font data after merging and character map thereof is embedded in this electronic document and preserves, make total size of character font data of character set reduced, and the corresponding minimizing of the quantity that makes character map, thereby dwindle the size of electronic document, and when again opening electronic document, the IO number of operations that corresponding minimizing occurs while resolving electronic document.
Further, merge cells 503 also is stored in a character font data file for the character font data after character set is merged, and updating block 504 stores this character font data file update in the electronic document of character set into.
In the present embodiment, obtain the character font data of all character set in electronic document by acquiring unit, whether the font descriptor by each character of judgment unit judges is identical, merge cells merges identical font descriptor, kind and the quantity of font descriptor and font name in the minimizing electronic document, thereby reduced total size of character font data, character font data after merge cells also will merge according to character set merges the character map of character set, make the also corresponding minimizing of quantity of character map, thereby reduce the size of electronic document, and the IO number of operations occurred while reduce resolving electronic document, improved the resolution speed of electronic document.
Be understandable that, above embodiment is only the illustrative embodiments adopted for principle of the present invention is described, yet the present invention is not limited thereto.For those skilled in the art, without departing from the spirit and substance in the present invention, can make various modification and improvement, these modification and improvement also are considered as protection scope of the present invention.