CN107992869B - Method and device for correcting tilted characters and electronic equipment - Google Patents
Method and device for correcting tilted characters and electronic equipment Download PDFInfo
- Publication number
- CN107992869B CN107992869B CN201610945094.9A CN201610945094A CN107992869B CN 107992869 B CN107992869 B CN 107992869B CN 201610945094 A CN201610945094 A CN 201610945094A CN 107992869 B CN107992869 B CN 107992869B
- Authority
- CN
- China
- Prior art keywords
- image
- text
- current
- standard deviation
- miscut
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
- 238000000034 method Methods 0.000 title claims abstract description 88
- 238000012937 correction Methods 0.000 claims abstract description 128
- 230000009466 transformation Effects 0.000 claims abstract description 91
- 230000011218 segmentation Effects 0.000 claims abstract description 25
- 230000001131 transforming effect Effects 0.000 claims abstract description 10
- 238000012790 confirmation Methods 0.000 claims description 31
- 238000004364 calculation method Methods 0.000 claims description 17
- 238000012545 processing Methods 0.000 claims description 16
- 238000006243 chemical reaction Methods 0.000 claims description 11
- 230000006399 behavior Effects 0.000 claims description 4
- 238000000605 extraction Methods 0.000 claims description 4
- QZUDBNBUXVUHMW-UHFFFAOYSA-N clozapine Chemical compound C1CN(C)CCN1C1=NC2=CC(Cl)=CC=C2NC2=CC=CC=C12 QZUDBNBUXVUHMW-UHFFFAOYSA-N 0.000 description 7
- 238000010586 diagram Methods 0.000 description 5
- 238000003384 imaging method Methods 0.000 description 4
- 238000007781 pre-processing Methods 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 238000003702 image correction Methods 0.000 description 2
- 238000012015 optical character recognition Methods 0.000 description 2
- 238000000926 separation method Methods 0.000 description 2
- 239000000853 adhesive Substances 0.000 description 1
- 230000001070 adhesive effect Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000003086 colorant Substances 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000010191 image analysis Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003672 processing method Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/146—Aligning or centring of the image pick-up or image-field
- G06V30/1475—Inclination or skew detection or correction of characters or of image to be recognised
- G06V30/1478—Inclination or skew detection or correction of characters or of image to be recognised of characters or characters lines
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V10/00—Arrangements for image or video recognition or understanding
- G06V10/20—Image preprocessing
- G06V10/26—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion
- G06V10/267—Segmentation of patterns in the image field; Cutting or merging of image elements to establish the pattern region, e.g. clustering-based techniques; Detection of occlusion by performing operations on regions, e.g. growing, shrinking or watersheds
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/28—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
- G06V30/287—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of Kanji, Hiragana or Katakana characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/28—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet
- G06V30/293—Character recognition specially adapted to the type of the alphabet, e.g. Latin alphabet of characters other than Kanji, Hiragana or Katakana
Landscapes
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Theoretical Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Character Input (AREA)
- Editing Of Facsimile Originals (AREA)
- Image Processing (AREA)
Abstract
The invention provides a method and a device for correcting inclined characters and electronic equipment. The method comprises the following steps: obtaining a binary image of an image to be corrected; carrying out image inclination correction on the binary image, and rotating a text line in the binary image to a first preset direction to obtain a corrected image; performing text line segmentation on the corrected image, and intercepting and selecting a plurality of text images, wherein each text image has one text line; and carrying out miscut transformation on the text image, and transforming the characters in the inclined state in the text image into a second preset direction perpendicular to the first preset direction to obtain an image to be read. The method can solve the problem of low efficiency when the text line spacing is determined line by line for tilt correction.
Description
Technical Field
The present invention relates to the field of electronic devices, and in particular, to a method and an apparatus for tilted text correction, and an electronic device.
Background
At present, when optical character recognition is carried out on a document scanning piece, a photo, a video image frame and the like, situations that italic printing fonts and text lines and characters in the image are inclined and distorted due to projection imaging are often encountered. Since oblique text lines and characters in an image file can make the characters difficult to segment and make subsequent optical character recognition difficult, the oblique characters must be corrected before character recognition.
In the prior art, the oblique character correction in the image file adopts a segmentation mode for a single line of text lines, the line spacing of the text lines must be determined line by line, the efficiency is low, the space between the text lines in the image file must be required to be obvious, and the effect of character correction applied to the condition that the text lines are adhered is poor.
Disclosure of Invention
The invention aims to provide a method, a device and electronic equipment for oblique character correction, and solves the problem of low efficiency when oblique correction is carried out by determining the line spacing of texts line by line in the prior art.
The invention provides a method for correcting inclined characters, wherein the method comprises the following steps:
obtaining a binary image of an image to be corrected;
carrying out image inclination correction on the binary image, and rotating a text line in the binary image to a first preset direction to obtain a corrected image;
performing text line segmentation on the corrected image, and intercepting and selecting a plurality of text images, wherein each text image has one text line;
and carrying out miscut transformation on the text image, and transforming the characters in the inclined state in the text image into a second preset direction perpendicular to the first preset direction to obtain an image to be read.
Preferably, in the method, the step of performing image tilt correction on the binary image, and rotating a text line in the binary image to a first preset direction, to obtain a corrected image includes:
rotating the binary image at different angles within a preset angle range;
projecting the binary image after each rotation in a second preset direction;
calculating the standard deviation of the obtained projection sequence when the binary image is projected in a second preset direction after each rotation;
and determining the rotated binary image corresponding to the maximum standard deviation as the corrected image.
Preferably, the method further comprises the step of rotating the binary image by different angles within a predetermined angle range, and the step of rotating the binary image by different angles includes:
setting the preset angle range of the binary image for angle rotation to be [ theta 1, theta 2], wherein theta 1< theta 2;
determining the adjustment step length s1 for angle rotation and the current rotation angle t1 as theta 1;
and performing initial rotation on the current rotation angle t 1-theta 1, adding the adjustment step length s1 to the current rotation angle t1 to obtain a numerical value, and assigning the numerical value to the current rotation angle t1 for next rotation, wherein t1+ s1 is not more than theta 2.
Preferably, after the step of calculating a standard deviation of a projection sequence obtained when the binary image is projected in the second preset direction after each rotation, the method further includes:
comparing the standard deviation std of the binary image after current rotation with the current maximum standard deviation maxstd;
if the standard deviation std of the binary image after the current rotation is larger than the current maximum standard deviation maxstd, assigning the standard deviation std of the binary image after the current rotation to the current maximum standard deviation maxstd, assigning the current rotation angle t1 to an image inclination correction angle alpha, and performing the next rotation;
if the standard deviation std of the binary image after the current rotation is less than or equal to the current maximum standard deviation maxstd, maintaining the current maximum standard deviation maxstd and the image inclination correction angle alpha unchanged;
when the initial rotation is performed, the current maximum standard deviation maxstd is zero, and the image tilt correction angle α is zero.
Preferably, in the method, in the process of performing different angle rotations on the binary image within a predetermined angle range, if the current rotation angle t1 is increased by the adjustment step length s1 to obtain a value greater than θ 2, the angle rotation on the binary image is stopped;
wherein the step of determining the rotated binary image corresponding to the maximum standard deviation as the corrected image comprises:
extracting the current image inclination correction angle alpha;
and determining the binary image after the binary image is correspondingly rotated when the binary image is rotated by the current image inclination correction angle alpha, wherein the binary image is the corrected image.
Preferably, in the method, the step of performing a miscut transform on the text image to transform characters of the characters in an inclined state in the text image into a second preset direction perpendicular to the first preset direction to obtain an image to be read includes:
performing a first preset direction and different tangent value miscut transformation on the text image within a preset tangent value range;
projecting the text image subjected to the miscut transformation in a first preset direction;
calculating the standard deviation of the obtained projection sequence when the text image is projected in a first preset direction after each time of miscut transform;
and determining the text image after the corresponding miscut transformation when the standard deviation is maximum as the image to be read.
Preferably, the method includes the step of performing a first predetermined direction and different tangent value miscut transform on the text image within a predetermined tangent value range, and includes:
setting the predetermined range of tangent values for the text image to undergo a first preset direction, different tangent value miscut transform to [ k1, k2], wherein-1 < k1< k2< 1;
determining the adjustment step size of the miscut transform to be s2 and the current tangent value t2 to be k 1;
and performing initial miscut transform on the current tangent value t 2-k 1, increasing the value obtained by the adjustment step length s2 to the current tangent value t2, assigning the value to the current tangent value t2, and performing next miscut transform, wherein t2+ s2 is not more than k 2.
Preferably, after the step of calculating a standard deviation of the obtained projection sequence when the text image is projected in the first preset direction after each time of the miscut transform, the method further includes:
comparing the standard deviation std of the text image after the current miscut transformation with the current maximum standard deviation maxstd;
if the standard deviation std of the text image after the current miscut transformation is larger than the current maximum standard deviation maxstd, assigning the standard deviation std of the text image after the current miscut transformation to the current maximum standard deviation maxstd, assigning the current tangent value t2 to a character correction confirmation tangent value tan (beta), and performing the next miscut transformation;
if the standard deviation std of the text image after the current miscut transformation is less than or equal to the current maximum standard deviation maxstd, maintaining the current maximum standard deviation maxstd and the character correction confirmation tangent value tan (beta) unchanged;
when performing the initial miscut transform, the current maximum standard deviation maxstd is zero, and the character correction confirmation tangent value tan (β) is zero.
Preferably, in the method, during the process of performing the miscut transform on the text image in the first preset direction and different tangent values within the preset tangent value range, if the current tangent value t2 is increased by the adjustment step length s2 to obtain a value greater than k2, the miscut transform on the text image is stopped;
wherein, the step of determining the text image after the corresponding miscut transformation when the standard deviation is the maximum comprises:
extracting the current character correction confirmation tangent value tan (beta);
and when the text image is determined to be subjected to the miscut transformation in the first preset direction by using the current character correction confirmation tangent value tan (beta), the text image after the corresponding miscut transformation is the image to be read.
Preferably, the method described above, wherein the step of performing text line segmentation on the corrected image and cutting out a plurality of text images includes:
projecting the corrected image in a second preset direction;
obtaining an accumulated value of each projected pixel row, and comparing the accumulated value with a first preset value;
when the accumulated value is larger than the first preset value, determining the corresponding pixel line as a text line;
when the accumulated value is smaller than the first preset value, determining the corresponding pixel behavior as a background row;
and according to the determined text line, intercepting and selecting to obtain the text image.
Another aspect of the present invention provides an apparatus for oblique character correction, wherein the apparatus comprises:
the image processing module is used for obtaining a binary image of an image to be corrected;
the text line correction module is used for carrying out image inclination correction on the binary image, rotating the text lines in the binary image to a first preset direction and obtaining a corrected image;
the text line segmentation module is used for performing text line segmentation on the corrected image and selecting a plurality of text images, wherein each text image has one text line;
and the character correction module is used for carrying out miscut transformation on the text image, transforming the characters in the inclined state in the text image into a second preset direction perpendicular to the first preset direction, and obtaining the image to be read.
Preferably, the apparatus described above, wherein the text line correction module includes:
the angle rotating unit is used for rotating the binary image by different angles within a preset angle range;
the first projection calculation unit is used for projecting the binary image in a second preset direction after each rotation;
the first standard deviation calculation unit is used for calculating the standard deviation of the obtained projection sequence when the binary image is projected in the second preset direction after each rotation;
and the first determining unit is used for determining the binary image after the corresponding rotation when the standard deviation is maximum as the corrected image.
Preferably, the apparatus described above, wherein the angle rotating unit includes:
a first setting subunit, configured to set the predetermined angle range in which the binary image is angularly rotated to [ θ 1, θ 2], where θ 1< θ 2;
a second setting subunit, configured to determine that an adjustment step size for performing angle rotation is s1 and a current rotation angle t1 is θ 1;
and the rotation execution subunit is configured to perform initial rotation on the current rotation angle t1 ═ θ 1, increase the current rotation angle t1 by the adjustment step length s1, and assign the current rotation angle t1 with a value obtained by adding the adjustment step length s1, where t1+ s1 is not more than θ 2.
Preferably, in the apparatus described above, the text line correction module further includes:
the first comparison unit is used for comparing the standard deviation std of the binary image after current rotation with the current maximum standard deviation maxstd;
a first execution unit, configured to, if the standard deviation std of the binary image after the current rotation is greater than the current maximum standard deviation maxstd, assign the standard deviation std of the binary image after the current rotation to the current maximum standard deviation maxstd, assign the current rotation angle t1 to an image tilt correction angle α, and perform the next rotation;
a second execution unit, configured to, if a standard deviation std of the binary image after the current rotation is less than or equal to the current maximum standard deviation maxstd, maintain the current maximum standard deviation maxstd and the image tilt correction angle α unchanged;
when the initial rotation is performed, the current maximum standard deviation maxstd is zero, and the image tilt correction angle α is zero.
Preferably, the apparatus described above, wherein the angle rotating unit further includes:
a first stop determining subunit, configured to, in a process of performing different angle rotations on the binary image within a predetermined angle range, stop performing the angle rotation on the binary image if the current rotation angle t1 is increased by the adjustment step s1 to obtain a value greater than θ 2;
wherein the first determination unit includes:
a correction angle extraction subunit, configured to extract a current tilt correction angle α of the image;
a first corrected image determining subunit, configured to determine that the binary image after rotation corresponding to the binary image when the binary image is rotated by the current image tilt correction angle α is the corrected image.
Preferably, the apparatus described above, wherein the character correction module includes:
the miscut transformation unit is used for carrying out the miscut transformation of different tangent values in a first preset direction on the text image within a preset tangent value range;
the second projection calculation unit is used for projecting the text image subjected to the shear conversion in the first preset direction;
the second standard deviation calculating unit is used for calculating the standard deviation of the obtained projection sequence when the text image is projected in the first preset direction after the text image is subjected to the miscut transform every time;
and the second determining unit is used for determining the text image after the corresponding miscut transformation when the standard deviation is maximum as the image to be read.
Preferably, the apparatus described above, wherein the miscut transform unit comprises:
a third setting subunit, configured to set the predetermined tangent value range of the text image subjected to the first preset direction, the miscut transform of different tangent values to [ k1, k2], where-1 < k1< k2< 1;
a fourth setting subunit, configured to determine that an adjustment step size for performing the miscut transform is s2 and a current tangent value t2 is k 1;
and the miscut transformation executing subunit is used for performing initial miscut transformation on the current tangent value t 2-k 1, increasing the current tangent value t2 by the value obtained by the adjustment step s2, assigning the value to the current tangent value t2, and performing next miscut transformation, wherein t2+ s2 is not more than k 2.
Preferably, in the apparatus described above, the character correction module further includes:
the second comparison unit is used for comparing the standard deviation std of the text image after the current miscut transformation with the current maximum standard deviation maxstd;
a third executing unit, configured to, if the standard deviation std of the text image after the current miscut transformation is greater than the current maximum standard deviation maxstd, assign the standard deviation std of the text image after the current miscut transformation to the current maximum standard deviation maxstd, assign the current tangent value t2 to a character correction confirmation tangent value tan (β), and perform the next miscut transformation;
a fourth executing unit, configured to, if a standard deviation std of the text image after the current miscut transform is less than or equal to the current maximum standard deviation maxstd, maintain the current maximum standard deviation maxstd and the character correction confirmation tangent value tan (β) unchanged;
when performing the initial miscut transform, the current maximum standard deviation maxstd is zero, and the character correction confirmation tangent value tan (β) is zero.
Preferably, the apparatus described above, wherein the miscut transform unit further comprises:
a second stop determining subunit, configured to, in a process of performing a first predetermined direction and a different tangent value miscut transform on the text image within a predetermined tangent value range, stop performing the miscut transform on the text image if the current tangent value t2 is increased by the adjustment step length s2 to obtain a value greater than k 2;
wherein the second determination unit includes:
a tangent value extracting subunit operable to extract the current character correction confirmation tangent value tan (β);
and the second corrected image determining subunit is configured to determine that, when the text image is subjected to the miscut transform in the first preset direction by using the current character correction confirmation tangent value tan (β), the text image after the corresponding miscut transform is the image to be read.
Preferably, the apparatus described above, wherein the text line segmentation module includes:
the third projection calculation unit is used for projecting the corrected image in a second preset direction;
a third comparing unit, configured to obtain an accumulated value of each projected pixel row, and compare the accumulated value with a first preset value;
the text line determining unit is used for determining the corresponding pixel line as a text line when the accumulated numerical value is greater than the first preset numerical value;
a background row determining unit, configured to determine the corresponding pixel row as a background row when the accumulated value is smaller than the first preset value;
and the interception execution unit is used for intercepting and obtaining the text image according to the determined text line.
Another aspect of the present invention provides an electronic device, including:
at least one processor; and
a memory coupled to the at least one processor; wherein,
the memory stores a program of instructions executable by the at least one processor, the program of instructions being executable by the at least one processor to cause the at least one processor to:
obtaining a binary image of an image to be corrected;
carrying out image inclination correction on the binary image, and rotating a text line in the binary image to a first preset direction to obtain a corrected image;
performing text line segmentation on the corrected image, and intercepting and selecting a plurality of text images, wherein each text image has one text line;
and carrying out miscut transformation on the text image, and transforming the characters in the inclined state in the text image into a second preset direction perpendicular to the first preset direction to obtain an image to be read.
At least one of the above technical solutions of the specific embodiment of the present invention has the following beneficial effects:
according to the method and the device for correcting the inclined characters, the whole image to be corrected is subjected to inclined correction of the text line, the text line is rotated to the first preset direction (for example, the text line is horizontal), then the image comprising the text line is subjected to miscut transformation, so that characters in the text line are in the second preset direction (for example, the text line is vertical), and the method and the device are adopted, so that the text line interval does not need to be determined line by line, and the problem that the efficiency of a character correction process in the prior art, in which the text line interval needs to be determined line by line, is low can be solved; in addition, the technical scheme of the invention does not require clear and obvious space among the characters, can correct the adhesive characters, has wide application range, and can effectively correct character inclination caused by italic printing fonts and projection imaging of languages such as Chinese, English and Korean.
Drawings
FIG. 1 is a flow chart illustrating a method for tilted text correction according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of step S120 of the method shown in FIG. 1;
FIG. 3 is a schematic flow chart of step S140 of the method shown in FIG. 1;
FIG. 4 is a schematic structural diagram of an apparatus for tilted text correction according to an embodiment of the present invention;
FIG. 5 is a block diagram of an image processing module in the apparatus according to the embodiment of the present invention;
FIG. 6 is a block diagram of a text line correction module in the apparatus according to the embodiment of the present invention;
FIG. 7 is a block diagram of a character correction module in the apparatus according to the embodiment of the present invention;
fig. 8 is a schematic structural diagram of a text line segmentation module in the apparatus according to the embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
Referring to fig. 1, a method for tilted text correction according to an embodiment of the present invention includes the steps of:
s110, obtaining a binary image of an image to be corrected;
s120, carrying out image inclination correction on the binary image, and rotating a text line in the binary image to a first preset direction to obtain a corrected image;
s130, performing text line segmentation on the corrected image, and intercepting and selecting a plurality of text images, wherein each text image has one text line;
s140, carrying out miscut transformation on the text image, and transforming the characters in the inclined state in the text image into a second preset direction perpendicular to the first preset direction to obtain an image to be read.
According to the method for correcting the inclined characters, firstly, the whole image to be corrected is subjected to inclined correction in the first preset direction (such as the horizontal direction) of the text line, then the text line is divided, and the text image formed by the divided text line is subjected to miscut transformation, so that the characters in the inclined state are transformed into the second preset direction perpendicular to the first preset direction, such as the vertical direction, and therefore the text line interval does not need to be determined line by line at the beginning to perform inclined correction, and the problem of low efficiency of the character correction process in the prior art is solved.
In the method according to the embodiment of the present invention, in step S120, in the process of rotating the text line in the binary image to the first preset direction, where the first preset direction may be a horizontal direction, in this case, in step S140, in the process of converting the characters in the inclined state in the text image to the second preset direction, where the second preset direction is a vertical direction; when the processing mode is adopted, the text lines in the binary image in the inclined state are rotated to be horizontal through the method, and after the text lines are subjected to the miscut transformation, the characters in each text line are vertical.
In addition, in the step S120, in the process of rotating the text line in the binary image to the first preset direction, where the first preset direction may be a vertical direction, in this case, in the step S140, in the process of converting the characters in the inclined state in the text image to the second preset direction, the second preset direction is a horizontal direction; when the processing mode is adopted, the text lines in the binary image in the inclined state are rotated to be vertical by the method, the text in the binary image is in a column arrangement state at the moment, and after the miscut transformation is carried out, the characters in each column of text are horizontal and vertical to the length direction of each column of text.
Of course, the first preset direction and the second preset direction are not limited to only horizontal and vertical directions, but may be other directions.
Preferably, in order to facilitate reading of the character after the subsequent tilted text correction, in step S120, the first preset direction is a horizontal direction; in step S140, the second preset direction is a vertical direction.
The following description will be made regarding a specific implementation process of the method according to the embodiment of the present invention, taking the first preset direction as a horizontal direction and the second preset direction as a vertical direction as an example. Specifically, in the method according to the embodiment of the present invention, in step S110, the step of obtaining a binary image of the image to be corrected includes:
and carrying out binarization operation on the image to be corrected to obtain a binary image of the image to be corrected.
Specifically, the threshold of the binarization operation may be determined by using an inter-class maximum variance method (i.e., OTSU method), and the binarization operation processing may be performed on the image to be corrected. That is, the display pixels of the image to be corrected are divided into two parts: display pixels having a grayscale value greater than the threshold and display pixels having a grayscale value less than the threshold. After the binarization operation, the display pixels with the gray values larger than the threshold value are converted into white (or black), and the display pixels with the gray values smaller than the threshold value are converted into black (or white).
Preferably, in order to obtain a clear binary image with a higher resolution, before performing binarization operation on the image to be corrected, image pre-processing operations of image denoising and contrast stretching are performed on the image to be corrected in sequence.
Further, in the method according to the embodiment of the present invention, after obtaining the binary image of the image to be corrected, the method further includes:
and marking the binary image, marking a text area in the binary image as a first numerical value, and marking a background area in the binary image as a second numerical value. According to a general image processing method, a graphic region and a background region in a binary image are respectively marked as 1 and 0. In the embodiment of the present invention, the text area is marked as 1, and the background area is marked as 0.
Before the marking mode is adopted, the areas of the display pixels with different colors in the binary image are respectively counted, and because the area of the background area is larger than that of the character area, when the display pixel with the color with the smaller area is marked as 1 and the display pixel with the color with the larger area is marked as 0, the character area is marked as 1 and the background area is marked as 0.
When the text line in the binary image is not horizontal, it will cause great difficulty to the subsequent image processing work, so it is necessary to perform image tilt correction on the binary image before extracting characters, and correct the text line in the binary image to horizontal, that is, execute step S120, perform image tilt correction on the binary image, and rotate the text line in the binary image to horizontal, so as to obtain a corrected image.
Specifically, the step S120 includes:
s121, rotating the binary image at different angles within a preset angle range;
s122, projecting the binary image after each rotation in the vertical direction;
s123, calculating a standard deviation of the obtained projection sequence when the binary image is projected in the vertical direction after each rotation;
and S124, determining the rotated binary image corresponding to the maximum standard deviation as the corrected image.
By adopting the processing mode, the binary image is subjected to different-angle trial rotation, when the standard deviation of the projection number sequence obtained when the binary image is projected in the vertical direction is maximum during different-angle rotation, the corresponding rotation angle is the angle of the binary image rotated from the current state, and the corresponding rotated binary image is the corrected image when the standard deviation is maximum.
The correction process for determining the maximum standard deviation specifically includes the following steps:
1) initial parameter setting is carried out
Setting the preset angle range of the binary image for angle rotation to be [ theta 1, theta 2], wherein theta 1< theta 2, and the unit is degree; generally, the tilt angle of the image containing the text is within a certain range, so that the possible tilt angle range [ theta 1, theta 2] of the image can be selected according to experience, such as [ -15,15 ];
determining the adjustment step length s1 for angle rotation and the current rotation angle t1 as theta 1;
setting the current maximum standard deviation maxstd to 0, and setting the image tilt correction angle alpha to 0;
2) performing an image correction process
Rotating the initial binary image (here denoted Ibw) by an angle t1 to obtain a new binary image (denoted Irot);
projecting the binary image Irot to the vertical direction to obtain a projection number sequence Iproxo of the binary image Irot in the vertical direction;
specifically, according to the mark of each display pixel in the binary image, the sum of the addition of each row of display pixels of the binary image Irot in the vertical direction is calculated to obtain each projection sequence Iproj; that is, the values of the display pixels of the first row in the binary image Irot are added and summed as the first term of the projection number series Iproj; adding and summing the values of the display pixels of the second line in the binary image Irot to be used as a second item of the projection number sequence Iproxo; … …, respectively; and scanning the binary image Irot line by line until the last line of the binary image Irot is calculated, and obtaining a projection number sequence Iprox.
In the embodiment of the present invention, the text portion in the binary image Irot is marked as 1, the background portion is marked as 0, and the projection number sequence Iproj obtained in the above manner is the number of pixel units corresponding to the text portion in each row.
The length of the recorded projection series Iproj is m, i.e. the number of rows of display pixels in the binary image Irot is m, xiIs the i-th element of Iproj,the calculation formula for the average of the projection sequence Iproj is as follows: calculating out
Then, based on the average value of IproxCalculating the standard deviation std of the projection sequence Iprox, wherein the calculation formula is as follows:
the standard deviation std obtained by the above calculation is also the standard deviation std of the current rotated binary image.
Then, comparing the standard deviation std of the current rotated binary image with the current maximum standard deviation maxstd, if the standard deviation std of the current rotated binary image is larger than the current maximum standard deviation maxstd, assigning the standard deviation std of the current rotated binary image to the current maximum standard deviation maxstd, assigning the current value of the rotation angle t1 to the image inclination correction angle alpha, and then performing the next rotation; and if the standard deviation std of the current rotated binary image is less than or equal to the current maximum standard deviation maxstd, performing no assignment operation, namely keeping the current maximum standard deviation maxstd unchanged, and keeping the image inclination correction angle alpha unchanged.
Further, if the current rotation angle t1 is less than θ 2 and t1+ s1 is not greater than θ 2, adding the value obtained by the adjustment step s1 to the current rotation angle t1, assigning the value to the current rotation angle t1 (i.e., assigning t1+ s1 to t1), and performing the image correction process of the next rotation again to obtain a new rotated binary image;
if t1+ s1> θ 2, extracting the current image tilt correction angle α, rotating the initial binary image by the current image tilt correction angle α, and obtaining a rotated binary image which is the corrected image and has the largest standard deviation of the projection sequence in the vertical direction during the rotation of the binary image from [ θ 1, θ 2 ].
Through the execution steps, the text lines in the binary image are rotated to be horizontal, and the corrected image is obtained. On the basis, in order to further perform inclination correction on the characters in the corrected image, the characters to be read in the corrected image need to be intercepted, so as to obtain a text image only including the text line where the characters to be read are located.
Specifically, in step S130, the step of segmenting the text line of the corrected image and selecting a plurality of text images, where each text image has one text line includes:
projecting the corrected image in a vertical direction;
obtaining an accumulated value of each projected pixel row, and comparing the accumulated value with a first preset value;
when the accumulated value is larger than the first preset value, determining the corresponding pixel line as a text line;
when the accumulated value is smaller than the first preset value, determining the corresponding pixel behavior as a background row;
and according to the determined text line, intercepting and selecting to obtain the text image.
In the step of obtaining the text image by intercepting and selecting according to the determined text line, since one text character is generally composed of a plurality of corresponding pixel lines (i.e. text lines), a plurality of adjacent corresponding pixel lines determined as the text lines are all configured as a line where the text to be read is located, and each pixel line where the text to be read is located is intercepted and the text image only including the text to be read is obtained.
The specific manner of projecting the corrected image in the vertical direction to obtain the accumulated value of each projected pixel row is the same as the corresponding manner of the correction process of the text row in the binary image, and is not described herein again.
Through the processing mode, the text line and the background line are distinguished according to the preset first preset numerical value, and the text image only including the characters to be read is obtained through interception.
Further, since there is a high possibility that the character tilt caused by the italicized font or the projection imaging exists in the text line of the text image, which causes difficulty in the subsequent character segmentation and recognition process, the method of the present invention further includes converting the character of the character in the tilted state in the text image to be vertical, that is, step S140 of fig. 1.
In the embodiment of the present invention, the manner of converting the characters of the characters in the inclined state in the text image into the vertical state is as follows: and carrying out horizontal miscut transformation on the text image.
As shown in fig. 3, specifically, in step S140, performing horizontal miscut transformation on the text image, and transforming characters of the characters in an inclined state in the text image into vertical characters, to obtain the image to be read includes:
s141, performing horizontal and different tangent value miscut transformation on the text image within a preset tangent value range;
s142, projecting the text image subjected to the shear conversion in the horizontal direction;
s143, calculating the standard deviation of the obtained projection sequence when the text image is projected in the horizontal direction after the miscut transformation is carried out each time;
and S144, determining the text image after the corresponding miscut transformation when the standard deviation is maximum as the image to be read.
By adopting the processing mode, the text image is subjected to the horizontal direction and the miscut conversion of different tangent values in the preset tangent value range, and when the standard deviation of the projection number sequence obtained when the text image is projected in the horizontal direction after the miscut conversion of different tangent values is maximum, the corresponding image after the miscut conversion is determined to be the image to be read.
The process of performing the shear transformation in the horizontal direction and with different tangent values specifically includes the following steps:
1) initial parameter setting is carried out
Setting the predetermined range of tangent values for the text image to undergo a horizontal, miscut transform of different tangent values to [ k1, k2], wherein-1 < k1< k2< 1; typically, the tilt angles of characters within a line of text are within a range, so the tangent range of character tilt angles [ k1, k2], such as [ -0.3,0.3], may be empirically selected;
according to the requirement of correction accuracy, determining an adjustment step s2 for changing the tangent value and the current tangent value t2 as k 1;
the current maximum standard deviation maxstd is set to 0, and the one-character correction confirmation tangent value tan (β) is set to 0.
2) Performing a miscut transform process
Performing a miscut transformation on an initial text image, namely an image (shown by Itext) which is obtained by intercepting only the characters to be translated, in the horizontal direction, wherein the display pixel coordinate corresponding relation of the miscut transformation is as follows:
obtaining a miscut transformed image Isear, where xnew,ynewRespectively showing the coordinates in X direction and Y direction of the display pixels in the image Isear after the miscut transformation, Xold,yoldRespectively the coordinates of the display pixels in the X-direction and the Y-direction in the image before the miscut transformation.
Projecting the miscut-transformed image Isear in the horizontal direction to obtain a projection number sequence Iproxo of the miscut-transformed image Isear in the horizontal direction;
specifically, according to the mark of each display pixel in the binary image, calculating the sum of the projection of each row of display pixels of the image Isear after the miscut transformation in the horizontal direction to obtain each projection number sequence Iproxo; that is, the values of the first column of display pixels in the image Ishear after the miscut transform are added and summed to be the first item of the projection number column Iproj; adding and summing the values of the display pixels in the second column in the image Isear after the miscut transformation to obtain a second item of the projection number column Iprox; … …, respectively; and scanning the image Isfront subjected to the cross-cut transformation row by row until the last row of the image Isfront subjected to the cross-cut transformation is calculated, and obtaining a projection number row Iprox.
In the embodiment of the present invention, the text part in the image isohear after the miscut transform is marked as 1, and the background part is marked as 0. The projection number sequence Iproj obtained in the above manner is the number of pixel units corresponding to the text portion in each row.
Recording the length of the projection number sequence Iprox as m, namely the number of the display pixel rows in the image Isear after the miscut transformation as m, xi as the ith element of the Iprox,the calculation formula for the average of the projection sequence Iproj is as follows:
then, based on the average value of IproxCalculating the standard deviation std of the projection sequence Iprox, wherein the calculation formula is as follows:
the standard deviation std obtained by the above process is also the standard deviation std of the current text image isohear after the miscut transformation.
Then, comparing the standard deviation std of the current text image isohear after the miscut conversion with the current maximum standard deviation maxstd, if the standard deviation std of the current image after the miscut conversion is larger than the current maximum standard deviation maxstd, assigning the standard deviation std of the current text image isohear after the miscut conversion to the current maximum standard deviation maxstd, assigning the value of the current tangent value t2 to the character correction confirmation tangent value tan (beta), and then performing the next horizontal miscut conversion; if the standard deviation std of the text image isohear after the current miscut transformation is less than or equal to the current maximum standard deviation maxstd, no assignment operation is performed, that is, the current maximum standard deviation maxstd remains unchanged, and the character correction confirmation tangent value tan (β) remains unchanged.
Further, if the current tangent value t2 is less than k2 and t2+ s2 is not greater than k2, adding the value obtained by the adjustment step s2 to the current tangent value t2, assigning the current tangent value t2 (i.e., assigning t2+ s2 to t2), and performing the next rotation of the miscut transform again to obtain a new image after the miscut transform;
if t2+ s2> k2, extracting the current character correction confirmation tangent value tan (β), and performing horizontal cross-cut transformation on the initial text image by using the current character correction confirmation tangent value tan (β), wherein the obtained cross-cut transformed text image is the image to be read. When the image to be read is an original text image and is subjected to horizontal direction miscut transformation within the angle range of [ k1, k2], the standard deviation of a projection sequence in the horizontal direction is the largest.
Specifically, the correspondence relationship between the pixel coordinates of the display pixels in the initial text image and the image to be read is as follows:
wherein xnew,ynewRespectively coordinates in the X-direction and Y-direction of the display pixels in the image to be read, i.e. when the character has been corrected to no tiltold,yoldTan () is the last character correction confirmation tangent value tan (β) in the above-described miscut transform process, for the coordinates of the display pixel X-direction and Y-direction in the original text image, respectively.
According to the above-mentioned manner and process, the image to be read in which the character is converted into the non-tilt state is obtained, and the image to be read can be used for further character segmentation and reading.
The method for correcting the inclined characters can be used for translating the characters during reading, when a user reads, an image of a scene in a visual field range read by the user is shot, an indicating image of one character which needs to be translated and indicated by the user during reading is obtained, the character indicated by the user is determined through an image analysis technology, the method for correcting the inclined characters is further adopted, the shot indicating image is subjected to text line inclination correction, the text line where the character indicated by the user is located is intercepted, and after horizontal direction miscut transformation is carried out on the image of the text line where the indicated character is located, the indicated character can be converted into a non-inclined state, so that the indicated character can be recognized and translated subsequently.
Of course, the method for correcting tilted text according to the embodiment of the present invention is not limited to be applied only to the above-mentioned usage scenarios, and the method can be applied to various processes requiring extraction and recognition of characters in an image file.
The method for correcting the inclined characters can correct the inclined characters in the image to be corrected into the characters without inclination so as to facilitate subsequent character separation and character recognition. The method of the invention firstly carries out the tilt correction of the text line on the whole image to be corrected, rotates the text line to be horizontal, and then carries out the miscut transformation on the image comprising one text line to ensure that the characters in the text line are in a vertical state.
In another aspect, an embodiment of the present invention further provides an apparatus for correcting tilted text, which is shown in fig. 4, and the apparatus includes:
an image processing module 100, configured to obtain a binary image of an image to be corrected;
a text line correction module 200, configured to perform image tilt correction on the binary image, and rotate a text line in the binary image to a first preset direction to obtain a corrected image;
a text line segmentation module 300, configured to perform text line segmentation on the corrected image, and segment a plurality of text images, where there is one text line in each text image;
and the character correction module 400 is configured to perform a miscut transformation on the text image, transform the characters in the text image in an inclined state into a second preset direction perpendicular to the first preset direction, and obtain an image to be read.
The structure and function of each module are described in detail below by taking the first preset direction as the horizontal direction and the second preset direction as the vertical direction as an example.
According to the device for correcting the inclined characters, the whole image to be corrected is subjected to inclination correction of the text line, the text line is rotated to be horizontal, then the image comprising the text line is subjected to miscut transformation, so that the characters in the text line are in a vertical state, the device does not require clear and obvious intervals among the characters, can correct the adhered characters, has a wide application range, can effectively correct the character inclination caused by inclined printing fonts and projection imaging of languages such as Chinese, English and Korean, and can solve the problem of low efficiency of a character correction process in the prior art.
Referring to fig. 5, in the apparatus according to the embodiment of the present invention, the image processing module 100 includes:
and the first preprocessing unit is used for carrying out binarization operation on the image to be corrected to obtain a binary image of the image to be corrected.
Specifically, the threshold of the binarization operation may be determined by using an inter-class maximum variance method (i.e., OTSU method), and the binarization operation processing may be performed on the image to be corrected. That is, the display pixels of the image to be corrected are divided into two parts: display pixels having a grayscale value greater than the threshold and display pixels having a grayscale value less than the threshold. After the binarization operation, the display pixels with the gray values larger than the threshold value are converted into white (or black), and the display pixels with the gray values smaller than the threshold value are converted into black (or white).
Preferably, the image processing module 100 further comprises:
and the second preprocessing unit is used for respectively carrying out denoising and contrast stretching processing on the image to be corrected before carrying out binarization operation on the image to be corrected.
Before the binarization operation is carried out on the image to be corrected, the image to be corrected is subjected to image denoising and contrast stretching in sequence to obtain a clear binary image with higher resolution.
On the other hand, referring to fig. 6, the text line correction module 200 includes:
the angle rotating unit is used for rotating the binary image by different angles within a preset angle range;
the first projection calculation unit is used for projecting the binary image in the vertical direction after each rotation;
the first standard deviation calculation unit is used for calculating the standard deviation of the projection sequence obtained when the binary image is projected in the vertical direction after each rotation;
and the first determining unit is used for determining the binary image after the corresponding rotation when the standard deviation is maximum as the corrected image.
By adopting the processing unit, the binary image is subjected to different-angle trial rotation, when the standard deviation of the projection number sequence obtained when the binary image is projected in the vertical direction is maximum during different-angle rotation, the corresponding rotation angle is the angle of the binary image rotated from the current state, and the corresponding rotated binary image is the corrected image when the standard deviation is maximum.
Preferably, the angle rotating unit includes:
a first setting subunit, configured to set the predetermined angle range in which the binary image is angularly rotated to [ θ 1, θ 2], where θ 1< θ 2;
a second setting subunit, configured to determine that an adjustment step size for performing angle rotation is s1 and a current rotation angle t1 is θ 1;
and the rotation execution subunit is configured to perform initial rotation on the current rotation angle t1 ═ θ 1, increase the current rotation angle t1 by the adjustment step length s1, and assign the current rotation angle t1 with a value obtained by adding the adjustment step length s1, where t1+ s1 is not more than θ 2.
Preferably, the text line correction module further comprises:
the first comparison unit is used for comparing the standard deviation std of the binary image after current rotation with the current maximum standard deviation maxstd;
a first execution unit, configured to, if the standard deviation std of the binary image after the current rotation is greater than the current maximum standard deviation maxstd, assign the standard deviation std of the binary image after the current rotation to the current maximum standard deviation maxstd, assign the current rotation angle t1 to an image tilt correction angle α, and perform the next rotation;
a second execution unit, configured to, if a standard deviation std of the binary image after the current rotation is less than or equal to the current maximum standard deviation maxstd, maintain the current maximum standard deviation maxstd and the image tilt correction angle α unchanged;
when the initial rotation is performed, the current maximum standard deviation maxstd is zero, and the image tilt correction angle α is zero.
Preferably, the angle rotating unit further includes:
a first stop determining subunit, configured to, in a process of performing different angle rotations on the binary image within a predetermined angle range, stop performing the angle rotation on the binary image if the current rotation angle t1 is increased by the adjustment step s1 to obtain a value greater than θ 2;
wherein the first determination unit includes:
a correction angle extraction subunit, configured to extract a current tilt correction angle α of the image;
a first corrected image determining subunit, configured to determine that the binary image after rotation corresponding to the binary image when the binary image is rotated by the current image tilt correction angle α is the corrected image.
The text line correction module 200 including the above structure may specifically perform the text line correction process according to the above description of the method section. By the text line correction module 200, the text lines in the binary image are rotated to be horizontal, and a corrected image is obtained.
On the other hand, as shown in fig. 7, the character correction module 400 includes:
the miscut transformation unit is used for carrying out the miscut transformation of different tangent values in the horizontal direction on the text image within a preset tangent value range;
the second projection calculation unit is used for projecting the text image subjected to the shear conversion in the horizontal direction each time;
the second standard deviation calculating unit is used for calculating the standard deviation of the obtained projection sequence when the text image is projected in the horizontal direction after the text image is subjected to the miscut transform every time;
and the second determining unit is used for determining the text image after the corresponding miscut transformation when the standard deviation is maximum as the image to be read.
Specifically, the miscut transform unit includes:
a third setting subunit for setting the predetermined range of tangent values for the text image to be [ k1, k2] for a horizontal, different tangent value, miscut transform, wherein-1 < k1< k2< 1;
a fourth setting subunit, configured to determine that an adjustment step size for performing the miscut transform is s2 and a current tangent value t2 is k 1;
and the miscut transformation executing subunit is used for performing initial miscut transformation on the current tangent value t 2-k 1, increasing the current tangent value t2 by the value obtained by the adjustment step s2, assigning the value to the current tangent value t2, and performing next miscut transformation, wherein t2+ s2 is not more than k 2.
Specifically, the character correction module further includes:
the second comparison unit is used for comparing the standard deviation std of the text image after the current miscut transformation with the current maximum standard deviation maxstd;
a third executing unit, configured to, if the standard deviation std of the text image after the current miscut transformation is greater than the current maximum standard deviation maxstd, assign the standard deviation std of the text image after the current miscut transformation to the current maximum standard deviation maxstd, assign the current tangent value t2 to a character correction confirmation tangent value tan (β), and perform the next miscut transformation;
a fourth executing unit, configured to, if a standard deviation std of the text image after the current miscut transform is less than or equal to the current maximum standard deviation maxstd, maintain the current maximum standard deviation maxstd and the character correction confirmation tangent value tan (β) unchanged;
when performing the initial miscut transform, the current maximum standard deviation maxstd is zero, and the character correction confirmation tangent value tan (β) is zero.
Specifically, the miscut transform unit further includes:
a second stop determining subunit, configured to, in a process of performing, on the text image, a horizontal-direction, different-tangent-value miscut transform within a predetermined tangent-value range, stop performing the miscut transform on the text image if a value obtained by increasing the current tangent value t2 by the adjustment step length s2 is greater than k 2;
wherein the second determination unit includes:
a tangent value extracting subunit operable to extract the current character correction confirmation tangent value tan (β);
a second corrected image determining subunit, configured to determine that, when the text image is subjected to horizontal miscut transform with the current character correction confirmation tangent value tan (β), the text image after the corresponding miscut transform is the image to be read.
In the device according to the embodiment of the present invention, the character correction module having the above structure converts the character into a non-inclined state by horizontal direction miscut transformation, so as to be used for further character segmentation and reading. For a specific implementation process of the horizontal direction miscut transform, reference may be made to the description of the above method, and details are not described here.
Further, referring to fig. 8, the text line segmentation module 300 includes:
a third projection calculation unit for projecting the corrected image in a vertical direction;
the third comparison unit is used for obtaining the accumulated value of each projected pixel row and comparing the accumulated value with a first preset value;
the text line determining unit is used for determining the corresponding pixel line as a text line when the accumulated numerical value is greater than the first preset numerical value;
a background row determining unit, configured to determine the corresponding pixel row as a background row when the accumulated value is smaller than the first preset value;
and the interception execution unit is used for intercepting and obtaining the text image according to the determined text line.
The text line segmentation module with the structure distinguishes the text line from the background line according to a preset first preset numerical value, and captures and obtains the text image corresponding to each text line for the horizontal direction miscut transformation of the text image corresponding to each subsequent text line, so that characters in the text lines are converted into a non-inclined state.
Another aspect of the present invention provides an electronic device, including:
at least one processor; and
a memory coupled to the at least one processor; wherein,
the memory stores a program of instructions executable by the at least one processor, the program of instructions being executable by the at least one processor to cause the at least one processor to:
obtaining a binary image of an image to be corrected;
carrying out image inclination correction on the binary image, and rotating a text line in the binary image to a first preset direction to obtain a corrected image;
performing text line segmentation on the corrected image, and intercepting and selecting a plurality of text images, wherein each text image has one text line;
and carrying out miscut transformation on the text image, and transforming the characters in the inclined state in the text image into a second preset direction perpendicular to the first preset direction to obtain an image to be read.
Any of the solutions of the method of the present invention can be implemented by at least one processor of the electronic device calling the relevant instruction program of the memory. The description of the electronic device will not be repeated.
The electronic equipment provided by the embodiment of the invention can be applied to various implementation technologies which need to extract and identify characters in an image file. The electronic equipment can correct inclined characters in the image to be corrected into characters without inclination so as to facilitate subsequent character separation and character recognition.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention as defined in the appended claims.
Claims (19)
1. A method for oblique text correction, the method comprising:
obtaining a binary image of an image to be corrected;
carrying out image inclination correction on the binary image, and rotating a text line in the binary image to a first preset direction to obtain a corrected image;
performing text line segmentation on the corrected image, and intercepting and selecting a plurality of text images, wherein each text image has one text line;
performing miscut transformation on the text image, and transforming characters in an inclined state in the text image into a second preset direction perpendicular to the first preset direction to obtain an image to be read;
wherein the step of performing text line segmentation on the corrected image and truncating a plurality of text images comprises:
projecting the corrected image in a second preset direction;
obtaining an accumulated value of each projected pixel row, and comparing the accumulated value with a first preset value;
when the accumulated value is larger than the first preset value, determining the corresponding pixel line as a text line;
when the accumulated value is smaller than the first preset value, determining the corresponding pixel behavior as a background row;
and according to the determined text line, intercepting and selecting to obtain the text image.
2. The method according to claim 1, wherein the step of performing image tilt correction on the binary image, rotating a text line in the binary image to a first preset direction, and obtaining a corrected image comprises:
rotating the binary image at different angles within a preset angle range;
projecting the binary image after each rotation in a second preset direction;
calculating the standard deviation of the obtained projection sequence when the binary image is projected in a second preset direction after each rotation;
and determining the rotated binary image corresponding to the maximum standard deviation as the corrected image.
3. The method of claim 2, wherein the step of rotating the binary image by different angles within a predetermined angular range comprises:
setting the preset angle range of the binary image for angle rotation to be [ theta 1, theta 2], wherein theta 1< theta 2;
determining the adjustment step length s1 for angle rotation and the current rotation angle t1 as theta 1;
and performing initial rotation on the current rotation angle t 1-theta 1, adding the adjustment step length s1 to the current rotation angle t1 to obtain a numerical value, and assigning the numerical value to the current rotation angle t1 for next rotation, wherein t1+ s1 is not more than theta 2.
4. The method according to claim 3, wherein after the step of calculating the standard deviation of the series of projections obtained when the binary image is projected in the second predetermined direction after each rotation, the method further comprises:
comparing the standard deviation std of the binary image after current rotation with the current maximum standard deviation maxstd;
if the standard deviation std of the binary image after the current rotation is larger than the current maximum standard deviation maxstd, assigning the standard deviation std of the binary image after the current rotation to the current maximum standard deviation maxstd, assigning the current rotation angle t1 to an image inclination correction angle alpha, and performing the next rotation;
if the standard deviation std of the binary image after the current rotation is less than or equal to the current maximum standard deviation maxstd, maintaining the current maximum standard deviation maxstd and the image inclination correction angle alpha unchanged;
when the initial rotation is performed, the current maximum standard deviation maxstd is zero, and the image tilt correction angle α is zero.
5. The method according to claim 4, wherein in the course of performing different angle rotations on the binary image within a predetermined angle range, if the current rotation angle t1 is increased by the adjustment step s1 to obtain a value greater than θ 2, the angle rotations on the binary image are stopped;
wherein the step of determining the rotated binary image corresponding to the maximum standard deviation as the corrected image comprises:
extracting the current image inclination correction angle alpha;
and determining the binary image after the binary image is correspondingly rotated when the binary image is rotated by the current image inclination correction angle alpha, wherein the binary image is the corrected image.
6. The method according to claim 1, wherein the step of performing the miscut transformation on the text image to transform the characters of the characters in the text image in an inclined state into a second preset direction perpendicular to the first preset direction to obtain the image to be read comprises:
performing a first preset direction and different tangent value miscut transformation on the text image within a preset tangent value range;
projecting the text image subjected to the miscut transformation in a first preset direction;
calculating the standard deviation of the obtained projection sequence when the text image is projected in a first preset direction after each time of miscut transform;
and determining the text image after the corresponding miscut transformation when the standard deviation is maximum as the image to be read.
7. The method of claim 6, wherein the step of subjecting the text image to a first predetermined directional, different tangent value, miscut transform within a predetermined range of tangent values comprises:
setting the predetermined range of tangent values for the text image to undergo a first preset direction, different tangent value miscut transform to [ k1, k2], wherein-1 < k1< k2< 1;
determining the adjustment step size of the miscut transform to be s2 and the current tangent value t2 to be k 1;
and performing initial miscut transform on the current tangent value t 2-k 1, increasing the value obtained by the adjustment step length s2 to the current tangent value t2, assigning the value to the current tangent value t2, and performing next miscut transform, wherein t2+ s2 is not more than k 2.
8. The method of claim 7, wherein after the step of calculating the standard deviation of the obtained projection sequence when the text image is projected in the first preset direction after each time of the miscut transform, the method further comprises:
comparing the standard deviation std of the text image after the current miscut transformation with the current maximum standard deviation maxstd;
if the standard deviation std of the text image after the current miscut transformation is larger than the current maximum standard deviation maxstd, assigning the standard deviation std of the text image after the current miscut transformation to the current maximum standard deviation maxstd, assigning the current tangent value t2 to a character correction confirmation tangent value tan (beta), and performing the next miscut transformation;
if the standard deviation std of the text image after the current miscut transformation is less than or equal to the current maximum standard deviation maxstd, maintaining the current maximum standard deviation maxstd and the character correction confirmation tangent value tan (beta) unchanged;
when performing the initial miscut transform, the current maximum standard deviation maxstd is zero, and the character correction confirmation tangent value tan (β) is zero.
9. The method according to claim 8, wherein in the process of performing the miscut transform on the text image with a first preset direction and different tangent values within a predetermined tangent value range, if the current tangent value t2 is increased by the adjustment step s2 to obtain a value greater than k2, the miscut transform on the text image is stopped;
wherein, the step of determining the text image after the corresponding miscut transformation when the standard deviation is the maximum comprises:
extracting the current character correction confirmation tangent value tan (beta);
and when the text image is determined to be subjected to the miscut transformation in the first preset direction by using the current character correction confirmation tangent value tan (beta), the text image after the corresponding miscut transformation is the image to be read.
10. An apparatus for oblique text correction, the apparatus comprising:
the image processing module is used for obtaining a binary image of an image to be corrected;
the text line correction module is used for carrying out image inclination correction on the binary image, rotating the text lines in the binary image to a first preset direction and obtaining a corrected image;
the text line segmentation module is used for performing text line segmentation on the corrected image and selecting a plurality of text images, wherein each text image has one text line;
the character correction module is used for carrying out miscut transformation on the text image, transforming the characters in the inclined state in the text image into a second preset direction perpendicular to the first preset direction, and obtaining an image to be read;
wherein the text line segmentation module comprises:
the third projection calculation unit is used for projecting the corrected image in a second preset direction;
a third comparing unit, configured to obtain an accumulated value of each projected pixel row, and compare the accumulated value with a first preset value;
the text line determining unit is used for determining the corresponding pixel line as a text line when the accumulated numerical value is greater than the first preset numerical value;
a background row determining unit, configured to determine the corresponding pixel row as a background row when the accumulated value is smaller than the first preset value;
and the interception execution unit is used for intercepting and obtaining the text image according to the determined text line.
11. The apparatus of claim 10, wherein the text line correction module comprises:
the angle rotating unit is used for rotating the binary image by different angles within a preset angle range;
the first projection calculation unit is used for projecting the binary image in a second preset direction after each rotation;
the first standard deviation calculation unit is used for calculating the standard deviation of the obtained projection sequence when the binary image is projected in the second preset direction after each rotation;
and the first determining unit is used for determining the binary image after the corresponding rotation when the standard deviation is maximum as the corrected image.
12. The apparatus of claim 11, wherein the angle rotating unit comprises:
a first setting subunit, configured to set the predetermined angle range in which the binary image is angularly rotated to [ θ 1, θ 2], where θ 1< θ 2;
a second setting subunit, configured to determine that an adjustment step size for performing angle rotation is s1 and a current rotation angle t1 is θ 1;
and the rotation execution subunit is configured to perform initial rotation on the current rotation angle t1 ═ θ 1, increase the current rotation angle t1 by the adjustment step length s1, and assign the current rotation angle t1 with a value obtained by adding the adjustment step length s1, where t1+ s1 is not more than θ 2.
13. The apparatus of claim 12, wherein the text line correction module further comprises:
the first comparison unit is used for comparing the standard deviation std of the binary image after current rotation with the current maximum standard deviation maxstd;
a first execution unit, configured to, if the standard deviation std of the binary image after the current rotation is greater than the current maximum standard deviation maxstd, assign the standard deviation std of the binary image after the current rotation to the current maximum standard deviation maxstd, assign the current rotation angle t1 to an image tilt correction angle α, and perform the next rotation;
a second execution unit, configured to, if a standard deviation std of the binary image after the current rotation is less than or equal to the current maximum standard deviation maxstd, maintain the current maximum standard deviation maxstd and the image tilt correction angle α unchanged;
when the initial rotation is performed, the current maximum standard deviation maxstd is zero, and the image tilt correction angle α is zero.
14. The apparatus of claim 13, wherein the angle rotating unit further comprises:
a first stop determining subunit, configured to, in a process of performing different angle rotations on the binary image within a predetermined angle range, stop performing the angle rotation on the binary image if the current rotation angle t1 is increased by the adjustment step s1 to obtain a value greater than θ 2;
wherein the first determination unit includes:
a correction angle extraction subunit, configured to extract a current tilt correction angle α of the image;
a first corrected image determining subunit, configured to determine that the binary image after rotation corresponding to the binary image when the binary image is rotated by the current image tilt correction angle α is the corrected image.
15. The apparatus of claim 10, wherein the character correction module comprises:
the miscut transformation unit is used for carrying out the miscut transformation of different tangent values in a first preset direction on the text image within a preset tangent value range;
the second projection calculation unit is used for projecting the text image subjected to the shear conversion in the first preset direction;
the second standard deviation calculating unit is used for calculating the standard deviation of the obtained projection sequence when the text image is projected in the first preset direction after the text image is subjected to the miscut transform every time;
and the second determining unit is used for determining the text image after the corresponding miscut transformation when the standard deviation is maximum as the image to be read.
16. The apparatus of claim 15, wherein the miscut transform unit comprises:
a third setting subunit, configured to set the predetermined tangent value range of the text image subjected to the first preset direction, the miscut transform of different tangent values to [ k1, k2], where-1 < k1< k2< 1;
a fourth setting subunit, configured to determine that an adjustment step size for performing the miscut transform is s2 and a current tangent value t2 is k 1;
and the miscut transformation executing subunit is used for performing initial miscut transformation on the current tangent value t 2-k 1, increasing the current tangent value t2 by the value obtained by the adjustment step s2, assigning the value to the current tangent value t2, and performing next miscut transformation, wherein t2+ s2 is not more than k 2.
17. The apparatus of claim 16, wherein the character correction module further comprises:
the second comparison unit is used for comparing the standard deviation std of the text image after the current miscut transformation with the current maximum standard deviation maxstd;
a third executing unit, configured to, if the standard deviation std of the text image after the current miscut transformation is greater than the current maximum standard deviation maxstd, assign the standard deviation std of the text image after the current miscut transformation to the current maximum standard deviation maxstd, assign the current tangent value t2 to a character correction confirmation tangent value tan (β), and perform the next miscut transformation;
a fourth executing unit, configured to, if a standard deviation std of the text image after the current miscut transform is less than or equal to the current maximum standard deviation maxstd, maintain the current maximum standard deviation maxstd and the character correction confirmation tangent value tan (β) unchanged;
when performing the initial miscut transform, the current maximum standard deviation maxstd is zero, and the character correction confirmation tangent value tan (β) is zero.
18. The apparatus of claim 17, wherein the miscut transform unit further comprises:
a second stop determining subunit, configured to, in a process of performing a first predetermined direction and a different tangent value miscut transform on the text image within a predetermined tangent value range, stop performing the miscut transform on the text image if the current tangent value t2 is increased by the adjustment step length s2 to obtain a value greater than k 2;
wherein the second determination unit includes:
a tangent value extracting subunit operable to extract the current character correction confirmation tangent value tan (β);
and the second corrected image determining subunit is configured to determine that, when the text image is subjected to the miscut transform in the first preset direction by using the current character correction confirmation tangent value tan (β), the text image after the corresponding miscut transform is the image to be read.
19. An electronic device, comprising:
at least one processor; and
a memory coupled to the at least one processor; wherein,
the memory stores a program of instructions executable by the at least one processor, the program of instructions being executable by the at least one processor to cause the at least one processor to:
obtaining a binary image of an image to be corrected;
carrying out image inclination correction on the binary image, and rotating a text line in the binary image to a first preset direction to obtain a corrected image;
performing text line segmentation on the corrected image, and intercepting and selecting a plurality of text images, wherein each text image has one text line;
performing miscut transformation on the text image, and transforming characters in an inclined state in the text image into a second preset direction perpendicular to the first preset direction to obtain an image to be read;
the processor performs text line segmentation on the corrected image, and selects a plurality of text images, where each text image has one text line, specifically:
projecting the corrected image in a second preset direction;
obtaining an accumulated value of each projected pixel row, and comparing the accumulated value with a first preset value;
when the accumulated value is larger than the first preset value, determining the corresponding pixel line as a text line;
when the accumulated value is smaller than the first preset value, determining the corresponding pixel behavior as a background row;
and according to the determined text line, intercepting and selecting to obtain the text image.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610945094.9A CN107992869B (en) | 2016-10-26 | 2016-10-26 | Method and device for correcting tilted characters and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201610945094.9A CN107992869B (en) | 2016-10-26 | 2016-10-26 | Method and device for correcting tilted characters and electronic equipment |
Publications (2)
Publication Number | Publication Date |
---|---|
CN107992869A CN107992869A (en) | 2018-05-04 |
CN107992869B true CN107992869B (en) | 2020-09-22 |
Family
ID=62028772
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201610945094.9A Expired - Fee Related CN107992869B (en) | 2016-10-26 | 2016-10-26 | Method and device for correcting tilted characters and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN107992869B (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108647681B (en) * | 2018-05-08 | 2019-06-14 | 重庆邮电大学 | A kind of English text detection method with text orientation correction |
CN108681729B (en) * | 2018-05-08 | 2023-06-23 | 腾讯科技(深圳)有限公司 | Text image correction method, device, storage medium and equipment |
CN111723610B (en) * | 2019-03-20 | 2024-03-08 | 北京沃东天骏信息技术有限公司 | Image recognition method, device and equipment |
CN110705546B (en) * | 2019-09-06 | 2023-12-19 | 平安科技(深圳)有限公司 | Text image angle deviation correcting method and device and computer readable storage medium |
CN111967474B (en) * | 2020-09-07 | 2024-04-26 | 凌云光技术股份有限公司 | Text line character segmentation method and device based on projection |
CN112241737B (en) * | 2020-11-12 | 2024-01-26 | 瞬联软件科技(北京)有限公司 | Text image correction method and device |
CN112651401B (en) * | 2020-12-30 | 2024-04-02 | 凌云光技术股份有限公司 | Automatic correction method and system for code spraying character |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104156718A (en) * | 2014-08-20 | 2014-11-19 | 电子科技大学 | Vehicle license plate image vertical tilt correction method |
CN105069456A (en) * | 2015-07-30 | 2015-11-18 | 北京邮电大学 | License plate character segmentation method and apparatus |
CN105426887A (en) * | 2015-10-30 | 2016-03-23 | 北京奇艺世纪科技有限公司 | Method and device for text image correction |
-
2016
- 2016-10-26 CN CN201610945094.9A patent/CN107992869B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104156718A (en) * | 2014-08-20 | 2014-11-19 | 电子科技大学 | Vehicle license plate image vertical tilt correction method |
CN105069456A (en) * | 2015-07-30 | 2015-11-18 | 北京邮电大学 | License plate character segmentation method and apparatus |
CN105426887A (en) * | 2015-10-30 | 2016-03-23 | 北京奇艺世纪科技有限公司 | Method and device for text image correction |
Also Published As
Publication number | Publication date |
---|---|
CN107992869A (en) | 2018-05-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107992869B (en) | Method and device for correcting tilted characters and electronic equipment | |
EP3309703B1 (en) | Method and system for decoding qr code based on weighted average grey method | |
EP2270746B1 (en) | Method for detecting alterations in printed document using image comparison analyses | |
CN112183038A (en) | Form identification and typing method, computer equipment and computer readable storage medium | |
CN110647882A (en) | Image correction method, device, equipment and storage medium | |
US8331670B2 (en) | Method of detection document alteration by comparing characters using shape features of characters | |
US20140093172A1 (en) | Image processing device and image processing method | |
CN107066433B (en) | Tables for shifting rotation in images | |
CN109993161B (en) | Text image rotation correction method and system | |
US7969631B2 (en) | Image processing apparatus, image processing method and computer readable medium storing image processing program | |
CN109598185B (en) | Image recognition translation method, device and equipment and readable storage medium | |
JP2009003937A (en) | Method and system for identifying text orientation in digital image, control program and recording medium | |
CN112308063B (en) | Character recognition device, translation pen, image translation method, and image translation device | |
CN105551044B (en) | A kind of picture control methods and device | |
US8538191B2 (en) | Image correction apparatus and method for eliminating lighting component | |
CN106203431A (en) | A kind of image-recognizing method and device | |
US20200242391A1 (en) | Object detection apparatus, object detection method, and computer-readable recording medium | |
CN112419207A (en) | Image correction method, device and system | |
CN109741273A (en) | A kind of mobile phone photograph low-quality images automatically process and methods of marking | |
US20160044196A1 (en) | Image processing apparatus | |
CN102682457A (en) | Rearrangement method for performing adaptive screen reading on print media image | |
CN110991440A (en) | Pixel-driven mobile phone operation interface text detection method | |
CN110610163B (en) | Table extraction method and system based on ellipse fitting in natural scene | |
CN113076952A (en) | Method and device for automatically identifying and enhancing text | |
US20230071008A1 (en) | Computer-readable, non-transitory recording medium containing therein image processing program for generating learning data of character detection model, and image processing apparatus |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20200922 |
|
CF01 | Termination of patent right due to non-payment of annual fee |