WO2019119030A1 - Image analysis - Google Patents
Image analysis Download PDFInfo
- Publication number
- WO2019119030A1 WO2019119030A1 PCT/AU2018/051347 AU2018051347W WO2019119030A1 WO 2019119030 A1 WO2019119030 A1 WO 2019119030A1 AU 2018051347 W AU2018051347 W AU 2018051347W WO 2019119030 A1 WO2019119030 A1 WO 2019119030A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- indicia
- image
- feature
- resolved
- location
- Prior art date
Links
- 238000010191 image analysis Methods 0.000 title claims description 13
- 238000006243 chemical reaction Methods 0.000 claims abstract description 9
- 239000000284 extract Substances 0.000 claims abstract description 9
- 238000000034 method Methods 0.000 claims description 35
- 238000010801 machine learning Methods 0.000 claims description 8
- 238000004458 analytical method Methods 0.000 claims description 7
- 238000012545 processing Methods 0.000 claims description 7
- 239000000463 material Substances 0.000 claims description 3
- 238000012015 optical character recognition Methods 0.000 description 10
- 238000004422 calculation algorithm Methods 0.000 description 4
- 238000005516 engineering process Methods 0.000 description 4
- 238000013528 artificial neural network Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 238000007689 inspection Methods 0.000 description 2
- 230000000306 recurrent effect Effects 0.000 description 2
- 238000012706 support-vector machine Methods 0.000 description 2
- 238000007635 classification algorithm Methods 0.000 description 1
- 238000004891 communication Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000007717 exclusion Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000007637 random forest analysis Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000012549 training Methods 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/12—Detection or correction of errors, e.g. by rescanning the pattern
- G06V30/133—Evaluation of quality of the acquired characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/412—Layout analysis of documents structured with printed lines or input boxes, e.g. business forms or tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/31—Indexing; Data structures therefor; Storage structures
- G06F16/316—Indexing structures
- G06F16/319—Inverted lists
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2413—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on distances to training or reference patterns
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/12—Detection or correction of errors, e.g. by rescanning the pattern
- G06V30/127—Detection or correction of errors, e.g. by rescanning the pattern with the intervention of an operator
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
- G06V30/14—Image acquisition
- G06V30/148—Segmentation of character regions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/40—Document-oriented image-based pattern recognition
- G06V30/41—Analysis of document content
- G06V30/413—Classification of content, e.g. text, photographs or tables
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06V—IMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
- G06V30/00—Character recognition; Recognising digital ink; Document-oriented image-based pattern recognition
- G06V30/10—Character recognition
Definitions
- the present disclosure relates to an image analysing system and a method of analysing an image that includes both text and images.
- Images in electronic format can be challenging to analyse, navigate and/or extract information from. If there is information in an image, there are limited tools available to find that information, or to search the image for the information.
- Existing optical character recognition (OCR) technology is able to find text within images, but does not always provide an accurate result. Also, there are limited tools with which to navigate through or edit images that have been OCR-ed. If the same type of information is required from different types or styles of images, it can be difficult for a user to find that information when visually inspecting the images.
- an image analysing system which includes: a conversion module that converts an image to segmented data including a first set of indicia and a first set of images; an image recognition module that selects a first image subset from the first set of images and extracts a second set of indicia from the first image subset; an indicia recognition module that recognises indicia in the first set of indicia and indicia in the second set of indicia, wherein the indicia recognition module generates a set of resolved indicia and a set of unresolved indicia; a classifier that classifies resolved indicia by:
- comparing the set of resolved indicia with a classification framework and extracting at least one feature that includes at least one indicium from the set of resolved indicia; and a feature locator that determines at least one indicia location in the image associated with one or more indicia in the set of unresolved indicia, wherein the feature locator bookmarks the at least one indicia location with an indicia bookmark.
- the system may further include a user interface enabling a user to: access the at least one indicia location via the indicia bookmark, and manipulate the unresolved indicia to form resolvable indicia.
- the classifier may classify the resolvable indicia to extract at least one further feature from the resolvable indicia.
- the image recognition module may select a second image subset from the first set of images, the feature locator may determine at least one image location in the image associated with one or more images in the second image subset, and the feature locator may bookmark the at least one image location with an image bookmark.
- the user interface may further enable the user to: access the at least one image location via the image bookmark, and manipulate the one or more images in the second image subset to form further resolvable indicia.
- the classifier may classify the further resolvable indicia to extract at least one additional feature from the further resolvable indicia.
- the user interface may display the indicia bookmark and the image bookmark to be visible on the image at the at least one indicia location and the at least one image location respectively.
- Extracting the at least one feature may include displaying the at least one extracted feature on the user interface.
- the at least one extracted feature may be displayed in a segmented and editable format.
- system may include an initial, primary stage configured to determine the relevance of all items of the image.
- the primary stage may include a primary stage classifier configured to analyse the items constituting the image, determine the relevance of each of the items, discard the items which are not of interest and forward items of interest for further processing to the image analysis classifier.
- the primary stage classifier may employ machine learning to conduct the analysis.
- the primary stage may be configured to remove extraneous matter from an item determined by the primary stage classifier to be an item of interest before the item of interest is forwarded to the image analysis classifier.
- an image analysing system which includes: a conversion module that provides an image that includes segmented data including a first set of indicia; an indicia recognition module that recognises indicia in the first set of indicia, wherein the indicia recognition module generates a set of resolved indicia and a set of unresolved indicia; a classifier that classifies resolved indicia to find at least one feature that includes at least one indicium from the set of resolved indicia; and a feature locator that determines at least one indicia location in the image associated with one or more indicia in the set of unresolved indicia.
- the feature locator may bookmark the at least one indicia location with an indicia bookmark.
- the conversion module may further provide a first set of images; the system may further include an image recognition module that extracts a second set of indicia from the first set of images; and the indicia recognition module may recognise further indicia in the second set of indicia, and may add the further indicia to at least one of the set of resolved indicia and the set of unresolved indicia.
- the classifier may classify the set of resolved indicia including the further indicia.
- the classifier may classify the set of resolved indicia by: comparing the set of resolved indicia with a classification framework, and extracting at least one feature that includes at least one indicium from the set of resolved indicia.
- Extracting the at least one feature may include displaying the at least one extracted feature on the user interface.
- a method of analysing an image including: providing an image that includes segmented data including a first set of indicia; recognising indicia in the first set of indicia; generating a set of resolved indicia and a set of unresolved indicia; classifying resolved indicia to find at least one feature that includes at least one indicium from the set of resolved indicia; and determining at least one indicia location in the image associated with one or more indicia in the set of unresolved indicia.
- the method may further include bookmarking the at least one indicia location with an indicia bookmark.
- the image may further include a first set of images
- the method may further include: extracting a second set of indicia from the first set of images; recognising further indicia in the second set of indicia; and adding the further indicia to at least one of the set of resolved indicia and the set of unresolved indicia.
- the providing may include converting the image to the segmented data including the first set of indicia and the first set of images.
- the classifying may include: comparing the set of resolved indicia with a classification framework, and extracting at least one feature that includes at least one indicium from the set of resolved indicia.
- the extracting at least one feature may include displaying the at least one extracted feature on a user interface.
- the method may include, initially, determining the relevance of all items of the image.
- the method may include analysing the items constituting the image: determining the relevance of each of the items; discarding the items which are not of interest; and forwarding items of interest for further processing.
- the method may include cleansing the items of interest of extraneous material prior to forwarding for further processing.
- Fig. l is a schematic representation of a first embodiment of an image analysing system
- FIG. 2 is a representation of an embodiment of a user interface of an image analysing system
- FIG. 3 is a flow diagram of a first embodiment of a method of analysing an image
- FIG. 4 is a schematic representation of a second embodiment of an image analysing system.
- FIG. 5 is a flow diagram of a second embodiment of a method of analysing an image. Detailed Description of Exemplary Embodiments
- a first embodiment of an image analysing system 100 includes a conversion module 102 that converts an image 104 to segmented data.
- the segmented data includes a first set of indicia 106 and a first set of images 108.
- the system 100 includes an image recognition module 110 that selects a first image subset from the first set of images 108 and extracts a second set of indicia 112 from the first image subset.
- the image recognition module 110 may include a known optical character recognition (OCR) application such as OmniPage ® available from Nuance Communications of 1 Wayside Road, Burlington, MA, 01803, USA.
- OCR optical character recognition
- the system 100 includes an indicia recognition module 114 that recognises indicia in the first set of indicia 106 and indicia in the second set of indicia 112.
- the indicia recognition module 114 generates a set of resolved indicia 116 and a set of unresolved indicia 118.
- the system includes an image analysis classifier, or classifier, 120 that classifies resolved indicia by comparing the set of resolved indicia with a classification framework, and extracting at least one feature 122 that includes at least one indicium from the set of resolved indicia.
- the system 100 includes a feature locator 124 that determines at least one indicia location in the image associated with one or more indicia in the set of unresolved indicia, and the feature locator 124 bookmarks the at least one indicia location with a bookmark 126, in particular an indicia bookmark.
- the system 100 also includes a user interface 130 that enables a user to access the at least one indicia location via the indicia bookmark 126, and to manipulate the unresolved indicia to form resolvable indicia 132.
- the classifier 120 then classifies the resolvable indicia 132 to extract at least one further feature 122 from the resolvable indicia 132.
- the image recognition module 110 may select a second image subset from the first set of images, and then the feature locator 124 determines at least one image location in the image associated with one or more images in the second image subset.
- the feature locator 124 bookmarks the at least one image location with an image bookmark 126.
- the user interface 130 further enables the user to access the at least one image location via the image bookmark 126, and to manipulate the one or more images in the second image subset to form further resolvable indicia 132.
- the classifier 120 classifies the further resolvable indicia 132 to extract at least one additional feature 122 from the further resolvable indicia 132.
- the system 100 may be implemented on a suitable standard computer.
- the computer may be set up to run a virtual machine that has dedicated CPUs, for example 4 virtual CPUs, each being an Intel Core 2 Duo T770 at 2.40GHz, and the virtual machine having at least 8GB RAM.
- the system 100 may be implemented using any suitable software, for example Python v3.6 using virtual env and Python modules as required.
- Fig. 2 of the drawings shows an example of an embodiment of the user interface 130 of the image analysing system 100 that includes a display 200.
- the image 104 (or a portion of the image 104) is displayed on the user interface 130 at a first display location 201, for example, in the bottom right hand side of the display 200.
- the image 104 (or portion of the image) is displayed in such a manner that a user can scroll, pan or otherwise navigate around the image 104 to view any part of the image 104.
- the image 104 includes a scanned electronic document, for example a Portable Document Format (pdf) document.
- pdf Portable Document Format
- the image 104 is converted to segmented data, for example with the use of known optical character recognition (OCR) technology.
- OCR optical character recognition
- the segmented data includes indicia in the form of text 202.
- the indicia recognition module 114 includes text recognition functionality, and as such is able to recognise at least some of the text 202. Text that is recognised forms part of a set of resolved indicia. Text that is not recognised (or not recognised with a certainty above a defined indicia threshold) forms part of a set of unresolved indicia.
- the classifier 120 classifies one or more features that contain at least one number, letter, symbol, or word of text from the recognised text Classification includes matching a feature label defined in a classification framework with one or more suitable features present in the image 104.
- the classifier 120 includes a machine learning module, and classification may be performed using any suitable machine learning process such as a recurrent neural network (RNN), for example named-entity recognition (NER).
- RNN recurrent neural network
- NER named-entity recognition
- NER is used for information extraction in order to locate and classify named entities, and the pre-defmed categories are defined in the classification framework.
- NER methods that may be used include known methods such as Stanford NER and/or NeuroNER (available from http://neuroner.com/).
- the use of a recurrent neural network allows the system 100 to be used for improvement and learning.
- the RNN is able to be used with new data as well as new data from new images, for example, from new clients, to train new fields. This enables the system 100 to self-leam and improve accuracy over time.
- features 122 are displayed at a second display location 206, for example, on the left hand side of the display 200.
- Each feature 122 has a feature label 208.
- the system 100 extracts the features 122 from the image 104. These extracted features 122 are displayed on the display 200 of the user interface 130, identified by the relevant feature labels 208 associated with the respective features as illustrated.
- the display 200 allows the user to search and locate instances of the features by selecting the feature label 208 displayed within the second display location 206.
- the system 100 locates the feature location 208 by navigating through the image 104 using the relevant feature bookmark that is tagged by the feature label 208 that the user searches.
- An image subset 212 containing an instance of the feature associated with the selected feature label is displayed at a third display location 214. Referring to the image subset 212, the user is able to verify the details of the feature as it appears in the original image 104, and the user can then amend the feature as displayed in the relevant feature field if necessary.
- the features 122 are extracted with an associated measure of accuracy.
- the measure of accuracy lies above an accuracy threshold for features considered to have been relatively accurately extracted, or the measure of accuracy lies below the accuracy threshold for extracted features that may include an error.
- the display 200 may include an indicator of the measure of accuracy associated with a particular feature 122.
- a feature field 210 may display a coloured border indicative of the relevant measure of accuracy.
- the feature fields include a green border 230 for features considered to have been accurately extracted, the feature fields include an amber border 232 for features with an associated measure of accuracy below the accuracy threshold, and the feature fields include a red border 234 for features which have not been extracted.
- the first display location 201 includes a feature locator 216 in the form of a search field.
- a feature locator 216 in the form of a search field.
- the feature locator 216 allows the user to navigate through the image to view possible instances of features associated with the entered feature label 208,“possible instances” being identified as potential features with an associated measure of accuracy below the defined accuracy threshold.
- the feature locator 216 also allows the user to search for indicia by entering letters, numbers or symbols into the feature locator field 218, and navigating through the image 104 using the navigation controls 240.
- the display 200 allows the user to navigate or browse through the image 104 and to place one or more feature labels 208 at selected locations within the image to associate the placed feature labels 208 with one or more features identified by the user within the image 104.
- the display 200 also allows the user to navigate or browse through the image 104 and to insert data and/or metadata, for example in the form of text. In this way the user is able to manipulate unresolved indicia to form resolvable indicia, and/or manipulate one or more images (for example a picture or hand-writing) to form further resolvable indicia. These resolvable indicia added by the user are also classified as appropriate based on the feature labels defined in the classification framework. Manipulating the image in this way may be performed, for example, by an annotation tool such as BRAT (available from
- Unrecognised text in the set of unresolved indicia may include classifiable features
- the feature locator 124 determines one or more indicia locations associated with one or more unrecognised numbers, letters, symbols, or words of text. In some embodiments, the feature locator 124 bookmarks the indicia locations.
- the user interface 130 allows a user to enter feature data into the relevant feature field 210.
- the user interface 130 allows a user to amend any of the features displayed in the feature fields.
- the image 104 may include a set of images having one or more images, for example scanned in hand-written words 204.
- a hand-written word referred to herein as“an image subset”
- an image subset is selected and processed by the image recognition module 110 in order to extract a set of indicia, i.e. one or more numbers, letters, symbols, or words of text, from the hand-written word.
- These extracted indicia may include resolved and unresolved indicia, and the resolved indicia are also classified, labelled, and bookmarked.
- Fig. 3 of the drawings illustrates a first embodiment of a method 300 of analysing an image.
- an image 104 is provided.
- the image 104 includes segmented data including a first set of indicia 304.
- the providing includes converting the image to the segmented data including the first set of indicia and a first set of images.
- the image 104 is input to the method already segmented, for example, a scanned document may be uploaded.
- indicia in the first set of indicia 304 are recognised, and at 310 a set of resolved indicia 312 and a set of unresolved indicia 314 are generated. Steps 308 and 310 may be executed for example using known text recognition or OCR tools.
- one or more resolved indicia are classified to find at least one feature 318 that includes at least one indicium from the set of resolved indicia 312.
- the classifying includes comparing one or more indicia in the set of resolved indicia 312 with a classification framework, and extracting at least one feature 318 that includes at least one indicium from the set of resolved indicia 312.
- the feature is displayed on the user interface 130 as illustrated in Fig. 2 of the drawings.
- Classification may be performed using any suitable machine learning process, for example named-entity recognition (NER).
- NER methods that may be used include known methods such as Stanford NER and/or NeuroNER.
- At 320 at least one indicia location 322 in the image associated with one or more indicia in the set of unresolved indicia is determined.
- the method further includes bookmarking 324 the at least one indicia location 322 with an indicia bookmark 326.
- the method further includes extracting 330 a further set of indicia 332 from the set of images 306, recognising further indicia in the second set of indicia (as at 308 and using known text recognition or OCR tools), and adding the further indicia to at least one of the set of resolved indicia 312 and the set of unresolved indicia 314.
- the system and methods described herein facilitate navigating through an image based on the bookmarks and the feature labels. These bookmarks and feature labels are inserted into locations of the image automatically by the system, but bookmarks and feature labels can also be edited or added by a user on inspection of the image or a part of the image. Furthermore, where the system is unable to resolve indicia or accurately identify features within the image, the user interface and the bookmarks facilitate inspection of such indicia or features. By selecting feature labels the user can edit the details of the feature associated with that label. The user is also able to insert feature labels into the image at user-defined locations. Accordingly, the system and methods described herein more efficiently display information about unresolved indicia, and also allow a user to quickly navigate and resolve the indicia to create an updated or new image or document.
- the user interface provides a side by side view of two representations: one part that typically has a familiar layout (e.g. the second display location 206) that a user would be familiar with and know how and where to locate information, and another part (e g. the first display location 201) with an appearance varying depending on the particular image being considered. Where different images are likely have different appearances, this two-part display facilitates the analysis and understanding of the content of images.
- the system 100 includes a primary stage 402 with the previously described part of the system 100 forming a downstream, secondary stage 404.
- the secondary stage 404 is as described above and is not described any further.
- the primary stage 402 is used initially to determine if all items in an image containing a bundle of items are items of interest. For example, in the case of a multi-page document (the bundle of items) each page constitutes an item. In such multi-page documents, there are numerous pages which contain routine information which does not require analysis or extraction. Thus, the primary stage 402 uses a text classification machine learning algorithm or classifier 406.
- An example of a suitable classification algorithm is the Naive Bayes classification method.
- Other example of machine learning algorithms which could be used as the classifier 406 are Support Vector Machines (SVMs) or Random Forest Classifiers.
- the image provided to the classifier 406 is one which has already undergone optical character recognition.
- each page of the image is opened as a text file and the classifier then determines whether or not the page is to be kept or discarded.
- each page is represented as a vector (list) of features, each feature being a unique word that may be seen on the page.
- the machine learning algorithm is trained, using examples, as to what constitutes a feature of interest or not. In other words, the algorithm is trained to recognise patterns/combinations of words to identify pages of interest. In principle, the classifier 406 effectively learns those words which are correlated with being pages of interest.
- the image, containing multiple items is provided to the classifier 406 of the primary stage 402.
- the classifier 406 interrogates each item (page of the document at 504 to determine if the item is an item of interest or not. If the classifier 406 determines that the item is not an item of interest, it is discarded at 506.
- the classifier 406 determines that the item is an item of interest, the classifier 406 performs a data cleansing operation at 508 to remove extraneous matter prior to forwarding the image 104 to the secondary stage 404 where the image 104 is processed as described above with reference to Fig. 3 of the drawings.
- the conversion module 102 of the secondary stage 404 is operable only to segment the data from the image 104 into the indicia 106 and the images 108, the image 104 already having undergone OCR.
- the second display location 206 is divided into searchable“Required Fields” 250 and“Selected Fields” 252 (Fig. 2).
- the required fields 250 are as described above.
- the selected fields 252 comprise a plurality of text boxes 254, one of which is shown in Fig. 2 of the drawings. It will be appreciated that the selected fields 252 will be made up, in use, of a number of text boxes which are able to be populated by a user. In particular, the text boxes 254 are able to be populated with information which that user may wish to bookmark.
- the system 100 incorporates additional searching capabilities using search engine technology.
- search engine technology is a tool called an inverted index such as one called Elasticsearch (available at https://www.elastic.co/).
- Elasticsearch available at https://www.elastic.co/.
- the selected fields 252 are, as indicated above, client specific fields populated by the user.
- the search engine points to where, in the original document, specific words occur.
- the search engine uses a further text classifier 408 (Fig. 4) to distinguish between different parts of the image.
- the text classifier 408 receives resolved indicia 116 containing words of interest for the selected fields.
- the text classifier 408 determines the relevance of the location of the words and classifies it according to the probability of that relevance.
- Relevant passages are then sent to the text box 254 as bookmarked information 410 for display in the text box 254.
- Multiple pieces of bookmarked information are able to be displayed in one text box 254 and a user is able to navigate through those pieces of bookmarked information using“Previous” and“Next” labels 256 and 258, respectively.
- the inverted index tool employed is also used to cross-reference and check the accuracy of any feature 122 extracted by the RNN, i.e. the secondary stage 404.
- the inverted index tool employs a validation routine to assess the accuracy of the extracted feature. If the inverted index tool is unable to locate the relevant feature in the document, an error message or a request for more information is generated. Conversely, if the RNN has failed to extract the relevant feature whereas the feature has been located by the inverted index tool, that can be used to train the RNN how to, and where to, find the relevant feature in the future. This provides an additional aid in the self-learning and training of the RNN of the system 100.
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Multimedia (AREA)
- Artificial Intelligence (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- Software Systems (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Quality & Reliability (AREA)
- Character Discrimination (AREA)
- Image Analysis (AREA)
Abstract
An image analysing system (100) includes a conversion module (102) that converts an image to segmented data. An image recognition module (110) selects a first image subset from a first set of images and extracts a second set of indicia from a first image subset. An indicia recognition module (114) recognises indicia in the first and second sets of indicia. The indicia recognition module (114) generates sets of resolved and unresolved indicia. A classifier (120) classifies resolved indicia by comparing the set of resolved indicia with a classification framework and extracting at least one feature that includes at least one indicium from the set of resolved indicia. A feature locator (124) determines at least one indicia location in the image associated with indicia in the set of unresolved indicia, the feature locator bookmarking the at least one indicia location with an indicia bookmark.
Description
"Image Analysis"
Cross-Reference to Related Applications
[0001] The present application claims priority from Australian Provisional Patent
Application No 2017905041 filed on 18 December 2017, the contents of which are incorporated herein by reference in their entirety.
Technical Field
[0002] The present disclosure relates to an image analysing system and a method of analysing an image that includes both text and images.
Background
[0003] Images in electronic format can be challenging to analyse, navigate and/or extract information from. If there is information in an image, there are limited tools available to find that information, or to search the image for the information. Existing optical character recognition (OCR) technology is able to find text within images, but does not always provide an accurate result. Also, there are limited tools with which to navigate through or edit images that have been OCR-ed. If the same type of information is required from different types or styles of images, it can be difficult for a user to find that information when visually inspecting the images.
[0004] Any discussion of documents, acts, materials, devices, articles or the like which has been included in the present specification is not to be taken as an admission that any or all of these matters form part of the prior art base or were common general knowledge in the field relevant to the present disclosure as it existed before the priority date of each claim of this application.
Summary
[0005] In one aspect there is provided an image analysing system which includes: a conversion module that converts an image to segmented data including a first set of indicia and a first set of images; an image recognition module that selects a first image subset from
the first set of images and extracts a second set of indicia from the first image subset; an indicia recognition module that recognises indicia in the first set of indicia and indicia in the second set of indicia, wherein the indicia recognition module generates a set of resolved indicia and a set of unresolved indicia; a classifier that classifies resolved indicia by:
comparing the set of resolved indicia with a classification framework, and extracting at least one feature that includes at least one indicium from the set of resolved indicia; and a feature locator that determines at least one indicia location in the image associated with one or more indicia in the set of unresolved indicia, wherein the feature locator bookmarks the at least one indicia location with an indicia bookmark.
[0006] The system may further include a user interface enabling a user to: access the at least one indicia location via the indicia bookmark, and manipulate the unresolved indicia to form resolvable indicia.
[0007] The classifier may classify the resolvable indicia to extract at least one further feature from the resolvable indicia.
[0008] The image recognition module may select a second image subset from the first set of images, the feature locator may determine at least one image location in the image associated with one or more images in the second image subset, and the feature locator may bookmark the at least one image location with an image bookmark.
[0009] The user interface may further enable the user to: access the at least one image location via the image bookmark, and manipulate the one or more images in the second image subset to form further resolvable indicia.
[0010] The classifier may classify the further resolvable indicia to extract at least one additional feature from the further resolvable indicia.
[0011] The user interface may display the indicia bookmark and the image bookmark to be visible on the image at the at least one indicia location and the at least one image location respectively.
[0012] Extracting the at least one feature may include displaying the at least one extracted feature on the user interface. The at least one extracted feature may be displayed in a segmented and editable format.
[0013] In another embodiment, the system may include an initial, primary stage configured to determine the relevance of all items of the image.
[0014] The primary stage may include a primary stage classifier configured to analyse the items constituting the image, determine the relevance of each of the items, discard the items which are not of interest and forward items of interest for further processing to the image analysis classifier. The primary stage classifier may employ machine learning to conduct the analysis.
[0015] The primary stage may be configured to remove extraneous matter from an item determined by the primary stage classifier to be an item of interest before the item of interest is forwarded to the image analysis classifier.
[0016] In another aspect there is provided an image analysing system which includes: a conversion module that provides an image that includes segmented data including a first set of indicia; an indicia recognition module that recognises indicia in the first set of indicia, wherein the indicia recognition module generates a set of resolved indicia and a set of unresolved indicia; a classifier that classifies resolved indicia to find at least one feature that includes at least one indicium from the set of resolved indicia; and a feature locator that determines at least one indicia location in the image associated with one or more indicia in the set of unresolved indicia.
[0017] The feature locator may bookmark the at least one indicia location with an indicia bookmark.
[0018] The conversion module may further provide a first set of images; the system may further include an image recognition module that extracts a second set of indicia from the first set of images; and the indicia recognition module may recognise further indicia in the second set of indicia, and may add the further indicia to at least one of the set of resolved indicia and the set of unresolved indicia.
[0019] The classifier may classify the set of resolved indicia including the further indicia.
[0020] The classifier may classify the set of resolved indicia by: comparing the set of resolved indicia with a classification framework, and extracting at least one feature that includes at least one indicium from the set of resolved indicia.
[0021] Extracting the at least one feature may include displaying the at least one extracted feature on the user interface.
[0022] In another aspect there is provided a method of analysing an image, the method including: providing an image that includes segmented data including a first set of indicia; recognising indicia in the first set of indicia; generating a set of resolved indicia and a set of unresolved indicia; classifying resolved indicia to find at least one feature that includes at least one indicium from the set of resolved indicia; and determining at least one indicia location in the image associated with one or more indicia in the set of unresolved indicia.
[0023] The method may further include bookmarking the at least one indicia location with an indicia bookmark.
[0024] The image may further include a first set of images, and the method may further include: extracting a second set of indicia from the first set of images; recognising further indicia in the second set of indicia; and adding the further indicia to at least one of the set of resolved indicia and the set of unresolved indicia.
[0025] The providing may include converting the image to the segmented data including the first set of indicia and the first set of images.
[0026] The classifying may include: comparing the set of resolved indicia with a classification framework, and extracting at least one feature that includes at least one indicium from the set of resolved indicia.
[0027] The extracting at least one feature may include displaying the at least one extracted feature on a user interface.
[0028] In an embodiment, the method may include, initially, determining the relevance of all items of the image.
[0029] The method may include analysing the items constituting the image: determining the relevance of each of the items; discarding the items which are not of interest; and forwarding items of interest for further processing.
[0030] The method may include cleansing the items of interest of extraneous material prior to forwarding for further processing.
[0031] Throughout this specification the words "comprise" or“include”, or variations such as "comprises", "comprising",“includes” or“including”, will be understood to imply the inclusion of a stated element, integer or step, or group of elements, integers or steps, but not the exclusion of any other element, integer or step, or group of elements, integers or steps.
Brief Description of Drawings
[0032] Embodiments of the disclosure are now described by way of example with reference to the accompanying drawings in which:-
[0033] Fig. l is a schematic representation of a first embodiment of an image analysing system;
[0034] Fig. 2 is a representation of an embodiment of a user interface of an image analysing system,
[0035] Fig. 3 is a flow diagram of a first embodiment of a method of analysing an image;
[0036] Fig. 4 is a schematic representation of a second embodiment of an image analysing system; and
[0037] Fig. 5 is a flow diagram of a second embodiment of a method of analysing an image.
Detailed Description of Exemplary Embodiments
[0038] Referring initially to Fig. 1 of the drawings, a first embodiment of an image analysing system 100 includes a conversion module 102 that converts an image 104 to segmented data. The segmented data includes a first set of indicia 106 and a first set of images 108. The system 100 includes an image recognition module 110 that selects a first image subset from the first set of images 108 and extracts a second set of indicia 112 from the first image subset. The image recognition module 110 may include a known optical character recognition (OCR) application such as OmniPage® available from Nuance Communications of 1 Wayside Road, Burlington, MA, 01803, USA.
[0039] The system 100 includes an indicia recognition module 114 that recognises indicia in the first set of indicia 106 and indicia in the second set of indicia 112. The indicia recognition module 114 generates a set of resolved indicia 116 and a set of unresolved indicia 118. The system includes an image analysis classifier, or classifier, 120 that classifies resolved indicia by comparing the set of resolved indicia with a classification framework, and extracting at least one feature 122 that includes at least one indicium from the set of resolved indicia. The system 100 includes a feature locator 124 that determines at least one indicia location in the image associated with one or more indicia in the set of unresolved indicia, and the feature locator 124 bookmarks the at least one indicia location with a bookmark 126, in particular an indicia bookmark.
[0040] The system 100 also includes a user interface 130 that enables a user to access the at least one indicia location via the indicia bookmark 126, and to manipulate the unresolved indicia to form resolvable indicia 132. The classifier 120 then classifies the resolvable indicia 132 to extract at least one further feature 122 from the resolvable indicia 132.
[0041] In some embodiments the image recognition module 110 may select a second image subset from the first set of images, and then the feature locator 124 determines at least one image location in the image associated with one or more images in the second image subset. The feature locator 124 bookmarks the at least one image location with an image bookmark 126. In these embodiments, the user interface 130 further enables the user to access the at least one image location via the image bookmark 126, and to manipulate the one or more images in the second image subset to form further resolvable indicia 132. The classifier 120
classifies the further resolvable indicia 132 to extract at least one additional feature 122 from the further resolvable indicia 132.
[0042] The system 100 may be implemented on a suitable standard computer. The computer may be set up to run a virtual machine that has dedicated CPUs, for example 4 virtual CPUs, each being an Intel Core 2 Duo T770 at 2.40GHz, and the virtual machine having at least 8GB RAM. The system 100 may be implemented using any suitable software, for example Python v3.6 using virtual env and Python modules as required.
[0043] Fig. 2 of the drawings shows an example of an embodiment of the user interface 130 of the image analysing system 100 that includes a display 200. The image 104 (or a portion of the image 104) is displayed on the user interface 130 at a first display location 201, for example, in the bottom right hand side of the display 200. The image 104 (or portion of the image) is displayed in such a manner that a user can scroll, pan or otherwise navigate around the image 104 to view any part of the image 104. In this example the image 104 includes a scanned electronic document, for example a Portable Document Format (pdf) document. The image 104 is converted to segmented data, for example with the use of known optical character recognition (OCR) technology. As used herein“segmented data” refers to data including one or more segments of data that can be electronically edited, searched, and/or otherwise processed.
[0044] The segmented data includes indicia in the form of text 202. The indicia recognition module 114 includes text recognition functionality, and as such is able to recognise at least some of the text 202. Text that is recognised forms part of a set of resolved indicia. Text that is not recognised (or not recognised with a certainty above a defined indicia threshold) forms part of a set of unresolved indicia.
[0045] The classifier 120 classifies one or more features that contain at least one number, letter, symbol, or word of text from the recognised text Classification includes matching a feature label defined in a classification framework with one or more suitable features present in the image 104. The classifier 120 includes a machine learning module, and classification may be performed using any suitable machine learning process such as a recurrent neural network (RNN), for example named-entity recognition (NER). NER is used for information extraction in order to locate and classify named entities, and the pre-defmed categories are
defined in the classification framework. NER methods that may be used include known methods such as Stanford NER and/or NeuroNER (available from http://neuroner.com/).
[0046] It will be appreciated that the use of a recurrent neural network allows the system 100 to be used for improvement and learning. In particular, the RNN is able to be used with new data as well as new data from new images, for example, from new clients, to train new fields. This enables the system 100 to self-leam and improve accuracy over time.
[0047] Several features 122 are displayed at a second display location 206, for example, on the left hand side of the display 200. Each feature 122 has a feature label 208. As described above, the system 100 extracts the features 122 from the image 104. These extracted features 122 are displayed on the display 200 of the user interface 130, identified by the relevant feature labels 208 associated with the respective features as illustrated.
[0048] When a feature is found and matched to a feature label 208, the feature location where that feature is located in the image 104 is bookmarked and tagged with the feature label 208.
[0049] The display 200 allows the user to search and locate instances of the features by selecting the feature label 208 displayed within the second display location 206. The system 100 locates the feature location 208 by navigating through the image 104 using the relevant feature bookmark that is tagged by the feature label 208 that the user searches. An image subset 212 containing an instance of the feature associated with the selected feature label is displayed at a third display location 214. Referring to the image subset 212, the user is able to verify the details of the feature as it appears in the original image 104, and the user can then amend the feature as displayed in the relevant feature field if necessary.
[0050] In some embodiments the features 122 are extracted with an associated measure of accuracy. The measure of accuracy lies above an accuracy threshold for features considered to have been relatively accurately extracted, or the measure of accuracy lies below the accuracy threshold for extracted features that may include an error. The display 200 may include an indicator of the measure of accuracy associated with a particular feature 122. For example, a feature field 210 may display a coloured border indicative of the relevant measure of accuracy. In this example, the feature fields include a green border 230 for features
considered to have been accurately extracted, the feature fields include an amber border 232 for features with an associated measure of accuracy below the accuracy threshold, and the feature fields include a red border 234 for features which have not been extracted.
[0051] The first display location 201 includes a feature locator 216 in the form of a search field. When a user enters a feature label 208 into a feature locator field 218 of the feature locator 216, the user is able to navigate through the image to view instances of features that are associated with the entered feature label 208. In some embodiments, the feature locator 216 allows the user to navigate through the image to view possible instances of features associated with the entered feature label 208,“possible instances” being identified as potential features with an associated measure of accuracy below the defined accuracy threshold.
[0052] The feature locator 216 also allows the user to search for indicia by entering letters, numbers or symbols into the feature locator field 218, and navigating through the image 104 using the navigation controls 240.
[0053] The display 200 allows the user to navigate or browse through the image 104 and to place one or more feature labels 208 at selected locations within the image to associate the placed feature labels 208 with one or more features identified by the user within the image 104. The display 200 also allows the user to navigate or browse through the image 104 and to insert data and/or metadata, for example in the form of text. In this way the user is able to manipulate unresolved indicia to form resolvable indicia, and/or manipulate one or more images (for example a picture or hand-writing) to form further resolvable indicia. These resolvable indicia added by the user are also classified as appropriate based on the feature labels defined in the classification framework. Manipulating the image in this way may be performed, for example, by an annotation tool such as BRAT (available from
http ://brat.nlplab .org/).
[0054] Unrecognised text in the set of unresolved indicia may include classifiable features In order to facilitate further processing of the unresolved indicia, the feature locator 124 determines one or more indicia locations associated with one or more unrecognised numbers, letters, symbols, or words of text. In some embodiments, the feature locator 124 bookmarks the indicia locations.
[0055] Where a feature associated with a particular feature label has not been extracted, the user interface 130 allows a user to enter feature data into the relevant feature field 210.
Similarly, the user interface 130 allows a user to amend any of the features displayed in the feature fields.
[0056] The image 104 may include a set of images having one or more images, for example scanned in hand-written words 204. In embodiments that include an image recognition module 110, a hand-written word, referred to herein as“an image subset”, is selected and processed by the image recognition module 110 in order to extract a set of indicia, i.e. one or more numbers, letters, symbols, or words of text, from the hand-written word. These extracted indicia may include resolved and unresolved indicia, and the resolved indicia are also classified, labelled, and bookmarked.
[0057] Fig. 3 of the drawings illustrates a first embodiment of a method 300 of analysing an image. At 302 an image 104 is provided. The image 104 includes segmented data including a first set of indicia 304. In some embodiments the providing includes converting the image to the segmented data including the first set of indicia and a first set of images. In other embodiments the image 104 is input to the method already segmented, for example, a scanned document may be uploaded. At 308 indicia in the first set of indicia 304 are recognised, and at 310 a set of resolved indicia 312 and a set of unresolved indicia 314 are generated. Steps 308 and 310 may be executed for example using known text recognition or OCR tools.
[0058] At 316 one or more resolved indicia are classified to find at least one feature 318 that includes at least one indicium from the set of resolved indicia 312. The classifying includes comparing one or more indicia in the set of resolved indicia 312 with a classification framework, and extracting at least one feature 318 that includes at least one indicium from the set of resolved indicia 312. When the feature is extracted, the feature is displayed on the user interface 130 as illustrated in Fig. 2 of the drawings. Classification may be performed using any suitable machine learning process, for example named-entity recognition (NER). NER methods that may be used include known methods such as Stanford NER and/or NeuroNER.
[0059] At 320 at least one indicia location 322 in the image associated with one or more indicia in the set of unresolved indicia is determined. In some embodiments, the method
further includes bookmarking 324 the at least one indicia location 322 with an indicia bookmark 326.
[0060] Where the image 104 also includes a set of images 306 (including, for example, hand-written words or other pictures), the method further includes extracting 330 a further set of indicia 332 from the set of images 306, recognising further indicia in the second set of indicia (as at 308 and using known text recognition or OCR tools), and adding the further indicia to at least one of the set of resolved indicia 312 and the set of unresolved indicia 314.
[0061] The system and methods described herein facilitate navigating through an image based on the bookmarks and the feature labels. These bookmarks and feature labels are inserted into locations of the image automatically by the system, but bookmarks and feature labels can also be edited or added by a user on inspection of the image or a part of the image. Furthermore, where the system is unable to resolve indicia or accurately identify features within the image, the user interface and the bookmarks facilitate inspection of such indicia or features. By selecting feature labels the user can edit the details of the feature associated with that label. The user is also able to insert feature labels into the image at user-defined locations. Accordingly, the system and methods described herein more efficiently display information about unresolved indicia, and also allow a user to quickly navigate and resolve the indicia to create an updated or new image or document.
[0062] The user interface provides a side by side view of two representations: one part that typically has a familiar layout (e.g. the second display location 206) that a user would be familiar with and know how and where to locate information, and another part (e g. the first display location 201) with an appearance varying depending on the particular image being considered. Where different images are likely have different appearances, this two-part display facilitates the analysis and understanding of the content of images.
[0063] Referring now to Figs 4 and 5 of the drawings, a second embodiment of an image analysing system and a method of analysing an image, respectively, are illustrated. With reference to previous drawings, like reference numerals refer to like parts, unless otherwise specified.
[0064] In this embodiment, the system 100 includes a primary stage 402 with the previously described part of the system 100 forming a downstream, secondary stage 404. The secondary stage 404 is as described above and is not described any further.
[0065] The primary stage 402 is used initially to determine if all items in an image containing a bundle of items are items of interest. For example, in the case of a multi-page document (the bundle of items) each page constitutes an item. In such multi-page documents, there are numerous pages which contain routine information which does not require analysis or extraction. Thus, the primary stage 402 uses a text classification machine learning algorithm or classifier 406. An example of a suitable classification algorithm is the Naive Bayes classification method. Other example of machine learning algorithms which could be used as the classifier 406 are Support Vector Machines (SVMs) or Random Forest Classifiers.
[0066] The image provided to the classifier 406 is one which has already undergone optical character recognition. In the classifier 406, each page of the image is opened as a text file and the classifier then determines whether or not the page is to be kept or discarded. In the classifier 406, each page is represented as a vector (list) of features, each feature being a unique word that may be seen on the page. The machine learning algorithm is trained, using examples, as to what constitutes a feature of interest or not. In other words, the algorithm is trained to recognise patterns/combinations of words to identify pages of interest. In principle, the classifier 406 effectively learns those words which are correlated with being pages of interest.
[0067] Hence, as shown at 502 in Fig. 5 of the drawings, the image, containing multiple items is provided to the classifier 406 of the primary stage 402. The classifier 406 interrogates each item (page of the document at 504 to determine if the item is an item of interest or not. If the classifier 406 determines that the item is not an item of interest, it is discarded at 506.
[0068] If the classifier 406 determines that the item is an item of interest, the classifier 406 performs a data cleansing operation at 508 to remove extraneous matter prior to forwarding the image 104 to the secondary stage 404 where the image 104 is processed as described above with reference to Fig. 3 of the drawings.
[0069] It will be appreciated that, in this embodiment, the conversion module 102 of the secondary stage 404 is operable only to segment the data from the image 104 into the indicia 106 and the images 108, the image 104 already having undergone OCR.
[0070] As a further development to the display 200 of the user interface 130, the second display location 206 is divided into searchable“Required Fields” 250 and“Selected Fields” 252 (Fig. 2). The required fields 250 are as described above. The selected fields 252 comprise a plurality of text boxes 254, one of which is shown in Fig. 2 of the drawings. It will be appreciated that the selected fields 252 will be made up, in use, of a number of text boxes which are able to be populated by a user. In particular, the text boxes 254 are able to be populated with information which that user may wish to bookmark.
[0071] The system 100 incorporates additional searching capabilities using search engine technology. An example of the search engine technology used is a tool called an inverted index such as one called Elasticsearch (available at https://www.elastic.co/). The selected fields 252 are, as indicated above, client specific fields populated by the user. The search engine points to where, in the original document, specific words occur.
[0072] The search engine uses a further text classifier 408 (Fig. 4) to distinguish between different parts of the image. The text classifier 408 receives resolved indicia 116 containing words of interest for the selected fields. The text classifier 408 determines the relevance of the location of the words and classifies it according to the probability of that relevance.
Relevant passages are then sent to the text box 254 as bookmarked information 410 for display in the text box 254. Multiple pieces of bookmarked information are able to be displayed in one text box 254 and a user is able to navigate through those pieces of bookmarked information using“Previous” and“Next” labels 256 and 258, respectively.
[0073] As a further enhancement of the system 100, the inverted index tool employed is also used to cross-reference and check the accuracy of any feature 122 extracted by the RNN, i.e. the secondary stage 404. For example, in the case of a lease document if the RNN extract a feature regarding particulars of a rent review, the inverted index tool employs a validation routine to assess the accuracy of the extracted feature. If the inverted index tool is unable to locate the relevant feature in the document, an error message or a request for more information is generated. Conversely, if the RNN has failed to extract the relevant feature
whereas the feature has been located by the inverted index tool, that can be used to train the RNN how to, and where to, find the relevant feature in the future. This provides an additional aid in the self-learning and training of the RNN of the system 100.
[0074] It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the above-described embodiments, without departing from the broad general scope of the present disclosure. The present embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.
Claims
1. An image analysing system which includes:
a conversion module that converts an image to segmented data including a first set of indicia and a first set of images;
an image recognition module that selects a first image subset from the first set of images and extracts a second set of indicia from the first image subset;
an indicia recognition module that recognises indicia in the first set of indicia and indicia in the second set of indicia, wherein the indicia recognition module generates a set of resolved indicia and a set of unresolved indicia;
an image analysis classifier that classifies resolved indicia by:
comparing the set of resolved indicia with a classification framework, and extracting at least one feature that includes at least one indicium from the set of resolved indicia; and
a feature locator that determines at least one indicia location in the image associated with one or more indicia in the set of unresolved indicia, wherein the feature locator bookmarks the at least one indicia location with an indicia bookmark.
2. The system of claim 1, further including a user interface enabling a user to:
access the at least one indicia location via the indicia bookmark, and
manipulate the unresolved indicia to form resolvable indicia.
3. The system of claim 2, wherein the image analysis classifier classifies the resolvable indicia to extract at least one further feature from the resolvable indicia.
4. The system of claim 2 or 3, wherein:
the image recognition module selects a second image subset from the first set of images,
the feature locator determines at least one image location in the image associated with one or more images in the second image subset, and
the feature locator bookmarks the at least one image location with an image bookmark.
5. The system of claim 4, wherein the user interface further enables the user to:
access the at least one image location via the image bookmark, and
manipulate the one or more images in the second image subset to form further resolvable indicia.
6. The system of claim 5, wherein the image analysis classifier classifies the further resolvable indicia to extract at least one additional feature from the further resolvable indicia.
7. The system of any one of claims 4 to 6, wherein the user interface displays the indicia bookmark and the image bookmark to be visible on the image at the at least one indicia location and the at least one image location respectively.
8. The system of any one of claim 2 to 7, wherein extracting the at least one feature includes displaying the at least one extracted feature on the user interface.
9. The system of claim 8, wherein the at least one extracted feature is displayed in a segmented and editable format.
10. The system of any one of the preceding claims which includes an initial, primary stage configured to determine the relevance of all items of the image.
11. The system of claim 10 in which the primary stage includes a primary stage classifier configured to analyse the items constituting the image, determine the relevance of each of the items, discard the items which are not of interest and forward items of interest for further processing to the image analysis classifier.
12. The system of claim 11 in which the primary stage classifier employs machine learning to conduct the analysis.
13. The system of claim 11 or claim 12 in which the primary stage is configured to remove extraneous matter from an item determined by the primary stage classifier to be an item of interest before the item of interest is forwarded to the image analysis classifier.
14. An image analysing system which includes:
a conversion module that provides an image that includes segmented data including a
first set of indicia;
an indicia recognition module that recognises indicia in the first set of indicia, wherein the indicia recognition module generates a set of resolved indicia and a set of unresolved indicia;
a image analysis classifier that classifies resolved indicia to find at least one feature that includes at least one indicium from the set of resolved indicia; and
a feature locator that determines at least one indicia location in the image associated with one or more indicia in the set of unresolved indicia.
15. The system of claim 14, wherein the feature locator bookmarks the at least one indicia location with an indicia bookmark.
16. The system of claim 14 or 15 wherein the conversion module further provides a first set of images;
the system further including an image recognition module that extracts a second set of indicia from the first set of images; and wherein the indicia recognition module recognises further indicia in the second set of indicia, and adds the further indicia to at least one of the set of resolved indicia and the set of unresolved indicia.
17. The system of claim 16, wherein the image analysis classifier classifies the set of resolved indicia including the further indicia.
18. The system of any one of claims 14 to 17, wherein the image analysis classifier classifies the set of resolved indicia by:
comparing the set of resolved indicia with a classification framework, and extracting at least one feature that includes at least one indicium from the set of resolved indicia.
19. The system of claim 18, wherein extracting the at least one feature includes displaying the at least one extracted feature on the user interface.
20. A method of analysing an image, the method including:
providing an image that includes segmented data including a first set of indicia; recognising indicia in the first set of indicia;
generating a set of resolved indicia and a set of unresolved indicia; classifying resolved indicia to find at least one feature that includes at least one indicium from the set of resolved indicia; and
determining at least one indicia location in the image associated with one or more indicia in the set of unresolved indicia.
21. The method of claim 20 further including bookmarking the at least one indicia location with an indicia bookmark.
22. The method of claim 20 or 21 wherein the image further includes a first set of images, and the method further including:
extracting a second set of indicia from the first set of images;
recognising further indicia in the second set of indicia; and
adding the further indicia to at least one of the set of resolved indicia and the set of unresolved indicia.
23. The method of claim 22, wherein the providing includes converting the image to the segmented data including the first set of indicia and the first set of images.
24. The method of any one of claims 20 to 23, wherein the classifying includes:
comparing the set of resolved indicia with a classification framework, and extracting at least one feature that includes at least one indicium from the set of resolved indicia
25. The method of claim 24, wherein the extracting at least one feature includes displaying the at least one extracted feature on a user interface.
26. The method of any one of claims 20 to 25 which includes, initially, determining the relevance of all items of the image.
27. The method of claim 26 which includes:
analysing the items constituting the image:
determining the relevance of each of the items;
discarding the items which are not of interest; and
forwarding items of interest for further processing.
28. The method of claim 27 which includes cleansing the items of interest of extraneous material prior to forwarding for further processing.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
AU2017905041 | 2017-12-18 | ||
AU2017905041A AU2017905041A0 (en) | 2017-12-18 | Image Analysis |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2019119030A1 true WO2019119030A1 (en) | 2019-06-27 |
Family
ID=61973055
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/AU2018/051347 WO2019119030A1 (en) | 2017-12-18 | 2018-12-17 | Image analysis |
Country Status (2)
Country | Link |
---|---|
AU (1) | AU2018100324B4 (en) |
WO (1) | WO2019119030A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022064409A1 (en) * | 2020-09-28 | 2022-03-31 | International Business Machines Corporation | Optimized data collection of relevant medical images |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5455875A (en) * | 1992-12-15 | 1995-10-03 | International Business Machines Corporation | System and method for correction of optical character recognition with display of image segments according to character data |
US5555362A (en) * | 1991-12-18 | 1996-09-10 | International Business Machines Corporation | Method and apparatus for a layout of a document image |
US7289685B1 (en) * | 2002-04-04 | 2007-10-30 | Ricoh Co., Ltd. | Paper based method for collecting digital data |
US7293712B2 (en) * | 2004-10-05 | 2007-11-13 | Hand Held Products, Inc. | System and method to automatically discriminate between a signature and a dataform |
US20110255782A1 (en) * | 2010-01-15 | 2011-10-20 | Copanion, Inc. | Systems and methods for automatically processing electronic documents using multiple image transformation algorithms |
US20140245120A1 (en) * | 2013-02-28 | 2014-08-28 | Ricoh Co., Ltd. | Creating Tables with Handwriting Images, Symbolic Representations and Media Images from Forms |
-
2018
- 2018-03-15 AU AU2018100324A patent/AU2018100324B4/en not_active Ceased
- 2018-12-17 WO PCT/AU2018/051347 patent/WO2019119030A1/en active Application Filing
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US5555362A (en) * | 1991-12-18 | 1996-09-10 | International Business Machines Corporation | Method and apparatus for a layout of a document image |
US5455875A (en) * | 1992-12-15 | 1995-10-03 | International Business Machines Corporation | System and method for correction of optical character recognition with display of image segments according to character data |
US7289685B1 (en) * | 2002-04-04 | 2007-10-30 | Ricoh Co., Ltd. | Paper based method for collecting digital data |
US7293712B2 (en) * | 2004-10-05 | 2007-11-13 | Hand Held Products, Inc. | System and method to automatically discriminate between a signature and a dataform |
US20110255782A1 (en) * | 2010-01-15 | 2011-10-20 | Copanion, Inc. | Systems and methods for automatically processing electronic documents using multiple image transformation algorithms |
US20140245120A1 (en) * | 2013-02-28 | 2014-08-28 | Ricoh Co., Ltd. | Creating Tables with Handwriting Images, Symbolic Representations and Media Images from Forms |
Non-Patent Citations (1)
Title |
---|
MOLL, M.A. ET AL.: "Segmentation-based retrieval of document images from diverse collections", DOCUMENT RECOGNITION AND RETRIEVAL XV, vol. 6815, 2008, XP055620370 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2022064409A1 (en) * | 2020-09-28 | 2022-03-31 | International Business Machines Corporation | Optimized data collection of relevant medical images |
US11380433B2 (en) | 2020-09-28 | 2022-07-05 | International Business Machines Corporation | Optimized data collection of relevant medical images |
Also Published As
Publication number | Publication date |
---|---|
AU2018100324A4 (en) | 2018-04-26 |
AU2018100324B4 (en) | 2018-07-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN107392143B (en) | Resume accurate analysis method based on SVM text classification | |
Veit et al. | Coco-text: Dataset and benchmark for text detection and recognition in natural images | |
Brunessaux et al. | The maurdor project: Improving automatic processing of digital documents | |
US7937338B2 (en) | System and method for identifying document structure and associated metainformation | |
US20100303356A1 (en) | Method for processing optical character recognition (ocr) data, wherein the output comprises visually impaired character images | |
CN109685052A (en) | Method for processing text images, device, electronic equipment and computer-readable medium | |
US20090144277A1 (en) | Electronic table of contents entry classification and labeling scheme | |
CN112035675A (en) | Medical text labeling method, device, equipment and storage medium | |
Al-Barhamtoshy et al. | Arabic documents information retrieval for printed, handwritten, and calligraphy image | |
CN105809170A (en) | Character identifying method and device | |
CN110825998A (en) | Website identification method and readable storage medium | |
CN103218420A (en) | Method and device for extracting page titles | |
Alves et al. | A strategy for automatically extracting references from PDF documents | |
WO2019119030A1 (en) | Image analysis | |
CN110688842B (en) | Analysis method, device and server for document title level | |
Yan et al. | Context-Aware Chart Element Detection | |
US20240211518A1 (en) | Automated document intake system | |
EP3716104A1 (en) | Extracting named entities based using document structure | |
EP4167106A1 (en) | Method and apparatus for data structuring of text | |
Souza et al. | ARCTIC: metadata extraction from scientific papers in pdf using two-layer CRF | |
CN109255122B (en) | Method for classifying and marking thesis citation relation | |
CN112183035A (en) | Text labeling method, device and equipment and readable storage medium | |
Bumbu | On classification of 17th century fonts using neural networks | |
Soto | Visual detection with context fo rdocument layout analysis | |
Flynn | Document classification in support of automated metadata extraction form heterogeneous collections |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 18890731 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 18890731 Country of ref document: EP Kind code of ref document: A1 |