CN113837216B - Data classification method, training device, medium and electronic equipment - Google Patents

Data classification method, training device, medium and electronic equipment Download PDF

Info

Publication number
CN113837216B
CN113837216B CN202110610877.2A CN202110610877A CN113837216B CN 113837216 B CN113837216 B CN 113837216B CN 202110610877 A CN202110610877 A CN 202110610877A CN 113837216 B CN113837216 B CN 113837216B
Authority
CN
China
Prior art keywords
layer
coding module
module
output
coding
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110610877.2A
Other languages
Chinese (zh)
Other versions
CN113837216A (en
Inventor
谭维
李松南
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tencent Technology Shenzhen Co Ltd
Original Assignee
Tencent Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tencent Technology Shenzhen Co Ltd filed Critical Tencent Technology Shenzhen Co Ltd
Priority to CN202110610877.2A priority Critical patent/CN113837216B/en
Publication of CN113837216A publication Critical patent/CN113837216A/en
Application granted granted Critical
Publication of CN113837216B publication Critical patent/CN113837216B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting

Landscapes

  • Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application provides a data classification method, a training device, a medium and electronic equipment. The data classification method comprises the following steps: acquiring target data to be classified; inputting target data into a multi-layer tag classification model, wherein the multi-layer tag classification model comprises a feature extraction module and multi-layer coding modules taking the output of the feature extraction module as input, each layer of coding module corresponds to one layer of classification tag, and the input of other layers of coding modules except the first layer of coding module in the multi-layer coding modules comprises the output of the coding module of the previous layer; and acquiring a plurality of classification labels corresponding to the target data, which are output by the multi-layer label classification model. The technical scheme of the embodiment of the application can utilize the hierarchical structure information of the labels, can complete the hierarchical structure label classification task by training one model, can ensure that no hierarchical structure error occurs in the prediction result, and can reduce the resources consumed by training and predicting by using the model.

Description

Data classification method, training device, medium and electronic equipment
Technical Field
The application relates to the technical field of artificial intelligence, in particular to a data classification method, a training method and device of a multi-layer label classification model, a computer readable medium and electronic equipment.
Background
Currently, for a classification task of a multi-level label, a plurality of models corresponding to a plurality of level labels need to be trained respectively. In this way, on one hand, a logic error occurs in which the lower label in the prediction result does not belong to the upper label; on the other hand, since each hierarchical label requires an independent model, the resources consumed in training and predicting using the model are large.
Disclosure of Invention
The embodiment of the application provides a data classification method, a training method and device of a multi-layer label classification model, a computer readable medium and electronic equipment, so that the defect of logic errors in label subordinate relations can be reduced at least to a certain extent, and resources consumed in training and predicting by using the model can be reduced.
Other features and advantages of the application will be apparent from the following detailed description, or may be learned by the practice of the application.
According to an aspect of an embodiment of the present application, there is provided a data classification method including: acquiring target data to be classified; inputting the target data into a multi-layer tag classification model, wherein the multi-layer tag classification model comprises a feature extraction module and a multi-layer coding module taking the output of the feature extraction module as input, each layer of coding module corresponds to one layer of classification tag, and the input of other layers of coding modules except for a first layer of coding module in the multi-layer coding module comprises the output of a coding module of a previous layer; and acquiring a plurality of classification labels corresponding to the target data, which are output by the multi-layer label classification model.
According to an aspect of the embodiment of the present application, there is provided a training method of a multi-layer tag classification model, including: obtaining a sample data set, wherein sample data in the sample data set comprises a sample and a multi-layer label corresponding to the sample; inputting sample data in the sample data set into a multi-layer tag classification model, wherein the multi-layer tag classification model comprises a feature extraction module and multi-layer coding modules taking output of the feature extraction module as input, each layer of coding module corresponds to one layer of classification tag, and input of other layers of coding modules except for a first layer of coding module in the multi-layer coding modules comprises output of coding modules of a previous layer; and adjusting parameters of the multi-layer label classification model according to the output result of the multi-layer label classification model and the loss value between the multi-layer labels corresponding to the samples so as to train the multi-layer label classification model.
According to an aspect of an embodiment of the present application, there is provided a data classification apparatus including: the first acquisition unit is used for acquiring target data to be classified; the input unit is used for inputting the target data into a multi-layer label classification model, wherein the multi-layer label classification model comprises a feature extraction module and multi-layer coding modules taking the output of the feature extraction module as input, and each layer of coding module corresponds to one layer of classification label; when the multi-layer label classification model is trained, the inputs of the coding modules of other layers except the first layer coding module in the multi-layer coding module comprise the output of the coding module of the previous layer; and the second acquisition unit is used for acquiring a plurality of classification labels corresponding to the target data, which are output by the multi-layer label classification model.
In some embodiments of the present application, based on the foregoing, the first obtaining unit is further configured to, before inputting the target data into the multi-layer tag classification model: obtaining a sample data set, wherein sample data in the sample data set comprises a sample and a multi-layer label corresponding to the sample; training the multi-layer tag classification model based on the sample dataset.
In some embodiments of the present application, based on the foregoing solution, the first obtaining unit is configured to: acquiring a sample and a specified level label corresponding to the sample; querying a tag hierarchy table based on the specified hierarchy tag to obtain other hierarchy tags associated with the specified hierarchy tag; generating sample data according to the specified level label, the other level labels and the sample; a sample data set is established from the sample data.
In some embodiments of the present application, based on the foregoing scheme, the other layer coding modules of the multi-layer coding module except for the first layer coding module include a first coding unit and a second coding unit; the input of the first coding unit comprises the output of the feature extraction module, and the input of the second coding unit comprises the output of the first coding unit, the output of a first layer coding module and the output of the first coding unit contained in a coding module at a level between the first layer coding module and the other layer coding modules.
In some embodiments of the application, based on the foregoing, the input unit is further configured to: and carrying out fusion processing on the output of the first coding unit contained in the other layer coding module, the output of the first layer coding module and the output of the first coding unit contained in the coding module at the level between the first layer coding module and the other layer coding module to obtain the output result of the other layer coding module.
In some embodiments of the present application, based on the foregoing, the multi-layer coding module includes a first layer coding module, a second layer coding module, and a third layer coding module; the first layer coding module, the second layer coding module and the third layer coding module obtain an output result according to the following formula:
Wherein pre1, pre2 and pre3 respectively represent output results of the first layer encoding module, the second layer encoding module and the third layer encoding module; s 1、S2 and S 3 represent activation functions in the first layer coding module, the second layer coding module, and the third layer coding module, respectively; f 1 denotes the feature obtained after the processing by the first layer coding module; f 2 denotes a feature obtained after processing by the first coding unit included in the second layer coding module; f 3 denotes a feature obtained after processing by the first coding unit included in the third layer coding module; FC 2 represents fusing feature f 1 and feature f 2; FC 3 represents fusing feature f 1, feature f 2, and feature f 3; A. b and C are parameters.
In some embodiments of the application, based on the foregoing, the input unit is further configured to: generating a loss function corresponding to the first layer coding module according to the difference value between the output of the first layer coding module and the first layer label of the sample data; generating a loss function corresponding to the other layer coding module according to the difference value between the output of the other layer coding modules except the first layer coding module and the label of the corresponding level of the sample data in the multi-layer coding module and the attribution relation between the output of the other layer coding module and the output of the previous layer coding module; and generating the loss function of the multi-layer label classification model according to the loss function corresponding to the first-layer coding module and the loss functions corresponding to the other-layer coding modules.
In some embodiments of the present application, based on the foregoing, the multi-layer coding module includes a first layer coding module, a second layer coding module, and a third layer coding module; the loss function of the multi-layer tag classification model is as follows:
Wherein pre1, pre2 and pre3 respectively represent output results of the first layer encoding module, the second layer encoding module and the third layer encoding module; gt1, gt2, and gt3 represent a first layer tag, a second layer tag, and a third layer tag of sample data, respectively; l1, L2 and L3 respectively represent loss functions corresponding to the first layer coding module, the second layer coding module and the third layer coding module; f21 and F32 represent weights, f21=1 if pre2 is attributed to pre1, and F21>1 if pre2 is not attributed to pre 1; f32=1 if pre3 is assigned to pre2, and f32>1 if pre3 is not assigned to pre 2.
In some embodiments of the present application, based on the foregoing scheme, each layer of the multi-layer coding modules includes a fully-connected layer, and an input of the fully-connected layer includes an output of the feature extraction module.
In some embodiments of the application, based on the foregoing scheme, the convolution kernels in the fully connected layer perform feature processing based on shared weights.
In some embodiments of the application, based on the foregoing, the input unit is further configured to: after a plurality of classification labels corresponding to the target data output by the multi-layer label classification model are obtained, verifying the hierarchical relationship of the classification labels based on a label hierarchical table; and if the hierarchical relation of the plurality of classification labels passes the verification, outputting the plurality of classification labels.
According to an aspect of an embodiment of the present application, there is provided a training apparatus for a multi-layer tag classification model, including: a sample data set acquisition unit, configured to acquire a sample data set, where sample data in the sample data set includes a sample and a multi-layer tag corresponding to the sample; a sample data input unit, configured to input sample data in the sample data set into a multi-layer tag classification model, where the multi-layer tag classification model includes a feature extraction module, and multi-layer coding modules that take output of the feature extraction module as input, each layer of coding module corresponds to one layer of classification tag, and inputs of coding modules of other layers of coding modules except a first layer of coding module in the multi-layer coding module include output of coding modules of a previous layer; and the training unit is used for adjusting parameters of the multi-layer label classification model according to the output result of the multi-layer label classification model and the loss value between the multi-layer labels corresponding to the samples so as to train the multi-layer label classification model.
According to an aspect of the embodiments of the present application, there is provided a computer readable medium having stored thereon a computer program which, when executed by a processor, implements a data classification method as described in the above embodiments.
According to an aspect of an embodiment of the present application, there is provided an electronic apparatus including: one or more processors; and a storage means for storing one or more programs which, when executed by the one or more processors, cause the one or more processors to implement the data classification method as described in the above embodiments.
In the technical scheme provided by some embodiments of the present application, data classification is realized by using a built multi-layer tag classification model, and the multi-layer tag classification model includes a feature extraction module and a multi-layer coding module, wherein the multi-layer coding module takes the output of the feature extraction module as input, each layer of coding module corresponds to one layer of classified tag, and the input of the coding module of the other layers except for the first layer of coding module includes the output of the coding module of the previous layer, so that the whole multi-layer tag classification model uses the hierarchical structure information of the tag, the data with a plurality of hierarchical tags can be used for training of the same model, the whole multi-layer tag classification model can output the corresponding multi-layer tag at one time according to the input of one data, the hierarchical structure tag classification task can be completed by training one model, the hierarchical structure error can be avoided in the prediction result, and the resources consumed during training and prediction by using the model can be reduced.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application. It is evident that the drawings in the following description are only some embodiments of the present application and that other drawings may be obtained from these drawings without inventive effort for a person of ordinary skill in the art. In the drawings:
FIG. 1 shows a schematic diagram of an exemplary system architecture to which the technical solution of an embodiment of the application may be applied;
FIG. 2 shows a flow chart of a data classification method according to an embodiment of the application;
FIG. 3 shows a flowchart of the steps preceding step 230 in FIG. 2, according to one embodiment of the application;
FIG. 4 shows a schematic diagram of a page for acquiring sample data, according to one embodiment of the application;
FIG. 5 shows a flow chart of a process for acquiring a sample dataset according to an embodiment of the application;
FIG. 6 illustrates a model architecture diagram for training a multi-layer tag classification model based on a manner in accordance with an embodiment of the present application;
FIG. 7 illustrates a model architecture diagram for training a multi-layer tag classification model based on another approach in accordance with an embodiment of the present application;
FIG. 8 illustrates a flow chart for verifying output results of a multi-layer tag classification model according to an embodiment of the application;
FIG. 9 illustrates a flowchart of a method of training a multi-layer tag classification model according to an embodiment of the application;
FIG. 10 shows a block diagram of a data sorting apparatus according to an embodiment of the application;
FIG. 11 illustrates a block diagram of a training apparatus for a multi-layer tag classification model according to an embodiment of the application;
Fig. 12 shows a schematic diagram of a computer system suitable for use in implementing an embodiment of the application.
Detailed Description
Example embodiments will now be described more fully with reference to the accompanying drawings. However, the exemplary embodiments may be embodied in many forms and should not be construed as limited to the examples set forth herein; rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the concept of the example embodiments to those skilled in the art.
Furthermore, the described features, structures, or characteristics may be combined in any suitable manner in one or more embodiments. In the following description, numerous specific details are provided to give a thorough understanding of embodiments of the application. One skilled in the relevant art will recognize, however, that the application may be practiced without one or more of the specific details, or with other methods, components, devices, steps, etc. In other instances, well-known methods, devices, implementations, or operations are not shown or described in detail to avoid obscuring aspects of the application.
The block diagrams depicted in the figures are merely functional entities and do not necessarily correspond to physically separate entities. That is, the functional entities may be implemented in software, or in one or more hardware modules or integrated circuits, or in different networks and/or processor devices and/or microcontroller devices.
The flow diagrams depicted in the figures are exemplary only, and do not necessarily include all of the elements and operations/steps, nor must they be performed in the order described. For example, some operations/steps may be decomposed, and some operations/steps may be combined or partially combined, so that the order of actual execution may be changed according to actual situations.
Classification is one of the important tasks in the fields of machine learning and artificial intelligence, and classification models are widely applied in many scenes.
The training of the classification model is typically performed using sample data containing labels, which, because of their inclusion in the sample data, are actually trained by supervised learning. However, just as the same thing may belong to different categories, the same object may correspond to multiple levels of tags at the same time. The multi-level tag is a tag system of a tree structure, and the tag of the next level belongs to the previous level, for example, a dog-medium dog-half-way dog is a multi-level tag with a hierarchical structure, wherein the half-way dog belongs to the tag of the medium dog, and the medium dog belongs to the tag of the dog.
In the related art, only data of the same level label can be used for training to obtain a corresponding classification model, and the classification model can only output the label of the level, so that for data of the multi-level label, only a plurality of classification models can be trained, and each classification model corresponds to one level label. Taking a three-level label classification task as an example, when training a model for completing the task based on the related technology, three models are usually trained independently, and the three models respectively correspond to three level labels.
Although the technical scheme based on the related art can also obtain the multi-level label corresponding to the data, at least the following defects exist:
Firstly, the models corresponding to the labels of each level are independently trained, so that the information of the hierarchical structure is wasted;
Second, when the model is used for prediction, logic errors occur in the prediction result that the lower-level label does not belong to the upper-level label due to the lack of hierarchical structure information;
Third, each level of labels requires an independent model for prediction, and the training and prediction resources are consumed more.
To this end, the present application first provides a data classification method. The data classification method provided by the embodiment of the application can overcome the defects. The data which can be classified by the data classification method provided by the embodiment of the application comprises but is not limited to text data, picture data, audio data, video data and the like, so that the data classification method provided by the embodiment of the application can be applied to tasks such as text classification, picture classification, audio classification and video classification.
Fig. 1 shows a schematic diagram of an exemplary system architecture to which the technical solution of an embodiment of the present application may be applied.
As shown in fig. 1, the system architecture may include a terminal device (such as one or more of the smartphone 101, tablet 102, and portable computer 103 shown in fig. 1, but of course, a desktop computer, etc.), a network 104, and a server 105. The network 104 is the medium used to provide communication links between the terminal devices and the server 105. The network 104 may include various connection types, such as wired communication links, wireless communication links, and the like.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation. For example, the server 105 may be a server cluster formed by a plurality of servers.
In one embodiment of the present application, the user of at least a part of the terminal devices is a data annotator, an application program is deployed on each terminal device, the data annotator can obtain sample data to be annotated from the server 105 through the application program, then the data annotator can annotate the sample data to be annotated with corresponding multi-layer tags through the operation application program, and finally the sample data to be annotated and the corresponding multi-layer tags can be sent to the server 105 through the network 104, thereby the server 105 can obtain the sample data including the multi-layer tags.
In one embodiment of the present application, a multi-layer tag classification model is deployed on the server 105, and the server 105 may train the multi-layer tag classification model using the obtained sample data containing the multi-layer tag, where the trained multi-layer tag classification model can output a corresponding multi-layer tag for one data.
In one embodiment of the present application, at least a portion of the terminal devices can send data to be classified to the server 105, and after obtaining the data, the server 105 can classify the data using a trained multi-layer label classification model and output one or more classification labels corresponding to each data, where the classification labels have a hierarchical or hierarchical relationship; after outputting the classification label, the multi-layer label classification model on the server 105 may transmit the classification label to the corresponding terminal device through the network 104.
In one embodiment of the application, the sample data to be marked is picture data, and the multi-layer labels corresponding to the marking of the sample data to be marked are a plurality of categories of the objects recorded in the picture data; the trained multi-layer tag classification model is capable of outputting respective categories of objects recorded in the picture data based on the input of the picture data.
It should be noted that, although in the embodiment of the present application, the sample data and the data to be classified are both from a terminal other than the implementation terminal, in other embodiments of the present application, the sample data and the data to be classified may be both stored locally; although in the embodiment of the present application, the sample data to be marked is picture data, in other embodiments or specific applications of the present application, the sample data to be marked may be other types of data such as text data, video data, and the like. The embodiments of the present application should not be limited in any way, nor should the scope of the application be limited in any way.
Moreover, it is easy to understand that the data classification method provided in the embodiment of the present application is generally executed by the server 105, and accordingly, the data classification device is generally disposed in the server 105. However, in other embodiments of the present application, the terminal device may also have a similar function as the server, so as to execute the data classification scheme provided by the embodiments of the present application.
The embodiment of the application can classify the data from the terminal by the server. The server may be an independent physical server, a server cluster or a distributed system formed by a plurality of physical servers, or a cloud server providing cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, cloud communication, middleware services, domain name services, security services, CDNs (Content Delivery Network, content delivery networks), basic cloud computing services such as big data and artificial intelligent platforms, and the like. The terminal may be, but is not limited to, a smart phone, a tablet computer, a notebook computer, a desktop computer, a smart speaker, a smart watch, etc. The terminal and the server may be directly or indirectly connected through wired or wireless communication, and the present application is not limited herein.
The embodiment of the application can be applied to cloud computing technology. Cloud computing (clouding) is a computing model that distributes computing tasks across a large pool of computers, enabling various application systems to acquire computing power, storage space, and information services as needed. The network that provides the resources is referred to as the "cloud". Resources in the cloud are infinitely expandable in the sense of users, and can be acquired at any time, used as needed, expanded at any time and paid for use as needed.
As a basic capability provider of cloud computing, a cloud computing resource pool (abbreviated as a cloud platform, generally referred to as IaaS (Infrastructure AS A SERVICE) platform) is established, and multiple types of virtual resources are deployed in the resource pool for external clients to select for use. The cloud computing resource pool mainly comprises: computing devices (which are virtualized machines, including operating systems), storage devices, network devices.
According to the logic function division, a PaaS (Platform AS A SERVICE, platform service) layer can be deployed on an IaaS (Infrastructure AS A SERVICE, infrastructure service) layer, and a SaaS (Software AS A SERVICE, service) layer can be deployed above the PaaS layer, or the SaaS can be directly deployed on the IaaS. PaaS is a platform on which software runs, such as a database, web container, etc. SaaS is a wide variety of business software such as web portals, sms mass senders, etc. Generally, saaS and PaaS are upper layers relative to IaaS.
The implementation details of the technical scheme of the embodiment of the application are described in detail below:
fig. 2 shows a flow chart of a data sorting method according to an embodiment of the application, which may be performed by a device having computing capabilities, such as the server 105 shown in fig. 1. Referring to fig. 2, the data classification method at least includes the following steps:
in step 230, target data to be classified is acquired.
The target data may be various types of data such as text data, picture data, audio data, video data, and the like. When the target data to be classified is picture data, the picture data is recorded with objects, and classifying the picture data is to determine the category of the objects recorded in the picture data.
For example, the picture data may be a picture of a hashi, and when classifying the picture data, the picture data may be classified as a dog, so classifying the picture data is equivalent to identifying the picture data, i.e., identifying the category of the object in the picture data.
The embodiment of the application can output the multi-layer labels for the target data, and the multi-layer labels corresponding to the same data have the attribution relation.
For example, the multi-layer labels corresponding to a picture data containing a halfton may be dog-to-medium dog-halfton, respectively, wherein the label of each layer is assigned to a layer of labels adjacent to and preceding the label, e.g., halfton is assigned to medium dogs, which in turn are assigned to dogs. Based on this, the more forward one of the multi-layer labels is, the greater the range of the one layer label and including all layers of labels following the one layer label, so that the labels of each layer are actually attributed to the labels preceding the label, e.g., halshi is attributed to both medium dogs and medium dogs.
The multi-layer label is built based on a label system of a tree structure, for example, dogs, cats and the like can be divided into one layer, large dogs, medium dogs and the like can be divided into one layer, and Beijing bars, hashiqi and the like can be divided into one layer, so that the tree structure is formed.
In the following, unless otherwise indicated, all references to layers, levels refer to the same meaning, i.e. to labels capable of covering different ranges.
In step 240, the target data is input into a multi-layer tag classification model, the multi-layer tag classification model including a feature extraction module and multi-layer encoding modules having an output of the feature extraction module as an input, each layer encoding module corresponding to one layer of classification tag, inputs of encoding modules of other layers than the first layer encoding module in the multi-layer encoding module including an output of encoding modules of a previous layer.
The multi-layer label classification model is built by using a neural network whether a feature extraction module or a multi-layer coding module is adopted.
In one embodiment of the application, the feature extraction module is a pre-trained model.
The pre-training model can be a Bert model and the like, and the pre-training model is trained on a large-scale data set in advance, so that when the multi-layer label classification model is built based on the pre-training model, more information can be introduced into the built multi-layer label classification model, and training of the multi-layer label classification model can be accelerated, and resources consumed by the training model and training cost can be saved.
The multi-layer encoding module is coupled to the feature extraction module such that the multi-layer encoding module is capable of taking the output of the feature extraction module as an input.
Because each layer of coding module corresponds to one layer of classification label, the multi-layer label classification model of the embodiment of the application can output multi-layer classification labels for the same data.
For example, the multi-layer coding modules may be a 1-layer coding module, a 2-layer coding module, and a 3-layer coding module, where the 1-layer coding module is the first-layer coding module located at the front, and the 2-layer coding module immediately follows the first-layer coding module, and the 3-layer coding module is the last-layer coding module. At this time, the input of the coding modules of the other layers except the first layer coding module among the multi-layer coding modules includes the output of the coding module of the previous layer means: the input of the layer 2 encoding module comprises the output of the layer 1 encoding module and the input of the layer 3 encoding module comprises the output of the layer 2 encoding module.
Because the input of the coding modules of other layers except the first layer coding module in the multi-layer coding module comprises the output of the coding module of the previous layer, and each layer coding module corresponds to one layer of classification label, the hierarchical structure information of the label is utilized in the training and using processes of the multi-layer label classification model, and the prediction result can be ensured not to have hierarchical structure errors.
The multi-layer tag classification model needs to be trained to classify the target data. Next, a training process of the multi-layered tag classification model will be described.
Fig. 3 shows a flow chart of steps preceding step 230 in fig. 2 according to an embodiment of the application. Referring to fig. 3, the following steps may be included before step 230:
in step 210, a sample data set is acquired, the sample data in the sample data set including a sample and a multi-layer label corresponding to the sample.
The sample data set includes a plurality of sample data.
The number of layers of the labels in the multi-layer labels corresponding to each sample is the same, the number of layers of the labels in the multi-layer labels corresponding to the samples is equal to the number of layers of the coding modules in the multi-layer coding modules of the multi-layer label classification model, and meanwhile, each layer of labels corresponds to only one layer of coding module in the multi-layer coding modules, so that the multi-layer label classification model capable of outputting the multi-layer labels can be trained by utilizing the sample data.
The samples in the sample data and the multi-layer labels corresponding to the samples may be obtained in a variety of ways. For example, the method can be obtained by a data mining-based mode, a crawling mode from the Internet by utilizing a crawler, and a manual labeling mode.
FIG. 4 shows a schematic diagram of a page for acquiring sample data according to one embodiment of the application. Referring to fig. 4, the page may be a page provided to a data annotator for annotating data. Specifically, the photo containing the dog is displayed on the left side of the page, the photo is a sample, three text entry boxes are listed on the right side of the page, a data annotator can enter multi-layer labels corresponding to the photo through the text entry boxes, labeling of the sample is achieved, and the fact that the entered one-layer label is the dog can be seen. Thus, when the multi-layer label is input, one sample of data can be obtained. The page also displays a last button and a next button, after the data annotator clicks the last button, each label annotated for other samples can be edited, and after the data annotator clicks the next button, labels can be annotated for other samples which are not annotated yet. The page shown in fig. 4 may be displayed on a terminal used by a data annotator, and a "submit" button may be displayed in the page containing the last sample, and by clicking the "submit" button, all samples may be sent to an implementation terminal of the embodiment of the present application, so as to obtain a plurality of sample data.
FIG. 5 shows a flow chart of a process for acquiring a sample dataset according to an embodiment of the application. Referring to fig. 5, the acquisition process of the sample dataset may include the steps of:
in step 510, a sample and a specified hierarchical label corresponding to the sample are obtained.
In one embodiment of the present application, obtaining a sample and a specified level tag corresponding to the sample includes: and acquiring a sample and a last layer of labels corresponding to the sample.
The last layer of labels corresponding to the samples are assigned to other layers of labels corresponding to the samples.
The last layer of labels is the label with the smallest coverage and is also the label most closely related to the sample, for example, the multi-layer labels corresponding to one sample may be dog-middle dog-half, respectively, wherein half is the last layer of labels. Since only the last layer of labels corresponding to the sample can be obtained in this step, other layers of labels corresponding to the sample need to be obtained to meet the requirement of training the multi-layer label classification model.
In step 520, the tag hierarchy table is queried based on the specified hierarchy tags to obtain other hierarchy tags associated with the specified hierarchy tags.
The tag hierarchy table may be built based on human experience.
One-layer label Two-layer label Three-layer label
Dog Medium-sized dog Hashiqi
Dog Large dogs Scotch shepherd dog
Dog Large dogs Labrador beagle dog
Dog Small dogs Chai Quan A
TABLE 1
Table 1 schematically shows a tag hierarchy table. In table 1, each layer of labels located in the same row corresponds to a third layer of labels, which are the last layer of labels and are assigned to the corresponding second layer of labels, and the second layer of labels are assigned to the corresponding first layer of labels. Thus, only three layers of labels are needed to be obtained, the corresponding first layer of labels and the corresponding second layer of labels can be determined by looking up the label hierarchy table, and other layers of labels associated with the specified layer of labels can be obtained according to the specified layer of labels.
In step 530, sample data is generated from the specified hierarchical labels, other hierarchical labels, and samples.
The specified hierarchy label and other hierarchy labels may constitute a multi-layer label corresponding to the sample, and the multi-layer label may be combined with the sample to generate sample data.
In step 540, a sample data set is created from the sample data.
After a plurality of sample data is obtained, a sample data set is constructed using the plurality of sample data.
In the embodiment of the application, after the sample is obtained, the user can label the specified level label for the sample only, such as the label of the last layer, and other level labels corresponding to the sample can be obtained automatically by looking up a table. Therefore, the embodiment of the application can greatly improve the acquisition efficiency of the multi-layer label corresponding to the sample, thereby improving the generation efficiency of the sample data and the sample data set.
In one embodiment of the present application, the other layer encoding modules other than the first layer encoding module in the multi-layer encoding module include a first encoding unit and a second encoding unit; the input of the first coding unit comprises the output of the feature extraction module, and the input of the second coding unit comprises the output of the first coding unit, the output of the first layer coding module and the output of the first coding unit contained in the coding module at the level between the first layer coding module and the other layer coding modules.
FIG. 6 illustrates a model architecture diagram for training a multi-layer tag classification model based on a manner in accordance with an embodiment of the application. Referring to fig. 6, the model architecture includes a feature extraction module 610 and a multi-layer encoding module, where the feature extraction module 610 is a single backup, and the single backup is a backbone network for extracting features, such as a pre-training model. The multi-layer coding modules are a first layer coding module 620, a second layer coding module 630, and a third layer coding module 640, respectively, where the first layer coding module 620 is a first layer coding module. The second layer encoding module 630 and the third layer encoding module 640 each include two encoding units, where the encoding unit near the feature extraction module is a first encoding unit, and the other encoding unit is a second encoding unit. As can be seen from fig. 6, the output of the feature extraction module directly enters the first coding unit, and the input of the second coding unit in the second layer coding module 630 includes the output of the first coding unit in the second layer coding module 630 and the output of the first layer coding module 620; since the second layer encoding module 630 is further included between the first layer encoding module 620 and the third layer encoding module 640, the input of the second encoding unit in the third layer encoding module 640 includes the output of the first encoding unit in the second layer encoding module 630 in addition to the output of the first encoding unit in the third layer encoding module 640 and the output of the first layer encoding module 620.
In one embodiment of the present application, the data classification method further includes: and carrying out fusion processing on the output of the first coding unit contained in the other layer coding module, the output of the first layer coding module and the output of the first coding unit contained in the coding module at the level between the first layer coding module and the other layer coding module to obtain the output result of the other layer coding module.
With continued reference to fig. 6, if the other layer encoding module is the third layer encoding module 640, the third layer encoding module 640 obtains an output result by performing fusion processing on the output of the first encoding unit in the third layer encoding module 640, the output of the first layer encoding module 620, and the output of the first encoding unit in the second layer encoding module 630 between the first layer encoding module 620 and the third layer encoding module 640.
In one embodiment of the present application, the multi-layer coding module includes a first layer coding module, a second layer coding module, and a third layer coding module; the first layer coding module, the second layer coding module and the third layer coding module obtain an output result according to the following formula:
Wherein, pre1, pre2 and pre3 respectively represent output results of the first layer coding module, the second layer coding module and the third layer coding module; s 1、S2 and S 3 represent activation functions in the first layer encoding module, the second layer encoding module, and the third layer encoding module, respectively; f 1 denotes the feature obtained after the processing by the first layer coding module; f 2 denotes a feature obtained after processing by the first coding unit included in the second layer coding module; f 3 denotes a feature obtained after processing by the first coding unit included in the third layer coding module; FC 2 represents fusing feature f 1 and feature f 2; FC 3 represents fusing feature f 1, feature f 2, and feature f 3; A. b and C are parameters.
Certain constraints may be imposed on parameters in the multi-layer coding module, for example, a+b=1 may be set fixedly.
With continued reference to fig. 6, the first layer encoding module 620, the second layer encoding module 630 and the third layer encoding module 640 are arranged in order from top to bottom, and their outputs are pre1, pre2 and pre3, respectively.
In one embodiment of the application, each of the multi-layer encoding modules includes a fully connected layer, and an input of the fully connected layer includes an output of the feature extraction module.
The first layer coding module comprises a full connection layer, and the first coding unit and the second coding unit in the other layer coding modules can comprise full connection layers.
Specifically, in the above formula, f 1=FC1(X),f2=FC2(X),f3=FC3 (X), where X is the output of the feature extraction module, f 1 is the full-connection layer in the first coding unit of the first layer coding module, f 2 is the full-connection layer in the first coding unit of the second layer coding module, f 3 is the full-connection layer in the first coding unit of the third layer coding module, and FC 2 may be the full-connection layer in the second coding unit of the second layer coding module, FC 3 may be the full-connection layer in the second coding unit of the third layer coding module, S 2 and S 3 may be the activation function layers located in the second coding unit, and the activation function layers may be located after the full-connection layer in the second coding unit.
Although in the embodiment of the present application, the second coding unit includes a full connection layer and an activation function layer, it is easy to understand that the second coding unit may also include other layers such as a pooling layer and a Softmax layer, so that the model is more complete, and the network complexity may be reduced. Specifically, each layer structure from the second coding unit to the first coding unit may be sequentially: a fully connected layer, a pooling layer, an activation function layer, and a Softmax layer, wherein the activation function layer may activate functions of a ReLU.
With continued reference to fig. 3, in step 220, a multi-layer tag classification model is trained based on the sample dataset.
In one embodiment of the application, the step of training the multi-layer tag classification model based on the sample dataset comprises:
dividing the sample data set into a training data set and a test data set according to a predetermined proportion; the multi-layer tag classification model is trained based on the training dataset.
After the multi-layer tag classification model is trained, the multi-layer tag classification model may be tested using the test dataset.
In one embodiment of the application, the convolution kernels in the fully connected layer are feature processed based on shared weights.
Specifically, feature filtering is performed on portions of a feature matrix using convolution kernels having the same weights. With continued reference to fig. 6, the portions of the first layer encoding module 620, the second layer encoding module 630, and the third layer encoding module 640 may share weights.
In one embodiment of the present application, the other layer coding modules except the first layer coding module among the multi-layer coding modules are trained according to the output result of a layer coding module adjacent to and before the layer coding module.
In one embodiment of the present application, the data classification method further includes: generating a loss function corresponding to the first layer coding module according to the difference value between the output of the first layer coding module and the first layer label of the sample data; generating a loss function corresponding to other layer coding modules according to the difference value between the output of the other layer coding modules except the first layer coding module in the multi-layer coding modules and the label of the corresponding level of the sample data and the attribution relation between the output of the other layer coding modules and the output of the previous layer coding module; and generating the loss function of the multi-layer label classification model according to the loss function corresponding to the first-layer coding module and the loss functions corresponding to other layer coding modules.
In one embodiment of the present application, the multi-layer coding module includes a first layer coding module, a second layer coding module, and a third layer coding module; the data classification method further comprises the following steps: generating a loss function corresponding to the first layer coding module according to the difference value between the output of the first layer coding module and the first layer label of the sample data; generating a loss function corresponding to the second layer coding module according to the difference value between the output of the second layer coding module and the second layer label of the sample data and the attribution relation between the output of the second layer coding module and the output of the first layer coding module; generating a loss function corresponding to the third layer coding module according to the difference value between the output of the third layer coding module and the third layer label of the sample data and the attribution relation between the output of the third layer coding module and the output of the second layer coding module; and generating a loss function of the multi-layer label classification model according to the loss functions respectively corresponding to the first layer coding module, the second layer coding module and the third layer coding module.
In one embodiment of the present application, the multi-layer coding module includes a first layer coding module, a second layer coding module, and a third layer coding module;
The loss function of the multi-layer tag classification model is as follows:
Wherein, pre1, pre2 and pre3 respectively represent output results of the first layer coding module, the second layer coding module and the third layer coding module; gt1, gt2, and gt3 represent a first layer tag, a second layer tag, and a third layer tag of sample data, respectively; l1, L2 and L3 respectively represent loss functions corresponding to the first layer coding module, the second layer coding module and the third layer coding module; f21 and F32 represent weights, f21=1 if pre2 is attributed to pre1, and F21>1 if pre2 is not attributed to pre 1; f32=1 if pre3 is assigned to pre2, and f32>1 if pre3 is not assigned to pre 2.
The first layer of labels is labels with the largest coverage range, the second layer of labels is labels adjacent to the layer of the first layer of labels and belongs to the first layer of labels, and the third layer of labels is labels adjacent to the layer of the second layer of labels and belongs to the second layer of labels. For example, the first layer of labels may be dogs, the second layer of labels may be medium dogs, and the third layer of labels may be hastelloy.
With continued reference to fig. 6, it can be seen that in the hierarchical loss portion, the first layer encoding module 620, the second layer encoding module 630 and the third layer encoding module 640 respectively correspond to a loss function, where in fig. 6, F21 is denoted as F (pre 2/pre 1) and F32 is denoted as F (pre 3/pre 2).
The parameters of the model are continuously adjusted, so that the loss function is minimized, and further training of the multi-layer label classification model is completed, wherein parameters such as A, B and C can be adjusted in the training process of the model.
As can be seen from the above formula, the training of each layer of coding modules in the multi-layer coding module of the multi-layer label classification model is performed depending on the output of other layers of coding modules, and each layer of coding modules is further trained according to the labels of the corresponding layers, so that the training of the multi-layer label classification model in the embodiment of the application effectively utilizes the hierarchical structure information of the multi-layer labels, and can ensure that no hierarchical structure error occurs in the prediction result.
The hierarchical relationship of the coding modules may be defined in other ways. FIG. 7 illustrates a model architecture diagram for training a multi-layer tag classification model based on another approach, according to an embodiment of the application. Referring to fig. 7, the first layer encoding module 740, the second layer encoding module 730 and the third layer encoding module 720 are all connected to the feature extraction module 710. For example, pre1, pre2, and pre3 may represent output results of the first layer encoding module 740, the second layer encoding module 730, and the third layer encoding module 720, respectively; gt1, gt2, and gt3 represent a first layer tag, a second layer tag, and a third layer tag of sample data, respectively. At this time, F (pre 2/pre 1) and F (pre 3/pre 2) may represent weights, and the meaning of F (pre 2/pre 1) may be: if pre1 includes pre2 (pre 1 includes pre2 indicating that pre2 is a lower label of pre 1), F (pre 2/pre 1) =1, if pre1 does not include pre2, F (pre 2/pre 1) >1; the meaning of F (pre 3/pre 2) may be: if pre2 includes pre3 (inclusion of pre3 indicates that pre3 is a lower label of pre 2), F (pre 3/pre 2) =1, and if pre2 does not include pre3, F (pre 3/pre 2) >1.
In one embodiment of the present application, the data classification method further includes: and training the multi-layer coding modules of the multi-layer label classification model from front to back.
In the foregoing embodiment, since each layer of coding modules is trained according to the attribution relationship between the output of the layer of coding module and the output of the previous layer of coding module, if the accuracy of the output of the previous layer of coding module is low, the training of the layer of coding module is directly affected by the error, so that the training effect is poor, and the training progress is also reduced. In the embodiment of the application, the coding modules are trained sequentially from front to back, so that the training effect and the training speed can be improved.
In one embodiment of the present application, the number of training times for each of the multiple layers of coding modules is the same.
Referring to fig. 2, in step 250, a plurality of classification labels corresponding to the target data output by the multi-layer label classification model are obtained.
The plurality of classification tags, i.e., multi-layer tags, corresponding to the target data have a home relationship between the layers of tags, similar to the form of dog-middle dog-hardy.
FIG. 8 illustrates a flow chart for verifying the output of a multi-layer tag classification model according to an embodiment of the application. Referring to fig. 8, the method comprises the following steps:
In step 810, after the plurality of classification labels corresponding to the target data output by the multi-layer label classification model are acquired, the hierarchical relationship of the plurality of classification labels is verified based on the label hierarchy table.
The tag hierarchy table stores correspondence of the category tags, and thus, whether a plurality of category tags correspond or not can be determined through the tag hierarchy table.
In step 820, if the hierarchical relationship of the plurality of classification labels passes the verification, the plurality of classification labels are output.
When the label hierarchy table is searched, the multiple classification labels output by the multi-layer label classification model can be confirmed to be corresponding to each other, and the verification is passed.
In the embodiment of the application, the output result of the multi-layer label classification model is checked by utilizing the label hierarchy table, and the classification result is output after the verification is passed, so that the accuracy of the classification result is further ensured.
The embodiment of the application also provides a training method of the multi-layer label classification model.
FIG. 9 illustrates a flowchart of a method of training a multi-layer tag classification model according to an embodiment of the application. Referring to fig. 9, the steps may include:
in step 910, a sample data set is acquired, the sample data in the sample data set including a sample and a multi-layer label corresponding to the sample.
The sample data set includes a plurality of sample data.
In step 920, sample data in the sample data set is input into a multi-layer tag classification model, the multi-layer tag classification model including a feature extraction module, and multi-layer encoding modules having an output of the feature extraction module as an input, each layer encoding module corresponding to one layer of classification tag, inputs of encoding modules of other layers than the first layer encoding module in the multi-layer encoding module including outputs of encoding modules of previous layers.
The multi-layer label classification model may employ the model architecture shown in fig. 6, and for specific details regarding the multi-layer label classification model, please refer to the solution of the above embodiment.
In step 930, parameters of the multi-layer tag classification model are adjusted according to the output result of the multi-layer tag classification model and the loss value between the multi-layer tags corresponding to the samples, so as to train the multi-layer tag classification model.
The loss value is minimized by adjusting the parameters of the multi-layer label classification model, so that the trained multi-layer label classification model can accurately output the corresponding multi-layer label according to the input of data.
According to the technical scheme provided by the embodiment of the application, the multi-layer label classification model can be trained end to end, so that higher prediction accuracy can be obtained, and the hierarchy accuracy of the prediction result is ensured.
The following describes an embodiment of the apparatus of the present application, which may be used to perform the data classification method of the above embodiment of the present application. For details not disclosed in the embodiments of the apparatus of the present application, please refer to the embodiments of the data classification method of the present application.
Fig. 10 shows a block diagram of a data sorting apparatus according to an embodiment of the application.
Referring to fig. 10, a data sorting apparatus 1000 according to an embodiment of the present application includes: a first acquisition unit 1010, an input unit 1020, and a second acquisition unit 1030.
Wherein, the first obtaining unit 1010 is configured to obtain target data to be classified; the input unit 1020 is configured to input the target data into a multi-layer tag classification model, where the multi-layer tag classification model includes a feature extraction module, and a multi-layer encoding module that takes an output of the feature extraction module as an input, where each layer of encoding module corresponds to a layer of classification tag; when the multi-layer label classification model is trained, the inputs of the coding modules of other layers except the first layer coding module in the multi-layer coding module comprise the output of the coding module of the previous layer; the second obtaining unit 1030 is configured to obtain a plurality of classification labels corresponding to the target data output by the multi-layer label classification model.
In some embodiments of the present application, based on the foregoing scheme, the first obtaining unit 1010 is further configured to, before inputting the target data into the multi-layer tag classification model: obtaining a sample data set, wherein sample data in the sample data set comprises a sample and a multi-layer label corresponding to the sample; training the multi-layer tag classification model based on the sample dataset.
In some embodiments of the present application, based on the foregoing scheme, the first obtaining unit 1010 is configured to: acquiring a sample and a specified level label corresponding to the sample; querying a tag hierarchy table based on the specified hierarchy tag to obtain other hierarchy tags associated with the specified hierarchy tag; generating sample data according to the specified level label, the other level labels and the sample; a sample data set is established from the sample data.
In some embodiments of the present application, based on the foregoing scheme, the other layer coding modules of the multi-layer coding module except for the first layer coding module include a first coding unit and a second coding unit; the input of the first coding unit comprises the output of the feature extraction module, and the input of the second coding unit comprises the output of the first coding unit, the output of a first layer coding module and the output of the first coding unit contained in a coding module at a level between the first layer coding module and the other layer coding modules.
In some embodiments of the present application, based on the foregoing scheme, the input unit 1020 is further configured to: and carrying out fusion processing on the output of the first coding unit contained in the other layer coding module, the output of the first layer coding module and the output of the first coding unit contained in the coding module at the level between the first layer coding module and the other layer coding module to obtain the output result of the other layer coding module.
In some embodiments of the present application, based on the foregoing, the multi-layer coding module includes a first layer coding module, a second layer coding module, and a third layer coding module; the first layer coding module, the second layer coding module and the third layer coding module obtain an output result according to the following formula:
Wherein pre1, pre2 and pre3 respectively represent output results of the first layer encoding module, the second layer encoding module and the third layer encoding module; s 1、S2 and S 3 represent activation functions in the first layer coding module, the second layer coding module, and the third layer coding module, respectively; f 1 denotes the feature obtained after the processing by the first layer coding module; f 2 denotes a feature obtained after processing by the first coding unit included in the second layer coding module; f 3 denotes a feature obtained after processing by the first coding unit included in the third layer coding module; FC 2 represents fusing feature f 1 and feature f 2; FC 3 represents fusing feature f 1, feature f 2, and feature f 3; A. b and C are parameters.
In some embodiments of the present application, based on the foregoing scheme, the input unit 1020 is further configured to: generating a loss function corresponding to the first layer coding module according to the difference value between the output of the first layer coding module and the first layer label of the sample data; generating a loss function corresponding to the other layer coding module according to the difference value between the output of the other layer coding modules except the first layer coding module and the label of the corresponding level of the sample data in the multi-layer coding module and the attribution relation between the output of the other layer coding module and the output of the previous layer coding module; and generating the loss function of the multi-layer label classification model according to the loss function corresponding to the first-layer coding module and the loss functions corresponding to the other-layer coding modules.
In some embodiments of the present application, based on the foregoing, the multi-layer coding module includes a first layer coding module, a second layer coding module, and a third layer coding module; the loss function of the multi-layer tag classification model is as follows:
Wherein pre1, pre2 and pre3 respectively represent output results of the first layer encoding module, the second layer encoding module and the third layer encoding module; gt1, gt2, and gt3 represent a first layer tag, a second layer tag, and a third layer tag of sample data, respectively; l1, L2 and L3 respectively represent loss functions corresponding to the first layer coding module, the second layer coding module and the third layer coding module; f21 and F32 represent weights, f21=1 if pre2 is attributed to pre1, and F21>1 if pre2 is not attributed to pre 1; f32=1 if pre3 is assigned to pre2, and f32>1 if pre3 is not assigned to pre 2.
In some embodiments of the present application, based on the foregoing scheme, each layer of the multi-layer coding modules includes a fully-connected layer, and an input of the fully-connected layer includes an output of the feature extraction module.
In some embodiments of the application, based on the foregoing scheme, the convolution kernels in the fully connected layer perform feature processing based on shared weights.
In some embodiments of the present application, based on the foregoing scheme, the input unit 1020 is further configured to: after a plurality of classification labels corresponding to the target data output by the multi-layer label classification model are obtained, verifying the hierarchical relationship of the classification labels based on a label hierarchical table; and if the hierarchical relation of the plurality of classification labels passes the verification, outputting the plurality of classification labels.
FIG. 11 illustrates a block diagram of a training apparatus for a multi-layer tag classification model according to an embodiment of the application.
Referring to fig. 11, a training apparatus 1100 of a multi-layered tag classification model according to an embodiment of the present application includes: a sample data set acquisition unit 1110, a sample data input unit 1120, and a training unit 1130.
The sample data set obtaining unit 1110 is configured to obtain a sample data set, where sample data in the sample data set includes a sample and a multi-layer tag corresponding to the sample; the sample data input unit 1120 is configured to input sample data in the sample data set into a multi-layer tag classification model, where the multi-layer tag classification model includes a feature extraction module, and a multi-layer encoding module that takes an output of the feature extraction module as an input, where each layer of encoding module corresponds to one layer of classification tag, and inputs of encoding modules of other layers of the multi-layer encoding module except a first layer of encoding module include outputs of encoding modules of a previous layer; the training unit 1130 is configured to adjust parameters of the multi-layer tag classification model according to the output result of the multi-layer tag classification model and a loss value between the multi-layer tags corresponding to the samples, so as to train the multi-layer tag classification model.
Fig. 12 shows a schematic diagram of a computer system suitable for use in implementing an embodiment of the application.
It should be noted that, the computer system 1200 of the electronic device shown in fig. 12 is only an example, and should not impose any limitation on the functions and the application scope of the embodiments of the present application.
As shown in fig. 12, the computer system 1200 includes a central processing unit (Central Processing Unit, CPU) 1201 that can perform various appropriate actions and processes, such as performing the methods described in the above embodiments, according to a program stored in a Read-Only Memory (ROM) 1202 or a program loaded from a storage section 1208 into a random access Memory (Random Access Memory, RAM) 1203. In the RAM 1203, various programs and data required for the system operation are also stored. The CPU 1201, ROM 1202, and RAM 1203 are connected to each other through a bus 1204. An Input/Output (I/O) interface 1205 is also connected to bus 1204.
The following components are connected to the I/O interface 1205: an input section 1206 including a keyboard, a mouse, and the like; an output portion 1207 including a Cathode Ray Tube (CRT), a Liquid crystal display (Liquid CRYSTAL DISPLAY, LCD), and a speaker, etc.; a storage section 1208 including a hard disk or the like; and a communication section 1209 including a network interface card such as a LAN (Local Area Network ) card, a modem, or the like. The communication section 1209 performs communication processing via a network such as the internet. The drive 1210 is also connected to the I/O interface 1205 as needed. A removable medium 1211 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is installed as needed on the drive 1210 so that a computer program read out therefrom is installed into the storage section 1208 as needed.
In particular, according to embodiments of the present application, the processes described above with reference to flowcharts may be implemented as computer software programs. For example, embodiments of the present application include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method shown in the flowcharts. In such an embodiment, the computer program can be downloaded and installed from a network via the communication portion 1209, and/or installed from the removable media 1211. When executed by a Central Processing Unit (CPU) 1201, performs the various functions defined in the system of the present application.
It should be noted that, the computer readable medium shown in the embodiments of the present application may be a computer readable signal medium or a computer readable storage medium, or any combination of the two. The computer readable storage medium can be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or a combination of any of the foregoing. More specific examples of the computer-readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-Only Memory (ROM), an erasable programmable read-Only Memory (Erasable Programmable Read Only Memory, EPROM), a flash Memory, an optical fiber, a portable compact disc read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the context of this document, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present application, however, the computer-readable signal medium may include a data signal propagated in baseband or as part of a carrier wave, with the computer-readable program code embodied therein. Such a propagated data signal may take any of a variety of forms, including, but not limited to, electro-magnetic, optical, or any suitable combination of the foregoing. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wired, etc., or any suitable combination of the foregoing.
The flowcharts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. Where each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
The units involved in the embodiments of the present application may be implemented by software, or may be implemented by hardware, and the described units may also be provided in a processor. Wherein the names of the units do not constitute a limitation of the units themselves in some cases.
As an aspect, the present application also provides a computer-readable medium that may be contained in the electronic device described in the above embodiment; or may exist alone without being incorporated into the electronic device. The computer-readable medium carries one or more programs which, when executed by the electronic device, cause the electronic device to implement the methods described in the above embodiments.
It should be noted that although in the above detailed description several modules or units of a device for action execution are mentioned, such a division is not mandatory. Indeed, the features and functions of two or more modules or units described above may be embodied in one module or unit in accordance with embodiments of the application. Conversely, the features and functions of one module or unit described above may be further divided into a plurality of modules or units to be embodied.
From the above description of embodiments, those skilled in the art will readily appreciate that the example embodiments described herein may be implemented in software, or may be implemented in software in combination with the necessary hardware. Thus, the technical solution according to the embodiments of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disk, a mobile hard disk, etc.) or on a network, and includes several instructions to cause a computing device (may be a personal computer, a server, a touch terminal, or a network device, etc.) to perform the method according to the embodiments of the present application.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the embodiments disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (14)

1. A method of classifying data, comprising:
Acquiring target data to be classified;
inputting the target data into a multi-layer tag classification model, wherein the multi-layer tag classification model comprises a feature extraction module and a multi-layer coding module taking the output of the feature extraction module as input, each layer of coding module corresponds to one layer of classification tag, and the input of other layers of coding modules except for a first layer of coding module in the multi-layer coding module comprises the output of a coding module of a previous layer; the other layer coding modules except the first layer coding module in the multi-layer coding module comprise a first coding unit and a second coding unit; the input of the first coding unit comprises the output of the feature extraction module, and the input of the second coding unit comprises the output of the first coding unit, the output of a first layer coding module and the output of the first coding unit contained in a coding module at a level between the first layer coding module and the other layer coding modules; the loss function of the multi-layer tag classification model is generated by: generating a loss function corresponding to the first layer coding module according to the difference value between the output of the first layer coding module and the first layer label of the sample data; generating a loss function corresponding to the other layer coding module according to the difference value between the output of the other layer coding modules except the first layer coding module and the label of the corresponding level of the sample data in the multi-layer coding module and the attribution relation between the output of the other layer coding module and the output of the previous layer coding module; generating a loss function of the multi-layer tag classification model according to the loss function corresponding to the first-layer coding module and the loss functions corresponding to the other-layer coding modules;
and acquiring a plurality of classification labels corresponding to the target data, which are output by the multi-layer label classification model.
2. The data classification method according to claim 1, wherein prior to inputting the target data into a multi-layer tag classification model, the method further comprises:
Obtaining a sample data set, wherein sample data in the sample data set comprises a sample and a multi-layer label corresponding to the sample;
Training the multi-layer tag classification model based on the sample dataset.
3. The data classification method of claim 2, wherein the acquiring a sample dataset comprises:
acquiring a sample and a specified level label corresponding to the sample;
querying a tag hierarchy table based on the specified hierarchy tag to obtain other hierarchy tags associated with the specified hierarchy tag;
generating sample data according to the specified level label, the other level labels and the sample;
a sample data set is established from the sample data.
4. The data classification method according to claim 1, characterized in that the data classification method further comprises:
And carrying out fusion processing on the output of the first coding unit contained in the other layer coding module, the output of the first layer coding module and the output of the first coding unit contained in the coding module at the level between the first layer coding module and the other layer coding module to obtain the output result of the other layer coding module.
5. The data classification method of claim 4, wherein the multi-layer coding module comprises a first layer coding module, a second layer coding module, and a third layer coding module;
the first layer coding module, the second layer coding module and the third layer coding module obtain an output result according to the following formula:
Wherein pre1, pre2 and pre3 respectively represent output results of the first layer encoding module, the second layer encoding module and the third layer encoding module; s 1、S2 and S 3 represent activation functions in the first layer coding module, the second layer coding module, and the third layer coding module, respectively; f 1 denotes the feature obtained after the processing by the first layer coding module; f 2 denotes a feature obtained after processing by the first coding unit included in the second layer coding module; f 3 denotes a feature obtained after processing by the first coding unit included in the third layer coding module; FC 2 represents fusing feature f 1 and feature f 2; FC 3 represents fusing feature f 1, feature f 2, and feature f 3; A. b and C are parameters.
6. The data classification method of claim 1, wherein the multi-layer coding module comprises a first layer coding module, a second layer coding module, and a third layer coding module; the loss function of the multi-layer tag classification model is as follows:
Wherein pre1, pre2 and pre3 respectively represent output results of the first layer encoding module, the second layer encoding module and the third layer encoding module; gt1, gt2, and gt3 represent a first layer tag, a second layer tag, and a third layer tag of sample data, respectively; l1, L2 and L3 respectively represent loss functions corresponding to the first layer coding module, the second layer coding module and the third layer coding module; f21 and F32 represent weights, f21=1 if pre2 is attributed to pre1, and F21>1 if pre2 is not attributed to pre 1; f32=1 if pre3 is assigned to pre2, and f32>1 if pre3 is not assigned to pre 2.
7. The data classification method according to any one of claims 1 to 6, wherein each of the multi-layer encoding modules comprises a fully connected layer, an input of the fully connected layer comprising an output of the feature extraction module.
8. The data classification method of claim 7, wherein the convolution kernels in the fully connected layer are characterized based on shared weights.
9. The data classification method according to any one of claims 1 to 6, characterized in that the data classification method further comprises:
After a plurality of classification labels corresponding to the target data output by the multi-layer label classification model are obtained, verifying the hierarchical relationship of the classification labels based on a label hierarchical table;
And if the hierarchical relation of the plurality of classification labels passes the verification, outputting the plurality of classification labels.
10. A method for training a multi-layer tag classification model, comprising:
Obtaining a sample data set, wherein sample data in the sample data set comprises a sample and a multi-layer label corresponding to the sample;
Inputting sample data in the sample data set into a multi-layer tag classification model, wherein the multi-layer tag classification model comprises a feature extraction module and multi-layer coding modules taking output of the feature extraction module as input, each layer of coding module corresponds to one layer of classification tag, and input of other layers of coding modules except for a first layer of coding module in the multi-layer coding modules comprises output of coding modules of a previous layer; the other layer coding modules except the first layer coding module in the multi-layer coding module comprise a first coding unit and a second coding unit; the input of the first coding unit comprises the output of the feature extraction module, and the input of the second coding unit comprises the output of the first coding unit, the output of a first layer coding module and the output of the first coding unit contained in a coding module at a level between the first layer coding module and the other layer coding modules; the loss function of the multi-layer tag classification model is generated by: generating a loss function corresponding to the first layer coding module according to the difference value between the output of the first layer coding module and the first layer label of the sample data; generating a loss function corresponding to the other layer coding module according to the difference value between the output of the other layer coding modules except the first layer coding module and the label of the corresponding level of the sample data in the multi-layer coding module and the attribution relation between the output of the other layer coding module and the output of the previous layer coding module; generating a loss function of the multi-layer tag classification model according to the loss function corresponding to the first-layer coding module and the loss functions corresponding to the other-layer coding modules;
And adjusting parameters of the multi-layer label classification model according to the output result of the multi-layer label classification model and the loss value between the multi-layer labels corresponding to the samples so as to train the multi-layer label classification model.
11. A data sorting apparatus, comprising:
The first acquisition unit is used for acquiring target data to be classified;
The input unit is used for inputting the target data into a multi-layer label classification model, wherein the multi-layer label classification model comprises a feature extraction module and multi-layer coding modules taking the output of the feature extraction module as input, and each layer of coding module corresponds to one layer of classification label; when the multi-layer label classification model is trained, the inputs of the coding modules of other layers except the first layer coding module in the multi-layer coding module comprise the output of the coding module of the previous layer; the other layer coding modules except the first layer coding module in the multi-layer coding module comprise a first coding unit and a second coding unit; the input of the first coding unit comprises the output of the feature extraction module, and the input of the second coding unit comprises the output of the first coding unit, the output of a first layer coding module and the output of the first coding unit contained in a coding module at a level between the first layer coding module and the other layer coding modules;
The input unit is further configured to: generating a loss function corresponding to the first layer coding module according to the difference value between the output of the first layer coding module and the first layer label of the sample data; generating a loss function corresponding to the other layer coding module according to the difference value between the output of the other layer coding modules except the first layer coding module and the label of the corresponding level of the sample data in the multi-layer coding module and the attribution relation between the output of the other layer coding module and the output of the previous layer coding module; generating a loss function of the multi-layer tag classification model according to the loss function corresponding to the first-layer coding module and the loss functions corresponding to the other-layer coding modules;
And the second acquisition unit is used for acquiring a plurality of classification labels corresponding to the target data, which are output by the multi-layer label classification model.
12. A training device for a multi-layer tag classification model, comprising:
A sample data set acquisition unit, configured to acquire a sample data set, where sample data in the sample data set includes a sample and a multi-layer tag corresponding to the sample;
a sample data input unit, configured to input sample data in the sample data set into a multi-layer tag classification model, where the multi-layer tag classification model includes a feature extraction module, and multi-layer coding modules that take output of the feature extraction module as input, each layer of coding module corresponds to one layer of classification tag, and inputs of coding modules of other layers of coding modules except a first layer of coding module in the multi-layer coding module include output of coding modules of a previous layer; the other layer coding modules except the first layer coding module in the multi-layer coding module comprise a first coding unit and a second coding unit; the input of the first coding unit comprises the output of the feature extraction module, and the input of the second coding unit comprises the output of the first coding unit, the output of a first layer coding module and the output of the first coding unit contained in a coding module at a level between the first layer coding module and the other layer coding modules; the loss function of the multi-layer tag classification model is generated by: generating a loss function corresponding to the first layer coding module according to the difference value between the output of the first layer coding module and the first layer label of the sample data; generating a loss function corresponding to the other layer coding module according to the difference value between the output of the other layer coding modules except the first layer coding module and the label of the corresponding level of the sample data in the multi-layer coding module and the attribution relation between the output of the other layer coding module and the output of the previous layer coding module; generating a loss function of the multi-layer tag classification model according to the loss function corresponding to the first-layer coding module and the loss functions corresponding to the other-layer coding modules;
And the training unit is used for adjusting parameters of the multi-layer label classification model according to the output result of the multi-layer label classification model and the loss value between the multi-layer labels corresponding to the samples so as to train the multi-layer label classification model.
13. A computer readable medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the data classification method according to any one of claims 1 to 9.
14. An electronic device, comprising:
One or more processors;
storage means for storing one or more programs which when executed by the one or more processors cause the one or more processors to implement the data classification method of any of claims 1 to 9.
CN202110610877.2A 2021-06-01 2021-06-01 Data classification method, training device, medium and electronic equipment Active CN113837216B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110610877.2A CN113837216B (en) 2021-06-01 2021-06-01 Data classification method, training device, medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110610877.2A CN113837216B (en) 2021-06-01 2021-06-01 Data classification method, training device, medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN113837216A CN113837216A (en) 2021-12-24
CN113837216B true CN113837216B (en) 2024-05-10

Family

ID=78962562

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110610877.2A Active CN113837216B (en) 2021-06-01 2021-06-01 Data classification method, training device, medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN113837216B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115577106B (en) * 2022-10-14 2023-12-19 北京百度网讯科技有限公司 Text classification method, device, equipment and medium based on artificial intelligence

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177569A (en) * 2020-01-07 2020-05-19 腾讯科技(深圳)有限公司 Recommendation processing method, device and equipment based on artificial intelligence
CN111177371A (en) * 2019-12-05 2020-05-19 腾讯科技(深圳)有限公司 Classification method and related device
CN111626063A (en) * 2020-07-28 2020-09-04 浙江大学 Text intention identification method and system based on projection gradient descent and label smoothing
CN111737521A (en) * 2020-08-04 2020-10-02 北京微播易科技股份有限公司 Video classification method and device
CN111783861A (en) * 2020-06-22 2020-10-16 北京百度网讯科技有限公司 Data classification method, model training device and electronic equipment
CN112182229A (en) * 2020-11-05 2021-01-05 江西高创保安服务技术有限公司 Text classification model construction method, text classification method and device
CN112232524A (en) * 2020-12-14 2021-01-15 北京沃东天骏信息技术有限公司 Multi-label information identification method and device, electronic equipment and readable storage medium
CN112353402A (en) * 2020-10-22 2021-02-12 平安科技(深圳)有限公司 Training method of electrocardiosignal classification model, electrocardiosignal classification method and device
CN112417150A (en) * 2020-11-16 2021-02-26 建信金融科技有限责任公司 Industry classification model training and using method, device, equipment and medium
WO2021087985A1 (en) * 2019-11-08 2021-05-14 深圳市欢太科技有限公司 Model training method and apparatus, storage medium, and electronic device

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11748613B2 (en) * 2019-05-10 2023-09-05 Baidu Usa Llc Systems and methods for large scale semantic indexing with deep level-wise extreme multi-label learning

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2021087985A1 (en) * 2019-11-08 2021-05-14 深圳市欢太科技有限公司 Model training method and apparatus, storage medium, and electronic device
CN111177371A (en) * 2019-12-05 2020-05-19 腾讯科技(深圳)有限公司 Classification method and related device
CN111177569A (en) * 2020-01-07 2020-05-19 腾讯科技(深圳)有限公司 Recommendation processing method, device and equipment based on artificial intelligence
CN111783861A (en) * 2020-06-22 2020-10-16 北京百度网讯科技有限公司 Data classification method, model training device and electronic equipment
CN111626063A (en) * 2020-07-28 2020-09-04 浙江大学 Text intention identification method and system based on projection gradient descent and label smoothing
CN111737521A (en) * 2020-08-04 2020-10-02 北京微播易科技股份有限公司 Video classification method and device
CN112353402A (en) * 2020-10-22 2021-02-12 平安科技(深圳)有限公司 Training method of electrocardiosignal classification model, electrocardiosignal classification method and device
CN112182229A (en) * 2020-11-05 2021-01-05 江西高创保安服务技术有限公司 Text classification model construction method, text classification method and device
CN112417150A (en) * 2020-11-16 2021-02-26 建信金融科技有限责任公司 Industry classification model training and using method, device, equipment and medium
CN112232524A (en) * 2020-12-14 2021-01-15 北京沃东天骏信息技术有限公司 Multi-label information identification method and device, electronic equipment and readable storage medium

Also Published As

Publication number Publication date
CN113837216A (en) 2021-12-24

Similar Documents

Publication Publication Date Title
CN111680219B (en) Content recommendation method, device, equipment and readable storage medium
CN105210064B (en) Classifying resources using deep networks
US11645548B1 (en) Automated cloud data and technology solution delivery using machine learning and artificial intelligence modeling
CN108140075A (en) User behavior is classified as exception
CN109492772A (en) The method and apparatus for generating information
US11423307B2 (en) Taxonomy construction via graph-based cross-domain knowledge transfer
CN111625715B (en) Information extraction method and device, electronic equipment and storage medium
CN109918499A (en) A kind of file classification method, device, computer equipment and storage medium
WO2023179038A1 (en) Data labeling method, ai development platform, computing device cluster, and storage medium
CN114511038A (en) False news detection method and device, electronic equipment and readable storage medium
CN115130711A (en) Data processing method and device, computer and readable storage medium
CN112269875B (en) Text classification method, device, electronic equipment and storage medium
CN113656690A (en) Product recommendation method and device, electronic equipment and readable storage medium
CN111522979B (en) Picture sorting recommendation method and device, electronic equipment and storage medium
CN110909768B (en) Method and device for acquiring marked data
CN113837216B (en) Data classification method, training device, medium and electronic equipment
CN111444335B (en) Method and device for extracting central word
CN113158051B (en) Label sorting method based on information propagation and multilayer context information modeling
CN113591881A (en) Intention recognition method and device based on model fusion, electronic equipment and medium
US20230117893A1 (en) Machine learning techniques for environmental discovery, environmental validation, and automated knowledge repository generation
CN115129930A (en) Video information processing method and device, computer equipment and storage medium
CN112417260B (en) Localized recommendation method, device and storage medium
CN115168609A (en) Text matching method and device, computer equipment and storage medium
CN112101015A (en) Method and device for identifying multi-label object
CN114417944B (en) Recognition model training method and device, and user abnormal behavior recognition method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant