JP2023044336A

JP2023044336A - Learning apparatus, learning method, and program

Info

Publication number: JP2023044336A
Application number: JP2021152315A
Authority: JP
Inventors: 恭史国定; Yasushi Kunisada
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2021-09-17
Filing date: 2021-09-17
Publication date: 2023-03-30

Abstract

To provide a technology capable of obtaining a model with high interpretability and accuracy, while suppressing human cost.SOLUTION: A learning apparatus includes: an input unit which acquires first input data and a ground truth value of the first input data; an inference unit which outputs first inference values corresponding to each of a plurality of inference models, on the basis of the first input data and the inference models; an explanation unit which outputs first explanatory information corresponding to each of the inference models that indicates a degree of contribution of the first input data to the first inference values; an inference evaluation unit which obtains an inference evaluation result on the basis of the ground truth value and the first inference value; an explanation evaluation unit which obtains an explanation evaluation result on the basis of a matching degree between the pieces of first explanatory information corresponding to each of the inference models; and an update unit which updates a first weight parameter of the inference models, on the basis of the inference evaluation result and the explanation evaluation result.SELECTED DRAWING: Figure 1

Description

本発明は、学習装置、学習方法およびプログラムに関する。 The present invention relates to a learning device, a learning method and a program.

ニューラルネットワーク（以下、「ＮＮ」とも表記する。）は、画像認識などにおいて高い性能を有する。しかし、一般的にＮＮは、膨大なパラメータと複雑なモデルとによって構成されており、ＮＮのパラメータとＮＮからの出力結果との関係を解釈することが難しい。かかる課題を解決するため、解釈性の高いＮＮを得る手法が幾つか提案されている。なお、「解釈性が高い」は、「人間の感覚との一致度が高い」とも換言され得る。 A neural network (hereinafter also referred to as “NN”) has high performance in image recognition and the like. However, NNs are generally composed of a huge number of parameters and complicated models, and it is difficult to interpret the relationship between NN parameters and output results from NNs. In order to solve this problem, several techniques for obtaining NN with high interpretability have been proposed. Note that "highly interpretable" can also be translated into "highly consistent with human senses."

例えば、ＮＮのモデルが判断のために注目するべき領域を示したヒートマップのラベルを人手によって付しておき、そのヒートマップと一致するようにモデルを学習させることによって人にも解釈しやすいモデルを得る手法が知られている（例えば、非特許文献１参照）。また、モデルから得られたヒートマップの解釈性が低い場合には、そのヒートマップと一致しないようにモデルを再学習させることによって、より解釈性の高いモデルを得ることもできる。 For example, a heat map that shows the areas that the NN model should pay attention to for judgment is manually labeled, and the model is trained so that it matches the heat map. is known (see, for example, Non-Patent Document 1). In addition, when the interpretability of the heat map obtained from the model is low, the model can be re-learned so as not to match the heat map, thereby obtaining a model with higher interpretability.

また、入力データのうちＮＮが判断を行うための注目領域を抽出する機構をネットワーク内に導入することによって、ＮＮの精度を向上させる手法も知られている（例えば、非特許文献２参照）。かかる手法によって得られた注目領域を人間が修正し、修正した注目領域とＮＮの注目領域が一致するようにＮＮを再学習させることによって、ＮＮの解釈性および精度を向上させることができる。 Also known is a method of improving the accuracy of the NN by introducing a mechanism into the network for extracting a region of interest for the NN to make judgments from the input data (see, for example, Non-Patent Document 2). Interpretability and accuracy of the NN can be improved by manually correcting the attention area obtained by such a method and re-learning the NN so that the corrected attention area and the NN attention area match.

Andrew Ross、他2名、"Right for the Right Reasons: Training Differentiable Models byConstraining their Explanations"、[online]、［令和3年9月8日検索］、インターネット＜https://arxiv.org/abs/1703.03717＞Andrew Ross, 2 others, "Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations", [online], [searched September 8, 2021], Internet <https://arxiv.org/abs/ 1703.03717> Masahiro Mitsuhara、他6名、"Embedding Human Knowledge into Deep Neural Network viaAttention Map"、[online]、［令和3年9月8日検索］、インターネット＜https://arxiv.org/abs/1905.03540＞Masahiro Mitsuhara, 6 others, "Embedding Human Knowledge into Deep Neural Network via Attention Map", [online], [Searched on September 8, 2021], Internet <https://arxiv.org/abs/1905.03540> "Grad-CAM: VisualExplanations from Deep Networks via Gradient-based Localization"、[online]、［令和3年9月8日検索］、インターネット＜https://arxiv.org/abs/1610.02391v3＞"Grad-CAM: Visual Explanations from Deep Networks via Gradient-based Localization", [online], [searched September 8, 2021], Internet <https://arxiv.org/abs/1610.02391v3>

しかしながら、非特許文献１および非特許文献２に記載された、人手によってヒートマップのラベルを用意する手法は、ラベル付けのための人的コストが大きい。 However, the method of manually preparing heat map labels described in Non-Patent Literature 1 and Non-Patent Literature 2 entails a large labor cost for labeling.

一方、ラベル付けを必要としない手法としては、非特許文献１に記載の学習済みモデルのヒートマップと一致しないようにモデルを再学習させる手法が挙げられる。しかし、かかる手法では、再学習により精度が低下してしまう可能性が高いという点が課題として挙げられる。さらに、かかる手法では、全てのデータに対して一様にヒートマップの一致度が低下してしまうため、個々のデータに対してはかえってヒートマップの解釈性を低下させてしまう場合があるという点が課題として挙げられる。 On the other hand, as a method that does not require labeling, there is a method of re-learning the model so that it does not match the heat map of the trained model described in Non-Patent Document 1. However, in such a method, there is a problem that there is a high possibility that the accuracy will be lowered by re-learning. Furthermore, in such a method, the degree of matching of the heat map uniformly decreases for all data, so the interpretability of the heat map may rather decrease for individual data. is an issue.

そこで、人的コストを抑制しながら、解釈性および精度の高いモデルを得ることが可能な技術が提供されることが望まれる。 Therefore, it is desired to provide a technique that can obtain a highly interpretable and accurate model while suppressing human costs.

上記問題を解決するために、本発明のある観点によれば、第１の入力データと前記第１の入力データの正解値とを取得する入力部と、前記第１の入力データと複数の推論モデルとに基づいて、複数の推論モデルそれぞれに対応する第１の推論値を出力する推論部と、前記第１の推論値に対する前記第１の入力データの寄与の大きさを示す前記複数の推論モデルそれぞれに対応する第１の説明情報を出力する説明部と、前記正解値と前記第１の推論値とに基づいて推論評価結果を得る推論評価部と、前記複数の推論モデルそれぞれに対応する第１の説明情報同士の一致度に基づいて説明評価結果を得る説明評価部と、前記推論評価結果と前記説明評価結果とに基づいて、前記複数の推論モデルの第１の重みパラメータの更新を行う更新部と、を備える、学習装置が提供される。 In order to solve the above problem, according to an aspect of the present invention, an input unit that acquires first input data and a correct value of the first input data; an inference unit that outputs a first inference value corresponding to each of a plurality of inference models based on a model; and the plurality of inferences that indicate a contribution magnitude of the first input data to the first inference value an explanation unit that outputs first explanation information corresponding to each model; an inference evaluation unit that obtains an inference evaluation result based on the correct value and the first inference value; and an inference evaluation unit that corresponds to each of the plurality of inference models. an explanation evaluation unit that obtains an explanation evaluation result based on the degree of matching between first explanation information; and an update of first weight parameters of the plurality of inference models based on the inference evaluation result and the explanation evaluation result. and an updating unit for performing.

前記入力部は、第２の入力データを取得し、前記推論部は、前記第２の入力データと前記第１の重みパラメータの更新後の複数の推定モデルである複数の学習済みモデルとに基づいて、前記複数の学習済みモデルそれぞれに対応する第２の推論値を出力し、前記説明部は、前記第２の推論値に対する前記第２の入力データの寄与の大きさを示す前記複数の学習済みモデルそれぞれに対応する第２の説明情報を出力し、前記学習装置は、前記第２の説明情報のユーザへの提示を制御する提示制御部を備えてもよい。 The input unit acquires second input data, and the inference unit is based on the second input data and a plurality of trained models, which are a plurality of estimation models after updating the first weight parameters. and outputs a second inference value corresponding to each of the plurality of trained models, and the explanation unit outputs the plurality of learning models indicating the magnitude of contribution of the second input data to the second inference value. The learning device may include a presentation control unit that outputs second explanation information corresponding to each of the finished models, and controls presentation of the second explanation information to the user.

前記提示制御部は、前記第２の推論値および前記第２の説明情報の前記ユーザへの提示を制御してもよい。 The presentation control unit may control presentation of the second inference value and the second explanatory information to the user.

前記学習装置は、前記複数の学習済みモデルから前記ユーザによって選択された１または複数の学習済みモデルを示す情報の記録を制御する記録制御部を備えてもよい。 The learning device may include a recording control unit that controls recording of information indicating one or a plurality of trained models selected by the user from the plurality of trained models.

前記説明評価結果は、前記複数の推論モデルそれぞれに対応する第１の説明情報同士の一致度が大きいほど小さい値を取ってもよい。 The explanation evaluation result may take a smaller value as the degree of matching between the first explanation information corresponding to each of the plurality of inference models increases.

前記説明評価部は、前記複数の推論モデルそれぞれに対応する第１の説明情報を正規化したベクトルの内積に基づいて前記説明評価結果を得てもよい。 The explanation evaluation unit may obtain the explanation evaluation result based on an inner product of vectors obtained by normalizing the first explanation information corresponding to each of the plurality of inference models.

前記説明評価部は、前記複数の推論モデルごとに、前記第１の説明情報の二値化を行ってマスクを生成するとともに、自身以外の推論モデルに対応する前記第１の説明情報から生成したマスクと自身の推論モデルに対応する前記第１の説明情報との積を計算し、前記複数の推論モデルごとの前記積の和に基づいて、前記説明評価結果を得てもよい。 The explanation evaluation unit generates a mask by binarizing the first explanation information for each of the plurality of inference models, and generates a mask from the first explanation information corresponding to an inference model other than itself. A product of the mask and the first explanation information corresponding to its own inference model may be calculated, and the explanation evaluation result may be obtained based on the sum of the products for each of the plurality of inference models.

前記説明部は、誤差逆伝播が可能な関数を含んでもよい。 The explanation part may include a function capable of error backpropagation.

前記説明部は、第２の重みパラメータを有し、前記更新部は、誤差逆伝播法によって前記第２の重みパラメータの更新を行ってもよい。 The explanation unit may have a second weight parameter, and the update unit may update the second weight parameter by error backpropagation.

前記複数の推論モデルの少なくとも一つは、ニューラルネットワークを含んでもよい。なお、ニューラルネットワークは、機械学習アルゴリズムの一例に過ぎない。したがって、ニューラルネットワークの代わりに他の機械学習アルゴリズムが用いられてもよい。 At least one of the plurality of inference models may include a neural network. It should be noted that neural networks are just one example of machine learning algorithms. Therefore, other machine learning algorithms may be used instead of neural networks.

前記更新部は、前記推論評価結果と前記説明評価結果との加算結果に基づいて、前記第１の重みパラメータの更新を行ってもよい。 The updating unit may update the first weighting parameter based on an addition result of the inference evaluation result and the explanation evaluation result.

前記第１の説明情報は、前記第１の推論値に対する前記第１の入力データの寄与の大きさを示すヒートマップであってもよい。 The first explanation information may be a heat map indicating the magnitude of contribution of the first input data to the first inference value.

また、本発明の別の観点によれば、第１の入力データと前記第１の入力データの正解値とを取得することと、前記第１の入力データと複数の推論モデルとに基づいて、複数の推論モデルそれぞれに対応する第１の推論値を出力することと、前記第１の推論値に対する前記第１の入力データの寄与の大きさを示す前記複数の推論モデルそれぞれに対応する第１の説明情報を出力することと、前記正解値と前記第１の推論値とに基づいて推論評価結果を得ることと、前記複数の推論モデルそれぞれに対応する第１の説明情報同士の一致度に基づいて説明評価結果を得ることと、前記推論評価結果と前記説明評価結果とに基づいて、前記複数の推論モデルの第１の重みパラメータの更新を行うことと、を含む、学習方法が提供される。 According to another aspect of the present invention, obtaining first input data and a correct value of the first input data, and based on the first input data and a plurality of inference models, outputting a first inference value corresponding to each of a plurality of inference models; obtaining an inference evaluation result based on the correct value and the first inference value; obtaining an explanation evaluation result based on the inference evaluation result; and updating a first weight parameter of the plurality of inference models based on the inference evaluation result and the explanation evaluation result. be.

また、本発明の別の観点によれば、コンピュータを、第１の入力データと前記第１の入力データの正解値とを取得する入力部と、前記第１の入力データと複数の推論モデルとに基づいて、複数の推論モデルそれぞれに対応する第１の推論値を出力する推論部と、前記第１の推論値に対する前記第１の入力データの寄与の大きさを示す前記複数の推論モデルそれぞれに対応する第１の説明情報を出力する説明部と、前記正解値と前記第１の推論値とに基づいて推論評価結果を得る推論評価部と、前記複数の推論モデルそれぞれに対応する第１の説明情報同士の一致度に基づいて説明評価結果を得る説明評価部と、前記推論評価結果と前記説明評価結果とに基づいて、前記複数の推論モデルの第１の重みパラメータの更新を行う更新部と、を備える、学習装置として機能させるプログラムが提供される。 According to another aspect of the present invention, a computer comprises: an input unit that acquires first input data and correct values of the first input data; and the first input data and a plurality of inference models. and an inference unit that outputs a first inference value corresponding to each of a plurality of inference models, based on each of the plurality of inference models, and each of the plurality of inference models that indicates the magnitude of contribution of the first input data to the first inference value. an explanation unit that outputs first explanation information corresponding to the inference model, an inference evaluation unit that obtains an inference evaluation result based on the correct value and the first inference value, and a first inference model corresponding to each of the plurality of inference models an explanation evaluation unit that obtains an explanation evaluation result based on the degree of matching between the explanation information of the above; and an update that updates the first weight parameters of the plurality of inference models based on the inference evaluation result and the explanation evaluation result. and a program functioning as a learning device is provided.

以上説明したように本発明によれば、人的コストを抑制しながら、解釈性および精度の高いモデルを得ることが可能な技術が提供される。 As described above, according to the present invention, there is provided a technology capable of obtaining a model with high interpretability and accuracy while suppressing human costs.

本発明の実施形態に係る学習装置の機能構成例を示す図である。1 is a diagram showing a functional configuration example of a learning device according to an embodiment of the present invention; FIG. ヒートマップを二値化したマスクと他のヒートマップとの掛け合わせによって説明評価結果を得る手法について説明するための図である。FIG. 10 is a diagram for explaining a method of obtaining an explanation evaluation result by multiplying a mask obtained by binarizing a heat map and another heat map; 同実施形態に係る学習装置の学習段階における動作例を示すフローチャートである。4 is a flow chart showing an example of the operation in the learning stage of the learning device according to the same embodiment; 同実施形態に係る学習装置のテスト段階における動作例を示すフローチャートである。6 is a flow chart showing an example of operation in a test stage of the learning device according to the same embodiment; 学習装置の例としての情報処理装置のハードウェア構成を示す図である。1 is a diagram showing a hardware configuration of an information processing device as an example of a learning device; FIG.

以下に添付図面を参照しながら、本発明の好適な実施の形態について詳細に説明する。なお、本明細書及び図面において、実質的に同一の機能構成を有する構成要素については、同一の符号を付することにより重複説明を省略する。 Preferred embodiments of the present invention will be described in detail below with reference to the accompanying drawings. In the present specification and drawings, constituent elements having substantially the same functional configuration are denoted by the same reference numerals, thereby omitting redundant description.

また、本明細書および図面において、実質的に同一の機能構成を有する複数の構成要素を、同一の符号の後に異なる数字を付して区別する場合がある。ただし、実質的に同一の機能構成を有する複数の構成要素等の各々を特に区別する必要がない場合、同一符号のみを付する。また、異なる実施形態の類似する構成要素については、同一の符号の後に異なるアルファベットを付して区別する場合がある。ただし、異なる実施形態の類似する構成要素等の各々を特に区別する必要がない場合、同一符号のみを付する。 In addition, in this specification and drawings, a plurality of components having substantially the same functional configuration may be distinguished by attaching different numerals after the same reference numerals. However, when there is no particular need to distinguish between a plurality of constituent elements having substantially the same functional configuration, only the same reference numerals are used. Also, similar components in different embodiments may be distinguished by attaching different alphabets after the same reference numerals. However, when there is no particular need to distinguish between similar components of different embodiments, only the same reference numerals are used.

（０．実施形態の概要）
本発明の実施形態の概要について説明する。本発明の実施形態では、入力データ（学習用データ）と正解値との組み合わせに基づいてニューラルネットワークの学習を行う学習装置について説明する。しかし、ニューラルネットワークは、機械学習アルゴリズムの一例に過ぎない。したがって、ニューラルネットワークの代わりに他の機械学習アルゴリズムが用いられてもよい。例えば、機械学習アルゴリズムの他の一例として、ＳＶＭ（ＳｕｐｐｏｒｔＶｅｃｔｏｒＭａｃｈｉｎｅ）などが用いられてもよい。 (0. Outline of embodiment)
An outline of an embodiment of the present invention will be described. In the embodiment of the present invention, a learning device that performs neural network learning based on a combination of input data (learning data) and correct values will be described. But neural networks are just one example of machine learning algorithms. Therefore, other machine learning algorithms may be used instead of neural networks. For example, SVM (Support Vector Machine) or the like may be used as another example of the machine learning algorithm.

（１．実施形態の詳細）
本発明の実施形態について詳細に説明する。 (1. Details of embodiment)
Embodiments of the present invention will be described in detail.

（１．１．学習装置の構成例）
図１は、本発明の実施形態に係る学習装置１０の機能構成例を示す図である。図１に示されるように、本発明の実施形態に係る学習装置１０は、入力部１０１と、推論部１０２と、説明部１０３と、推論評価部１０４と、説明評価部１０５と、更新部１０６と、提示制御部１０７と、記録制御部１０８と、表示部１２１と、操作部１２２とを備える。 (1.1. Configuration example of learning device)
FIG. 1 is a diagram showing a functional configuration example of a learning device 10 according to an embodiment of the present invention. As shown in FIG. 1, the learning device 10 according to the embodiment of the present invention includes an input unit 101, an inference unit 102, an explanation unit 103, an inference evaluation unit 104, an explanation evaluation unit 105, and an update unit 106. , presentation control unit 107 , recording control unit 108 , display unit 121 , and operation unit 122 .

本発明の実施形態では、推論部１０２が、ｎ個（ｎは１より大きい整数）の推論モデル、すなわち、「第１推論モデル」から「第ｎ推論モデル」までを含む場合を主に想定する。また、本発明の実施形態では、第１推論モデルから第ｎ推論モデルまでのそれぞれが、ニューラルネットワークを含んで構成される場合を主に想定する。以下では、ニューラルネットワークを「ＮＮ」とも表記する。 In the embodiment of the present invention, it is mainly assumed that the reasoning unit 102 includes n (n is an integer greater than 1) reasoning models, that is, from "first reasoning model" to "nth reasoning model". . Moreover, in the embodiments of the present invention, it is mainly assumed that each of the first to n-th inference models includes a neural network. Below, a neural network is also written as "NN."

第１推論モデルから第ｎ推論モデルまでのそれぞれに含まれるＮＮは、重みパラメータ１１０（第１の重みパラメータ）を使用する。このとき、第１推論モデルから第ｎ推論モデルまでのそれぞれに含まれるＮＮは、共通の構造を有し、使用する重みパラメータ１１０（第１の重みパラメータ）が異なっていてもよい。あるいは、第１推論モデルから第ｎ推論モデルまでのそれぞれに含まれるＮＮは、別々の構造を有していてもよい。 The NNs included in each of the first to n-th inference models use weight parameters 110 (first weight parameters). At this time, the NNs included in each of the first to n-th inference models may have a common structure and may use different weighting parameters 110 (first weighting parameters). Alternatively, the NNs included in each of the first to n-th inference models may have separate structures.

なお、第１推論モデルから第ｎ推論モデルまでの少なくとも一つが、ＮＮを含んでもよい。例えば、第１推論モデルから第ｎ推論モデルまでの一部がＮＮを含んでもよく、第１推論モデルから第ｎ推論モデルまでの他の一部は、ＮＮの代わりに他の機械学習アルゴリズムを含んでもよい。 At least one of the first to n-th inference models may include NN. For example, part of the first to n-th inference model may include NN, and another part of the first to n-th inference model includes other machine learning algorithms instead of NN. It's okay.

さらに、本発明の実施形態では、説明部１０３がＮＮを含んで構成される場合を主に想定する。説明部１０３に含まれるＮＮは、重みパラメータ（第２の重みパラメータ）を使用する。 Furthermore, in the embodiment of the present invention, it is mainly assumed that the explanation part 103 is configured including the NN. The NN included in the explanation part 103 uses a weight parameter (second weight parameter).

データセット１００、第１推論モデルから第ｎ推論モデルまでの重みパラメータ１１０（第１の重みパラメータ）および説明部１０３が有する重みパラメータ（第２の重みパラメータ）は、図示しない記憶部によって記憶される。かかる記憶部は、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）、ハードディスクドライブまたはフラッシュメモリなどのメモリによって構成されてよい。 The data set 100, the weighting parameters 110 (first weighting parameters) from the first to n-th inference models, and the weighting parameters (second weighting parameters) of the explanation section 103 are stored in a storage section (not shown). . The storage unit may be composed of a memory such as a RAM (Random Access Memory), a hard disk drive, or a flash memory.

入力部１０１と、推論部１０２と、説明部１０３と、推論評価部１０４と、説明評価部１０５と、更新部１０６と、提示制御部１０７と、記録制御部１０８とは、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）またはＧＰＵ（ＧｒａｐｈｉｃｓＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）などの演算装置を含み、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）により記憶されているプログラムが演算装置によりＲＡＭに展開されて実行されることにより、その機能が実現され得る。このとき、当該プログラムを記録した、コンピュータに読み取り可能な記録媒体も提供され得る。あるいは、これらのブロックは、専用のハードウェアにより構成されていてもよいし、複数のハードウェアの組み合わせにより構成されてもよい。演算装置による演算に必要なデータは、図示しない記憶部によって適宜記憶される。 The input unit 101, the inference unit 102, the explanation unit 103, the inference evaluation unit 104, the explanation evaluation unit 105, the update unit 106, the presentation control unit 107, and the recording control unit 108 are implemented by a CPU (Central Processing Unit). ) or a GPU (Graphics Processing Unit), and a program stored in a ROM (Read Only Memory) is loaded into a RAM and executed by the arithmetic device to realize its function. At this time, a computer-readable recording medium recording the program may also be provided. Alternatively, these blocks may be composed of dedicated hardware, or may be composed of a combination of multiple pieces of hardware. Data necessary for calculation by the calculation device are appropriately stored in a storage unit (not shown).

初期状態において、第１推論モデルから第ｎ推論モデルまでの重みパラメータ１１０および説明部１０３が有する重みパラメータそれぞれには、初期値が設定されている。例えば、これらに設定される初期値は、ランダムな値であってよいが、どのような値であってもよい。例えば、これらに設定される初期値は、あらかじめ学習によって得られた学習済みの値であってもよい。 In the initial state, the weight parameters 110 of the first to n-th inference models and the weight parameters of the explanation part 103 are set to initial values. For example, the initial values set to these may be random values, but may be any value. For example, the initial values set for these may be learned values obtained in advance through learning.

（データセット１００）
データセット１００は、学習段階において使用される複数の入力データ（第１の入力データ）と当該複数の入力データそれぞれの正解値とを含む。学習段階において使用される複数の入力データは、学習用データに該当し得る。さらに、データセット１００は、テスト段階において使用される複数の入力データ（第２の入力データ）を含む。テスト段階において使用される複数の入力データは、テスト用データに該当し得る。 (Dataset 100)
Data set 100 includes a plurality of input data (first input data) used in the learning stage and correct values for each of the plurality of input data. A plurality of input data used in the learning stage may correspond to data for learning. Furthermore, data set 100 includes a plurality of input data (second input data) used in the test phase. A plurality of input data used in the test phase may correspond to test data.

なお、テスト用データは、学習用データと別のデータとして用意されていることが主に想定される。しかし、テスト用データは、学習用データの一部を含んでもよい。 In addition, it is mainly assumed that the test data is prepared as data different from the learning data. However, the test data may include part of the learning data.

また、本発明の実施形態では、入力データが画像データである場合（特に、静止画像データである場合）を主に想定する。しかし、入力データの種類は特に限定されず、画像データ以外も入力データとして用いられ得る。例えば、入力データは、複数のフレームを含んだ動画像データであってもよいし、音響データであってもよい。 Also, in the embodiments of the present invention, it is mainly assumed that input data is image data (particularly still image data). However, the type of input data is not particularly limited, and data other than image data can be used as input data. For example, the input data may be moving image data including a plurality of frames, or may be audio data.

（入力部１０１）
入力部１０１は、学習段階において、データセット１００から学習段階において使用される入力データおよび正解値の組み合わせを順次に取得する。入力部１０１は、学習段階において使用される入力データおよび正解値の組み合わせを順次に推論部１０２に出力する。また、入力部１０１は、テスト段階において、データセット１００からテストにおいて使用される入力データを順次に取得する。入力部１０１は、テスト段階において使用される入力データを順次に推論部１０２に出力する。 (Input unit 101)
In the learning stage, the input unit 101 sequentially acquires combinations of input data and correct values used in the learning stage from the data set 100 . The input unit 101 sequentially outputs combinations of input data and correct values used in the learning stage to the inference unit 102 . Also, in the test stage, the input unit 101 sequentially acquires input data used in the test from the data set 100 . The input unit 101 sequentially outputs input data used in the test stage to the inference unit 102 .

なお、例えば、入力部１０１は、データセット１００から学習段階において使用される入力データおよび正解値の組み合わせを全部取得して出力し終わった場合には、最初から当該組み合わせを取得し直して再度出力する動作を所定の回数繰り返してよい。かかる場合には、入力部１０１よりも後段のブロックにおいても、再度の入力に基づいて順次に各自の処理が繰り返し実行されてよい。一方、例えば、入力部１０１は、データセット１００からテスト段階において使用される入力データを全部取得して出力し終わった場合には、入力データの取得を終了してよい。 Note that, for example, when the input unit 101 acquires and outputs all combinations of input data and correct values used in the learning stage from the data set 100, it acquires and outputs the combinations again from the beginning. The operation of performing may be repeated a predetermined number of times. In such a case, the blocks subsequent to the input unit 101 may sequentially repeat their processes based on the re-input. On the other hand, for example, the input unit 101 may terminate the acquisition of input data when all the input data used in the test stage have been acquired from the data set 100 and output.

（推論部１０２）
推論部１０２は、学習段階において、入力部１０１から入力された入力データと第１推論モデルから第ｎ推論モデルまでとに基づいて、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値（第１の推論値）を得る。同様に、推論部１０２は、テスト段階において、入力部１０１から入力された入力データと第１推論モデルから第ｎ推論モデルまでとに基づいて、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値（第２の推論値）を得る。 (Inference unit 102)
In the learning stage, the inference unit 102 performs inference corresponding to each of the first to nth inference models based on the input data input from the input unit 101 and the first to nth inference models. Get a value (first inference value). Similarly, in the test stage, the inference unit 102, based on the input data input from the input unit 101 and the first inference model to the nth inference model, Obtain the corresponding inference value (second inference value).

第１推論モデルから第ｎ推論モデルまでが使用する重みパラメータ１１０は図示しない記憶部によって記憶されている。したがって、推論部１０２は、図示しない記憶部から重みパラメータ１１０を取得し、取得した重みパラメータ１１０と入力部１０１から入力された入力データとに基づいて、第１推論モデルから第ｎ推論モデルまでによる推論を行う。 The weight parameters 110 used by the first to n-th inference models are stored in a storage unit (not shown). Therefore, the inference unit 102 acquires the weight parameter 110 from the storage unit (not shown), and based on the acquired weight parameter 110 and the input data input from the input unit 101, the first to n-th inference models make inferences.

なお、本明細書においては、ＮＮへの入力に基づいてＮＮからの出力を得ることを広く「推論」と言う。 In this specification, obtaining an output from a NN based on an input to the NN is broadly called "inference".

一例として、ｉ番目の推論モデルを示す関数をＦｉ（ｉは１～ｎまでの整数）とし、ｉ番目の推論モデルへの入力をｘとすると、ｉ番目の推論モデルからの出力はＦｉ（ｘ）と表現され得る。 As an example, if the function representing the i-th inference model is Fi (i is an integer from 1 to n) and the input to the i-th inference model is x, the output from the i-th inference model is Fi (x ).

なお、後にも説明するように、説明部１０３が用いる説明手法（すなわち、説明情報の生成手法）には、推論値の他に第１推論モデルから第ｎ推論モデルまでのそれぞれから出力される特徴量（中間特徴量）などの情報を必要とする説明手法が存在する場合があり得る。かかる場合には、推論部１０２は、推論値とともに、第１推論モデルから第ｎ推論モデルまでのそれぞれの中間層から出力される特徴量を説明部１０３に出力してよい。 As will be described later, the explanation method (that is, the method of generating explanation information) used by the explanation unit 103 includes, in addition to the inference values, the feature values output from each of the first to nth inference models. There may be cases where there are explanation methods that require information such as quantities (intermediate features). In such a case, the inference unit 102 may output to the explanation unit 103 feature amounts output from the respective intermediate layers from the first inference model to the n-th inference model together with the inference value.

第１推論モデルから第ｎ推論モデルまでの具体的な構成は、特に限定されない。しかし、第１推論モデルから第ｎ推論モデルまでのそれぞれの出力の形式は、入力データに対応する正解値の形式と合わせて設定されているのがよい。例えば、正解値が分類問題のクラスである場合、第１推論モデルから第ｎ推論モデルまでのそれぞれの出力は、クラス数分の長さを有するｏｎｅ－ｈｏｔベクトルであるとよい。 The specific configurations of the first to n-th inference models are not particularly limited. However, it is preferable that the output format of each of the first to n-th inference models is set together with the format of the correct value corresponding to the input data. For example, if the correct answer is a class of a classification problem, each output from the first inference model to the nth inference model may be a one-hot vector having a length equal to the number of classes.

推論部１０２は、学習段階において、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値を、説明部１０３および推論評価部１０４それぞれに出力する。一方、推論部１０２は、テスト段階において、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値を、説明部１０３および提示制御部１０７それぞれに出力する。 Inference unit 102 outputs inference values corresponding to each of the first to n-th inference models to explanation unit 103 and inference evaluation unit 104 in the learning stage. On the other hand, inference unit 102 outputs inference values corresponding to the first to n-th inference models to explanation unit 103 and presentation control unit 107 in the test stage.

（説明部１０３）
説明部１０３は、第１推論モデルから第ｎ推論モデルまでのそれぞれについて、推論部１０２から入力された推論値の判断根拠を説明する説明情報を生成する。 (Description part 103)
The explanation unit 103 generates explanation information explaining the basis for judgment of the inference value input from the inference unit 102 for each of the first to n-th inference models.

ここで、説明情報は、推論部１０２から入力された推論値に対する入力データの寄与の大きさを示す情報である。以下では、説明情報が推論値に対する入力データの寄与の大きさを領域（例えば、画像を構成するピクセルなど）または変数ごとに示すヒートマップである場合について主に説明する。ヒートマップによれば、入力データのうち判断に寄与した重要な領域または変数が示され得る。 Here, the explanation information is information indicating the degree of contribution of the input data to the inference value input from the inference unit 102 . In the following, a case where the explanation information is a heat map showing the magnitude of contribution of input data to an inference value for each region (for example, pixels constituting an image) or for each variable will be mainly explained. A heat map can show the important areas or variables of the input data that contributed to the decision.

入力データが画像データなどである場合には、ヒートマップは２次元ベクトルによって表現され得る。あるいは、入力データが表形式データなどである場合には、ヒートマップは１次元ベクトルによって表現され得る。 If the input data is image data or the like, the heatmap can be represented by a two-dimensional vector. Alternatively, if the input data is tabular data or the like, the heatmap can be represented by a one-dimensional vector.

ヒートマップはどのように生成されてもよい。例えば、説明部１０３は、推論部１０２から入力された推論値に基づいて、ヒートマップを生成してもよい。あるいは、上記したように、推論部１０２から説明部１０３に推論値だけではなく特徴量も入力される場合があり得る。かかる場合には、説明部１０３は、推論部１０２から入力された推論値と特徴量とに基づいて、ヒートマップを生成してもよい。 A heatmap may be generated in any way. For example, the explanation unit 103 may generate a heat map based on the inference values input from the inference unit 102 . Alternatively, as described above, not only the inference value but also the feature amount may be input from the inference unit 102 to the description unit 103 . In such a case, the explanation unit 103 may generate a heat map based on the inference values and feature amounts input from the inference unit 102 .

例えば、説明部１０３は、誤差逆伝播が可能な関数を含んでいてもよい。このとき、後に説明するように、更新部１０６によって説明部１０３が有する重みパラメータが誤差逆伝播法によって更新され得る。すなわち、説明部１０３は、誤差逆伝播法による更新後の重みパラメータによってヒートマップを生成してもよい。 For example, the explanation part 103 may include a function capable of back propagation. At this time, as will be described later, the updating unit 106 can update the weighting parameters of the explanation unit 103 by the error backpropagation method. That is, the explanation unit 103 may generate a heat map using the weight parameters updated by the backpropagation method.

誤差逆伝播法による更新後の重みパラメータによってヒートマップを生成する説明手法としては、非特許文献３に記載された、いわゆるＧｒａｄ－ＣＡＭなどが適用され得る。Ｇｒａｄ－ＣＡＭは、ＮＮへの入力のうち推論値への寄与度が高い領域を示すヒートマップを出力する説明手法である。その他にも、ＶａｎｉｌｌａＧｒａｄｉｅｎｔ、ＳｍｏｏｔｈＧｒａｄといった各種の説明手法が適用され得る。 The so-called Grad-CAM described in Non-Patent Document 3 can be applied as an explanation method for generating a heat map using weight parameters after being updated by the error backpropagation method. Grad-CAM is an explanation method that outputs a heat map showing regions of inputs to the NN that have a high degree of contribution to inference values. In addition, various explanation methods such as Vanilla Gradient and SmoothGrad can be applied.

上記したように、ｉ番目の推論モデルに対応する推論値はＦｉ（ｘ）と表現され得るため、一例として、ヒートマップの生成処理を示す関数をＧとすると、説明部１０３によって生成されるｉ番目の推論モデルに対応するヒートマップＴｉ（ｘ）は、以下の式（１）のように表現され得る。 As described above, the inference value corresponding to the i-th inference model can be expressed as Fi(x). Therefore, as an example, assuming that the function indicating the heat map generation process is G, the i A heat map Ti(x) corresponding to the th inference model can be expressed as in Equation (1) below.

Ｔｉ（ｘ）＝Ｇ（Ｆｉ（ｘ））・・・（１） Ti(x)=G(Fi(x)) (1)

説明部１０３は、学習段階において、生成したｎ個のヒートマップ（第１の説明情報）を説明評価部１０５に出力する。一方、説明部１０３は、テスト段階において、生成したｎ個のヒートマップ（第２の説明情報）を提示制御部１０７に出力する。 Explanation unit 103 outputs the generated n heat maps (first explanation information) to explanation evaluation unit 105 in the learning stage. On the other hand, the explanation unit 103 outputs the generated n heat maps (second explanation information) to the presentation control unit 107 in the test stage.

（推論評価部１０４）
推論評価部１０４は、推論部１０２から入力された第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値と入力部１０１によって取得された正解値とに基づいて、推論評価結果を得る。より詳細に、推論評価部１０４は、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値と入力部１０１によって取得された正解値とを比較することによって、推論評価結果を得る。 (Inference evaluation unit 104)
The inference evaluation unit 104 obtains an inference evaluation result based on the inference values corresponding to the first to n-th inference models input from the inference unit 102 and the correct values obtained by the input unit 101. . More specifically, the inference evaluation unit 104 obtains an inference evaluation result by comparing the inference values corresponding to the first to n-th inference models with the correct values obtained by the input unit 101 .

本発明の実施形態では、推論評価部１０４が、推論値と正解値とに応じた損失関数の第１推論モデルから第ｎ推論モデルまでについての和を推論評価結果の例としての損失関数Ｌ１として算出する場合を想定する。ここで、推論値と正解値とに応じた損失関数は特定の関数に限定されず、一般的なニューラルネットワークにおいて用いられる損失関数と同様の損失関数が用いられてよい。例えば、推論値と正解値とに応じた損失関数は、正解値と推論値との差分に基づくクロスエントロピー誤差であってもよい。 In the embodiment of the present invention, the inference evaluation unit 104 uses the sum of the first to n-th inference models of the loss functions corresponding to the inference value and the correct value as the loss function L1 as an example of the inference evaluation result. Assume the case of calculation. Here, the loss function according to the inference value and the correct value is not limited to a specific function, and a loss function similar to loss functions used in general neural networks may be used. For example, the loss function according to the inferred value and the correct value may be a cross-entropy error based on the difference between the correct value and the inferred value.

推論評価部１０４は、推論評価結果を更新部１０６に出力する。 The inference evaluation unit 104 outputs the inference evaluation result to the update unit 106 .

（説明評価部１０５）
説明評価部１０５は、説明部１０３から入力された第１推論モデルから第ｎ推論モデルまでのそれぞれに対応するヒートマップに基づいて説明評価結果を得る。より詳細に、説明評価部１０５は、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応するヒートマップ同士を比較する。そして、説明評価部１０５は、比較結果としてのｎ個のヒートマップ同士の一致度に基づいて、説明評価結果を得る。 (Description evaluation unit 105)
Explanation evaluation unit 105 obtains explanation evaluation results based on heat maps corresponding to the first to n-th inference models input from explanation unit 103 . More specifically, the explanation evaluation unit 105 compares heat maps corresponding to the first to n-th inference models. Then, the description evaluation unit 105 obtains a description evaluation result based on the degree of matching between the n heat maps as the comparison result.

本発明の実施形態では、ｎ個のヒートマップ同士の一致度が大きいほど説明評価結果が小さい値を取る損失関数である場合を主に想定する。なお、ｎ個のヒートマップ同士の一致度は、ｎ個のヒートマップ同士がどの程度乖離しているかを示す乖離度と換言されてもよい。かかる場合には、ｎ個のヒートマップ同士の乖離度が小さいほど説明評価結果が小さい値を取る損失関数であってよい。 In the embodiment of the present invention, it is mainly assumed that the loss function takes a smaller value for the explanation evaluation result as the degree of matching between the n heat maps increases. Note that the degree of matching between the n heat maps may be rephrased as the degree of divergence indicating how much the n heat maps diverge from each other. In such a case, the loss function may be such that the smaller the degree of divergence between the n heat maps, the smaller the explanation evaluation result.

ｎ個のヒートマップから説明評価結果を得る手法は限定されない。ここでは、説明評価結果を得る手法として、ヒートマップを二値化したマスクと他のヒートマップとの掛け合わせによって説明評価結果を得る手法、および、正規化されたヒートマップ同士の内積によって説明評価結果を得る手法について順に説明する。 The method of obtaining explanatory evaluation results from n heatmaps is not limited. Here, as a method to obtain explanation evaluation results, a method to obtain explanation evaluation results by multiplying a heat map binarized mask with another heat map, and a method to obtain explanation evaluation results by using the inner product of normalized heat maps The method for obtaining the results will be described in order.

図２は、ヒートマップを二値化したマスクと他のヒートマップとの掛け合わせによって説明評価結果を得る手法について説明するための図である。図２に示された例では、説明を簡便にするため、ｎ＝２である場合、すなわち、推論部１０２が、第１推論モデルおよび第２推論モデルを有する場合を想定する。 FIG. 2 is a diagram for explaining a method of obtaining an explanation evaluation result by multiplying a mask obtained by binarizing a heat map and another heat map. In the example shown in FIG. 2, to simplify the explanation, it is assumed that n=2, that is, the reasoning unit 102 has the first reasoning model and the second reasoning model.

図２を参照すると、第１推論モデルからは、推論値とヒートマップＨ１とが出力されている。一方、第２推論モデルからは、推論値とヒートマップＨ２とが出力されている。図２では、ヒートマップＨ１およびヒートマップＨ２において、入力データのうち推論値への寄与が大きい領域ほど濃い色によって示されている。 Referring to FIG. 2, the inference values and the heat map H1 are output from the first inference model. On the other hand, an inference value and a heat map H2 are output from the second inference model. In FIG. 2, in the heat map H1 and the heat map H2, areas of the input data that contribute more to the inference value are indicated by darker colors.

説明評価部１０５は、ヒートマップＨ１の二値化を行ってマスクＭ１を生成するとともに、ヒートマップＨ２の二値化を行ってマスクＭ２を生成する。なお、二値化は、閾値ｃ以上である要素（例えば、ヒートマップを構成するピクセル）の値を１とし、閾値ｃよりも小さい要素の値を０とすることによって実行され得る。図２においては、二値のうち１が黒によって示され、０が白によって示されている。 The explanation evaluation unit 105 binarizes the heat map H1 to generate the mask M1, and binarizes the heat map H2 to generate the mask M2. Note that binarization can be performed by assigning 1 to the values of elements that are equal to or greater than the threshold c (for example, pixels that make up the heatmap) and to 0 to the values of elements that are smaller than the threshold c. In FIG. 2, 1's of the binary are indicated by black and 0's by white.

説明評価部１０５は、第１推論モデルから出力されたヒートマップＨ１と、第２の推論モデルから出力されたヒートマップＨ２から生成したマスクＭ２との積を、要素ごとに計算する。同様に、説明評価部１０５は、第２推論モデルから出力されたヒートマップＨ２と、第１の推論モデルから出力されたヒートマップＨ１から生成したマスクＭ１との積を、要素ごとに計算する。これによって、各要素に対応する積の集合が推論モデルごとに得られる。 The explanation evaluation unit 105 calculates the product of the heat map H1 output from the first inference model and the mask M2 generated from the heat map H2 output from the second inference model for each element. Similarly, the explanation evaluation unit 105 calculates the product of the heat map H2 output from the second inference model and the mask M1 generated from the heat map H1 output from the first inference model for each element. This yields a set of products corresponding to each element for each inference model.

説明評価部１０５は、各要素に対応する積を全部の推論モデルについて足し合わせることによって積の和を計算する。そして、説明評価部１０５は、このようにして計算した積の和を全要素について足し合わせることによって合計値を計算する。説明評価部１０５は、この合計値を説明評価結果の例としての損失関数Ｌ２とする。 The explanation evaluation unit 105 calculates the sum of the products by adding the products corresponding to each element for all the inference models. Then, the explanation evaluation unit 105 calculates a total value by adding the sums of products calculated in this way for all the elements. The explanation evaluation unit 105 uses this total value as a loss function L2 as an example of the explanation evaluation result.

図２を参照しながらｎ＝２である場合について説明した。ｎを１より大きい任意の整数であるとして説明すると、以下の通りである。 The case of n=2 has been described with reference to FIG. If n is an arbitrary integer greater than 1, it is as follows.

すなわち、説明評価部１０５は、ｉ＝１～ｎについて、ヒートマップＴｉ（ｘ）の各要素の値を二値化したマスクＭｉ（ｘ）を生成する。次に、説明評価部１０５は、推論モデルごとに、自身の推論モデルから出力されたヒートマップＴｉ（ｘ）と、自身以外の推論モデルに対応するヒートマップから生成したマスクＭ１（ｘ）～Ｍｉ－１（ｘ）、Ｍｉ＋１（ｘ）～Ｍｎ（ｘ）の和との積を要素ごとに計算する。 That is, the explanation evaluation unit 105 generates a mask Mi(x) by binarizing the values of the elements of the heat map Ti(x) for i=1 to n. Next, for each inference model, the explanation evaluation unit 105 generates masks M1(x) to Mi Compute the product of −1(x) and the sum of Mi+1(x) to Mn(x) element by element.

説明評価部１０５は、各要素に対応する積を第１推論モデルから第ｎ推論モデルまでについて足し合わせることによって積の和を計算する。そして、説明評価部１０５は、このようにして計算した積の和に基づいて、説明評価結果を得る。より詳細に、説明評価部１０５は、積の和を全要素について足し合わせることによって合計値を計算する。説明評価部１０５は、この合計値を説明評価結果の例としての損失関数Ｌ２とする。 The explanation evaluation unit 105 calculates the sum of the products by adding the products corresponding to the respective elements for the first to n-th inference models. Then, the explanation evaluation unit 105 obtains an explanation evaluation result based on the sum of products calculated in this way. More specifically, the explanation evaluation unit 105 calculates the total value by adding the sum of products for all the elements. The explanation evaluation unit 105 uses this total value as a loss function L2 as an example of the explanation evaluation result.

この損失関数Ｌ２は、各ヒートマップにおいて、自身以外のヒートマップにおいて閾値以上の値を持つ領域の合計値である。この損失関数Ｌ２の値を小さくするように学習が行われることによって、ヒートマップの一致度が小さいｎ個の推論モデルが得られる。なお、このときの損失関数Ｌ２は、以下の式（２）のように表現され得る。式（２）において、ｅは、要素番号を示す。ここで、ヒートマップＴｉ（ｘ）は、ヒートマップＴｉ（ｘ）の大きさ｜Ｔｉ（ｘ）｜で割るなどして正規化してもよい。また、ヒートマップＴｉ（ｘ）にはsigmoidなどの活性化関数をかけてもよい。 This loss function L2 is the total value of regions having values equal to or greater than the threshold value in heat maps other than the heat map itself. By performing learning so as to reduce the value of this loss function L2, n inference models with a low heat map matching degree are obtained. It should be noted that the loss function L2 at this time can be expressed as in Equation (2) below. In formula (2), e indicates an element number. Here, the heat map Ti(x) may be normalized by dividing by the size |Ti(x)| of the heat map Ti(x). Also, the heat map Ti(x) may be multiplied by an activation function such as sigmoid.

図２を参照しながら、ヒートマップを二値化したマスクと他のヒートマップとの掛け合わせによって説明評価結果を得る手法について説明した。続いて、正規化されたヒートマップ同士の内積によって説明評価結果を得る手法について説明する。 With reference to FIG. 2, a method of obtaining explanatory evaluation results by multiplying a mask obtained by binarizing a heat map with another heat map has been described. Next, a method of obtaining an explanation evaluation result by an inner product of normalized heat maps will be described.

説明評価部１０５は、ｉ＝１～ｎについて、ヒートマップＴｉ（ｘ）をヒートマップＴｉ（ｘ）の大きさ｜Ｔｉ（ｘ）｜で割ることによって正規化して、ｉ＝１～ｎについての正規化したベクトルを生成する。そして、説明評価部１０５は、ｉ＝１～ｎについての正規化したベクトルの内積に基づいて説明評価結果を得る。より詳細に、説明評価部１０５は、内積を全要素について足し合わせることによって合計値を計算する。説明評価部１０５は、この合計値を説明評価結果の例としての損失関数Ｌ２とする。 The explanation evaluation unit 105 normalizes the heat map Ti(x) for i = 1 to n by dividing it by the size of the heat map Ti(x) |Ti(x)| for i = 1 to n. Generate a normalized vector. Then, explanation evaluation section 105 obtains explanation evaluation results based on the inner product of normalized vectors for i=1 to n. More specifically, the explanation evaluation unit 105 calculates the total value by adding the inner products of all the elements. The explanation evaluation unit 105 uses this total value as a loss function L2 as an example of the explanation evaluation result.

正規化したベクトルの内積が大きいほど、この損失関数Ｌ２は、大きい値となる。正規化したベクトルの内積が大きいことは、ヒートマップ同士の一致度が高いことを意味する。したがって、この損失関数Ｌ２の値を小さくするように学習が行われることによって、ヒートマップの一致度が小さいｎ個の推論モデルが得られる。なお、このときの損失関数Ｌ２は、以下の式（３）のように表現され得る。式（３）において、ｅは、要素番号を示す。 The larger the inner product of the normalized vectors, the larger the value of this loss function L2. A large inner product of normalized vectors means a high degree of matching between the heat maps. Therefore, by performing learning so as to reduce the value of this loss function L2, n inference models with a small degree of matching in the heat map are obtained. It should be noted that the loss function L2 at this time can be expressed as in Equation (3) below. In formula (3), e indicates an element number.

説明評価部１０５は、説明評価結果を更新部１０６に出力する。 Description evaluation section 105 outputs the description evaluation result to update section 106 .

（更新部１０６）
更新部１０６は、推論評価部１０４から入力された推論評価結果と、説明評価部１０５から入力された説明評価結果とに基づいて、第１推論モデルから第ｎ推論モデルまでのそれぞれが使用する重みパラメータ１１０の更新を行う。これによって、第１推論モデルから第ｎ推論モデルまでのそれぞれから出力される推論値が正解値に近づくように、かつ、説明部１０３から出力されるｎ個のヒートマップ同士の一致度が小さくなるように、重みパラメータ１１０が更新され得る。重みパラメータ１１０は、誤差逆伝播法（バックプロパゲーション）によって更新されてよい。 (Update unit 106)
Based on the inference evaluation result input from the inference evaluation unit 104 and the explanation evaluation result input from the explanation evaluation unit 105, the update unit 106 updates the weights used by the first to n-th inference models. Update the parameters 110 . As a result, the inference values output from each of the first inference model to the n-th inference model approach the correct values, and the degree of matching between the n heat maps output from the explanation unit 103 decreases. As such, the weight parameter 110 may be updated. Weight parameters 110 may be updated by backpropagation.

例えば、更新部１０６は、推論評価部１０４から入力された推論評価結果と、説明評価部１０５から入力された説明評価結果とを加算し、加算結果に基づいて、重みパラメータ１１０の更新を行えばよい。このとき、更新部１０６は、計算した加算結果を誤差として、誤差逆伝播法（バックプロパゲーション）によって重みパラメータ１１０を更新すればよい。上記のように、推論評価結果が損失関数Ｌ１と表現され、説明評価結果が損失関数Ｌ２と表現される場合、加算結果は、Ｌ１＋Ｌ２である。 For example, the update unit 106 adds the inference evaluation result input from the inference evaluation unit 104 and the explanation evaluation result input from the explanation evaluation unit 105, and updates the weight parameter 110 based on the addition result. good. At this time, the update unit 106 may update the weight parameter 110 by error back propagation using the calculated addition result as an error. As described above, when the inference evaluation result is expressed as a loss function L1 and the explanation evaluation result is expressed as a loss function L2, the addition result is L1+L2.

さらに、更新部１０６は、説明部１０３が有する重みパラメータを更新してよい。より詳細に、説明部１０３が、誤差逆伝播が可能な関数を含む場合、更新部１０６は、推論評価結果と説明評価結果とに基づいて、誤差逆伝播法（バックプロパゲーション）によって、説明部１０３が有する重みパラメータを更新してよい。 Furthermore, the updating unit 106 may update the weighting parameters that the explanation unit 103 has. More specifically, when the explanation part 103 includes a function that can be backpropagated, the update part 106 updates the explanation part 103 by error backpropagation based on the inference evaluation result and the explanation evaluation result. 103 may have a weight parameter updated.

なお、学習の終了条件（すなわち、重みパラメータ更新の終了条件）は特に限定されず、第１推論モデルから第ｎ推論モデルまでの学習がある程度行われたことを示す条件であればよい。具体的に、学習の終了件は、損失関数Ｌ１＋Ｌ２の値が閾値よりも小さいという条件を含んでもよい。あるいは、学習の終了条件は、損失関数Ｌ１＋Ｌ２の値の変化が閾値よりも小さいという条件（損失関数Ｌ１＋Ｌ２の値が収束状態になったという条件）を含んでもよい。あるいは、学習の終了条件は、重みパラメータの更新が所定の回数行われたという条件を含んでもよい。あるいは、推論評価部１０４によって正解値と推論値とに基づいて精度（例えば、正答率など）が算出される場合、学習の終了条件は、精度が所定の割合（例えば、９０％など）を超えるという条件を含んでもよい。 Note that the learning end condition (that is, the weight parameter update end condition) is not particularly limited as long as it indicates that the learning from the first inference model to the nth inference model has been performed to some extent. Specifically, the learning termination condition may include a condition that the value of the loss function L1+L2 is smaller than a threshold. Alternatively, the learning end condition may include a condition that the change in the value of the loss function L1+L2 is smaller than a threshold (a condition that the value of the loss function L1+L2 has converged). Alternatively, the learning termination condition may include a condition that the weighting parameters have been updated a predetermined number of times. Alternatively, when the inference evaluation unit 104 calculates the accuracy (for example, the correct answer rate) based on the correct value and the inference value, the learning termination condition is that the accuracy exceeds a predetermined percentage (for example, 90%) may include the condition

（提示制御部１０７）
提示制御部１０７は、テスト段階において、推論部１０２から入力された第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値と、説明部１０３から入力されたｎ個のヒートマップとが、ユーザに提示されるように制御する。より詳細に、提示制御部１０７は、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値と、ｎ個のヒートマップとが表示されるように表示部１２１を制御する。なお、ｎ個のヒートマップは表示されるが、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する推論値は表示されない形態も想定され得る。 (Presentation control unit 107)
In the test stage, the presentation control unit 107 compares the inference values corresponding to the first to n-th inference models input from the inference unit 102 and the n heat maps input from the explanation unit 103. , to be presented to the user. More specifically, the presentation control unit 107 controls the display unit 121 so that the inference values corresponding to the first to n-th inference models and n heat maps are displayed. It is also conceivable that the n heat maps are displayed but the inference values corresponding to the first to n-th inference models are not displayed.

（表示部１２１）
表示部１２１は、ディスプレイによって構成され、提示制御部１０７による制御に従って各種情報の表示を行う機能を有する。例えば、表示部１２１は、ｎ個の推論値とｎ個のヒートマップとを表示することが可能である。ここで、表示部１２１の形態は特に限定されない。例えば、表示部１２１は、液晶ディスプレイ（ＬＣＤ）装置であってもよいし、ＯＬＥＤ（ＯｒｇａｎｉｃＬｉｇｈｔＥｍｉｔｔｉｎｇＤｉｏｄｅ）装置であってもよいし、ランプなどの表示装置であってもよい。 (Display unit 121)
The display unit 121 is configured by a display, and has a function of displaying various information under the control of the presentation control unit 107 . For example, the display unit 121 can display n inference values and n heat maps. Here, the form of the display unit 121 is not particularly limited. For example, the display unit 121 may be a liquid crystal display (LCD) device, an OLED (Organic Light Emitting Diode) device, or a display device such as a lamp.

（操作部１２２）
操作部１２２は、ユーザによる操作を受け付ける。例えば、ユーザがｎ個の推論値とｎ個のヒートマップとを参照しながら、ｎ個の推論モデルから解釈性の高い１または複数の推論モデル（以下、「選択モデル」とも言う。）を見つけたとする。このとき、ユーザは、選択モデルを示す情報（以下、「選択モデル情報」とも言う。）を操作部１２２に入力し、操作部１２２は、選択モデル情報１２３を受け付ける。例えば、選択モデル情報１２３は、選択モデルを示す番号であってよい。 (Operation unit 122)
The operation unit 122 accepts user's operations. For example, while referring to n inference values and n heat maps, the user finds one or more inference models with high interpretability (hereinafter also referred to as "selected models") from n inference models. Suppose At this time, the user inputs information indicating the selected model (hereinafter also referred to as “selected model information”) to the operating section 122 , and the operating section 122 receives the selected model information 123 . For example, the selected model information 123 may be a number indicating the selected model.

なお、本発明の実施形態では、操作部１２２がマウスおよびキーボードである場合を主に想定する。しかし、操作部１２２の形態は特に限定されない。例えば、操作部１２２は、タッチパネルであってもよいし、他の入力装置であってもよい。 Note that the embodiment of the present invention mainly assumes that the operation unit 122 is a mouse and a keyboard. However, the form of the operation unit 122 is not particularly limited. For example, the operation unit 122 may be a touch panel or other input device.

（記録制御部１０８）
記録制御部１０８は、操作部１２２によってユーザから受け付けられた選択モデル情報１２３の記録を制御する。より詳細に、記録制御部１０８は、操作部１２２によってユーザから受け付けられた選択モデル情報１２３を図示しない記憶部に記憶させる。選択モデル情報１２３は、図示しない記憶部から後に取得され、選択モデル情報１２３によって示される選択モデルが、解釈性の高い学習済みモデルとして用いられ得る。 (Recording control unit 108)
The recording control unit 108 controls recording of the selected model information 123 received from the user through the operation unit 122 . More specifically, the recording control unit 108 causes the storage unit (not shown) to store the selected model information 123 received from the user through the operation unit 122 . The selection model information 123 is acquired later from a storage unit (not shown), and the selection model indicated by the selection model information 123 can be used as a highly interpretable trained model.

なお、テストの終了条件は特に限定されず、ユーザにとって十分な回数のテストが行われたことを示す条件であればよい。具体的に、テストの終了条件は、テスト段階においてユーザによって推論結果の確認が所定の回数以上行われたという条件を含んでもよい。 Note that the conditions for ending the test are not particularly limited, and may be any condition that indicates that the test has been performed a sufficient number of times for the user. Specifically, the end condition of the test may include a condition that the user confirms the inference result more than a predetermined number of times in the test stage.

以上、本発明の実施形態に係る学習装置１０の構成例について説明した。 The configuration example of the learning device 10 according to the embodiment of the present invention has been described above.

（１．２．学習段階における動作）
図３を参照しながら、本発明の実施形態に係る学習装置１０の学習段階における動作の流れについて説明する。図３は、本発明の実施形態に係る学習装置１０の学習段階における動作例を示すフローチャートである。 (1.2. Operation in the learning stage)
The flow of operations in the learning stage of the learning device 10 according to the embodiment of the present invention will be described with reference to FIG. FIG. 3 is a flow chart showing an operation example in the learning stage of the learning device 10 according to the embodiment of the present invention.

まず、図３に示されたように、入力部１０１は、データセット１００から入力データ（すなわち、学習用データ）および正解値の組み合わせを取得する。さらに、推論部１０２は、ｎ個の推論モデルそれぞれに対応する重みパラメータ１１０を取得する（Ｓ１１）。推論部１０２は、入力部１０１によって取得された入力データとｎ個の推論モデルとに基づいて推論を行い（Ｓ１２）、推論によって得られたｎ個の推論値を推論評価部１０４および説明部１０３それぞれに出力する。 First, as shown in FIG. 3, the input unit 101 acquires a combination of input data (that is, learning data) and correct values from the dataset 100 . Furthermore, the inference unit 102 acquires the weight parameters 110 corresponding to each of the n inference models (S11). The inference unit 102 makes an inference based on the input data acquired by the input unit 101 and the n inference models (S12). output to each.

説明部１０３は、推論部１０２から入力されたｎ個の推論値に基づいて、ｎ個の推論値それぞれの判断根拠を説明するヒートマップを生成する（Ｓ１３）。説明部１０３は、生成したｎ個のヒートマップを説明評価部１０５に出力する。 Based on the n inference values input from the inference unit 102, the explanation unit 103 generates a heat map explaining the basis for judgment of each of the n inference values (S13). Explanation unit 103 outputs the generated n heat maps to explanation evaluation unit 105 .

推論評価部１０４は、入力部１０１によって取得された正解値に基づいて、推論部１０２から入力されたｎ個の推論値を評価して推論評価結果を得る。より詳細に、推論評価部１０４は、正解値とｎ個の推論値とに応じた損失関数を推論評価結果として算出する。推論評価部１０４は、算出した推論評価結果を更新部１０６に出力する。 The inference evaluation unit 104 evaluates n inference values input from the inference unit 102 based on the correct values obtained by the input unit 101 to obtain inference evaluation results. More specifically, the inference evaluation unit 104 calculates a loss function corresponding to the correct value and the n inference values as an inference evaluation result. The inference evaluation unit 104 outputs the calculated inference evaluation result to the update unit 106 .

説明評価部１０５は、説明部１０３から入力されたｎ個のヒートマップの一致度に基づいて、説明評価結果を得る。より詳細に、説明評価部１０５は、説明部１０３から入力されたｎ個のヒートマップ同士の一致度に応じた損失関数を説明評価結果として算出する。説明評価部１０５は、算出した説明評価結果を更新部１０６に出力する（Ｓ１４）。 Description evaluation unit 105 obtains a description evaluation result based on the matching degree of the n heat maps input from description unit 103 . More specifically, the explanation evaluation unit 105 calculates a loss function according to the degree of matching between the n heat maps input from the explanation unit 103 as the explanation evaluation result. The explanation evaluation unit 105 outputs the calculated explanation evaluation result to the update unit 106 (S14).

更新部１０６は、推論評価部１０４から入力された推論評価結果と、説明評価部１０５から入力された説明評価結果とに基づいて、第１推論モデルから第ｎ推論モデルまでのそれぞれに対応する重みパラメータ１１０の更新を行う（Ｓ１５）。より詳細に、更新部１０６は、推論評価結果と説明評価結果とに基づいて、誤差逆伝播法によって、重みパラメータ１１０を更新する。さらに、更新部１０６は、推論評価結果と説明評価結果とに基づく誤差逆伝播法によって説明部１０３が有する重みパラメータの更新を行う。 Based on the inference evaluation result input from the inference evaluation unit 104 and the explanation evaluation result input from the explanation evaluation unit 105, the updating unit 106 updates weights corresponding to the first to n-th inference models. The parameter 110 is updated (S15). More specifically, the updating unit 106 updates the weight parameter 110 by error backpropagation based on the inference evaluation result and the explanatory evaluation result. Furthermore, the updating unit 106 updates the weighting parameters of the explanation unit 103 by backpropagation based on the inference evaluation result and the explanation evaluation result.

更新部１０６は、入力データに基づく重みパラメータの更新が終わるたびに、学習の終了条件が満たされたか否かを判断する（Ｓ１６）。学習の終了条件が満たされていないと判断した場合には（Ｓ１６において「ＮＯ」）、Ｓ１１に動作が移行され、入力部１０１によって次の入力データが取得され、推論部１０２、説明部１０３、推論評価部１０４、説明評価部１０５および更新部１０６それぞれによって、当該次の入力データに基づく各自の処理が再度実行される。一方、更新部１０６によって、学習の終了条件が満たされたと判断された場合には（Ｓ１６において「ＹＥＳ」）、学習が終了される。 The update unit 106 determines whether or not the learning termination condition is satisfied each time the update of the weight parameter based on the input data is completed (S16). If it is determined that the learning end condition is not satisfied ("NO" in S16), the operation proceeds to S11, the next input data is acquired by the input unit 101, the inference unit 102, the explanation unit 103, The inference evaluation unit 104, the explanation evaluation unit 105, and the update unit 106 each perform their respective processes again based on the next input data. On the other hand, when update unit 106 determines that the end condition for learning is satisfied ("YES" in S16), learning ends.

以上、本発明の実施形態に係る学習装置１０の学習段階における動作の流れについて説明した。 The flow of operations in the learning stage of the learning device 10 according to the embodiment of the present invention has been described above.

（１．３．テスト段階における動作）
図４を参照しながら、本発明の実施形態に係る学習装置１０のテスト段階における動作の流れについて説明する。図４は、本発明の実施形態に係る学習装置１０のテスト段階における動作例を示すフローチャートである。 (1.3. Operation in the test stage)
The flow of operations in the test stage of the learning device 10 according to the embodiment of the present invention will be described with reference to FIG. FIG. 4 is a flow chart showing an operation example in the test stage of the learning device 10 according to the embodiment of the present invention.

まず、図４に示されたように、入力部１０１は、データセット１００から入力データ（すなわち、テスト用データ）および正解値の組み合わせを取得する。さらに、推論部１０２は、ｎ個の推論モデルそれぞれに対応する重みパラメータ１１０を取得する（Ｓ２１）。推論部１０２は、入力部１０１によって取得された入力データとｎ個の推論モデルとに基づいて推論を行い（Ｓ２２）、推論によって得られたｎ個の推論値を説明部１０３および提示制御部１０７それぞれに出力する。 First, as shown in FIG. 4, the input unit 101 acquires a combination of input data (that is, test data) and correct values from the dataset 100 . Furthermore, the inference unit 102 acquires the weight parameters 110 corresponding to each of the n inference models (S21). The inference unit 102 performs inference based on the input data acquired by the input unit 101 and the n inference models (S22), and presents the n inference values obtained by the inference to the explanation unit 103 and the presentation control unit 107. output to each.

説明部１０３は、推論部１０２から入力されたｎ個の推論値に基づいて、ｎ個の推論値それぞれの判断根拠を説明するヒートマップを生成する（Ｓ２３）。説明部１０３は、生成したｎ個のヒートマップを提示制御部１０７に出力する。 Based on the n inference values input from the inference unit 102, the explanation unit 103 generates a heat map explaining the basis for the determination of each of the n inference values (S23). The explanation unit 103 outputs the generated n heat maps to the presentation control unit 107 .

提示制御部１０７は、推論部１０２から入力されたｎ個の推論値と、説明部１０３から入力されたｎ個のヒートマップとがユーザに提示されるように表示部１２１を制御する。表示部１２１は、提示制御部１０７による制御に従って、ｎ個の推論値と、ｎ個のヒートマップとを表示する（Ｓ２４）。 The presentation control unit 107 controls the display unit 121 so that the n inference values input from the inference unit 102 and the n heat maps input from the explanation unit 103 are presented to the user. The display unit 121 displays n inference values and n heat maps under the control of the presentation control unit 107 (S24).

操作部１２２は、ｎ個の推論モデルから解釈性が高いと判断された１または複数の推論モデルを示す情報（選択モデル情報１２３）をユーザから受け付ける。記録制御部１０８は、操作部１２２によってユーザから受け付けられた選択モデル情報１２３の記録を制御する（Ｓ２５）。図示しない記憶部は、記録制御部１０８による制御に従って、選択モデル情報１２３を記憶する。 The operation unit 122 receives information (selected model information 123) indicating one or a plurality of inference models determined to have high interpretability from the n inference models from the user. The recording control unit 108 controls recording of the selected model information 123 received from the user by the operation unit 122 (S25). A storage unit (not shown) stores the selected model information 123 under the control of the recording control unit 108 .

記録制御部１０８は、入力データに基づく選択モデル情報１２３の記録制御が終わるたびに、テストの終了条件が満たされたか否かを判断する（Ｓ２６）。テストの終了条件が満たされていないと判断した場合には（Ｓ２６において「ＮＯ」）、Ｓ２１に動作が移行され、入力部１０１によって次の入力データが取得され、推論部１０２、説明部１０３、提示制御部１０７および記録制御部１０８それぞれによって、当該次の入力データに基づく各自の処理が再度実行される。一方、記録制御部１０８によって、テストの終了条件が満たされたと判断された場合には（Ｓ２６において「ＹＥＳ」）、テストが終了される。 Each time recording control of the selected model information 123 based on the input data is finished, the recording control unit 108 determines whether or not the end condition of the test is satisfied (S26). If it is determined that the end condition of the test is not satisfied ("NO" in S26), the operation proceeds to S21, the next input data is acquired by the input unit 101, the inference unit 102, the explanation unit 103, Each of the presentation control unit 107 and the recording control unit 108 re-executes its own processing based on the next input data. On the other hand, when the recording control unit 108 determines that the end condition of the test is satisfied ("YES" in S26), the test is ended.

以上、本発明の実施形態に係る学習装置１０のテスト段階における動作の流れについて説明した。 The flow of operations in the test stage of the learning device 10 according to the embodiment of the present invention has been described above.

（１．４．実施形態の効果）
以上に説明したように、本発明の実施形態によれば、第１推論モデルから第ｎ推論モデルまでのそれぞれから出力される推論値が正解値に近づくように、かつ、説明情報として出力されるｎ個のヒートマップ同士の一致度が小さくなるように、学習が行われ得る。これによって、互いに異なる複数のヒートマップを出力する推論モデルを得ることができる。これによって、ユーザは、ｎ個のモデルの中からより解釈性の高いヒートマップを出力するモデルを選んで使用することができる。 (1.4. Effect of Embodiment)
As described above, according to the embodiment of the present invention, the inference values output from each of the first inference model to the n-th inference model are output as explanatory information so as to approach the correct value. Learning can be performed so that the degree of matching between the n heatmaps becomes small. This makes it possible to obtain an inference model that outputs a plurality of different heatmaps. This allows the user to select and use a model that outputs a more interpretable heat map from among the n models.

以上、本発明の実施形態が奏する効果について説明した。 The effects of the embodiment of the present invention have been described above.

（２．ハードウェア構成例）
続いて、本発明の実施形態に係る学習装置１０のハードウェア構成例について説明する。以下では、本発明の実施形態に係る学習装置１０のハードウェア構成例として、情報処理装置９００のハードウェア構成例について説明する。なお、以下に説明する情報処理装置９００のハードウェア構成例は、学習装置１０のハードウェア構成の一例に過ぎない。したがって、学習装置１０のハードウェア構成は、以下に説明する情報処理装置９００のハードウェア構成から不要な構成が削除されてもよいし、新たな構成が追加されてもよい。 (2. Hardware configuration example)
Next, a hardware configuration example of the learning device 10 according to the embodiment of the present invention will be described. A hardware configuration example of the information processing device 900 will be described below as a hardware configuration example of the learning device 10 according to the embodiment of the present invention. Note that the hardware configuration example of the information processing device 900 described below is merely an example of the hardware configuration of the learning device 10 . Therefore, as for the hardware configuration of the learning device 10, unnecessary configurations may be deleted from the hardware configuration of the information processing device 900 described below, or a new configuration may be added.

図５は、本発明の実施形態に係る学習装置１０の例としての情報処理装置９００のハードウェア構成を示す図である。情報処理装置９００は、ＣＰＵ（ＣｅｎｔｒａｌＰｒｏｃｅｓｓｉｎｇＵｎｉｔ）９０１と、ＲＯＭ（ＲｅａｄＯｎｌｙＭｅｍｏｒｙ）９０２と、ＲＡＭ（ＲａｎｄｏｍＡｃｃｅｓｓＭｅｍｏｒｙ）９０３と、ホストバス９０４と、ブリッジ９０５と、外部バス９０６と、インタフェース９０７と、入力装置９０８と、出力装置９０９と、ストレージ装置９１０と、通信装置９１１と、を備える。 FIG. 5 is a diagram showing the hardware configuration of an information processing device 900 as an example of the learning device 10 according to the embodiment of the present invention. The information processing device 900 includes a CPU (Central Processing Unit) 901, a ROM (Read Only Memory) 902, a RAM (Random Access Memory) 903, a host bus 904, a bridge 905, an external bus 906, and an interface 907. , an input device 908 , an output device 909 , a storage device 910 and a communication device 911 .

ＣＰＵ９０１は、演算処理装置および制御装置として機能し、各種プログラムに従って情報処理装置９００内の動作全般を制御する。また、ＣＰＵ９０１は、マイクロプロセッサであってもよい。ＲＯＭ９０２は、ＣＰＵ９０１が使用するプログラムや演算パラメータ等を記憶する。ＲＡＭ９０３は、ＣＰＵ９０１の実行において使用するプログラムや、その実行において適宜変化するパラメータ等を一時記憶する。これらはＣＰＵバス等から構成されるホストバス９０４により相互に接続されている。 The CPU 901 functions as an arithmetic processing device and a control device, and controls general operations within the information processing device 900 according to various programs. Alternatively, the CPU 901 may be a microprocessor. The ROM 902 stores programs, calculation parameters, and the like used by the CPU 901 . The RAM 903 temporarily stores programs used in the execution of the CPU 901, parameters that change as appropriate during the execution, and the like. These are interconnected by a host bus 904 comprising a CPU bus or the like.

ホストバス９０４は、ブリッジ９０５を介して、ＰＣＩ（ＰｅｒｉｐｈｅｒａｌＣｏｍｐｏｎｅｎｔＩｎｔｅｒｃｏｎｎｅｃｔ／Ｉｎｔｅｒｆａｃｅ）バス等の外部バス９０６に接続されている。なお、必ずしもホストバス９０４、ブリッジ９０５および外部バス９０６を分離構成する必要はなく、１つのバスにこれらの機能を実装してもよい。 The host bus 904 is connected via a bridge 905 to an external bus 906 such as a PCI (Peripheral Component Interconnect/Interface) bus. Note that the host bus 904, the bridge 905 and the external bus 906 do not necessarily have to be configured separately, and these functions may be implemented in one bus.

入力装置９０８は、マウス、キーボード、タッチパネル、ボタン、マイクロフォン、スイッチおよびレバー等ユーザが情報を入力するための入力手段と、ユーザによる入力に基づいて入力信号を生成し、ＣＰＵ９０１に出力する入力制御回路等から構成されている。情報処理装置９００を操作するユーザは、この入力装置９０８を操作することにより、情報処理装置９００に対して各種のデータを入力したり処理動作を指示したりすることができる。 The input device 908 includes input means for the user to input information, such as a mouse, keyboard, touch panel, button, microphone, switch, and lever, and an input control circuit that generates an input signal based on the user's input and outputs it to the CPU 901 . etc. A user who operates the information processing apparatus 900 can input various data to the information processing apparatus 900 and instruct processing operations by operating the input device 908 .

出力装置９０９は、例えば、ＣＲＴ（ＣａｔｈｏｄｅＲａｙＴｕｂｅ）ディスプレイ装置、液晶ディスプレイ（ＬＣＤ）装置、ＯＬＥＤ（ＯｒｇａｎｉｃＬｉｇｈｔＥｍｉｔｔｉｎｇＤｉｏｄｅ）装置、ランプ等の表示装置およびスピーカ等の音声出力装置を含む。 The output device 909 includes, for example, a CRT (Cathode Ray Tube) display device, a liquid crystal display (LCD) device, an OLED (Organic Light Emitting Diode) device, a display device such as a lamp, and an audio output device such as a speaker.

ストレージ装置９１０は、データ格納用の装置である。ストレージ装置９１０は、記憶媒体、記憶媒体にデータを記録する記録装置、記憶媒体からデータを読み出す読出し装置および記憶媒体に記録されたデータを削除する削除装置等を含んでもよい。ストレージ装置９１０は、例えば、ＨＤＤ（ＨａｒｄＤｉｓｋＤｒｉｖｅ）で構成される。このストレージ装置９１０は、ハードディスクを駆動し、ＣＰＵ９０１が実行するプログラムや各種データを格納する。 The storage device 910 is a device for data storage. The storage device 910 may include a storage medium, a recording device that records data on the storage medium, a reading device that reads data from the storage medium, a deletion device that deletes data recorded on the storage medium, and the like. The storage device 910 is configured by, for example, an HDD (Hard Disk Drive). The storage device 910 drives a hard disk and stores programs executed by the CPU 901 and various data.

通信装置９１１は、例えば、ネットワークに接続するための通信デバイス等で構成された通信インタフェースである。また、通信装置９１１は、無線通信または有線通信のどちらに対応してもよい。 The communication device 911 is, for example, a communication interface configured with a communication device or the like for connecting to a network. Also, the communication device 911 may support either wireless communication or wired communication.

以上、本発明の実施形態に係る学習装置１０のハードウェア構成例について説明した。 The hardware configuration example of the learning device 10 according to the embodiment of the present invention has been described above.

（３．まとめ）
以上、添付図面を参照しながら本発明の好適な実施形態について詳細に説明したが、本発明はかかる例に限定されない。本発明の属する技術の分野における通常の知識を有する者であれば、特許請求の範囲に記載された技術的思想の範疇内において、各種の変更例または修正例に想到し得ることは明らかであり、これらについても、当然に本発明の技術的範囲に属するものと了解される。 (3. Summary)
Although the preferred embodiments of the present invention have been described in detail above with reference to the accompanying drawings, the present invention is not limited to such examples. It is obvious that a person having ordinary knowledge in the technical field to which the present invention belongs can conceive of various modifications or modifications within the scope of the technical idea described in the claims. It is understood that these also naturally belong to the technical scope of the present invention.

例えば、上記した例では、学習装置１０がｎ個の推論モデルを同時に学習する場合を主に想定している。しかし、学習装置１０は、ｎ個の推論モデルの全部を同時に学習しなくてもよい。例えば、ｎ個の推論モデルの一部として、学習済みの推論モデルが使用されてもよい。このとき、学習済みの推論モデルの重みパラメータは、更新されずに一定の値に固定され得る。 For example, in the above example, it is mainly assumed that the learning device 10 learns n inference models simultaneously. However, the learning device 10 does not have to learn all of the n inference models at the same time. For example, a trained inference model may be used as part of the n inference models. At this time, the weight parameter of the learned inference model can be fixed at a constant value without being updated.

また、上記した例では、説明部１０３におけるヒートマップの生成手法の種類が、１種類である場合を主に想定している。しかし、説明部１０３におけるヒートマップの生成手法の種類は複数であってもよい。このとき、説明部１０３は、ヒートマップ同士の一致度に基づく損失の複数種類のヒートマップ生成手法についての合計値を説明評価結果の例として更新部１０６に出力してもよい。 Further, in the above example, it is mainly assumed that the number of types of heat map generation methods in the explanation unit 103 is one. However, a plurality of types of heat map generation techniques may be used in the explanation section 103 . At this time, the explanation unit 103 may output to the update unit 106, as an example of the explanation evaluation result, a total value for a plurality of types of loss heat map generation methods based on the degree of matching between the heat maps.

１０学習装置
１００データセット
１０１入力部
１０２推論部
１０３説明部
１０４推論評価部
１０５説明評価部
１０６更新部
１０７提示制御部
１０８記録制御部
１１０重みパラメータ
１２１表示部
１２２操作部
１２３選択モデル情報

10 learning device 100 data set 101 input unit 102 inference unit 103 explanation unit 104 inference evaluation unit 105 explanation evaluation unit 106 update unit 107 presentation control unit 108 recording control unit 110 weight parameter 121 display unit 122 operation unit 123 selected model information

Claims

an input unit that acquires first input data and a correct value of the first input data;
an inference unit that outputs a first inference value corresponding to each of a plurality of inference models based on the first input data and the plurality of inference models;
an explanation unit that outputs first explanation information corresponding to each of the plurality of inference models indicating the magnitude of contribution of the first input data to the first inference value;
an inference evaluation unit that obtains an inference evaluation result based on the correct value and the first inference value;
an explanation evaluation unit that obtains an explanation evaluation result based on a degree of matching between first explanation information corresponding to each of the plurality of inference models;
an updating unit that updates first weight parameters of the plurality of inference models based on the inference evaluation result and the explanation evaluation result;
A learning device comprising:

The input unit acquires second input data,
The inference unit, based on the second input data and a plurality of trained models that are a plurality of estimation models after the update of the first weight parameter, a second model corresponding to each of the plurality of trained models. outputs the inferred value of
The explanation unit outputs second explanation information corresponding to each of the plurality of trained models indicating the magnitude of contribution of the second input data to the second inference value,
The learning device includes a presentation control unit that controls presentation of the second explanatory information to the user.
A learning device according to claim 1.

The presentation control unit controls presentation to the user of the second inference value and the second explanatory information.
3. A learning device according to claim 2.

The learning device
a recording control unit that controls recording of information indicating one or more trained models selected by the user from the plurality of trained models;
4. The learning device according to claim 2 or 3.

The explanation evaluation result takes a smaller value as the degree of matching between the first explanation information corresponding to each of the plurality of inference models increases.
A learning device according to any one of claims 1 to 4.

The explanation evaluation unit obtains the explanation evaluation result based on an inner product of vectors obtained by normalizing the first explanation information corresponding to each of the plurality of inference models.
The learning device according to claim 5.

The explanation evaluation unit generates a mask by binarizing the first explanation information for each of the plurality of inference models, and generates a mask from the first explanation information corresponding to an inference model other than itself. calculating the product of the mask and the first explanation information corresponding to its own inference model, and obtaining the explanation evaluation result based on the sum of the products for each of the plurality of inference models;
The learning device according to claim 5.

The explanation part includes a function capable of backpropagation,
A learning device according to any one of claims 1 to 7.

The description part has a second weighting parameter,
The update unit updates the second weight parameter by error backpropagation.
9. A learning device according to claim 8.

at least one of the plurality of inference models includes a neural network;
A learning device according to any one of claims 1 to 9.

The update unit updates the first weight parameter based on the addition result of the inference evaluation result and the explanation evaluation result.
A learning device according to any one of claims 1 to 10.

The first explanatory information is a heat map showing the magnitude of contribution of the first input data to the first inference value,
A learning device according to any one of claims 1 to 11.

obtaining first input data and a correct value of the first input data;
outputting a first inference value corresponding to each of a plurality of inference models based on the first input data and a plurality of inference models;
outputting first explanation information corresponding to each of the plurality of inference models indicating the magnitude of contribution of the first input data to the first inference value;
obtaining an inference evaluation result based on the correct value and the first inference value;
Obtaining an explanation evaluation result based on a degree of matching between first explanation information corresponding to each of the plurality of inference models;
updating first weight parameters of the plurality of inference models based on the inference evaluation result and the explanation evaluation result;
How to learn, including

the computer,
an input unit that acquires first input data and a correct value of the first input data;
an inference unit that outputs a first inference value corresponding to each of a plurality of inference models based on the first input data and the plurality of inference models;
an explanation unit that outputs first explanation information corresponding to each of the plurality of inference models indicating the magnitude of contribution of the first input data to the first inference value;
an inference evaluation unit that obtains an inference evaluation result based on the correct value and the first inference value;
an explanation evaluation unit that obtains an explanation evaluation result based on a degree of matching between first explanation information corresponding to each of the plurality of inference models;
an updating unit that updates first weight parameters of the plurality of inference models based on the inference evaluation result and the explanation evaluation result;
A program that functions as a learning device.