JPH10254475A - Speech recognition method - Google Patents
Speech recognition methodInfo
- Publication number
- JPH10254475A JPH10254475A JP9060237A JP6023797A JPH10254475A JP H10254475 A JPH10254475 A JP H10254475A JP 9060237 A JP9060237 A JP 9060237A JP 6023797 A JP6023797 A JP 6023797A JP H10254475 A JPH10254475 A JP H10254475A
- Authority
- JP
- Japan
- Prior art keywords
- voice
- speech
- detecting
- speech recognition
- length
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Abstract
Description
【0001】[0001]
【発明の属する技術分野】この発明は,音声の始端と終
端を自動的に検出して音声を認識する音声認識方法にお
いて,単語間に存在する長い無発声区間に対処する音声
認識方法に関する。BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a speech recognition method for automatically detecting the beginning and end of speech and recognizing the speech, and dealing with a long non-speech section existing between words.
【0002】[0002]
【従来の技術】従来までの音声認識方法を図3を参照し
て説明する。図3において,30は入力音声の音声始端
を検出する始端検出部,31は入力音声の音声終端を検
出する終端検出部,32は検出された音声始端と音声終
端間の音声信号について音声認識を行う音声認識部,3
3は音声認識時に音声信号から抽出した特徴量とマッチ
ングするための参照パターン,34は音声終端を検出す
るときの閾値である無発声区間長を表す。2. Description of the Related Art A conventional speech recognition method will be described with reference to FIG. In FIG. 3, reference numeral 30 denotes a start edge detection unit for detecting a voice start edge of the input voice, 31 denotes an end edge detection unit for detecting the voice end of the input voice, and 32 denotes voice recognition for a voice signal between the detected voice start edge and the voice end edge. Speech recognition unit to perform, 3
Reference numeral 3 denotes a reference pattern for matching with a feature amount extracted from a speech signal during speech recognition, and reference numeral 34 denotes a non-speech interval length which is a threshold when detecting the end of speech.
【0003】入力された音声は,始端検出部30で音声
始端の検出が行われる。音声始端の検出後に無発声区間
が検出されると,終端検出部31では,あらかじめ設定
された無発声区間長34と比較し,入力音声の無発声区
間の長さが所定の無発声区間長34よりも長いときに,
その無発声区間が始まった時点を音声終端として検出す
る。音声認識部32において,この音声始端から音声終
端までの音声に対して,あらかじめ参照パターン33と
して用意された各音声モデルとの類似度が計算され,そ
の値に基づいた認識結果が,音声認識部32から出力さ
れる。The input voice is detected by a start detection unit 30 at the start of the voice. When a non-speech interval is detected after the start of the voice is detected, the end detection unit 31 compares the length of the non-speech interval of the input voice with a predetermined non-speech interval length 34 in comparison with a preset non-speech interval length 34. Longer than
The point in time when the non-utterance section starts is detected as the end of voice. The speech recognition unit 32 calculates the similarity between the speech from the speech start end to the speech end and each speech model prepared as the reference pattern 33 in advance, and recognizes the recognition result based on the value as a speech recognition unit. 32.
【0004】[0004]
【発明が解決しようとする課題】従来技術による音声認
識方法では,音声始端と音声終端を検出した音声に対し
て音声認識を行っていた。このとき,終端検出では,あ
る一定長の長さの無発声区間が観測された場合に,音声
がもう発声されていないと判断し,音声の終端としてい
た。しかし,この方法では,考えながら発声している場
合や躊躇した発声の場合などのように,単語の間に長い
無発声区間が挿入されると,発声の途中を音声の終端と
して検出してしまうため,精度のよい音声認識結果が得
られないという問題があった。In the speech recognition method according to the prior art, speech recognition is performed on the speech in which the start and end of the speech are detected. At this time, in the end detection, when a non-speech section having a certain length is observed, it is determined that the speech is no longer uttered, and the end of the speech is determined. However, with this method, if a long unuttered section is inserted between words, such as when uttering while thinking or when hesitating, the middle of the utterance is detected as the end of speech. Therefore, there is a problem that an accurate speech recognition result cannot be obtained.
【0005】この発明は,上述の問題を解消する音声認
識方法を提供するものである。[0005] The present invention provides a speech recognition method that solves the above-mentioned problem.
【0006】[0006]
【課題を解決するための手段】この発明によれば,音声
始端と音声終端を検出後,音声認識部で認識が行われ,
その結果に対して,認識結果確認部で認識された単語の
信頼度が計算される。計算した信頼度がある条件を満た
せば結果を出力する。もし,一部の単語の信頼度だけが
高い場合には,無発声区間の閾値の長さを増やして,再
度音声の終端検出を行う。その結果を音声認識部で再度
認識する。また,どの単語の信頼度も十分でない場合に
は,音声の始端検出に戻って再度上記操作を繰り返す。According to the present invention, after detecting a voice start end and a voice end, recognition is performed by a voice recognition unit.
Based on the result, the reliability of the word recognized by the recognition result confirmation unit is calculated. If the calculated reliability satisfies a certain condition, the result is output. If the reliability of only some of the words is high, the length of the threshold of the non-speech section is increased, and the end of the speech is detected again. The result is recognized again by the voice recognition unit. If the reliability of any word is not sufficient, the process returns to the detection of the beginning of the voice and the above operation is repeated again.
【0007】以上の操作を全ての単語の信頼度が十分に
なるか,あるいは無発声区間の閾値がある一定の長さを
越えるまで繰り返す。無発声区間の閾値がある一定の長
さを越えた場合には,音声の始端検出に戻る。The above operation is repeated until the reliability of all the words becomes sufficient or the threshold value of the non-speech interval exceeds a certain length. When the threshold value of the non-speech interval exceeds a certain length, the process returns to the start detection of the voice.
【0008】[0008]
【発明の実施の形態】図1は,本発明の構成例を示すブ
ロック図である。図中,10は入力音声の音声始端を検
出する始端検出部,11は入力音声の音声終端を検出す
る終端検出部,12は検出された音声始端と音声終端間
の音声信号について音声認識を行う音声認識部,13は
音声認識時に音声信号から抽出した特徴量とマッチング
するための参照パターン,14は認識結果の信頼度を計
算して認識結果を確認する認識結果確認部,15は認識
結果の確認に用いる単語の長さ,16は音声終端を検出
するときの閾値である無発声区間長を表す。FIG. 1 is a block diagram showing a configuration example of the present invention. In the figure, reference numeral 10 denotes a start-end detecting unit for detecting a voice start end of an input voice, 11 denotes an end-end detecting unit for detecting a voice end of an input voice, and 12 performs voice recognition for a voice signal between the detected voice start end and the voice end. A voice recognition unit 13 is a reference pattern for matching with a feature amount extracted from a voice signal at the time of voice recognition, 14 is a recognition result confirmation unit that calculates the reliability of the recognition result and checks the recognition result, and 15 is a recognition result confirmation unit. The length of the word used for confirmation, and 16 represents a non-speech interval length which is a threshold when detecting the end of speech.
【0009】ここでは4桁の数字の認識を行う場合の例
について説明する。最初に入力された音声は,始端検出
部10で音声始端の検出が行われる。この手法として,
例えば以下の尤度がある閾値を越えた場合を音声の始端
とする方法を用いる。Here, an example in which a four-digit number is recognized will be described. The voice input first is detected by the start detection unit 10 at the start of the voice. As this technique,
For example, a method is used in which a case where the following likelihood exceeds a certain threshold value is set as a starting point of speech.
【0010】 D= logP(Ot|音声)− logP(Ot|環境雑音) ここで,P(Ot|音声)は時刻tに入力された音声の
特徴量Otが音声である尤度,P(Ot|環境雑音)は
入力された音声の特徴量Otが環境雑音である尤度であ
る。D = logP (Ot | voice) −logP (Ot | environmental noise) where P (Ot | voice) is the likelihood that the feature Ot of the voice input at time t is voice, and P (Ot | | Environmental noise) is the likelihood that the feature amount Ot of the input speech is environmental noise.
【0011】第1の尤度P(Ot|音声)は,全ての語
彙に対応する音声HMM(Hidden Markov Model)に対す
る尤度である。このモデルは,対象語彙の全てを含む音
声を用いて学習され,対象語彙の音声信号に対しては高
い尤度を示すが,それ以外の信号には低い尤度を示すこ
とが期待される。第2の尤度P(Ot|環境雑音)は,
無音区間など認識対象語彙外の信号区間を用いて学習さ
れた非音声HMMに対する尤度で,無音区間で高い尤度
を示し音声信号には低い尤度を示すことが期待される。
それぞれのHMMは非常に簡単な構造のモデルでよく,
尤度計算は高速に行うことが可能である。The first likelihood P (Ot | voice) is a likelihood for a voice HMM (Hidden Markov Model) corresponding to all vocabularies. This model is learned using a speech including all of the target vocabulary, and is expected to show high likelihood for a speech signal of the target vocabulary, but show low likelihood for other signals. The second likelihood P (Ot | environmental noise) is
The likelihood for a non-speech HMM learned using a signal section outside the recognition target vocabulary, such as a silent section, is expected to show a high likelihood in a silent section and a low likelihood in a speech signal.
Each HMM can be a model with a very simple structure,
The likelihood calculation can be performed at high speed.
【0012】この他にも,音声始端および音声終端の検
出に,音声のパワーの閾値を使う手法も考えられる。こ
の後,終端検出部11では,無発声区間長16の閾値よ
り,無発声と判断された区間が長いときに,その無発声
と判断された区間の先頭を音声の終わりとして検出す
る。すなわち,上記の尤度Dが所定の閾値以下である場
合の区間が,無発声区間長16の閾値より長いときに,
その区間の先頭を音声終端とする。なお,検出された音
声始端と音声終端部分における信号情報の取りこぼしを
なくすために,実際に検出された音声始端から一定の長
さ分だけ音声始端を前にずらし,検出された音声終端か
ら一定の長さ分だけ音声終端を後にずらすようにしても
よい。In addition to the above, a method using a threshold value of the power of voice for detecting the voice start and voice ends is also conceivable. Thereafter, when the section determined as non-speech is longer than the threshold value of the non-speech section length 16, the end detection unit 11 detects the head of the section determined as non-speech as the end of speech. That is, when the section where the likelihood D is equal to or smaller than the predetermined threshold is longer than the threshold of the silent section length 16,
The beginning of the section is the audio end. Note that in order to eliminate the loss of signal information at the detected voice start and voice end, the voice start is shifted forward by a fixed length from the actually detected voice start, and a certain amount is shifted from the detected voice end. The audio end may be shifted later by the length.
【0013】この音声始端と音声終端により切り出され
た音声は,音声認識部12に渡され,参照モデルとして
あらかじめ設定されている参照パターン13と比較され
る。音声認識部12では,認識結果と,参照モデルと入
力音声との尤度,および単語の長さが出力される。The voice cut out by the voice start end and the voice end is passed to a voice recognition unit 12 and compared with a reference pattern 13 preset as a reference model. The speech recognition unit 12 outputs the recognition result, the likelihood between the reference model and the input speech, and the word length.
【0014】認識結果確認部14では,この尤度と単語
の長さから,認識された単語の信頼度をチェックして,
数字を“良”,“可”,“不可”の3段階に分類する。
ここでは,この信頼度として以下の3つを使っている。The recognition result confirmation unit 14 checks the reliability of the recognized word from the likelihood and the word length.
The numbers are classified into three levels, "good", "acceptable", and "impossible".
Here, the following three are used as the reliability.
【0015】・数字の長さ 数字の長さがある閾値より短い数字は「不可」と分類す
る。 ・入力音声と環境雑音モデルとの尤度と入力音声と数字
モデルとの尤度の差 入力音声と環境雑音モデルとの尤度と,入力音声と数字
モデルとの尤度の差が0以下であれば,その数字を「不
可」とする。Numeral length Numerals whose numerical length is shorter than a certain threshold are classified as "impossible". The difference between the likelihood between the input speech and the environmental noise model and the likelihood between the input speech and the numerical model is less than or equal to 0. If there is, make the number "impossible".
【0016】・入力音声と音声モデルとの尤度と入力音
声と数字モデルとの尤度の差 入力音声と音声モデルとの尤度と,入力音声と数字モデ
ルとの尤度の差より,数字を「不可」,「可」,「良」
とラベル付けする。A likelihood difference between the input voice and the voice model and a likelihood difference between the input voice and the numeric model, and a likelihood difference between the input voice and the voice model and a likelihood difference between the input voice and the numeric model. To "impossible", "acceptable", "good"
Label as
【0017】この結果を使った認識結果確認部14での
処理手順を図2に示す。図2において,ステップS4〜
S8が,認識結果確認部14での処理手順である。始端
検出部10により入力音声から音声始端を検出し(S
1),また終端検出部11により音声終端を検出し(S
2),その間の音声信号について,音声認識部12によ
り音声認識を行う(S3)。音声認識部12から認識結
果,参照モデルと入力音声との尤度,単語の長さが認識
結果確認部14へ通知される。FIG. 2 shows a processing procedure in the recognition result confirmation unit 14 using the result. In FIG.
S8 is a processing procedure in the recognition result confirmation unit 14. The start-of-speech detection unit 10 detects the start of speech from the input speech (S
1) Also, the end of voice is detected by the end detection unit 11 (S
2) The voice recognition unit 12 performs voice recognition on the voice signal during that time (S3). The speech recognition unit 12 notifies the recognition result confirmation unit 14 of the recognition result, the likelihood between the reference model and the input speech, and the word length.
【0018】認識結果確認部14は,認識の結果から数
字が4つとも「可」以上であれば(S4),その認識結
果を出力する(S5)。もし,認識結果中に「可」以上
が4つなく,また「良」の数字が1つもなければ(S
6),この切り出された音声の中には数字が存在しない
として認識結果を棄却し(S7),ステップS1へ戻
り,音声の始端検出からもう一度行う。少なくとも1つ
以上の数字が「良」であれば(S6),音声の中に長い
無発声区間が存在すると仮定して,無発声区間長の閾値
を大きくして(S8),もう一度音声終端の検出を行
う。この例では,ステップS8で無発声区間長の現在の
閾値に320msを加えている。この操作は無発声区間
長の閾値がある値以上になるまで続けられる。The recognition result confirming unit 14 outputs the recognition result if all four numbers are "OK" or more from the recognition result (S4) (S5). If there are no more than four "OK" or no "good" numbers in the recognition result (S
6), the recognition result is rejected because there is no numeral in the cut-out voice (S7), and the process returns to step S1 to perform the process again from the start detection of the voice. If at least one of the numbers is "good" (S6), it is assumed that a long non-speech section exists in the speech, the threshold value of the non-speech section length is increased (S8), and the end of the speech is again determined. Perform detection. In this example, 320 ms is added to the current threshold value of the silent interval length in step S8. This operation is continued until the threshold value of the non-utterance section length exceeds a certain value.
【0019】認識結果の信頼度としては,ここに挙げた
例の他に,音声パワーなどによる信頼度が挙げられる。As the reliability of the recognition result, in addition to the examples described above, the reliability based on the voice power and the like can be mentioned.
【0020】[0020]
【実施例】この発明の効果を調べるために数字音声認識
実験を行った。評価用データには話者27人の発声した
4桁数字の音声を使用した。この結果,従来手法では1
9.2%の誤り率であったのに対し,本発明の実施例で
は誤り率が11.4%まで改善された。EXAMPLE In order to examine the effect of the present invention, a numerical speech recognition experiment was performed. For the evaluation data, a 4-digit numeric voice uttered by 27 speakers was used. As a result, 1
While the error rate was 9.2%, in the embodiment of the present invention, the error rate was improved to 11.4%.
【0021】[0021]
【発明の効果】以上の通りであって,この発明によれば
発声中に躊躇したり戸惑ったりした場合の長い無発声区
間が存在しても,音声認識を行うことができ,高い音声
認識率を達成することができる。As described above, according to the present invention, speech recognition can be performed even when there is a long non-speech section when hesitation or embarrassment occurs during utterance, and a high speech recognition rate is obtained. Can be achieved.
【図1】本発明の構成例を示すブロック図である。FIG. 1 is a block diagram illustrating a configuration example of the present invention.
【図2】本発明の実施の形態のフローチャートである。FIG. 2 is a flowchart of an embodiment of the present invention.
【図3】従来例を示すブロック図である。FIG. 3 is a block diagram showing a conventional example.
10 始端検出部 11 終端検出部 12 音声認識部 13 参照パターン 14 認識結果確認部 15 単語の長さ 16 無発声区間長 DESCRIPTION OF SYMBOLS 10 Start-end detection part 11 End-end detection part 12 Speech recognition part 13 Reference pattern 14 Recognition result confirmation part 15 Word length 16 Unvoiced section length
Claims (1)
認識を行う音声認識装置における音声認識方法におい
て,音声の入力信号から音声始端を検出する過程と,無
発声区間長の閾値を用いて,音声の入力信号から音声終
端を検出する過程と,検出された音声始端から音声終端
までの音声を認識する過程と,認識された結果の信頼度
に応じて,前記音声終端を検出するための無発声区間長
の閾値を大きくし,再度音声終端の検出を行わせる過程
とを有することを特徴とする音声認識方法。1. A speech recognition method in a speech recognition apparatus for detecting a speech section from an input signal and performing speech recognition, wherein a step of detecting a speech start end from a speech input signal and a threshold of a non-speech section length are used. Detecting a voice end from a voice input signal, recognizing a voice from a detected voice start to a voice end, and detecting the voice end according to the reliability of the recognized result. Increasing the threshold value of the non-speech interval length and causing the end of the speech to be detected again.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP9060237A JP3069531B2 (en) | 1997-03-14 | 1997-03-14 | Voice recognition method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
JP9060237A JP3069531B2 (en) | 1997-03-14 | 1997-03-14 | Voice recognition method |
Publications (2)
Publication Number | Publication Date |
---|---|
JPH10254475A true JPH10254475A (en) | 1998-09-25 |
JP3069531B2 JP3069531B2 (en) | 2000-07-24 |
Family
ID=13136374
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
JP9060237A Expired - Lifetime JP3069531B2 (en) | 1997-03-14 | 1997-03-14 | Voice recognition method |
Country Status (1)
Country | Link |
---|---|
JP (1) | JP3069531B2 (en) |
Cited By (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002538514A (en) * | 1999-03-05 | 2002-11-12 | パナソニック テクノロジーズ, インコーポレイテッド | Speech detection method using stochastic reliability in frequency spectrum |
JP2004510209A (en) * | 2000-09-29 | 2004-04-02 | テレフオンアクチーボラゲット エル エム エリクソン(パブル) | Method and apparatus for analyzing spoken number sequences |
US7260527B2 (en) | 2001-12-28 | 2007-08-21 | Kabushiki Kaisha Toshiba | Speech recognizing apparatus and speech recognizing method |
WO2011070972A1 (en) * | 2009-12-10 | 2011-06-16 | 日本電気株式会社 | Voice recognition system, voice recognition method and voice recognition program |
JP2017078869A (en) * | 2015-10-19 | 2017-04-27 | グーグル インコーポレイテッド | Speech endpointing |
JP2018155779A (en) * | 2017-03-15 | 2018-10-04 | ヤマハ株式会社 | Information providing method and information providing system |
JP2019008274A (en) * | 2017-06-26 | 2019-01-17 | フェアリーデバイセズ株式会社 | Voice information processing system, control method of voice information processing system, program of voice information processing system and storage medium |
US10269341B2 (en) | 2015-10-19 | 2019-04-23 | Google Llc | Speech endpointing |
JP2019194733A (en) * | 2015-09-03 | 2019-11-07 | グーグル エルエルシー | Method, system, and computer-readable storage medium for enhanced utterance endpoint designation |
JP2019207329A (en) * | 2018-05-29 | 2019-12-05 | シャープ株式会社 | Electronic apparatus, control device for controlling electronic apparatus, control program and control method |
US10593352B2 (en) | 2017-06-06 | 2020-03-17 | Google Llc | End of query detection |
CN113362827A (en) * | 2021-06-24 | 2021-09-07 | 未鲲(上海)科技服务有限公司 | Speech recognition method, speech recognition device, computer equipment and storage medium |
JP2022003415A (en) * | 2020-11-03 | 2022-01-11 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | Voice control method and voice control device, electronic apparatus, and storage medium |
US11676625B2 (en) | 2017-06-06 | 2023-06-13 | Google Llc | Unified endpointer using multitask and multidomain learning |
-
1997
- 1997-03-14 JP JP9060237A patent/JP3069531B2/en not_active Expired - Lifetime
Cited By (25)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2002538514A (en) * | 1999-03-05 | 2002-11-12 | パナソニック テクノロジーズ, インコーポレイテッド | Speech detection method using stochastic reliability in frequency spectrum |
JP4745502B2 (en) * | 1999-03-05 | 2011-08-10 | マツシタ エレクトリック コーポレーション オブ アメリカ | Speech detection method using probabilistic reliability in frequency spectrum |
JP2004510209A (en) * | 2000-09-29 | 2004-04-02 | テレフオンアクチーボラゲット エル エム エリクソン(パブル) | Method and apparatus for analyzing spoken number sequences |
US7260527B2 (en) | 2001-12-28 | 2007-08-21 | Kabushiki Kaisha Toshiba | Speech recognizing apparatus and speech recognizing method |
WO2011070972A1 (en) * | 2009-12-10 | 2011-06-16 | 日本電気株式会社 | Voice recognition system, voice recognition method and voice recognition program |
JPWO2011070972A1 (en) * | 2009-12-10 | 2013-04-22 | 日本電気株式会社 | Speech recognition system, speech recognition method, and speech recognition program |
JP5621783B2 (en) * | 2009-12-10 | 2014-11-12 | 日本電気株式会社 | Speech recognition system, speech recognition method, and speech recognition program |
US9002709B2 (en) | 2009-12-10 | 2015-04-07 | Nec Corporation | Voice recognition system and voice recognition method |
JP2019194733A (en) * | 2015-09-03 | 2019-11-07 | グーグル エルエルシー | Method, system, and computer-readable storage medium for enhanced utterance endpoint designation |
CN112735422A (en) * | 2015-09-03 | 2021-04-30 | 谷歌有限责任公司 | Enhanced speech endpoint determination |
JP2022095683A (en) * | 2015-09-03 | 2022-06-28 | グーグル エルエルシー | Method, system and computer-readable storage medium for enhanced utterance endpoint specification |
US10269341B2 (en) | 2015-10-19 | 2019-04-23 | Google Llc | Speech endpointing |
JP2017078869A (en) * | 2015-10-19 | 2017-04-27 | グーグル インコーポレイテッド | Speech endpointing |
US11062696B2 (en) | 2015-10-19 | 2021-07-13 | Google Llc | Speech endpointing |
US11710477B2 (en) | 2015-10-19 | 2023-07-25 | Google Llc | Speech endpointing |
JP2018155779A (en) * | 2017-03-15 | 2018-10-04 | ヤマハ株式会社 | Information providing method and information providing system |
US11551709B2 (en) | 2017-06-06 | 2023-01-10 | Google Llc | End of query detection |
US10593352B2 (en) | 2017-06-06 | 2020-03-17 | Google Llc | End of query detection |
US11676625B2 (en) | 2017-06-06 | 2023-06-13 | Google Llc | Unified endpointer using multitask and multidomain learning |
JP2019008274A (en) * | 2017-06-26 | 2019-01-17 | フェアリーデバイセズ株式会社 | Voice information processing system, control method of voice information processing system, program of voice information processing system and storage medium |
JP2019207329A (en) * | 2018-05-29 | 2019-12-05 | シャープ株式会社 | Electronic apparatus, control device for controlling electronic apparatus, control program and control method |
JP2022003415A (en) * | 2020-11-03 | 2022-01-11 | ベイジン バイドゥ ネットコム サイエンス テクノロジー カンパニー リミテッド | Voice control method and voice control device, electronic apparatus, and storage medium |
US11893988B2 (en) | 2020-11-03 | 2024-02-06 | Beijing Baidu Netcom Science And Technology Co., Ltd. | Speech control method, electronic device, and storage medium |
CN113362827A (en) * | 2021-06-24 | 2021-09-07 | 未鲲(上海)科技服务有限公司 | Speech recognition method, speech recognition device, computer equipment and storage medium |
CN113362827B (en) * | 2021-06-24 | 2024-02-13 | 上海风和雨网络科技有限公司 | Speech recognition method, device, computer equipment and storage medium |
Also Published As
Publication number | Publication date |
---|---|
JP3069531B2 (en) | 2000-07-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
JP3180655B2 (en) | Word speech recognition method by pattern matching and apparatus for implementing the method | |
Bourlard et al. | Optimizing recognition and rejection performance in wordspotting systems | |
KR100697961B1 (en) | Semi-supervised speaker adaptation | |
JP4911034B2 (en) | Voice discrimination system, voice discrimination method, and voice discrimination program | |
JP3069531B2 (en) | Voice recognition method | |
Boite et al. | A new approach towards keyword spotting. | |
US6230126B1 (en) | Word-spotting speech recognition device and system | |
JP2002091467A (en) | Voice recognition system | |
JPH08263092A (en) | Response voice generating method and voice interactive system | |
JPH09292899A (en) | Voice recognizing device | |
JPH07230293A (en) | Voice recognition device | |
JP2000099099A (en) | Data reproducing device | |
JPH0997095A (en) | Speech recognition device | |
JP3110025B2 (en) | Utterance deformation detection device | |
JP2798919B2 (en) | Voice section detection method | |
JP3357752B2 (en) | Pattern matching device | |
JP3583930B2 (en) | Speech recognition apparatus and method | |
JPH05303391A (en) | Speech recognition device | |
JPH08211893A (en) | Speech recognition device | |
JPH09212190A (en) | Speech recognition device and sentence recognition device | |
JPS6027000A (en) | Pattern matching | |
JPH0449716B2 (en) | ||
JPH096387A (en) | Voice recognition device | |
EP1422691A1 (en) | Method for adapting a speech recognition system | |
Bartels et al. | Using syllable nuclei locations to improve automatic speech recognition in the presence of burst noise. |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20090519 Year of fee payment: 9 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20090519 Year of fee payment: 9 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20100519 Year of fee payment: 10 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20100519 Year of fee payment: 10 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20110519 Year of fee payment: 11 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20120519 Year of fee payment: 12 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20130519 Year of fee payment: 13 |
|
FPAY | Renewal fee payment (event date is renewal date of database) |
Free format text: PAYMENT UNTIL: 20140519 Year of fee payment: 14 |
|
S531 | Written request for registration of change of domicile |
Free format text: JAPANESE INTERMEDIATE CODE: R313531 |
|
R350 | Written notification of registration of transfer |
Free format text: JAPANESE INTERMEDIATE CODE: R350 |
|
EXPY | Cancellation because of completion of term |