US20140278393A1 - Apparatus and Method for Power Efficient Signal Conditioning for a Voice Recognition System - Google Patents
Apparatus and Method for Power Efficient Signal Conditioning for a Voice Recognition System Download PDFInfo
- Publication number
- US20140278393A1 US20140278393A1 US13/955,186 US201313955186A US2014278393A1 US 20140278393 A1 US20140278393 A1 US 20140278393A1 US 201313955186 A US201313955186 A US 201313955186A US 2014278393 A1 US2014278393 A1 US 2014278393A1
- Authority
- US
- United States
- Prior art keywords
- noise
- audio signal
- estimator
- voice
- response
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 238000000034 method Methods 0.000 title claims abstract description 44
- 230000003750 conditioning effect Effects 0.000 title description 6
- 230000005236 sound signal Effects 0.000 claims abstract description 71
- 238000012545 processing Methods 0.000 claims abstract description 48
- 230000000694 effects Effects 0.000 claims abstract description 45
- 230000004044 response Effects 0.000 claims abstract description 39
- 230000001629 suppression Effects 0.000 claims abstract description 29
- 230000008859 change Effects 0.000 claims abstract description 21
- 230000003213 activating effect Effects 0.000 claims abstract description 19
- 238000012544 monitoring process Methods 0.000 claims abstract description 6
- 230000007774 longterm Effects 0.000 claims description 8
- 230000003139 buffering effect Effects 0.000 claims description 5
- 238000004891 communication Methods 0.000 description 11
- 230000009467 reduction Effects 0.000 description 11
- 230000004913 activation Effects 0.000 description 4
- 230000008878 coupling Effects 0.000 description 4
- 238000010168 coupling process Methods 0.000 description 4
- 238000005859 coupling reaction Methods 0.000 description 4
- 238000010586 diagram Methods 0.000 description 3
- 230000008901 benefit Effects 0.000 description 2
- 238000001514 detection method Methods 0.000 description 2
- 238000005516 engineering process Methods 0.000 description 2
- VJYFKVYYMZPMAB-UHFFFAOYSA-N ethoprophos Chemical compound CCCSP(=O)(OCC)SCCC VJYFKVYYMZPMAB-UHFFFAOYSA-N 0.000 description 2
- 230000002349 favourable effect Effects 0.000 description 2
- 238000007781 pre-processing Methods 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 230000001960 triggered effect Effects 0.000 description 2
- 230000009471 action Effects 0.000 description 1
- 230000002411 adverse Effects 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 235000019800 disodium phosphate Nutrition 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/20—Speech recognition techniques specially adapted for robustness in adverse environments, e.g. in noise, of stress induced speech
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0272—Voice signal separating
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0316—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude
- G10L21/0364—Speech enhancement, e.g. noise reduction or echo cancellation by changing the amplitude for improving intelligibility
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L25/84—Detection of presence or absence of voice signals for discriminating voice from noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
- G10L2021/02161—Number of inputs available containing the signal or the noise to be suppressed
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/78—Detection of presence or absence of voice signals
- G10L2025/783—Detection of presence or absence of voice signals based on threshold decision
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L21/00—Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
- G10L21/02—Speech enhancement, e.g. noise reduction or echo cancellation
- G10L21/0208—Noise filtering
- G10L21/0216—Noise filtering characterised by the method used for estimating noise
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L25/00—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
- G10L25/03—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
- G10L25/21—Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
Definitions
- the present disclosure relates generally to voice signal processing and more particularly to voice signal processing for voice recognition systems.
- Mobile devices such as, but not limited to, mobile phones, smart phones, personal digital assistants (PDAs), tablets, laptops or other electronic devices, etc., increasingly include voice recognition systems to provide hands free voice control of the devices.
- voice recognition technologies have been improving, accurate voice recognition remains a technical challenge.
- a particular challenge when implementing voice recognition systems on mobile devices is that, as the mobile device moves or is positioned in certain ways, the acoustic environment of the mobile device changes accordingly thereby changing the sound perceived by the mobile device's voice recognition system.
- Voice sound that may be recognized by the voice recognition system under one acoustic environment may be unrecognizable under certain changed conditions due to mobile device motion or positioning.
- Various other conditions in the surrounding environment can add noise, echo or cause other acoustically undesirable conditions that also adversely impact the voice recognition system.
- the mobile device acoustic environment impacts the operation of signal processing components such as microphone arrays, noise suppressors, echo cancellation systems and signal conditioning that is used to improve voice recognition performance.
- signal processing components such as microphone arrays, noise suppressors, echo cancellation systems and signal conditioning that is used to improve voice recognition performance.
- Such signal processing operations for voice recognition improvement are not power efficient and increase the drain on battery power. Because users expects voice recognition systems to be available as needed, various voice recognition system programs, processes or services may be required to run continuously resulting in further increased power consumption.
- FIG. 1 is a schematic block diagram of an apparatus in accordance with the embodiments.
- FIG. 2 is a flow chart providing an example method of operation of the apparatus of FIG. 1 in accordance with various embodiments.
- FIG. 3 is a flow chart showing a method of operation related to voice signal detection in accordance with various embodiments.
- FIG. 4 is a flow chart showing a method of operation related to selection of signal processing in accordance with various embodiments.
- FIG. 5 is a flow chart showing a method of operation in accordance with various embodiments.
- FIG. 6 is a flow chart showing a method of operation in accordance with various embodiments.
- the disclosed embodiments detect when conditions require the use of accurate, and thus less power efficient, signal processing to assist in voice recognition. Such power intensive signal processing is turned off or otherwise disabled to conserve battery power for as long as possible.
- the disclosed embodiments achieve a progressive increase of accuracy by running more computationally efficient signal processing on fewer resources and making determinations of when to invoke more sophisticated signal processing based on detected changes of conditions. More particularly, based on information obtained from signal observations, decisions may be made to power-off hardware that is not needed. In other words, when conditions improve from the standpoint of voice recognition performance, the amount of signal processing is ramped down which results in more efficient use of resources and decreased battery power consumption.
- power consumption is minimized by optimizing voice recognition system operation in every software and hardware layer, including switching off non-essential hardware, running power efficient signal processing and relying on accurate, less power efficient signal processing only when needed to accommodate acoustic environment conditions.
- a disclosed method of operation includes monitoring an audio signal energy level with a plurality of signal processing components deactivated, and activating at least one signal processing component of the plurality of signal processing components in response to a detected change in the audio signal energy level.
- the method may further include activating and running a voice activity detector on the audio signal in response to the detected change in the audio energy level where the voice activity detector is one of the signal processing components that is otherwise kept deactivated.
- a method of operation includes monitoring an audio signal energy level while having a noise suppressor deactivated to conserve battery power, buffering the audio signal in response to a detected change or increase in the audio energy level, activating and running a voice activity detector on the audio signal in response to the detected change or increase in the audio energy level and activating and running a noise estimator in response to voice being detected in the audio signal by the voice activity detector.
- the method may further include activating and running the noise suppressor only if the noise estimator determines that noise suppression is required.
- the method may further include activating and running a noise type classifier to determine the noise type based on information received from the noise estimator and selecting a noise suppressor algorithm, from a group of available noise suppressor algorithms, based on the noise type.
- the selected noise suppressor algorithm may also be selected based on power consumption efficiency for the noise type.
- the method may further include determining, by the noise estimator, that noise suppression is not required, and performing voice recognition on the buffered audio signal without activating the noise suppressor.
- the method may also include applying gain to the buffered audio signal prior to performing voice recognition.
- the method may include activating additional microphones to receive audio in response to the detected increase in the audio energy level.
- the method of operation may deactivate the additional microphones and return to a single microphone configuration in response to voice not being detected in the audio signal by the voice activity detector.
- the energy estimator calculates a long term energy baseline and a short term deviation from it, and monitors the audio signal energy level while having a noise suppressor, or other signal processing components, deactivated to conserve battery power.
- the method of operation may include buffering the audio signal in response to a detected short term deviation.
- a disclosed apparatus in one embodiment includes a noise suppressor, a voice activity detector and an energy estimator that is operatively coupled to the voice activity detector.
- the energy estimator is operative to monitor an audio signal energy level with at least the noise suppressor and the voice activity detector deactivated.
- the voice activity detector Upon detecting a change in the audio signal energy level, the voice activity detector is operative to activate at least the voice activity detector in response to the detected change.
- a disclosed apparatus includes voice recognition logic, a noise suppressor operatively coupled to the voice recognition logic, an energy estimator operative to monitor an audio signal energy level while the noise suppressor is deactivated to conserve battery power, and a voice activity detector operatively coupled to the energy estimator.
- the voice activity detector is operative to activate in response to a first activation control signal from the energy estimator.
- a noise estimator is operatively coupled to the voice activity detector. The noise estimator is operative to activate in response to a second activation control signal from the voice activity detector.
- the apparatus may include a buffer that is operatively coupled to the voice recognition logic and the energy estimator.
- the buffer is operative to receive a control signal from the energy estimator and to buffer the audio signal in response to the control signal.
- the energy estimator may be further operative to send the first activation control signal to the voice activity detector in response to a detected change or increase in the audio signal energy level.
- the voice activity detector is operative to send the second activation control signal to the noise estimator in response to detecting voice in the audio signal.
- the apparatus may include a switch that is operatively coupled to the voice recognition logic, the noise suppressor and the noise estimator.
- the noise estimator may actuate the switch to switch the audio signal sent to the voice recognition logic from a buffered audio signal to a noise suppressed audio signal output by the noise suppressor.
- the apparatus may further include a noise suppressor algorithms selector, operatively coupled to the noise estimator and to the noise suppressor. The noise suppressor algorithms selector operative to activate and run the noise suppressor in response to a noise estimator control signal sent when the noise estimator determines that noise suppression is required.
- the apparatus may further include a noise type classifier, operatively coupled to the noise estimator and to the noise suppressor algorithms selector.
- the noise type classifier is operative to activate and run in response to a control signal from then noise estimator, and is operative to determine noise type based on information received from the noise estimator.
- the noise suppressor algorithms selector may be further operative to select a noise suppressor algorithm, from a group of available noise suppressor algorithms, where the selected noise suppressor algorithm is the most power consumption efficient for the noise type.
- the noise estimator may also be operative to determine that noise suppression is not required and actuate the switch to switch the audio signal sent to the voice recognition logic from a noise suppressed audio signal output by the noise suppressor to a buffered audio signal.
- the apparatus includes a plurality of microphones and microphone configuration logic that includes, among other things, switch logic operative to turn each microphone on or off.
- the energy estimator is further operative to control the microphone configuration logic to turn on additional microphones in response to a detected change or increase in the audio signal energy level.
- the voice activity detector may be further operative to deactivate the additional microphones and return to a single microphone configuration, or to a low power mode of that microphone, in response to voice not being detected in the audio signal by the voice activity detector.
- FIG. 1 is a schematic block diagram of an apparatus 100 which is a voice recognition system in accordance with various embodiments.
- the apparatus 100 may be incorporated into and used in various battery-powered electronic devices that employ voice-recognition. That is, the apparatus 100 may be used in any of various mobile devices such as, but not limited to, a mobile telephone, smart phone, camera, video camera, tablet, laptop, audio recorder or some other battery-powered electronic device, etc.
- FIG. 1 schematic block diagram is limited, for the purpose of clarity, to showing only those components useful to describe the features and advantages of the various embodiments, and to describe how to make and use the various embodiments to those of ordinary skill. It is therefore to be understood that various other components, circuitry, and devices etc. may be present in order to implement an apparatus and that those various other components, circuitry, devices, etc., are understood to be present by those of ordinary skill.
- the apparatus may include inputs for receiving power from a power source and a power bus that may be connected to a battery housed within one of the various battery powered electronic devices such as mobile devices, etc. to provide power to the apparatus 100 or to distribute power to the various components of the apparatus 100 .
- the apparatus 100 may also include an internal communication bus, for providing operative coupling between the various components, circuitry, and devices.
- operatively coupled refers to coupling that enables operational and/or functional communication and relationships between the various components, circuitry, devices etc. described as being operatively coupled and may include any intervening items (i.e. buses, connectors, other components, circuitry, devices etc.) used to enable such communication such as, for example, internal communication buses such as data communication buses or any other intervening items that one of ordinary skill would understand to be present.
- other intervening items may be present between “operatively coupled” items even though such other intervening items are not necessary to the functional communication facilitated by the operative coupling.
- a data communication bus may be present in various embodiments and may provide data to several items along a pathway along which two or more items are operatively coupled, etc. Such operative coupling is shown generally in FIG. 1 described herein.
- the apparatus 100 may include a group of microphones 110 that provide microphone outputs and that are operatively coupled to microphone configuration logic 120 .
- the example of FIG. 1 shows three microphones, the embodiments are not limited to three microphones and any number of microphones may be used in the embodiments.
- the group of microphones 110 are shown using a dotted line in FIG. 1 because the group of microphones 110 is not necessarily a part of the apparatus 100 . In other words, the group of microphones 110 may be part of a mobile device or some other device into which the apparatus 100 is incorporated.
- the apparatus 100 is operatively coupled to the group of microphones 110 , which are located within the mobile device, via a suitable communication bus or suitable connectors, etc., such that the group of microphones 110 are operatively coupled to the microphone configuration logic 120 .
- the microphone configuration logic 120 may include various front end processing, such as, but not limited to, signal amplification, analog-to-digital conversion/digital audio sampling, echo cancellation, etc., which may be applied to the microphone M 1 , M 2 , M 3 outputs prior to performing additional, less power efficient signal processing such as noise suppression.
- the microphone configuration logic 120 may also include switch logic operatively coupled to the group of microphones 110 and operative to respond to control signals to turn each of microphones M 1 , M 2 or M 3 on or off so as to save power consumption by not using the front end processing of the microphone configuration logic 120 for those microphones that are turned off. Additionally, in some embodiments, the microphone configuration logic 120 may be operative to receive control signals from other components of the apparatus 100 to adjust front end processing parameters such as, for example, amplifier gain.
- the microphone configuration logic 120 is operatively coupled to a history buffer 130 , to provide the three microphone outputs M 1 , M 2 and M 3 to the history buffer 130 .
- Microphone configuration logic 120 is also operatively coupled to an energy estimator 140 and provides a single microphone output M 3 to the energy estimator 140 .
- the energy estimator 140 is operatively coupled to the history buffer 130 and to a voice activity detector 150 .
- the energy estimator 140 provides a control signal 115 to the history buffer 130 , a control signal 117 to the voice activity detector 150 and a control signal 121 to the microphone configuration logic 120 .
- the voice activity detector 150 is also operatively coupled to the microphone configuration logic 120 to receive the microphone M 3 output and to provide a control signal 123 to microphone configuration logic 120 .
- the voice activity detector 150 is further operatively coupled to a signal-to-noise ratio (SNR) estimator 160 and provides a control signal 119 .
- the signal-to-noise ratio (SNR) estimator 160 is operatively coupled to the history buffer 130 , a noise type classifier 170 , a noise suppressor algorithms selector 180 , and a switch 195 .
- the various signal processing components such as the voice activity detector 150 , SNR estimator 160 , noise type classifier 170 , noise suppressor algorithms selector 180 and noise suppressor 190 are kept in a deactivated state until needed and are progressively activated according to various decisions which may also be made progressively. Likewise, activated signal components are progressively deactivated when no longer needed in accordance with the embodiments.
- the SNR estimator 160 receives a buffered voice signal 113 from the history buffer 130 and provides control signal 127 to the switch 195 , control signal 129 to noise type classifier 170 , and control signal 135 to the noise suppressor algorithms selector 180 .
- the noise type classifier 170 is operatively coupled to the history buffer 130 , the SNR estimator 160 and the noise suppressor algorithms selector 180 .
- the noise type classifier 170 receives a buffered voice signal 111 from the history buffer 130 and provides a control signal 131 to the noise suppressor algorithms selector 180 .
- the noise suppressor algorithms selector 180 is operatively coupled to the SNR estimator 160 , the noise type classifier 170 , the microphone configuration logic 120 , a noise suppressor 190 and system memory 107 .
- the noise suppressor algorithms selector 180 provides a control signal 125 to the microphone configuration logic 120 and a control signal 137 to a noise suppressor 190 .
- the noise suppressor algorithms selector 180 is also operatively coupled to system memory 107 by a read-write connection 139 .
- the noise suppressor 190 receives the buffered voice signal 111 from the history buffer 130 and provides a noise suppressed voice signal 133 to the switch 195 .
- the noise suppressor 190 may also be operatively coupled to system memory 107 by a read-write connection 143 in some embodiments.
- the switch 195 is operatively coupled to the noise suppressor 190 and to automatic gain control (AGC) 105 , and provides voice signal 141 to the AGC 105 .
- AGC automatic gain control
- Voice command recognition logic 101 is operatively coupled to AGC 105 and to the system control 103 , which may be any type of voice controllable system control depending on the mobile device such as, but not limited to, a voice controlled dialer of a mobile telephone, a video recorder system control, an application control of a mobile telephone, smartphone, tablet, laptop, etc., or any other type of voice controllable system control.
- the AGC 105 adjusts the voice signal 141 received from the switch 195 and provides a gain adjusted voice signal 145 to the voice command recognition logic 101 .
- the voice command recognition logic 101 sends a control signal 147 to the system control 103 in response to detected command words or command phrases received on the voice signal 145 .
- a transceiver 197 may also be present and may be operatively coupled to receive either the gain adjusted voice signal 145 as shown, or to receive the voice signal 141 .
- the transceiver 197 may be a wireless transceiver for wireless communication using any wireless technology and may utilize the received voice signal as an uplink (i.e. send) transmission portion of a wireless duplex communication channel in embodiments where the apparatus 100 is used in a mobile telephone or smartphone, or etc.
- either the gain adjusted voice signal 145 or the voice signal 141 may also be provided to a transceiver external to the apparatus 100 using appropriate connectivity between the apparatus 100 and a device into which the apparatus 100 is incorporated.
- the transceiver 197 may be used to transmit voice commands to a remote voice command recognition system.
- the system memory 107 is a non-volatile, non-transitory memory, and may be accessible by other components of the apparatus 100 for various settings, stored applications, etc.
- system memory 107 may store a database of noise suppression algorithms 109 , which may be accessed by noise suppressor algorithms selector 180 , over read-write connection 139 .
- the noise suppressor 190 access system memory 107 over read-write connection 143 and may retrieve selected noise suppression algorithms from the database of noise suppression algorithms 109 for execution.
- the switch 195 is operative to respond to the control signal 127 from the SNR estimator 160 , to switch its output voice signal 141 between the buffered voice signal 111 and the noise suppressor 190 noise suppressed voice signal 133 .
- switch 195 operates as a changeover switch.
- the output voice signal 141 from switch 195 is provided to the AGC 105 .
- the disclosed embodiments employ voice activity detector 150 to distinguish voice activity from noise and accordingly enable the voice command recognition logic 101 and noise reduction as needed to improve voice recognition performance.
- the embodiments also utilize a low power noise estimator, SNR estimator 160 , to determine when to enable or disable noise reduction thereby saving battery power. For example, under low noise conditions, the noise reduction can be disabled accordingly. Also, some microphones may be turned off during low noise conditions which also conserves battery power.
- voice activity detector 150 may trigger operation of noise suppressor 190 or may send control signal 123 to the microphone configuration logic 120 to increase front end processing gain, rather than invoke the noise suppressor 190 , initially for low noise conditions.
- dual-microphone noise reduction may be enabled.
- a single microphone may be used, and the energy estimator 140 may create a long term energy base line from which rapid deviations will trigger the noise suppressor 190 and voice activity detector (VAD) 150 to analyze the voice signal and to decide when noise reduction should be applied.
- VAD voice activity detector
- an absolute ambient noise measurement may be used to decide if noise reduction should be applied and, if so, the type of noise reduction best suited for the condition. That is, because the noise suppressor algorithms selected will impact power consumption, selectively running or not running certain noise suppressor algorithms serves to minimize battery power consumption.
- the energy estimator 140 is operative to detect deviations from a baseline that may be an indicator of voice being present in a received audio signal, received, for example, from microphone M 3 . If such deviations are detected, the energy estimator 140 may send control signal 117 to activate VAD 150 to determine if voice is actually present in the received audio signal.
- An example method of operation of the apparatus 100 may be understood in view of the flowchart of FIG. 2 .
- the method of operation begins in operation block 201 which represents a default state in which the microphone configuration logic 120 is controlled to use a single microphone configuration in order to conserve battery power. Any front end processing of the microphone configuration logic 120 for other microphones of the group of microphones 110 is therefore turned off.
- the energy estimator 140 determines an energy baseline. The energy estimator 140 first calculates the signal level and long term power estimates, and short-term deviation from the long-term baseline. Short-term deviations exceeding a threshold invoke powering multiple microphones and buffering the signals.
- the energy estimator 140 monitors the audio output from one microphone such as microphone M 3 and looks for changes in the audio signal energy level. If an observed short-term deviation exceeds the threshold in decision block 205 , the energy estimator 140 sends control signal 121 to the microphone configuration logic 120 to turn on at least one additional microphone as shown in operation block 207 . In operation block 213 , the energy estimator 140 also sends control signal 115 to history buffer 130 to invoke buffering of audio signals from the activated microphones since the buffered audio may need to have noise suppression applied in operation block 229 . Also, in operation block 209 , energy estimator 140 sends control signal 117 to VAD 150 to activate VAD 150 to determine if speech is present in the M 3 audio signal.
- the energy estimator 140 continues to monitor the single microphone as in operation block 201 .
- the energy estimator 140 operates to monitor an audio signal from at least one of the microphones while other signal processing components remain deactivated.
- a deactivated signal processing component is one that is powered down or placed in a low power mode such as a sleep state where the signal processing component operates with either no power consumption or with reduced power consumption. The signal processing component is therefore activated when it is either powered up or is awaken from a low power mode such as a sleep state.
- decision block 211 if the VAD 150 does not detect speech, the VAD 150 sends control signal 123 to the microphone configuration logic 120 and returns the system to a lower power state. For example, in operation block 231 , the control signal 123 may turn off any additional microphones so that only a single microphone is used. If voice (i.e. speech activity) is detected in decision block 211 , then VAD 150 sends control signal 119 to activate SNR estimator 160 . In operation block 215 , the SNR estimator 160 proceeds to estimate short-term signal-to-noise ratio and signal levels in order to determine if de-noising is needed.
- voice i.e. speech activity
- the SNR estimator 160 may send control signal 127 to the switch 195 to maintain the apparatus 100 in a low power state, i.e. bypassing and not using the noise suppressor 190 .
- the apparatus 100 may also be returned to a single microphone mode of operation.
- the noise suppressor algorithms selector 180 may send control signal 125 to the microphone configuration logic 120 to switch off any additional microphones.
- the voice signal 141 is provided to the AGC 105 and is gained up to obtain the level required and the gain adjusted voice signal 145 is sent to the voice command recognition logic 101 .
- the voice command recognition logic 101 and, if command words or command phrases are detected, may send control signal 147 to the system control 103 .
- the method of operation then ends. If noise reduction is determined to be necessary by the SNR estimator 160 in decision block 217 , then the SNR estimator 160 sends control signal 129 to activate noise type classifier 170 as shown in operation block 223 .
- the noise type classifier 170 receives the buffered voice signal 111 , and may also receive signal-to-noise ratio information from SNR estimator 160 via control signal 129 .
- the noise type classifier 170 assigns a noise type and sends the noise type information by control signal 131 to noise suppressor algorithms selector 180 .
- the noise suppressor algorithms selector 180 may also receive information from SNR estimator 160 via control signal 135 .
- the noise suppressor algorithms selector 180 proceeds to select an appropriate noise suppressor algorithm for the observed conditions (i.e. observed SNR and noise type). This may be accomplished, in some embodiments, by accessing system memory 107 over read-write connection 139 .
- the system memory 107 may store the database of noise suppression algorithms 109 and any other useful information such as an associated memory table that can be used to compare observed SNR and noise types to select a suitable noise suppression algorithm.
- the noise suppressor algorithms selector 180 may then send control signal 137 to activate noise suppressor 190 and to provide a pointer to the location in system memory 107 of the selected noise suppression algorithm.
- the noise suppressor algorithms selector 180 may also send control signal 125 to the microphone configuration logic to make any adjustments that might be needed in relation to the selected noise suppressor algorithm.
- the noise suppressor 190 may access system memory 107 and the database of noise suppression algorithms 109 over read-write connection 143 to access the selected noise suppression algorithm and execute it accordingly.
- the SNR estimator 160 will also send control signal 127 to switch 195 to switch to receive the noise suppressed voice signal 133 output from noise suppressor 190 , rather than the buffered voice signal 111 .
- the noise suppressor 190 receives the buffered voice signal 111 , applies the selected noise suppression algorithm and provides the noise suppressed voice signal 133 to switch 195 .
- the method of operation then again proceeds to operation block 219 where the voice signal 141 is provided to the AGC 105 and is gained up to obtain the level required and the gain adjusted voice signal 145 is sent to the voice command recognition logic 101 .
- the voice command recognition logic 101 operates on the gain adjusted voice signal 145 and the method of operation ends as shown.
- the apparatus 100 may then return to single microphone operation and the method of operation beginning at operation block 201 may continue.
- a noise suppressor algorithm is invoked based on the attempt to determine the type of noise present in the environment, based on the noise type, and signal to noise ratio. As the noise conditions worsen, different noise algorithms can be used, with progressively increased complexity and power consumption cost. As discussed above with respect to decision block 211 , the system returns to low power state after a negative VAD 150 decision or, in some embodiments after some time-out period.
- the apparatus 100 may run a continuous single microphone powered, long-term noise estimator/classifier which can store a set of noise estimates to be used by the noise reduction system to aid speed up convergence.
- a continuously run VAD may be employed to look for speech activity. In both embodiments, the apparatus will remain in an elevated power state returning from voice recognition invocation into VAD estimation.
- the various components, circuitry, devices etc. described with respect to FIG. 1 including, but not limited to, those described using the term “logic,” such as the microphone configuration logic 120 , history buffer 130 , energy estimator 140 , VAD 150 , SNR estimator 160 , noise type classifier 170 , noise suppressor algorithms selector 180 , noise suppressor 190 , switch 195 , AGC 105 , voice command recognition logic 101 , or system control 103 may be implemented in various ways such as by software and/or firmware executing on one or more programmable processors such as a central processing unit (CPU) or the like, or by ASICs, DSPs, FPGAs, hardwired circuitry (logic circuitry), or any combinations thereof.
- CPU central processing unit
- ASICs application specific integrated circuitry
- DSPs digital signal processor
- FPGAs field-programmable gate arrays
- hardwired circuitry logic circuitry
- control signals may be implemented in various ways such as using application programming interfaces (APIs) between the various components. Therefore, in some embodiments, components may be operatively coupled using APIs rather than a hardware communication bus if such components are implemented as by software and/or firmware executing on one or more programmable processors.
- the noise suppressor algorithms selector 180 and the noise suppressor 190 may be software and/or firmware executing on a single processor and may communicate and interact with each other using APIs.
- operations involving the system memory 107 may be implemented using pointers where the components such as, but not limited to, the noise suppressor algorithms selector 180 or the noise suppressor 190 , access the system memory 107 as directed by control signals which may include pointers to memory locations or database access commands that access the database of noise suppression algorithms 109 .
- control signals may include pointers to memory locations or database access commands that access the database of noise suppression algorithms 109 .
- such operations may be accomplished in the various embodiments using application programming interfaces (APIs).
- APIs application programming interfaces
- FIG. 3 is a flow chart showing a method of operation related to voice signal detection in accordance with various embodiments.
- an apparatus uses a microphone signal level as a measure to determine if pre-processing is needed.
- the apparatus runs a detector for energy deviations from a long term base-line and invokes VAD/noise estimators to make decisions as to when voice recognition logic should operate.
- the apparatus detects the need for signal conditioning based on a low-power noise estimator (i.e. by running the noise estimator only).
- the apparatus uses a VAD to determine voice activity from noise and to determine to whether or not to run noise suppression, or voice recognition, and runs one or the other only when needed.
- the apparatus will classify the noise type, and based on noise type, will invoke appropriate noise suppression or other appropriate signal conditioning.
- FIG. 4 is a flow chart showing a method of operation related to selection of signal processing in accordance with various embodiments.
- the apparatus determines which microphones are not needed (as well as any associated circuitry such as amplifiers, A/D converters etc.) and turns off the microphones (and any associated circuitry) accordingly.
- the apparatus uses a single microphone for continuously running triggers/estimators.
- the apparatus uses an ultra-low-power microphone for monitoring only (or uses lower power mode for one of the microphones).
- the apparatus stores data in a history buffer, and when triggered processes only data in the history buffer, rather than continuously.
- the history buffer maintains an audio signal of interest while decisions are made as to whether voice is present in the audio signal and, subsequently, whether further signal processing components should be invoked such as noise suppression. If further signal processing components such as the noise suppressor are not required, the buffered audio signal may be sent directly to the voice command recognition logic 101 .
- the apparatus uses no noise suppression (in quiet conditions), single-microphone noise suppression (for example in favorable SNR and noise types), multiple-microphone noise suppression as per conditions observed and when needed only.
- the apparatus determines signal level and SNR dependency, maximizes gain in high SNR conditions (i.e. if favorable conditions exist apply gain to boost signal, rather than de-noise signal).
- the apparatus uses voice recognition specially trained with power-efficient noise-reduction pre-processing algorithm, and runs the power efficient noise reduction front end on the portable (i.e. a mobile device in which the apparatus is incorporated).
- the apparatus uses long-term noise estimates to configure apparatus components such as voice recognition and signal conditioning components, and uses the short-term estimate to select optimal configurations and switch between those.
- an audio signal energy level is monitored while having other signal processing components deactivated.
- at least one of the other signal processing components is activated in response to a detected change in the audio signal energy level. For example, if the energy level changes, this may be an indication that a device operator is speaking and attempting to command the device.
- a VAD may be activated as the at least one other signal processing component in some embodiments. If the VAD detects the presence of voice in the audio signal, further signal processing components, such as a noise suppressor, may be activated. In another embodiment, a noise estimator may be activated initially using the assumption that voice is present in the audio signal.
- the flowchart of FIG. 6 provides a method of operation where a VAD is activated in response to changes in the audio signal level as shown in operation block 601 .
- Other signal processing components are deactivated initially.
- operation block 603 if voice is detected by the VAD, other signal processing components are activated in order to analyze the audio signal and determine if noise suppression should be applied or not. Noise suppression is then either applied, or not applied, accordingly.
- operation block 605 various audio signal processing components are either activated or deactivated as audio signal conditions change or when voice is no longer detected.
- the apparatus may by returned from a multi-microphone configuration to a single, low-power microphone configuration and noise suppressors, etc. may be deactivated.
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Signal Processing (AREA)
- Quality & Reliability (AREA)
- Circuit For Audible Band Transducer (AREA)
- Telephone Function (AREA)
Abstract
A disclosed method includes monitoring an audio signal energy level while having a plurality of signal processing components deactivated and activating at least one signal processing component in response to a detected change in the audio signal energy level. The method may include activating and running a voice activity detector on the audio signal in response to the detected change where the voice activity detector is the at least one signal processing component. The method may further include activating and running the noise suppressor only if a noise estimator determines that noise suppression is required. The method may activate and runs a noise type classifier to determine the noise type based on information received from the noise estimator and may select a noise suppressor algorithm, from a group of available noise suppressor algorithms, where the selected noise suppressor algorithm is the most power consumption efficient.
Description
- The present application claims priority to U.S. Provisional Patent Application No. 61/827,797, filed May 28, 2013, entitled “APPARATUS AND METHOD FOR POWER EFFICIENT SIGNAL CONDITIONING IN A VOICE RECOGNITION SYSTEM,” and further claims priority to U.S. Provisional Patent Application No. 61/798,097, filed Mar. 15, 2013, entitled “VOICE RECOGNITION FOR A MOBILE DEVICE,” and further claims priority to U.S. Provisional Pat. App. No. 61/776,793, filed Mar. 12, 2013, entitled “VOICE RECOGNITION FOR A MOBILE DEVICE,” all of which are assigned to the same assignee as the present application, and all of which are hereby incorporated by reference herein in their entirety.
- The present disclosure relates generally to voice signal processing and more particularly to voice signal processing for voice recognition systems.
- Mobile devices such as, but not limited to, mobile phones, smart phones, personal digital assistants (PDAs), tablets, laptops or other electronic devices, etc., increasingly include voice recognition systems to provide hands free voice control of the devices. Although voice recognition technologies have been improving, accurate voice recognition remains a technical challenge.
- A particular challenge when implementing voice recognition systems on mobile devices is that, as the mobile device moves or is positioned in certain ways, the acoustic environment of the mobile device changes accordingly thereby changing the sound perceived by the mobile device's voice recognition system. Voice sound that may be recognized by the voice recognition system under one acoustic environment may be unrecognizable under certain changed conditions due to mobile device motion or positioning. Various other conditions in the surrounding environment can add noise, echo or cause other acoustically undesirable conditions that also adversely impact the voice recognition system.
- More specifically, the mobile device acoustic environment impacts the operation of signal processing components such as microphone arrays, noise suppressors, echo cancellation systems and signal conditioning that is used to improve voice recognition performance. Such signal processing operations for voice recognition improvement are not power efficient and increase the drain on battery power. Because users expects voice recognition systems to be available as needed, various voice recognition system programs, processes or services may be required to run continuously resulting in further increased power consumption.
-
FIG. 1 is a schematic block diagram of an apparatus in accordance with the embodiments. -
FIG. 2 is a flow chart providing an example method of operation of the apparatus ofFIG. 1 in accordance with various embodiments. -
FIG. 3 is a flow chart showing a method of operation related to voice signal detection in accordance with various embodiments. -
FIG. 4 is a flow chart showing a method of operation related to selection of signal processing in accordance with various embodiments. -
FIG. 5 is a flow chart showing a method of operation in accordance with various embodiments. -
FIG. 6 is a flow chart showing a method of operation in accordance with various embodiments. - Briefly, the disclosed embodiments detect when conditions require the use of accurate, and thus less power efficient, signal processing to assist in voice recognition. Such power intensive signal processing is turned off or otherwise disabled to conserve battery power for as long as possible. The disclosed embodiments achieve a progressive increase of accuracy by running more computationally efficient signal processing on fewer resources and making determinations of when to invoke more sophisticated signal processing based on detected changes of conditions. More particularly, based on information obtained from signal observations, decisions may be made to power-off hardware that is not needed. In other words, when conditions improve from the standpoint of voice recognition performance, the amount of signal processing is ramped down which results in more efficient use of resources and decreased battery power consumption.
- Among other advantages of the disclosed embodiments, power consumption is minimized by optimizing voice recognition system operation in every software and hardware layer, including switching off non-essential hardware, running power efficient signal processing and relying on accurate, less power efficient signal processing only when needed to accommodate acoustic environment conditions.
- A disclosed method of operation includes monitoring an audio signal energy level with a plurality of signal processing components deactivated, and activating at least one signal processing component of the plurality of signal processing components in response to a detected change in the audio signal energy level. The method may further include activating and running a voice activity detector on the audio signal in response to the detected change in the audio energy level where the voice activity detector is one of the signal processing components that is otherwise kept deactivated. In one embodiment, a method of operation includes monitoring an audio signal energy level while having a noise suppressor deactivated to conserve battery power, buffering the audio signal in response to a detected change or increase in the audio energy level, activating and running a voice activity detector on the audio signal in response to the detected change or increase in the audio energy level and activating and running a noise estimator in response to voice being detected in the audio signal by the voice activity detector. In some embodiments, the method may further include activating and running the noise suppressor only if the noise estimator determines that noise suppression is required. In some embodiments, the method may further include activating and running a noise type classifier to determine the noise type based on information received from the noise estimator and selecting a noise suppressor algorithm, from a group of available noise suppressor algorithms, based on the noise type. The selected noise suppressor algorithm may also be selected based on power consumption efficiency for the noise type. The method may further include determining, by the noise estimator, that noise suppression is not required, and performing voice recognition on the buffered audio signal without activating the noise suppressor.
- The method may also include applying gain to the buffered audio signal prior to performing voice recognition. The method may include activating additional microphones to receive audio in response to the detected increase in the audio energy level. The method of operation may deactivate the additional microphones and return to a single microphone configuration in response to voice not being detected in the audio signal by the voice activity detector. The energy estimator calculates a long term energy baseline and a short term deviation from it, and monitors the audio signal energy level while having a noise suppressor, or other signal processing components, deactivated to conserve battery power. The method of operation may include buffering the audio signal in response to a detected short term deviation.
- A disclosed apparatus in one embodiment includes a noise suppressor, a voice activity detector and an energy estimator that is operatively coupled to the voice activity detector. The energy estimator is operative to monitor an audio signal energy level with at least the noise suppressor and the voice activity detector deactivated. Upon detecting a change in the audio signal energy level, the voice activity detector is operative to activate at least the voice activity detector in response to the detected change. In one embodiment, a disclosed apparatus includes voice recognition logic, a noise suppressor operatively coupled to the voice recognition logic, an energy estimator operative to monitor an audio signal energy level while the noise suppressor is deactivated to conserve battery power, and a voice activity detector operatively coupled to the energy estimator. The voice activity detector is operative to activate in response to a first activation control signal from the energy estimator. A noise estimator is operatively coupled to the voice activity detector. The noise estimator is operative to activate in response to a second activation control signal from the voice activity detector.
- In various embodiments, the apparatus may include a buffer that is operatively coupled to the voice recognition logic and the energy estimator. The buffer is operative to receive a control signal from the energy estimator and to buffer the audio signal in response to the control signal. The energy estimator may be further operative to send the first activation control signal to the voice activity detector in response to a detected change or increase in the audio signal energy level. The voice activity detector is operative to send the second activation control signal to the noise estimator in response to detecting voice in the audio signal.
- In various embodiments, the apparatus may include a switch that is operatively coupled to the voice recognition logic, the noise suppressor and the noise estimator. The noise estimator may actuate the switch to switch the audio signal sent to the voice recognition logic from a buffered audio signal to a noise suppressed audio signal output by the noise suppressor. The apparatus may further include a noise suppressor algorithms selector, operatively coupled to the noise estimator and to the noise suppressor. The noise suppressor algorithms selector operative to activate and run the noise suppressor in response to a noise estimator control signal sent when the noise estimator determines that noise suppression is required.
- In some embodiments, the apparatus may further include a noise type classifier, operatively coupled to the noise estimator and to the noise suppressor algorithms selector. The noise type classifier is operative to activate and run in response to a control signal from then noise estimator, and is operative to determine noise type based on information received from the noise estimator. The noise suppressor algorithms selector may be further operative to select a noise suppressor algorithm, from a group of available noise suppressor algorithms, where the selected noise suppressor algorithm is the most power consumption efficient for the noise type. The noise estimator may also be operative to determine that noise suppression is not required and actuate the switch to switch the audio signal sent to the voice recognition logic from a noise suppressed audio signal output by the noise suppressor to a buffered audio signal.
- In some embodiments, the apparatus includes a plurality of microphones and microphone configuration logic that includes, among other things, switch logic operative to turn each microphone on or off. The energy estimator is further operative to control the microphone configuration logic to turn on additional microphones in response to a detected change or increase in the audio signal energy level. The voice activity detector may be further operative to deactivate the additional microphones and return to a single microphone configuration, or to a low power mode of that microphone, in response to voice not being detected in the audio signal by the voice activity detector.
- Turning now to the drawings,
FIG. 1 is a schematic block diagram of anapparatus 100 which is a voice recognition system in accordance with various embodiments. Theapparatus 100 may be incorporated into and used in various battery-powered electronic devices that employ voice-recognition. That is, theapparatus 100 may be used in any of various mobile devices such as, but not limited to, a mobile telephone, smart phone, camera, video camera, tablet, laptop, audio recorder or some other battery-powered electronic device, etc. - It is to be understood that the
FIG. 1 schematic block diagram is limited, for the purpose of clarity, to showing only those components useful to describe the features and advantages of the various embodiments, and to describe how to make and use the various embodiments to those of ordinary skill. It is therefore to be understood that various other components, circuitry, and devices etc. may be present in order to implement an apparatus and that those various other components, circuitry, devices, etc., are understood to be present by those of ordinary skill. For example, the apparatus may include inputs for receiving power from a power source and a power bus that may be connected to a battery housed within one of the various battery powered electronic devices such as mobile devices, etc. to provide power to theapparatus 100 or to distribute power to the various components of theapparatus 100. - Another example is that the
apparatus 100 may also include an internal communication bus, for providing operative coupling between the various components, circuitry, and devices. The terminology “operatively coupled” as used herein refers to coupling that enables operational and/or functional communication and relationships between the various components, circuitry, devices etc. described as being operatively coupled and may include any intervening items (i.e. buses, connectors, other components, circuitry, devices etc.) used to enable such communication such as, for example, internal communication buses such as data communication buses or any other intervening items that one of ordinary skill would understand to be present. Also, it is to be understood that other intervening items may be present between “operatively coupled” items even though such other intervening items are not necessary to the functional communication facilitated by the operative coupling. For example, a data communication bus may be present in various embodiments and may provide data to several items along a pathway along which two or more items are operatively coupled, etc. Such operative coupling is shown generally inFIG. 1 described herein. - In
FIG. 1 theapparatus 100 may include a group ofmicrophones 110 that provide microphone outputs and that are operatively coupled tomicrophone configuration logic 120. Although the example ofFIG. 1 shows three microphones, the embodiments are not limited to three microphones and any number of microphones may be used in the embodiments. It is to be understood that the group ofmicrophones 110 are shown using a dotted line inFIG. 1 because the group ofmicrophones 110 is not necessarily a part of theapparatus 100. In other words, the group ofmicrophones 110 may be part of a mobile device or some other device into which theapparatus 100 is incorporated. In that case, theapparatus 100 is operatively coupled to the group ofmicrophones 110, which are located within the mobile device, via a suitable communication bus or suitable connectors, etc., such that the group ofmicrophones 110 are operatively coupled to themicrophone configuration logic 120. - The
microphone configuration logic 120 may include various front end processing, such as, but not limited to, signal amplification, analog-to-digital conversion/digital audio sampling, echo cancellation, etc., which may be applied to the microphone M1, M2, M3 outputs prior to performing additional, less power efficient signal processing such as noise suppression. Themicrophone configuration logic 120 may also include switch logic operatively coupled to the group ofmicrophones 110 and operative to respond to control signals to turn each of microphones M1, M2 or M3 on or off so as to save power consumption by not using the front end processing of themicrophone configuration logic 120 for those microphones that are turned off. Additionally, in some embodiments, themicrophone configuration logic 120 may be operative to receive control signals from other components of theapparatus 100 to adjust front end processing parameters such as, for example, amplifier gain. - The
microphone configuration logic 120 is operatively coupled to ahistory buffer 130, to provide the three microphone outputs M1, M2 and M3 to thehistory buffer 130.Microphone configuration logic 120 is also operatively coupled to anenergy estimator 140 and provides a single microphone output M3 to theenergy estimator 140. Theenergy estimator 140 is operatively coupled to thehistory buffer 130 and to avoice activity detector 150. Theenergy estimator 140 provides acontrol signal 115 to thehistory buffer 130, acontrol signal 117 to thevoice activity detector 150 and acontrol signal 121 to themicrophone configuration logic 120. - The
voice activity detector 150 is also operatively coupled to themicrophone configuration logic 120 to receive the microphone M3 output and to provide acontrol signal 123 tomicrophone configuration logic 120. Thevoice activity detector 150 is further operatively coupled to a signal-to-noise ratio (SNR)estimator 160 and provides acontrol signal 119. The signal-to-noise ratio (SNR)estimator 160 is operatively coupled to thehistory buffer 130, anoise type classifier 170, a noisesuppressor algorithms selector 180, and aswitch 195. In the various embodiments, the various signal processing components such as thevoice activity detector 150,SNR estimator 160,noise type classifier 170, noisesuppressor algorithms selector 180 andnoise suppressor 190 are kept in a deactivated state until needed and are progressively activated according to various decisions which may also be made progressively. Likewise, activated signal components are progressively deactivated when no longer needed in accordance with the embodiments. - The
SNR estimator 160 receives a buffered voice signal 113 from thehistory buffer 130 and providescontrol signal 127 to theswitch 195,control signal 129 tonoise type classifier 170, and control signal 135 to the noisesuppressor algorithms selector 180. Thenoise type classifier 170 is operatively coupled to thehistory buffer 130, theSNR estimator 160 and the noisesuppressor algorithms selector 180. - The
noise type classifier 170 receives a buffered voice signal 111 from thehistory buffer 130 and provides acontrol signal 131 to the noisesuppressor algorithms selector 180. The noisesuppressor algorithms selector 180 is operatively coupled to theSNR estimator 160, thenoise type classifier 170, themicrophone configuration logic 120, anoise suppressor 190 andsystem memory 107. The noisesuppressor algorithms selector 180 provides acontrol signal 125 to themicrophone configuration logic 120 and acontrol signal 137 to anoise suppressor 190. The noisesuppressor algorithms selector 180 is also operatively coupled tosystem memory 107 by a read-write connection 139. - The
noise suppressor 190 receives the buffered voice signal 111 from thehistory buffer 130 and provides a noise suppressedvoice signal 133 to theswitch 195. Thenoise suppressor 190 may also be operatively coupled tosystem memory 107 by a read-write connection 143 in some embodiments. Theswitch 195 is operatively coupled to thenoise suppressor 190 and to automatic gain control (AGC) 105, and providesvoice signal 141 to theAGC 105. Voicecommand recognition logic 101 is operatively coupled toAGC 105 and to thesystem control 103, which may be any type of voice controllable system control depending on the mobile device such as, but not limited to, a voice controlled dialer of a mobile telephone, a video recorder system control, an application control of a mobile telephone, smartphone, tablet, laptop, etc., or any other type of voice controllable system control. TheAGC 105 adjusts thevoice signal 141 received from theswitch 195 and provides a gain adjustedvoice signal 145 to the voicecommand recognition logic 101. The voicecommand recognition logic 101 sends acontrol signal 147 to thesystem control 103 in response to detected command words or command phrases received on thevoice signal 145. In some embodiments, atransceiver 197 may also be present and may be operatively coupled to receive either the gain adjustedvoice signal 145 as shown, or to receive thevoice signal 141. Thetransceiver 197 may be a wireless transceiver for wireless communication using any wireless technology and may utilize the received voice signal as an uplink (i.e. send) transmission portion of a wireless duplex communication channel in embodiments where theapparatus 100 is used in a mobile telephone or smartphone, or etc. In alternative embodiments, either the gain adjustedvoice signal 145 or thevoice signal 141 may also be provided to a transceiver external to theapparatus 100 using appropriate connectivity between theapparatus 100 and a device into which theapparatus 100 is incorporated. In some embodiments, thetransceiver 197 may be used to transmit voice commands to a remote voice command recognition system. - The
system memory 107 is a non-volatile, non-transitory memory, and may be accessible by other components of theapparatus 100 for various settings, stored applications, etc. In someembodiments system memory 107 may store a database ofnoise suppression algorithms 109, which may be accessed by noisesuppressor algorithms selector 180, over read-write connection 139. In some embodiments, thenoise suppressor 190access system memory 107 over read-write connection 143 and may retrieve selected noise suppression algorithms from the database ofnoise suppression algorithms 109 for execution. - The
switch 195 is operative to respond to the control signal 127 from theSNR estimator 160, to switch itsoutput voice signal 141 between the bufferedvoice signal 111 and thenoise suppressor 190 noise suppressedvoice signal 133. In other words, switch 195 operates as a changeover switch. The output voice signal 141 fromswitch 195 is provided to theAGC 105. - The disclosed embodiments employ
voice activity detector 150 to distinguish voice activity from noise and accordingly enable the voicecommand recognition logic 101 and noise reduction as needed to improve voice recognition performance. The embodiments also utilize a low power noise estimator,SNR estimator 160, to determine when to enable or disable noise reduction thereby saving battery power. For example, under low noise conditions, the noise reduction can be disabled accordingly. Also, some microphones may be turned off during low noise conditions which also conserves battery power. - Various actions may be triggered or invoked in the embodiments based on voice activity or other criteria that progressively ramp up the application of signal processing requiring increased power consumption. For example, the
voice activity detector 150 may trigger operation ofnoise suppressor 190 or may send control signal 123 to themicrophone configuration logic 120 to increase front end processing gain, rather than invoke thenoise suppressor 190, initially for low noise conditions. - For a high noise environment, dual-microphone noise reduction may be enabled. For low noise environments, a single microphone may be used, and the
energy estimator 140 may create a long term energy base line from which rapid deviations will trigger thenoise suppressor 190 and voice activity detector (VAD) 150 to analyze the voice signal and to decide when noise reduction should be applied. For example, an absolute ambient noise measurement may be used to decide if noise reduction should be applied and, if so, the type of noise reduction best suited for the condition. That is, because the noise suppressor algorithms selected will impact power consumption, selectively running or not running certain noise suppressor algorithms serves to minimize battery power consumption. - Thus, the
energy estimator 140 is operative to detect deviations from a baseline that may be an indicator of voice being present in a received audio signal, received, for example, from microphone M3. If such deviations are detected, theenergy estimator 140 may send control signal 117 to activateVAD 150 to determine if voice is actually present in the received audio signal. - An example method of operation of the
apparatus 100 may be understood in view of the flowchart ofFIG. 2 . The method of operation begins inoperation block 201 which represents a default state in which themicrophone configuration logic 120 is controlled to use a single microphone configuration in order to conserve battery power. Any front end processing of themicrophone configuration logic 120 for other microphones of the group ofmicrophones 110 is therefore turned off. Inoperation block 203 theenergy estimator 140 determines an energy baseline. Theenergy estimator 140 first calculates the signal level and long term power estimates, and short-term deviation from the long-term baseline. Short-term deviations exceeding a threshold invoke powering multiple microphones and buffering the signals. - Specifically, in
decision block 205, theenergy estimator 140 monitors the audio output from one microphone such as microphone M3 and looks for changes in the audio signal energy level. If an observed short-term deviation exceeds the threshold indecision block 205, theenergy estimator 140 sendscontrol signal 121 to themicrophone configuration logic 120 to turn on at least one additional microphone as shown inoperation block 207. Inoperation block 213, theenergy estimator 140 also sendscontrol signal 115 tohistory buffer 130 to invoke buffering of audio signals from the activated microphones since the buffered audio may need to have noise suppression applied inoperation block 229. Also, inoperation block 209,energy estimator 140 sendscontrol signal 117 toVAD 150 to activateVAD 150 to determine if speech is present in the M3 audio signal. If the observed short-term deviation observed by theenergy estimator 140 does not exceed the threshold indecision block 205, theenergy estimator 140 continues to monitor the single microphone as inoperation block 201. In other words, theenergy estimator 140 operates to monitor an audio signal from at least one of the microphones while other signal processing components remain deactivated. A deactivated signal processing component is one that is powered down or placed in a low power mode such as a sleep state where the signal processing component operates with either no power consumption or with reduced power consumption. The signal processing component is therefore activated when it is either powered up or is awaken from a low power mode such as a sleep state. - In
decision block 211, if theVAD 150 does not detect speech, theVAD 150 sendscontrol signal 123 to themicrophone configuration logic 120 and returns the system to a lower power state. For example, inoperation block 231, thecontrol signal 123 may turn off any additional microphones so that only a single microphone is used. If voice (i.e. speech activity) is detected indecision block 211, thenVAD 150 sendscontrol signal 119 to activateSNR estimator 160. Inoperation block 215, theSNR estimator 160 proceeds to estimate short-term signal-to-noise ratio and signal levels in order to determine if de-noising is needed. - If noise reduction is not needed in
decision block 217, theSNR estimator 160 may send control signal 127 to theswitch 195 to maintain theapparatus 100 in a low power state, i.e. bypassing and not using thenoise suppressor 190. Theapparatus 100 may also be returned to a single microphone mode of operation. For example, the noisesuppressor algorithms selector 180 may send control signal 125 to themicrophone configuration logic 120 to switch off any additional microphones. Inoperation block 219, thevoice signal 141 is provided to theAGC 105 and is gained up to obtain the level required and the gain adjustedvoice signal 145 is sent to the voicecommand recognition logic 101. Inoperation block 221, the voicecommand recognition logic 101 and, if command words or command phrases are detected, may send control signal 147 to thesystem control 103. The method of operation then ends. If noise reduction is determined to be necessary by theSNR estimator 160 indecision block 217, then theSNR estimator 160 sendscontrol signal 129 to activatenoise type classifier 170 as shown inoperation block 223. - In
operation block 223, thenoise type classifier 170 receives the bufferedvoice signal 111, and may also receive signal-to-noise ratio information fromSNR estimator 160 viacontrol signal 129. Thenoise type classifier 170 assigns a noise type and sends the noise type information bycontrol signal 131 to noisesuppressor algorithms selector 180. The noisesuppressor algorithms selector 180 may also receive information fromSNR estimator 160 viacontrol signal 135. Inoperation block 225, the noisesuppressor algorithms selector 180 proceeds to select an appropriate noise suppressor algorithm for the observed conditions (i.e. observed SNR and noise type). This may be accomplished, in some embodiments, by accessingsystem memory 107 over read-write connection 139. Thesystem memory 107 may store the database ofnoise suppression algorithms 109 and any other useful information such as an associated memory table that can be used to compare observed SNR and noise types to select a suitable noise suppression algorithm. The noisesuppressor algorithms selector 180 may then send control signal 137 to activatenoise suppressor 190 and to provide a pointer to the location insystem memory 107 of the selected noise suppression algorithm. Inoperation block 227, the noisesuppressor algorithms selector 180 may also send control signal 125 to the microphone configuration logic to make any adjustments that might be needed in relation to the selected noise suppressor algorithm. - In
operation block 229, thenoise suppressor 190 may accesssystem memory 107 and the database ofnoise suppression algorithms 109 over read-write connection 143 to access the selected noise suppression algorithm and execute it accordingly. TheSNR estimator 160 will also send control signal 127 to switch 195 to switch to receive the noise suppressedvoice signal 133 output fromnoise suppressor 190, rather than the bufferedvoice signal 111. Instead, thenoise suppressor 190 receives the bufferedvoice signal 111, applies the selected noise suppression algorithm and provides the noise suppressedvoice signal 133 to switch 195. The method of operation then again proceeds to operation block 219 where thevoice signal 141 is provided to theAGC 105 and is gained up to obtain the level required and the gain adjustedvoice signal 145 is sent to the voicecommand recognition logic 101. Inoperation block 221, the voicecommand recognition logic 101 operates on the gain adjustedvoice signal 145 and the method of operation ends as shown. Theapparatus 100 may then return to single microphone operation and the method of operation beginning atoperation block 201 may continue. - Initially, in the embodiments, a noise suppressor algorithm is invoked based on the attempt to determine the type of noise present in the environment, based on the noise type, and signal to noise ratio. As the noise conditions worsen, different noise algorithms can be used, with progressively increased complexity and power consumption cost. As discussed above with respect to decision block 211, the system returns to low power state after a
negative VAD 150 decision or, in some embodiments after some time-out period. - In another embodiment, the
apparatus 100 may run a continuous single microphone powered, long-term noise estimator/classifier which can store a set of noise estimates to be used by the noise reduction system to aid speed up convergence. In yet another embodiment, a continuously run VAD may be employed to look for speech activity. In both embodiments, the apparatus will remain in an elevated power state returning from voice recognition invocation into VAD estimation. - It is to be understood that the various components, circuitry, devices etc. described with respect to
FIG. 1 including, but not limited to, those described using the term “logic,” such as themicrophone configuration logic 120,history buffer 130,energy estimator 140,VAD 150,SNR estimator 160,noise type classifier 170, noisesuppressor algorithms selector 180,noise suppressor 190,switch 195,AGC 105, voicecommand recognition logic 101, orsystem control 103 may be implemented in various ways such as by software and/or firmware executing on one or more programmable processors such as a central processing unit (CPU) or the like, or by ASICs, DSPs, FPGAs, hardwired circuitry (logic circuitry), or any combinations thereof. - Also, it is to be understood that the various “control signals” described herein with respect to
FIG. 1 and the various aforementioned components, may be implemented in various ways such as using application programming interfaces (APIs) between the various components. Therefore, in some embodiments, components may be operatively coupled using APIs rather than a hardware communication bus if such components are implemented as by software and/or firmware executing on one or more programmable processors. For example, the noisesuppressor algorithms selector 180 and thenoise suppressor 190 may be software and/or firmware executing on a single processor and may communicate and interact with each other using APIs. - Additionally, operations involving the
system memory 107 may be implemented using pointers where the components such as, but not limited to, the noisesuppressor algorithms selector 180 or thenoise suppressor 190, access thesystem memory 107 as directed by control signals which may include pointers to memory locations or database access commands that access the database ofnoise suppression algorithms 109. In other words, such operations may be accomplished in the various embodiments using application programming interfaces (APIs). - Further methods of operation of various embodiments are illustrated by the flowcharts of
FIG. 3 andFIG. 4 .FIG. 3 is a flow chart showing a method of operation related to voice signal detection in accordance with various embodiments. Inoperation block 301, an apparatus uses a microphone signal level as a measure to determine if pre-processing is needed. Inoperation block 303, the apparatus runs a detector for energy deviations from a long term base-line and invokes VAD/noise estimators to make decisions as to when voice recognition logic should operate. Inoperation block 305, the apparatus detects the need for signal conditioning based on a low-power noise estimator (i.e. by running the noise estimator only). Inoperation block 307, the apparatus uses a VAD to determine voice activity from noise and to determine to whether or not to run noise suppression, or voice recognition, and runs one or the other only when needed. Inoperation block 309, the apparatus will classify the noise type, and based on noise type, will invoke appropriate noise suppression or other appropriate signal conditioning. -
FIG. 4 is a flow chart showing a method of operation related to selection of signal processing in accordance with various embodiments. Inoperation block 401, the apparatus determines which microphones are not needed (as well as any associated circuitry such as amplifiers, A/D converters etc.) and turns off the microphones (and any associated circuitry) accordingly. Inoperation block 403, the apparatus uses a single microphone for continuously running triggers/estimators. Inoperation block 405, the apparatus uses an ultra-low-power microphone for monitoring only (or uses lower power mode for one of the microphones). Inoperation block 407, the apparatus stores data in a history buffer, and when triggered processes only data in the history buffer, rather than continuously. That is, the history buffer maintains an audio signal of interest while decisions are made as to whether voice is present in the audio signal and, subsequently, whether further signal processing components should be invoked such as noise suppression. If further signal processing components such as the noise suppressor are not required, the buffered audio signal may be sent directly to the voicecommand recognition logic 101. Inoperation block 409, the apparatus uses no noise suppression (in quiet conditions), single-microphone noise suppression (for example in favorable SNR and noise types), multiple-microphone noise suppression as per conditions observed and when needed only. Inoperation block 411, the apparatus determines signal level and SNR dependency, maximizes gain in high SNR conditions (i.e. if favorable conditions exist apply gain to boost signal, rather than de-noise signal). Inoperation block 413, the apparatus uses voice recognition specially trained with power-efficient noise-reduction pre-processing algorithm, and runs the power efficient noise reduction front end on the portable (i.e. a mobile device in which the apparatus is incorporated). Inoperation block 415, the apparatus uses long-term noise estimates to configure apparatus components such as voice recognition and signal conditioning components, and uses the short-term estimate to select optimal configurations and switch between those. - The flowcharts of
FIG. 5 andFIG. 6 provide methods of operation for the various embodiments described above. InFIG. 5 ,operation block 501, an audio signal energy level is monitored while having other signal processing components deactivated. Inoperation block 503, at least one of the other signal processing components is activated in response to a detected change in the audio signal energy level. For example, if the energy level changes, this may be an indication that a device operator is speaking and attempting to command the device. In response, a VAD may be activated as the at least one other signal processing component in some embodiments. If the VAD detects the presence of voice in the audio signal, further signal processing components, such as a noise suppressor, may be activated. In another embodiment, a noise estimator may be activated initially using the assumption that voice is present in the audio signal. - The flowchart of
FIG. 6 provides a method of operation where a VAD is activated in response to changes in the audio signal level as shown inoperation block 601. Other signal processing components are deactivated initially. Inoperation block 603, if voice is detected by the VAD, other signal processing components are activated in order to analyze the audio signal and determine if noise suppression should be applied or not. Noise suppression is then either applied, or not applied, accordingly. Inoperation block 605, various audio signal processing components are either activated or deactivated as audio signal conditions change or when voice is no longer detected. For example, the apparatus may by returned from a multi-microphone configuration to a single, low-power microphone configuration and noise suppressors, etc. may be deactivated. - While various embodiments have been illustrated and described, it is to be understood that the invention is not so limited. Numerous modifications, changes, variations, substitutions and equivalents will occur to those skilled in the art without departing from the scope of the present invention as defined by the appended claims.
Claims (22)
1. A method comprising:
monitoring an audio signal energy level with a plurality of signal processing components deactivated; and
activating at least one signal processing component of the plurality of signal processing components in response to a detected change in the audio signal energy level.
2. The method of claim 1 , wherein activating at least one signal processing component comprises:
activating and running a voice activity detector on the audio signal in response to the detected change in the audio energy level, the voice activity detector being one of the plurality of signal processing components.
3. The method of claim 2 , further comprising:
activating and running a noise estimator in response to voice being detected in the audio signal by the voice activity detector.
4. The method of claim 3 , further comprising:
determining, by the noise estimator, that noise suppression is required for the audio signal; and
activating and running a noise suppressor on the audio signal in response to the noise estimator determination.
5. The method of claim 3 , further comprising:
activating and running a noise type classifier to determine a noise type based on information received from the noise estimator; and
selecting a noise suppressor algorithm based on the determined noise type.
6. The method of claim 3 , further comprising:
determining, by the noise estimator, that noise suppression is not required for the audio signal; and
performing voice recognition on the audio signal without activating a noise suppressor.
7. The method of claim 1 , further comprising:
activating at least one additional microphone to receive the audio signal in response to the detected change in the audio signal energy level.
8. The method of claim 7 , further comprising:
deactivating the at least one additional microphone and returning to a single microphone configuration in response to voice not being detected in the audio signal by a voice activity detector or a second detected change in the audio signal energy level.
9. The method of claim 1 , further comprising:
calculating, by an energy estimator, a long term energy baseline and a short term deviation wherein monitoring the audio signal energy level is performed by the energy estimator.
10. The method of claim 9 , further comprising:
buffering the audio signal in response to a detected short term deviation.
11. An apparatus comprising:
a noise suppressor;
a voice activity detector; and
an energy estimator, operatively coupled to the voice activity detector, the energy estimator operative to monitor an audio signal energy level with at least the noise suppressor and the voice activity detector deactivated and to activate at least the voice activity detector in response to a detected change in the audio signal energy level.
12. The apparatus of claim 11 , further comprising:
a noise estimator, operatively coupled to the voice activity detector, wherein the voice activity detector is operative to activate the noise estimator in response to voice being detected in the audio signal.
13. The apparatus of claim 12 , further comprising:
a buffer, operatively coupled to the energy estimator, the buffer operative to receive a buffer control signal from the energy estimator and to buffer the audio signal in response to the buffer control signal, the energy estimator operative to send the buffer control signal in response to the detected change in the audio signal energy level.
14. The apparatus of claim 13 , further comprising:
a switch, operatively coupled to the noise suppressor and to the noise estimator to receive a switch control signal, the switch operative to change over between a noise suppressed audio signal output from the noise suppressor and a buffered audio signal output from the buffer, according to the switch control signal; and
wherein the noise estimator is operative to send the switch control signal.
15. The apparatus of claim 14 , further comprising:
a noise suppressor algorithms selector, operatively coupled to the noise estimator and to the noise suppressor, the noise suppressor algorithms selector operative to activate and run the noise suppressor in response to a noise estimator control signal sent when the noise estimator determines that noise suppression is required.
16. The apparatus of claim 15 , further comprising:
a noise type classifier, operatively coupled to the noise estimator and to the noise suppressor algorithms selector, the noise type classifier operative to activate and run in response to a control signal from the noise estimator, and operative to determine noise type based on information received from the noise estimator.
17. The apparatus of claim 16 , wherein the noise suppressor algorithms selector is further operative to select a noise suppressor algorithm based on the noise type determined by the noise type classifier.
18. The apparatus of claim 14 , where the noise estimator is further operative to determine that noise suppression is not required and send the switch control signal to change over from the noise suppressed audio signal output from the noise suppressor to the buffered audio signal output from the buffer.
19. The apparatus of claim 11 , further comprising:
a plurality of microphones; and
microphone configuration logic operative to turn each microphone on or off; and
wherein the energy estimator is further operative to control the microphone configuration logic to turn on at least one additional microphone in response to the detected change in the audio signal energy level.
20. The apparatus of claim 19 , wherein the voice activity detector is operative to deactivate the at least one additional microphone and return to a single microphone configuration in response to voice not being detected in the audio signal by the voice activity detector.
21. The apparatus of claim 14 , further comprising:
voice command recognition logic, having an input operatively coupled to the switch.
22. The apparatus of claim 21 , further comprising:
a transceiver having an input operatively coupled to the switch.
Priority Applications (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US13/955,186 US20140278393A1 (en) | 2013-03-12 | 2013-07-31 | Apparatus and Method for Power Efficient Signal Conditioning for a Voice Recognition System |
PCT/US2014/014371 WO2014143438A1 (en) | 2013-03-12 | 2014-02-03 | Apparatus and method for power efficient signal conditioning for a voice recognition system |
EP14705000.9A EP2973548B1 (en) | 2013-03-12 | 2014-02-03 | Apparatus and method for power efficient signal conditioning for a voice recognition system |
US15/977,397 US10909977B2 (en) | 2013-03-12 | 2018-05-11 | Apparatus and method for power efficient signal conditioning for a voice recognition system |
US17/143,472 US11735175B2 (en) | 2013-03-12 | 2021-01-07 | Apparatus and method for power efficient signal conditioning for a voice recognition system |
Applications Claiming Priority (4)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US201361776793P | 2013-03-12 | 2013-03-12 | |
US201361798097P | 2013-03-15 | 2013-03-15 | |
US201361827797P | 2013-05-28 | 2013-05-28 | |
US13/955,186 US20140278393A1 (en) | 2013-03-12 | 2013-07-31 | Apparatus and Method for Power Efficient Signal Conditioning for a Voice Recognition System |
Related Child Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/977,397 Continuation US10909977B2 (en) | 2013-03-12 | 2018-05-11 | Apparatus and method for power efficient signal conditioning for a voice recognition system |
Publications (1)
Publication Number | Publication Date |
---|---|
US20140278393A1 true US20140278393A1 (en) | 2014-09-18 |
Family
ID=51531813
Family Applications (3)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US13/955,186 Abandoned US20140278393A1 (en) | 2013-03-12 | 2013-07-31 | Apparatus and Method for Power Efficient Signal Conditioning for a Voice Recognition System |
US15/977,397 Active US10909977B2 (en) | 2013-03-12 | 2018-05-11 | Apparatus and method for power efficient signal conditioning for a voice recognition system |
US17/143,472 Active 2034-05-10 US11735175B2 (en) | 2013-03-12 | 2021-01-07 | Apparatus and method for power efficient signal conditioning for a voice recognition system |
Family Applications After (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
US15/977,397 Active US10909977B2 (en) | 2013-03-12 | 2018-05-11 | Apparatus and method for power efficient signal conditioning for a voice recognition system |
US17/143,472 Active 2034-05-10 US11735175B2 (en) | 2013-03-12 | 2021-01-07 | Apparatus and method for power efficient signal conditioning for a voice recognition system |
Country Status (3)
Country | Link |
---|---|
US (3) | US20140278393A1 (en) |
EP (1) | EP2973548B1 (en) |
WO (1) | WO2014143438A1 (en) |
Cited By (27)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20150065199A1 (en) * | 2013-09-05 | 2015-03-05 | Saurin Shah | Mobile phone with variable energy consuming speech recognition module |
US20150127335A1 (en) * | 2013-11-07 | 2015-05-07 | Nvidia Corporation | Voice trigger |
US20150249881A1 (en) * | 2014-02-28 | 2015-09-03 | Texas Instruments Incorporated | Power control for multichannel signal processing circuit |
US20150341716A1 (en) * | 2014-05-21 | 2015-11-26 | Mark Desmarais | Battery charging adaptor for a wireless microphone |
US9678954B1 (en) * | 2015-10-29 | 2017-06-13 | Google Inc. | Techniques for providing lexicon data for translation of a single word speech input |
WO2017123814A1 (en) * | 2016-01-14 | 2017-07-20 | Knowles Electronics, Llc | Systems and methods for assisting automatic speech recognition |
US9734845B1 (en) * | 2015-06-26 | 2017-08-15 | Amazon Technologies, Inc. | Mitigating effects of electronic audio sources in expression detection |
US9769550B2 (en) | 2013-11-06 | 2017-09-19 | Nvidia Corporation | Efficient digital microphone receiver process and system |
US20170309275A1 (en) * | 2014-11-26 | 2017-10-26 | Panasonic Intellectual Property Corporation Of America | Method and apparatus for recognizing speech by lip reading |
US10048936B2 (en) * | 2015-08-31 | 2018-08-14 | Roku, Inc. | Audio command interface for a multimedia device |
US10204639B2 (en) * | 2014-07-28 | 2019-02-12 | Huawei Technologies Co., Ltd. | Method and device for processing sound signal for communications device |
WO2019050849A1 (en) | 2017-09-06 | 2019-03-14 | Realwear, Incorporated | Multi-mode noise cancellation for voice detection |
US10249323B2 (en) | 2017-05-31 | 2019-04-02 | Bose Corporation | Voice activity detection for communication headset |
US10311889B2 (en) | 2017-03-20 | 2019-06-04 | Bose Corporation | Audio signal processing for noise reduction |
US10366708B2 (en) | 2017-03-20 | 2019-07-30 | Bose Corporation | Systems and methods of detecting speech activity of headphone user |
US10424315B1 (en) | 2017-03-20 | 2019-09-24 | Bose Corporation | Audio signal processing for noise reduction |
US10438605B1 (en) | 2018-03-19 | 2019-10-08 | Bose Corporation | Echo control in binaural adaptive noise cancellation systems in headsets |
CN110473542A (en) * | 2019-09-06 | 2019-11-19 | 北京安云世纪科技有限公司 | Awakening method, device and the electronic equipment of phonetic order execution function |
US10499139B2 (en) | 2017-03-20 | 2019-12-03 | Bose Corporation | Audio signal processing for noise reduction |
JP2020034683A (en) * | 2018-08-29 | 2020-03-05 | 富士通株式会社 | Voice recognition device, voice recognition program and voice recognition method |
DE102015117380B4 (en) * | 2014-10-22 | 2020-04-09 | GM Global Technology Operations LLC (n. d. Gesetzen des Staates Delaware) | Selective noise cancellation during automatic speech recognition |
CN111739550A (en) * | 2019-03-25 | 2020-10-02 | 恩智浦有限公司 | Audio processing system for speech enhancement |
US10902853B2 (en) * | 2019-01-11 | 2021-01-26 | Wistron Corporation | Electronic device and voice command identification method thereof |
US20210125607A1 (en) * | 2013-03-12 | 2021-04-29 | Google Technology Holdings LLC | Apparatus and method for power efficient signal conditioning for a voice recognition system |
CN112885323A (en) * | 2021-02-22 | 2021-06-01 | 联想(北京)有限公司 | Audio information processing method and device and electronic equipment |
US11074910B2 (en) * | 2017-01-09 | 2021-07-27 | Samsung Electronics Co., Ltd. | Electronic device for recognizing speech |
US11355105B2 (en) * | 2018-12-27 | 2022-06-07 | Samsung Electronics Co., Ltd. | Home appliance and method for voice recognition thereof |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113452855B (en) * | 2021-06-03 | 2022-05-27 | 杭州网易智企科技有限公司 | Howling processing method, howling processing device, electronic equipment and storage medium |
Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4630304A (en) * | 1985-07-01 | 1986-12-16 | Motorola, Inc. | Automatic background noise estimator for a noise suppression system |
US20050060149A1 (en) * | 2003-09-17 | 2005-03-17 | Guduru Vijayakrishna Prasad | Method and apparatus to perform voice activity detection |
US20050165604A1 (en) * | 2002-06-12 | 2005-07-28 | Toshiyuki Hanazawa | Speech recognizing method and device thereof |
US6968064B1 (en) * | 2000-09-29 | 2005-11-22 | Forgent Networks, Inc. | Adaptive thresholds in acoustic echo canceller for use during double talk |
US20070021958A1 (en) * | 2005-07-22 | 2007-01-25 | Erik Visser | Robust separation of speech signals in a noisy environment |
US20080159560A1 (en) * | 2006-12-30 | 2008-07-03 | Motorola, Inc. | Method and Noise Suppression Circuit Incorporating a Plurality of Noise Suppression Techniques |
US20080201138A1 (en) * | 2004-07-22 | 2008-08-21 | Softmax, Inc. | Headset for Separation of Speech Signals in a Noisy Environment |
US20080249779A1 (en) * | 2003-06-30 | 2008-10-09 | Marcus Hennecke | Speech dialog system |
US20090198492A1 (en) * | 2008-01-31 | 2009-08-06 | Rod Rempel | Adaptive noise modeling speech recognition system |
US20090290718A1 (en) * | 2008-05-21 | 2009-11-26 | Philippe Kahn | Method and Apparatus for Adjusting Audio for a User Environment |
US20100191525A1 (en) * | 1999-04-13 | 2010-07-29 | Broadcom Corporation | Gateway With Voice |
US8345890B2 (en) * | 2006-01-05 | 2013-01-01 | Audience, Inc. | System and method for utilizing inter-microphone level differences for speech enhancement |
US8566086B2 (en) * | 2005-06-28 | 2013-10-22 | Qnx Software Systems Limited | System for adaptive enhancement of speech signals |
US8744844B2 (en) * | 2007-07-06 | 2014-06-03 | Audience, Inc. | System and method for adaptive intelligent noise suppression |
US8924204B2 (en) * | 2010-11-12 | 2014-12-30 | Broadcom Corporation | Method and apparatus for wind noise detection and suppression using multiple microphones |
US9094744B1 (en) * | 2012-09-14 | 2015-07-28 | Cirrus Logic, Inc. | Close talk detector for noise cancellation |
Family Cites Families (78)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4409435A (en) | 1980-10-03 | 1983-10-11 | Gen Engineering Co., Ltd. | Hearing aid suitable for use under noisy circumstance |
US8085959B2 (en) | 1994-07-08 | 2011-12-27 | Brigham Young University | Hearing compensation system incorporating signal processing techniques |
US6070140A (en) * | 1995-06-05 | 2000-05-30 | Tran; Bao Q. | Speech recognizer |
JP3674990B2 (en) * | 1995-08-21 | 2005-07-27 | セイコーエプソン株式会社 | Speech recognition dialogue apparatus and speech recognition dialogue processing method |
DE19533541C1 (en) | 1995-09-11 | 1997-03-27 | Daimler Benz Aerospace Ag | Method for the automatic control of one or more devices by voice commands or by voice dialog in real time and device for executing the method |
US5737695A (en) * | 1996-12-21 | 1998-04-07 | Telefonaktiebolaget Lm Ericsson | Method and apparatus for controlling the use of discontinuous transmission in a cellular telephone |
US6035408A (en) | 1998-01-06 | 2000-03-07 | Magnex Corp. | Portable computer with dual switchable processors for selectable power consumption |
US7124079B1 (en) | 1998-11-23 | 2006-10-17 | Telefonaktiebolaget Lm Ericsson (Publ) | Speech coding with comfort noise variability feature for increased fidelity |
US6757367B1 (en) * | 1999-09-20 | 2004-06-29 | Broadcom Corporation | Packet based network exchange with rate synchronization |
US6778959B1 (en) | 1999-10-21 | 2004-08-17 | Sony Corporation | System and method for speech verification using out-of-vocabulary models |
US8583427B2 (en) * | 1999-11-18 | 2013-11-12 | Broadcom Corporation | Voice and data exchange over a packet based network with voice detection |
US7263074B2 (en) * | 1999-12-09 | 2007-08-28 | Broadcom Corporation | Voice activity detection based on far-end and near-end statistics |
US7561700B1 (en) | 2000-05-11 | 2009-07-14 | Plantronics, Inc. | Auto-adjust noise canceling microphone with position sensor |
US7457750B2 (en) | 2000-10-13 | 2008-11-25 | At&T Corp. | Systems and methods for dynamic re-configurable speech recognition |
US7219058B1 (en) | 2000-10-13 | 2007-05-15 | At&T Corp. | System and method for processing speech recognition results |
US6876966B1 (en) | 2000-10-16 | 2005-04-05 | Microsoft Corporation | Pattern recognition training method and apparatus using inserted noise followed by noise reduction |
US6915262B2 (en) | 2000-11-30 | 2005-07-05 | Telesector Resources Group, Inc. | Methods and apparatus for performing speech recognition and using speech recognition results |
JP2002334097A (en) | 2001-02-09 | 2002-11-22 | Seiko Epson Corp | Service providing system, management terminal, moving body, service providing program, and method for providing service |
US6820054B2 (en) * | 2001-05-07 | 2004-11-16 | Intel Corporation | Audio signal processing for speech communication |
US6959276B2 (en) | 2001-09-27 | 2005-10-25 | Microsoft Corporation | Including the category of environmental noise when processing speech signals |
US6950796B2 (en) | 2001-11-05 | 2005-09-27 | Motorola, Inc. | Speech recognition by dynamical noise model adaptation |
JP4195267B2 (en) | 2002-03-14 | 2008-12-10 | インターナショナル・ビジネス・マシーンズ・コーポレーション | Speech recognition apparatus, speech recognition method and program thereof |
US7224981B2 (en) | 2002-06-20 | 2007-05-29 | Intel Corporation | Speech recognition of mobile devices |
US6805633B2 (en) | 2002-08-07 | 2004-10-19 | Bally Gaming, Inc. | Gaming machine with automatic sound level adjustment and method therefor |
US7283956B2 (en) | 2002-09-18 | 2007-10-16 | Motorola, Inc. | Noise suppression |
JP4109063B2 (en) | 2002-09-18 | 2008-06-25 | パイオニア株式会社 | Speech recognition apparatus and speech recognition method |
JP4352790B2 (en) | 2002-10-31 | 2009-10-28 | セイコーエプソン株式会社 | Acoustic model creation method, speech recognition device, and vehicle having speech recognition device |
US7457745B2 (en) | 2002-12-03 | 2008-11-25 | Hrl Laboratories, Llc | Method and apparatus for fast on-line automatic speaker/environment adaptation for speech/speaker recognition in the presence of changing environments |
EP1443498B1 (en) * | 2003-01-24 | 2008-03-19 | Sony Ericsson Mobile Communications AB | Noise reduction and audio-visual speech activity detection |
US7392188B2 (en) | 2003-07-31 | 2008-06-24 | Telefonaktiebolaget Lm Ericsson (Publ) | System and method enabling acoustic barge-in |
JP4548646B2 (en) | 2003-09-12 | 2010-09-22 | 株式会社エヌ・ティ・ティ・ドコモ | Noise model noise adaptation system, noise adaptation method, and speech recognition noise adaptation program |
US7634095B2 (en) | 2004-02-23 | 2009-12-15 | General Motors Company | Dynamic tuning of hands-free algorithm for noise and driving conditions |
US7454332B2 (en) | 2004-06-15 | 2008-11-18 | Microsoft Corporation | Gain constrained noise suppression |
US8027833B2 (en) | 2005-05-09 | 2011-09-27 | Qnx Software Systems Co. | System for suppressing passing tire hiss |
US20060262938A1 (en) | 2005-05-18 | 2006-11-23 | Gauger Daniel M Jr | Adapted audio response |
CN101517550B (en) | 2005-11-29 | 2013-01-02 | 谷歌公司 | Social and interactive applications for mass media |
US9185487B2 (en) | 2006-01-30 | 2015-11-10 | Audience, Inc. | System and method for providing noise suppression utilizing null processing noise subtraction |
FR2898209B1 (en) | 2006-03-01 | 2008-12-12 | Parrot Sa | METHOD FOR DEBRUCTING AN AUDIO SIGNAL |
KR101313170B1 (en) * | 2006-09-12 | 2013-09-30 | 삼성전자주식회사 | Terminal for removing noise of phone call and method thereof |
US8041568B2 (en) | 2006-10-13 | 2011-10-18 | Google Inc. | Business listing search |
US7890326B2 (en) | 2006-10-13 | 2011-02-15 | Google Inc. | Business listing search |
US8275611B2 (en) | 2007-01-18 | 2012-09-25 | Stmicroelectronics Asia Pacific Pte., Ltd. | Adaptive noise suppression for digital speech signals |
US7941189B2 (en) | 2007-02-07 | 2011-05-10 | Denso Corporation | Communicating road noise control system, in-vehicle road noise controller, and server |
GB2441835B (en) | 2007-02-07 | 2008-08-20 | Sonaptic Ltd | Ambient noise reduction system |
US20090030687A1 (en) | 2007-03-07 | 2009-01-29 | Cerra Joseph P | Adapting an unstructured language model speech recognition system based on usage |
US8121837B2 (en) | 2008-04-24 | 2012-02-21 | Nuance Communications, Inc. | Adjusting a speech engine for a mobile computing device based on background noise |
US8024596B2 (en) | 2008-04-29 | 2011-09-20 | Bose Corporation | Personal wireless network power-based task distribution |
US8085941B2 (en) | 2008-05-02 | 2011-12-27 | Dolby Laboratories Licensing Corporation | System and method for dynamic sound delivery |
CN102077274B (en) * | 2008-06-30 | 2013-08-21 | 杜比实验室特许公司 | Multi-microphone voice activity detector |
US8285208B2 (en) * | 2008-07-25 | 2012-10-09 | Apple Inc. | Systems and methods for noise cancellation and power management in a wireless headset |
EP2164066B1 (en) | 2008-09-15 | 2016-03-09 | Oticon A/S | Noise spectrum tracking in noisy acoustical signals |
CN102239705B (en) | 2008-12-05 | 2015-02-25 | 应美盛股份有限公司 | Wind noise detection method and system |
US8688445B2 (en) | 2008-12-10 | 2014-04-01 | Adobe Systems Incorporated | Multi-core processing for parallel speech-to-text processing |
US8416964B2 (en) | 2008-12-15 | 2013-04-09 | Gentex Corporation | Vehicular automatic gain control (AGC) microphone system and method for post processing optimization of a microphone signal |
JP5127754B2 (en) | 2009-03-24 | 2013-01-23 | 株式会社東芝 | Signal processing device |
US8477973B2 (en) * | 2009-04-01 | 2013-07-02 | Starkey Laboratories, Inc. | Hearing assistance system with own voice detection |
JP5651923B2 (en) | 2009-04-07 | 2015-01-14 | ソニー株式会社 | Signal processing apparatus and signal processing method |
JP4809454B2 (en) * | 2009-05-17 | 2011-11-09 | 株式会社半導体理工学研究センター | Circuit activation method and circuit activation apparatus by speech estimation |
US8391524B2 (en) | 2009-06-02 | 2013-03-05 | Panasonic Corporation | Hearing aid, hearing aid system, walking detection method, and hearing aid method |
US8571231B2 (en) | 2009-10-01 | 2013-10-29 | Qualcomm Incorporated | Suppressing noise in an audio signal |
CN102044243B (en) * | 2009-10-15 | 2012-08-29 | 华为技术有限公司 | Method and device for voice activity detection (VAD) and encoder |
US8589163B2 (en) | 2009-12-04 | 2013-11-19 | At&T Intellectual Property I, L.P. | Adapting language models with a bit mask for a subset of related words |
US8265928B2 (en) | 2010-04-14 | 2012-09-11 | Google Inc. | Geotagged environmental audio for enhanced speech recognition accuracy |
US8539318B2 (en) | 2010-06-04 | 2013-09-17 | École Polytechnique Fédérale De Lausanne (Epfl) | Power and pin efficient chip-to-chip communications with common-mode rejection and SSO resilience |
US8468012B2 (en) | 2010-05-26 | 2013-06-18 | Google Inc. | Acoustic model adaptation using geographic information |
US8423357B2 (en) | 2010-06-18 | 2013-04-16 | Alon Konchitsky | System and method for biometric acoustic noise reduction |
US9711162B2 (en) | 2011-07-05 | 2017-07-18 | Texas Instruments Incorporated | Method and apparatus for environmental noise compensation by determining a presence or an absence of an audio event |
US8903722B2 (en) | 2011-08-29 | 2014-12-02 | Intel Mobile Communications GmbH | Noise reduction for dual-microphone communication devices |
US9368096B2 (en) | 2011-12-20 | 2016-06-14 | Texas Instruments Incorporated | Method and system for active noise cancellation according to a type of noise |
US9070374B2 (en) | 2012-02-20 | 2015-06-30 | JVC Kenwood Corporation | Communication apparatus and condition notification method for notifying a used condition of communication apparatus by using a light-emitting device attached to communication apparatus |
US9183845B1 (en) | 2012-06-12 | 2015-11-10 | Amazon Technologies, Inc. | Adjusting audio signals based on a specific frequency range associated with environmental noise characteristics |
US9703378B2 (en) | 2012-06-13 | 2017-07-11 | Immersion Corporation | Method and apparatus for representing user interface metaphors as physical changes on a shape-changing device |
US20140003635A1 (en) | 2012-07-02 | 2014-01-02 | Qualcomm Incorporated | Audio signal processing device calibration |
US9043210B1 (en) * | 2012-10-02 | 2015-05-26 | Voice Security Systems, Inc. | Biometric voice command and control switching device and method of use |
US9704486B2 (en) * | 2012-12-11 | 2017-07-11 | Amazon Technologies, Inc. | Speech recognition power management |
US20140278393A1 (en) * | 2013-03-12 | 2014-09-18 | Motorola Mobility Llc | Apparatus and Method for Power Efficient Signal Conditioning for a Voice Recognition System |
WO2017029550A1 (en) * | 2015-08-20 | 2017-02-23 | Cirrus Logic International Semiconductor Ltd | Feedback adaptive noise cancellation (anc) controller and method having a feedback response partially provided by a fixed-response filter |
US10089989B2 (en) * | 2015-12-07 | 2018-10-02 | Semiconductor Components Industries, Llc | Method and apparatus for a low power voice trigger device |
-
2013
- 2013-07-31 US US13/955,186 patent/US20140278393A1/en not_active Abandoned
-
2014
- 2014-02-03 EP EP14705000.9A patent/EP2973548B1/en active Active
- 2014-02-03 WO PCT/US2014/014371 patent/WO2014143438A1/en active Application Filing
-
2018
- 2018-05-11 US US15/977,397 patent/US10909977B2/en active Active
-
2021
- 2021-01-07 US US17/143,472 patent/US11735175B2/en active Active
Patent Citations (16)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US4630304A (en) * | 1985-07-01 | 1986-12-16 | Motorola, Inc. | Automatic background noise estimator for a noise suppression system |
US20100191525A1 (en) * | 1999-04-13 | 2010-07-29 | Broadcom Corporation | Gateway With Voice |
US6968064B1 (en) * | 2000-09-29 | 2005-11-22 | Forgent Networks, Inc. | Adaptive thresholds in acoustic echo canceller for use during double talk |
US20050165604A1 (en) * | 2002-06-12 | 2005-07-28 | Toshiyuki Hanazawa | Speech recognizing method and device thereof |
US20080249779A1 (en) * | 2003-06-30 | 2008-10-09 | Marcus Hennecke | Speech dialog system |
US20050060149A1 (en) * | 2003-09-17 | 2005-03-17 | Guduru Vijayakrishna Prasad | Method and apparatus to perform voice activity detection |
US20080201138A1 (en) * | 2004-07-22 | 2008-08-21 | Softmax, Inc. | Headset for Separation of Speech Signals in a Noisy Environment |
US8566086B2 (en) * | 2005-06-28 | 2013-10-22 | Qnx Software Systems Limited | System for adaptive enhancement of speech signals |
US20070021958A1 (en) * | 2005-07-22 | 2007-01-25 | Erik Visser | Robust separation of speech signals in a noisy environment |
US8345890B2 (en) * | 2006-01-05 | 2013-01-01 | Audience, Inc. | System and method for utilizing inter-microphone level differences for speech enhancement |
US20080159560A1 (en) * | 2006-12-30 | 2008-07-03 | Motorola, Inc. | Method and Noise Suppression Circuit Incorporating a Plurality of Noise Suppression Techniques |
US8744844B2 (en) * | 2007-07-06 | 2014-06-03 | Audience, Inc. | System and method for adaptive intelligent noise suppression |
US20090198492A1 (en) * | 2008-01-31 | 2009-08-06 | Rod Rempel | Adaptive noise modeling speech recognition system |
US20090290718A1 (en) * | 2008-05-21 | 2009-11-26 | Philippe Kahn | Method and Apparatus for Adjusting Audio for a User Environment |
US8924204B2 (en) * | 2010-11-12 | 2014-12-30 | Broadcom Corporation | Method and apparatus for wind noise detection and suppression using multiple microphones |
US9094744B1 (en) * | 2012-09-14 | 2015-07-28 | Cirrus Logic, Inc. | Close talk detector for noise cancellation |
Cited By (41)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20210125607A1 (en) * | 2013-03-12 | 2021-04-29 | Google Technology Holdings LLC | Apparatus and method for power efficient signal conditioning for a voice recognition system |
US11735175B2 (en) * | 2013-03-12 | 2023-08-22 | Google Llc | Apparatus and method for power efficient signal conditioning for a voice recognition system |
US9251806B2 (en) * | 2013-09-05 | 2016-02-02 | Intel Corporation | Mobile phone with variable energy consuming speech recognition module |
US20150065199A1 (en) * | 2013-09-05 | 2015-03-05 | Saurin Shah | Mobile phone with variable energy consuming speech recognition module |
US9769550B2 (en) | 2013-11-06 | 2017-09-19 | Nvidia Corporation | Efficient digital microphone receiver process and system |
US9454975B2 (en) * | 2013-11-07 | 2016-09-27 | Nvidia Corporation | Voice trigger |
US20150127335A1 (en) * | 2013-11-07 | 2015-05-07 | Nvidia Corporation | Voice trigger |
US20150249881A1 (en) * | 2014-02-28 | 2015-09-03 | Texas Instruments Incorporated | Power control for multichannel signal processing circuit |
US9774949B2 (en) * | 2014-02-28 | 2017-09-26 | Texas Instruments Incorporated | Power control for multichannel signal processing circuit |
US20150341716A1 (en) * | 2014-05-21 | 2015-11-26 | Mark Desmarais | Battery charging adaptor for a wireless microphone |
US10038948B2 (en) * | 2014-05-21 | 2018-07-31 | Revo Labs | Battery charging adaptor for a wireless microphone |
US10204639B2 (en) * | 2014-07-28 | 2019-02-12 | Huawei Technologies Co., Ltd. | Method and device for processing sound signal for communications device |
DE102015117380B4 (en) * | 2014-10-22 | 2020-04-09 | GM Global Technology Operations LLC (n. d. Gesetzen des Staates Delaware) | Selective noise cancellation during automatic speech recognition |
US9997159B2 (en) * | 2014-11-26 | 2018-06-12 | Panasonic Intellectual Property Corporation Of America | Method and apparatus for recognizing speech by lip reading |
US20170309275A1 (en) * | 2014-11-26 | 2017-10-26 | Panasonic Intellectual Property Corporation Of America | Method and apparatus for recognizing speech by lip reading |
US9734845B1 (en) * | 2015-06-26 | 2017-08-15 | Amazon Technologies, Inc. | Mitigating effects of electronic audio sources in expression detection |
US10871942B2 (en) | 2015-08-31 | 2020-12-22 | Roku, Inc. | Audio command interface for a multimedia device |
US10048936B2 (en) * | 2015-08-31 | 2018-08-14 | Roku, Inc. | Audio command interface for a multimedia device |
US12112096B2 (en) | 2015-08-31 | 2024-10-08 | Roku, Inc. | Audio command interface for a multimedia device |
US9678954B1 (en) * | 2015-10-29 | 2017-06-13 | Google Inc. | Techniques for providing lexicon data for translation of a single word speech input |
WO2017123814A1 (en) * | 2016-01-14 | 2017-07-20 | Knowles Electronics, Llc | Systems and methods for assisting automatic speech recognition |
US11074910B2 (en) * | 2017-01-09 | 2021-07-27 | Samsung Electronics Co., Ltd. | Electronic device for recognizing speech |
US10499139B2 (en) | 2017-03-20 | 2019-12-03 | Bose Corporation | Audio signal processing for noise reduction |
US10762915B2 (en) | 2017-03-20 | 2020-09-01 | Bose Corporation | Systems and methods of detecting speech activity of headphone user |
US10311889B2 (en) | 2017-03-20 | 2019-06-04 | Bose Corporation | Audio signal processing for noise reduction |
US10366708B2 (en) | 2017-03-20 | 2019-07-30 | Bose Corporation | Systems and methods of detecting speech activity of headphone user |
US10424315B1 (en) | 2017-03-20 | 2019-09-24 | Bose Corporation | Audio signal processing for noise reduction |
US10249323B2 (en) | 2017-05-31 | 2019-04-02 | Bose Corporation | Voice activity detection for communication headset |
US10706868B2 (en) | 2017-09-06 | 2020-07-07 | Realwear, Inc. | Multi-mode noise cancellation for voice detection |
EP3679573A4 (en) * | 2017-09-06 | 2021-05-12 | Realwear, Incorporated | Multi-mode noise cancellation for voice detection |
CN111095405A (en) * | 2017-09-06 | 2020-05-01 | 瑞欧威尔公司 | Multi-mode noise cancellation for voice detection |
WO2019050849A1 (en) | 2017-09-06 | 2019-03-14 | Realwear, Incorporated | Multi-mode noise cancellation for voice detection |
US10438605B1 (en) | 2018-03-19 | 2019-10-08 | Bose Corporation | Echo control in binaural adaptive noise cancellation systems in headsets |
US11183180B2 (en) * | 2018-08-29 | 2021-11-23 | Fujitsu Limited | Speech recognition apparatus, speech recognition method, and a recording medium performing a suppression process for categories of noise |
JP7167554B2 (en) | 2018-08-29 | 2022-11-09 | 富士通株式会社 | Speech recognition device, speech recognition program and speech recognition method |
JP2020034683A (en) * | 2018-08-29 | 2020-03-05 | 富士通株式会社 | Voice recognition device, voice recognition program and voice recognition method |
US11355105B2 (en) * | 2018-12-27 | 2022-06-07 | Samsung Electronics Co., Ltd. | Home appliance and method for voice recognition thereof |
US10902853B2 (en) * | 2019-01-11 | 2021-01-26 | Wistron Corporation | Electronic device and voice command identification method thereof |
CN111739550A (en) * | 2019-03-25 | 2020-10-02 | 恩智浦有限公司 | Audio processing system for speech enhancement |
CN110473542A (en) * | 2019-09-06 | 2019-11-19 | 北京安云世纪科技有限公司 | Awakening method, device and the electronic equipment of phonetic order execution function |
CN112885323A (en) * | 2021-02-22 | 2021-06-01 | 联想(北京)有限公司 | Audio information processing method and device and electronic equipment |
Also Published As
Publication number | Publication date |
---|---|
US20210125607A1 (en) | 2021-04-29 |
US10909977B2 (en) | 2021-02-02 |
WO2014143438A1 (en) | 2014-09-18 |
US20180268811A1 (en) | 2018-09-20 |
US11735175B2 (en) | 2023-08-22 |
EP2973548B1 (en) | 2017-01-25 |
EP2973548A1 (en) | 2016-01-20 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11735175B2 (en) | Apparatus and method for power efficient signal conditioning for a voice recognition system | |
US10229697B2 (en) | Apparatus and method for beamforming to obtain voice and noise signals | |
US10332524B2 (en) | Speech recognition wake-up of a handheld portable electronic device | |
US9703350B2 (en) | Always-on low-power keyword spotting | |
US9406313B2 (en) | Adaptive microphone sampling rate techniques | |
US20160134966A1 (en) | Reduced microphone power-up latency | |
US10955898B2 (en) | Electronic device with a wake up module distinct from a core domain | |
US11437021B2 (en) | Processing audio signals | |
KR102492727B1 (en) | Electronic apparatus and the control method thereof | |
US20180174574A1 (en) | Methods and systems for reducing false alarms in keyword detection | |
US10332543B1 (en) | Systems and methods for capturing noise for pattern recognition processing | |
CN115699173A (en) | Voice activity detection method and device | |
US20230122089A1 (en) | Enhanced noise reduction in a voice activated device | |
US12057138B2 (en) | Cascade audio spotting system | |
WO2021248350A1 (en) | Audio gain selection | |
WO2017119901A1 (en) | System and method for speech detection adaptation | |
CN116416977A (en) | Sensitivity mode for an audio localization system |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
AS | Assignment |
Owner name: MOTOROLA MOBILITY LLC, ILLINOIS Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNORS:IVANOV, PLAMEN A;BASTYR, KEVIN J;CLARK, JOEL A;AND OTHERS;SIGNING DATES FROM 20130821 TO 20130903;REEL/FRAME:031134/0065 |
|
AS | Assignment |
Owner name: GOOGLE TECHNOLOGY HOLDINGS LLC, CALIFORNIA Free format text: ASSIGNMENT OF ASSIGNORS INTEREST;ASSIGNOR:MOTOROLA MOBILITY LLC;REEL/FRAME:037428/0876 Effective date: 20151029 |
|
STCB | Information on status: application discontinuation |
Free format text: ABANDONED -- AFTER EXAMINER'S ANSWER OR BOARD OF APPEALS DECISION |