CN106887230A - A kind of method for recognizing sound-groove in feature based space - Google Patents
A kind of method for recognizing sound-groove in feature based space Download PDFInfo
- Publication number
- CN106887230A CN106887230A CN201510947369.8A CN201510947369A CN106887230A CN 106887230 A CN106887230 A CN 106887230A CN 201510947369 A CN201510947369 A CN 201510947369A CN 106887230 A CN106887230 A CN 106887230A
- Authority
- CN
- China
- Prior art keywords
- sequence number
- frequency range
- data group
- sequence
- identification feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 43
- 238000012360 testing method Methods 0.000 claims abstract description 31
- 238000012549 training Methods 0.000 claims abstract description 30
- 230000015572 biosynthetic process Effects 0.000 claims abstract description 8
- 238000013144 data compression Methods 0.000 claims description 18
- 230000008569 process Effects 0.000 claims description 9
- 230000002123 temporal effect Effects 0.000 claims description 7
- 238000012790 confirmation Methods 0.000 claims description 6
- 238000006243 chemical reaction Methods 0.000 claims description 2
- 238000004364 calculation method Methods 0.000 abstract description 3
- 230000009286 beneficial effect Effects 0.000 abstract description 2
- 230000008859 change Effects 0.000 description 3
- 238000000605 extraction Methods 0.000 description 2
- 230000001755 vocal effect Effects 0.000 description 2
- 229910017435 S2 In Inorganic materials 0.000 description 1
- TZCXTZWJZNENPQ-UHFFFAOYSA-L barium sulfate Chemical compound [Ba+2].[O-]S([O-])(=O)=O TZCXTZWJZNENPQ-UHFFFAOYSA-L 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/02—Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L17/00—Speaker identification or verification techniques
- G10L17/04—Training, enrolment or model building
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L9/00—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
- H04L9/32—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
- H04L9/3226—Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
- H04L9/3231—Biological data, e.g. fingerprint, voice or retina
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Computer Security & Cryptography (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Biodiversity & Conservation Biology (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a kind of method for recognizing sound-groove in feature based space, belong to technical field of biometric identification;Method includes:Pre-set the first frequency range of high band, and low-frequency range the second frequency range, the two frequency ranges are proceeded as follows respectively then:Voice is divided into multiple identification sections;Characteristic point, and and then formation identification feature space are identified after eigentransformation is done to each identification section;Identification feature space is divided into many sub-spaces;Done according to training sentence and obtain time sequence characteristic point after eigentransformation and be dispensed into each sub-spaces, then the sequence number according to subspace forms First ray, and and then form training identification feature.Similarly, obtained testing identification feature according to test statement;Finally test identification feature and training identification feature are contrasted, and the result of Application on Voiceprint Recognition is obtained according to comparing result treatment.The beneficial effect of above-mentioned technical proposal is:The amount of calculation of Application on Voiceprint Recognition is smaller, saves storage and computing resource.
Description
Technical field
The present invention relates to technical field of biometric identification, more particularly to a kind of Application on Voiceprint Recognition in feature based space
Method.
Background technology
Application on Voiceprint Recognition and fingerprint, iris, recognition of face etc. are the same, belong to one kind of bio-identification, are recognized
To be most natural living things feature recognition identity authentication mode.Can easily to saying by Application on Voiceprint Recognition
The identity for talking about people is verified, and the privacy of this verification mode is very high, because the usual nothing of vocal print
Method and is stolen at fraudulent copying, thus Application on Voiceprint Recognition various fields especially smart machine field have it is prominent
The application advantage for going out.
The basic process of Application on Voiceprint Recognition is voice collecting, feature extraction, disaggregated model.Common voice is special
It is, using the short-term stationarity characteristic of voice, to be converted speech into using U.S. Cepstrum Transform method to levy extracting method
Identification feature collection, is modeled the classification mould for obtaining speaker to speaker's voice by learning process afterwards
Type, then obtains the result of Application on Voiceprint Recognition by all kinds of identification models.But said process exists following several
Individual problem:(1) model of above-mentioned Application on Voiceprint Recognition needs to learn more samples to apply;(2) foundation
The complexity of the calculating of the Application on Voiceprint Recognition that above-mentioned identification model is carried out is higher;(3) according to above-mentioned identification mould
The model data amount that type is calculated is larger.In sum, for the intelligence system of resource-constrained,
The above-mentioned problem both deposited limits the application of voiceprint recognition algorithm of the prior art.
The content of the invention
According to the above-mentioned problems in the prior art, a kind of technical scheme of method for recognizing sound-groove is now provided,
Specifically include:
A kind of method for recognizing sound-groove in feature based space, wherein:Default one first frequency range and one second frequency
Section, first frequency range is higher than second frequency range, is also included:
Step S1, will be respectively at different background under first frequency range or second frequency range, no
The identification section of length-specific is divided into the voice of voice;
Step S2, corresponding multiple identification feature is obtained after eigentransformation is done to identification section each described,
And correspondence described first is respectively constituted using all described identification feature for being associated with all identification sections
The identification feature space of frequency range, or correspond to the identification feature space of second frequency range;
Step S3, plural sub-spaces are divided into by the identification feature space, and with description information each
The subspace being divided, and respectively to imparting one corresponding sequence number in subspace each described;
Step S4, will be associated with training in first frequency range or in second frequency range respectively
Every of model training sentence does the time sequence characteristic point for obtaining including corresponding time sequence characteristic point after eigentransformation
Collection, each described subspace that each described time sequence characteristic point is respectively allocated under same frequency range, according to every
The sequence number of the corresponding subspace of the individual time sequence characteristic point formed respectively be associated with first frequency range or
The First ray of the second frequency range described in person, and and then the corresponding training identification feature of formation;
Step S5, will be associated with test in first frequency range or in second frequency range respectively
Every test statement of model obtains the temporal aspect point set after doing eigentransformation, and each described sequential is special
Levy and be a little respectively allocated into subspace each described, according to the corresponding son of each described time sequence characteristic point
The sequence number in space forms the second sequence for being associated with first frequency range or second frequency range respectively, and
And then form corresponding test identification feature;
Step S6, contrast is associated with the training identification feature of first frequency range and the test identification
Whether feature is similar, and the confirmation result of Application on Voiceprint Recognition is obtained according to comparing result treatment, or
For being associated with whether the training identification feature of second frequency range tests identification feature with described
It is similar, and the confirmation result of Application on Voiceprint Recognition is obtained according to comparing result treatment.
Preferably, the method for recognizing sound-groove, wherein, in the step S4, each described time sequence characteristic point
It is dispensed into each described subspace according to nearest neighbouring rule.
Preferably, the method for recognizing sound-groove, wherein, in the step S4, the sequential will be dispensed into
Each described subspace of characteristic point constitutes a spatial sequence according to the sequence number, and by the spatial sequence
As the First ray, to form the training identification feature.
Preferably, the method for recognizing sound-groove, wherein, in the step S5, the sequential will be dispensed into
Each described subspace of characteristic point constitutes a spatial sequence according to the sequence number, and by the control sequence
As second sequence, to form the test identification feature.
Preferably, the method for recognizing sound-groove, wherein, in the step S4, the spatial sequence includes
It is associated with the data group of each subspace, one sequence number of a data group correspondence;
After the spatial sequence is formed, also including respectively in first frequency range or second frequency
The process of the first data compression that the spatial sequence of section is carried out, specially:
Step S41, records the sequence number of each data group, and record is associated with each described sequence number
Repetition sequence number quantity;
Step S42, the sequence number quantity that repeats for judging whether the sequence number is 1, and existing
State and step S43 is turned to when repetition sequence number quantity is 1 data group;
Step S43, deletes the corresponding data group of the sequence number that the repetition sequence number quantity is 1;
Step S44, judge the deleted data group previous data group the sequence number whether with quilt
The sequence number of latter data group of the data group deleted is identical:
If identical, by the previous data group and latter data combination simultaneously;
If differing, retain the previous data group and the latter data group;
All described data group in the spatial sequence forms institute after being performed both by first data compression
State First ray.
Preferably, the method for recognizing sound-groove, wherein, in the step S5, the spatial sequence includes
It is associated with the data group of each subspace, one sequence number of a data group correspondence;
After the spatial sequence is formed, also including respectively in first frequency range or second frequency
The process of the second data compression that the spatial sequence of section is carried out, specially:
Step S51, records the sequence number of each data group, and record is associated with each described sequence number
Repetition sequence number quantity;
Step S52, the sequence number quantity that repeats for judging whether the sequence number is 1, and existing
State and step S53 is turned to when repetition sequence number quantity is 1 data group;
Step S53, deletes the corresponding data group of the sequence number that the repetition sequence number quantity is 1;
Step S54, judge the deleted data group previous data group the sequence number whether with quilt
The sequence number of latter data group of the data group deleted is identical:
If identical, by the previous data group and latter data combination simultaneously;
If differing, retain the previous data group and the latter data group;
All described data group in the spatial sequence forms institute after being performed both by second data compression
State the second sequence.
Preferably, the method for recognizing sound-groove, wherein:The eigentransformation is U.S. Cepstrum Transform.
Preferably, the method for recognizing sound-groove, wherein:During the U.S. Cepstrum Transform is performed, respectively
Every sentence is divided into the frames of 20ms mono-, and the frame of 10ms is pipetted out is associated with the sentence
Sentence frame;
Then, remove Jing Yin in units of frame, 12 are stayed per frame after Cepstrum Transform is helped to the sentence frame
Coefficient, and constituted the identification feature with 12 coefficients.
Preferably, the method for recognizing sound-groove, wherein:In the step S3, will using " K- averages " algorithm
Identification feature space is divided into several subspaces, and the subspace of each after division is respectively with " K- averages "
Central point be recorded as the description information of the correspondence subspace.
The beneficial effect of above-mentioned technical proposal is:A kind of method for recognizing sound-groove in feature based space is provided,
So that the amount of calculation of Application on Voiceprint Recognition is smaller, storage and computing resource can be saved, and overcome based on general
The problem that the modeling method of rate statistics is present, the intelligence system for being suitable for limited system resources is used.Simultaneously
The first frequency for pre-setting the speaker for representing children and the second frequency of the speaker for representing adult
And be compared respectively, further improve the degree of accuracy of Application on Voiceprint Recognition.
Brief description of the drawings
During Fig. 1 is preferred embodiment of the invention, a kind of method for recognizing sound-groove in feature based space
Overview flow chart;
During Fig. 2 is preferred embodiment of the invention, the schematic flow sheet of the first data compression;
During Fig. 3 is preferred embodiment of the invention, the schematic flow sheet of the second data compression.
Specific embodiment
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out
Clearly and completely describe, it is clear that described embodiment is only a part of embodiment of the invention, and
It is not all, of embodiment.Based on the embodiment in the present invention, those of ordinary skill in the art are without work
The every other embodiment obtained on the premise of going out creative work, belongs to the scope of protection of the invention.
It should be noted that in the case where not conflicting, the embodiment in the present invention and the spy in embodiment
Levying to be mutually combined.
The invention will be further described with specific embodiment below in conjunction with the accompanying drawings, but not as of the invention
Limit.
In preferred embodiment of the invention, based on the above-mentioned problems in the prior art, one is now provided
Plant the method for recognizing sound-groove in feature based space.The method for recognizing sound-groove goes for Voice command
In the smart machine of function, intelligent robot being for example applied in personal air etc..
In above-mentioned method for recognizing sound-groove, one first frequency range and one second frequency range, described the are preset first
One frequency range is higher than second frequency range.Specifically, for different users, the frequency of its voice
May be different, carry out dividing the relatively low frequency range of the speaker that can be divided into correspondence adult roughly to frequency,
And the frequency range higher of the speaker of correspondence children.
Further, for the speaker for growing up and the speaker of children, its Application on Voiceprint Recognition may
And differ, the structure of the extraction and corresponding sound-groove model that are characterized in particular in its vocal print feature might have
Difference.Therefore in technical solution of the present invention, two frequency ranges of phonetic incepting are set, and according to the two
Frequency range recognizes the speech differentiation of the voice of adult and children, so as to further lift accuracy of identification.Change
Yan Zhi, the first the above frequency range can be used to indicate that the voice band of the speaker of children, second
Frequency range can be used to indicate that the voice band of the speaker of adult.Therefore, preferred embodiment of the invention
In, above-mentioned two frequency range can accordingly be changed according to the constantly cumulative of experimental data, so as to reach one
The purpose of the individual voice band that accurately can respectively represent adult speaker and children speaker.
Then in preferred embodiment of the invention, as shown in figure 1, above-mentioned method for recognizing sound-groove is specifically included:
Step S1, will be respectively at different background under the first frequency range or the second frequency range, different voice
Voice is divided into the identification section of length-specific;
Step S2, obtains corresponding multiple identification feature, and adopt after eigentransformation is done to each identification section
The identification feature for respectively constituting the first frequency range of correspondence with all identification features for being associated with all identification sections is empty
Between, or the identification feature space for corresponding to the second frequency range;
Step S3, plural sub-spaces are divided into by identification feature space, and each is drawn with description information
The subspace divided, and respectively to every sub-spaces one corresponding sequence number of imparting;
Step S4, will be associated with the every of training pattern in the first frequency range or in the second frequency range respectively
Bar training sentence does the temporal aspect point set for obtaining including corresponding time sequence characteristic point after eigentransformation, each
Time sequence characteristic point is respectively allocated each sub-spaces under same frequency range, according to each time sequence characteristic point correspondence
The sequence number of subspace form the First ray for being associated with the first frequency range or the second frequency range respectively, and and then
Form corresponding training identification feature;
Step S5, will be associated with the every of test model in the first frequency range or in the second frequency range respectively
Bar test statement obtains temporal aspect point set after doing eigentransformation, each time sequence characteristic point be respectively allocated into
Each sub-spaces, the sequence number according to the corresponding subspace of each time sequence characteristic point forms be associated with first respectively
Second sequence of frequency range or the second frequency range, and and then the corresponding test identification feature of formation;
Step S6, contrast be associated with the training identification feature of the first frequency range with test identification feature whether phase
Seemingly, and according to comparing result treatment the confirmation result of Application on Voiceprint Recognition is obtained, or
Whether the training identification feature for being associated with the second frequency range is similar to test identification feature, and according to
Comparing result treatment obtains the confirmation result of Application on Voiceprint Recognition.
In preferred embodiment of the invention, on the basis of above-mentioned pre-setting, above-mentioned steps S1-S2
In, obtain first be respectively under the first frequency range or the second frequency range based on different background, different voice
Voice, and these voices are divided into the identification section of length-specific.Specifically, can be by the different back ofs the body
Scape, the corresponding every sentence of voice of different voice are divided into multiple sentence frames that 20ms is a frame,
And pipette the sentence frame of 10ms, then remove Jing Yin in units of every frame, cepstrum is helped to speech frame
Conversion, 12 coefficients are stayed per frame, and 12 coefficients are to constitute identification feature.The identification of all voice segments
Feature constitutes identification feature collection, that is, constitutes corresponding identification feature space.
In preferred embodiment of the invention, in above-mentioned steps S3, will be recognized using " K- averages " algorithm
Feature space is divided into plural sub-spaces, and the several subspaces after division are respectively with the center of " K- averages "
Point is recorded as the data description of the subspace, and each sub-spaces are numbered, and record is per sub-spaces
Description information sequence number corresponding with its.Above-mentioned steps are same under the first frequency range or the second frequency range
Identification feature space perform respectively.
It is empty to the son under the first frequency range or the second frequency range respectively in preferred embodiment of the invention
Between carry out such as the operation of above-mentioned step S4:Every training sentence that training pattern will be associated with does feature change
Obtain including the temporal aspect point set of corresponding time sequence characteristic point after changing, each time sequence characteristic point is divided respectively
Allocate each sub-spaces under same frequency range into, the sequence number difference according to the corresponding subspace of each time sequence characteristic point
Formation is associated with the First ray of the first frequency range or the second frequency range, and and then the corresponding training identification of formation
Feature.
Specifically, in preferred embodiment of the invention, so-called training sentence can be by instructing repeatedly
The part of the training pattern that reference is carried out when internal system is compared for system is defaulted in after white silk.
Specifically, in preferred embodiment of the invention, in above-mentioned steps S4, by each temporal aspect
Point is respectively allocated under same frequency range (the first frequency range or the second frequency range) according to nearest neighbouring rule
In each sub-spaces, and the sequence number of the corresponding subspace of each time sequence characteristic point is recorded, ultimately form one
Individual First ray, the First ray is made up of the sequence number of different subspaces, for example (2,2,4,8,8,
8th, 5,5,5,5,5), and then corresponding training identification feature is formed according to the First ray.
In preferred embodiment of the invention, similarly, in above-mentioned steps S5, respectively in above-mentioned
Subspace under first frequency range or the second frequency range proceeds as follows:Test to being associated with test model
Sentence obtains temporal aspect point set after doing eigentransformation, and each time sequence characteristic point is respectively allocated into each height
Space, the sequence number according to the corresponding subspace of each time sequence characteristic point formed respectively be associated with the first frequency range or
Second sequence of the frequency range of person second, and and then the corresponding test identification feature of formation.
In preferred embodiment of the invention, so-called test statement, it is associated with test model, that is,
Need the sentence for comparing.
Specifically, in preferred embodiment of the invention, in above-mentioned steps S5, equally by above-mentioned test
Each time sequence characteristic point in sentence is respectively allocated (first under same frequency range according to nearest neighbouring rule
Frequency range or the second frequency range) each sub-spaces in, and it is empty to record the corresponding son of each time sequence characteristic point
Between sequence number, ultimately form second sequence, the same sequence number by different subspaces of second sequence
Composition, such as (2,3,3,5,5,8,6,6,6,4,4), and then according to the second sequence shape
Into corresponding test identification feature.In preferred embodiment of the invention, above-mentioned steps S4 and step S5
Between and in the absence of the relation that mutually depends on, (i.e. the execution of step S5 is necessarily finished with step S4
Premised on), therefore above-mentioned steps S4 and step S5 can carry out simultaneously.Step is still shown in Fig. 1
The embodiment that S4 and step S5 orders are carried out.
In preferred embodiment of the invention, in above-mentioned steps S6, the training of above-mentioned formation is recognized special
Test identification feature of seeking peace is compared, and the final result of Application on Voiceprint Recognition is obtained according to comparison result treatment.
Specifically, in above-mentioned steps S6, equally compared respectively in accordance with the first frequency range and the second frequency range
It is right, test identification feature that will be under the first frequency range and the training identification feature being similarly under the first frequency range
Compare, and the result of Application on Voiceprint Recognition is obtained according to comparison result treatment.Similarly, by the second frequency range
Under test identification feature compare with the training identification feature being similarly under the second frequency range, and according to
Comparison result treatment obtains the result of Application on Voiceprint Recognition.
Further, in preferred embodiment of the invention, in above-mentioned steps S4, wrapped in spatial sequence
Include the data group for being associated with every sub-spaces, data group one sequence number of correspondence;
Then after spatial sequence is formed, also including respectively to the space in the first frequency range or the second frequency range
The process of the first data compression that sequence is carried out is specific as shown in Fig. 2 being:
Step S41, records the sequence number of each data group, and record the repetition sequence number for being associated with each sequence number
Quantity;
Step S42, judges whether that the repetition sequence number quantity of sequence number is 1, and repeat sequence number existing
Step S43 is turned to when quantity is 1 data group;
Step S43, deletes the corresponding data group of sequence number for repeating that sequence number quantity is 1;
Step S44, judge deleted data group previous data group sequence number whether with it is deleted
The sequence number of latter data group of data group is identical:
If identical, by previous data group and latter data combination simultaneously;
If differing, retain previous data group and latter data group;
All data groups in spatial sequence form First ray after being performed both by the first data compression.
Specifically, in preferred embodiment of the invention, during above-mentioned first data compression, record
The sequence number of subspace and the quantity of same sequence number, using the quantity of sequence number and same sequence number as one group of data
Arranged, when the quantity of same sequence number is 1, removed this group of data.In a foot stool of the invention
Embodiment in, the data of serial number 4 only have 1, then deleted during the first data compression is carried out
Fall this group of data.
If after this group of data were removed, sequence number and one group of rear data in the one group of data in data front
In sequence number it is identical when, then by two combination simultaneously.The sequence number and the deleted data of the new data group for being formed
The sequence number of the one group of data in front of group is identical, and the quantity of same sequence number is deleted this group of data front one
The quantity of group data and the quantity sum of deleted one group of this group of data rear data.Or, deleting
After this group of data, the sequence number in the one group of data in data front is different with the sequence number in the data of one group of rear,
Then retain this two groups of data simultaneously.For example, in a preferred embodiment of the invention, working as serial number
After 4 data group is removed, positioned at this group of serial number of the data of data previous group 2, positioned at this group of data
The serial number 8,2 and 8 of the data of later group is differed, so retaining former data group.
In preferred embodiment of the invention, above-mentioned instruction is by the First ray after the first data compression
Practice identification feature.
Correspondingly, in preferred embodiment of the invention, in above-mentioned steps S5, spatial sequence includes
It is associated with the data group of every sub-spaces, data group one sequence number of correspondence;
Then after spatial sequence is formed, also including respectively to the space in the first frequency range or the second frequency range
The process of the second data compression that sequence is carried out is specific as shown in figure 3, being:
Step S51, records the sequence number of each data group, and record the repetition sequence number for being associated with each sequence number
Quantity;
Step S52, judges whether that the repetition sequence number quantity of sequence number is 1, and repeat sequence number existing
Step S53 is turned to when quantity is 1 data group;
Step S53, deletes the corresponding data group of sequence number for repeating that sequence number quantity is 1;
Step S54, judge deleted data group previous data group sequence number whether with it is deleted
The sequence number of latter data group of data group is identical:
If identical, by previous data group and latter data combination simultaneously;
If differing, retain previous data group and latter data group;
All data groups in spatial sequence form the second sequence after being performed both by the second data compression.
Specifically, it is similar to the step of described in above-mentioned steps S4, in step S5, equally records subspace
Sequence number and same sequence number quantity, arranged the quantity of sequence number and same sequence number as one group of data
Row.When the quantity of same sequence number is 1, remove this group of data.
If after this group of data were removed, sequence number and one group of rear data in the one group of data in data front
In sequence number it is identical when, then by two combination simultaneously.The sequence number and the deleted data of the new data group for being formed
The sequence number of the one group of data in front of group is identical, and the quantity of same sequence number is deleted this group of data front one
The quantity of group data and the quantity sum of deleted one group of this group of data rear data.Or, deleting
After this group of data, the sequence number in the one group of data in data front is different with the sequence number in the data of one group of rear,
Then retain this two groups of data simultaneously.For example, in a preferred embodiment of the invention, working as serial number
After 4 data group is removed, positioned at this group of serial number of the data of data previous group 2, positioned at this group of data
The serial number 8,2 and 8 of the data of later group is differed, so retaining former data group.
Similarly, in preferred embodiment of the invention, above-mentioned the second sequence by the second data compression
As test identification feature.
Then as mentioned above it is possible, in above-mentioned steps S6, eventually through will be in same frequency range (the first frequency
Section or the second frequency range) under training identification feature and test identification feature compare, and according to comparison
Result treatment obtains the result of final Application on Voiceprint Recognition.
The execution of above-mentioned steps causes that the amount of calculation of Application on Voiceprint Recognition is smaller, and discrimination more preferably, and needs place
The data volume of reason is also relatively small.
The foregoing is only preferred embodiments of the present invention, not thereby limit embodiments of the present invention and
Protection domain, to those skilled in the art, should can appreciate that all utilization description of the invention
And the equivalent done by diagramatic content and the scheme obtained by obvious change, should include
Within the scope of the present invention.
Claims (9)
1. a kind of method for recognizing sound-groove in feature based space, it is characterised in that:Default one first frequency range with
And one second frequency range, first frequency range is higher than second frequency range, is also included:
Step S1, will be respectively at different background under first frequency range or second frequency range, no
The identification section of length-specific is divided into the voice of voice;
Step S2, corresponding multiple identification feature is obtained after eigentransformation is done to identification section each described,
And correspondence described first is respectively constituted using all described identification feature for being associated with all identification sections
The identification feature space of frequency range, or correspond to the identification feature space of second frequency range;
Step S3, plural sub-spaces are divided into by the identification feature space, and with description information each
The subspace being divided, and respectively to imparting one corresponding sequence number in subspace each described;
Step S4, will be associated with training in first frequency range or in second frequency range respectively
Every of model training sentence does the time sequence characteristic point for obtaining including corresponding time sequence characteristic point after eigentransformation
Collection, each described subspace that each described time sequence characteristic point is respectively allocated under same frequency range, according to every
The sequence number of the corresponding subspace of the individual time sequence characteristic point formed respectively be associated with first frequency range or
The First ray of the second frequency range described in person, and and then the corresponding training identification feature of formation;
Step S5, will be associated with test in first frequency range or in second frequency range respectively
Every test statement of model obtains the temporal aspect point set after doing eigentransformation, and each described sequential is special
Levy and be a little respectively allocated into subspace each described, according to the corresponding son of each described time sequence characteristic point
The sequence number in space forms the second sequence for being associated with first frequency range or second frequency range respectively, and
And then form corresponding test identification feature;
Step S6, contrast is associated with the training identification feature of first frequency range and the test identification
Whether feature is similar, and the confirmation result of Application on Voiceprint Recognition is obtained according to comparing result treatment, or
For being associated with whether the training identification feature of second frequency range tests identification feature with described
It is similar, and the confirmation result of Application on Voiceprint Recognition is obtained according to comparing result treatment.
2. method for recognizing sound-groove as claimed in claim 1, it is characterised in that in the step S4, often
The individual time sequence characteristic point is dispensed into each described subspace according to nearest neighbouring rule.
3. method for recognizing sound-groove as claimed in claim 1, it is characterised in that in the step S4, will
Each the described subspace for being dispensed into the time sequence characteristic point constitutes a spatial sequence according to the sequence number,
And using the spatial sequence as the First ray, to form the training identification feature.
4. method for recognizing sound-groove as claimed in claim 1, it is characterised in that in the step S5, will
Each the described subspace for being dispensed into the time sequence characteristic point constitutes a spatial sequence according to the sequence number,
And using the control sequence as second sequence, to form the test identification feature.
5. method for recognizing sound-groove as claimed in claim 3, it is characterised in that in the step S4, institute
Stating spatial sequence includes being associated with the data group of each subspace, a data group correspondence one
The individual sequence number;
After the spatial sequence is formed, also including respectively in first frequency range or second frequency
The process of the first data compression that the spatial sequence of section is carried out, specially:
Step S41, records the sequence number of each data group, and record is associated with each described sequence number
Repetition sequence number quantity;
Step S42, the sequence number quantity that repeats for judging whether the sequence number is 1, and existing
State and step S43 is turned to when repetition sequence number quantity is 1 data group;
Step S43, deletes the corresponding data group of the sequence number that the repetition sequence number quantity is 1;
Step S44, judge the deleted data group previous data group the sequence number whether with quilt
The sequence number of latter data group of the data group deleted is identical:
If identical, by the previous data group and latter data combination simultaneously;
If differing, retain the previous data group and the latter data group;
All described data group in the spatial sequence forms institute after being performed both by first data compression
State First ray.
6. method for recognizing sound-groove as claimed in claim 4, it is characterised in that in the step S5, institute
Stating spatial sequence includes being associated with the data group of each subspace, a data group correspondence one
The individual sequence number;
After the spatial sequence is formed, also including respectively in first frequency range or second frequency
The process of the second data compression that the spatial sequence of section is carried out, specially:
Step S51, records the sequence number of each data group, and record is associated with each described sequence number
Repetition sequence number quantity;
Step S52, the sequence number quantity that repeats for judging whether the sequence number is 1, and existing
State and step S53 is turned to when repetition sequence number quantity is 1 data group;
Step S53, deletes the corresponding data group of the sequence number that the repetition sequence number quantity is 1;
Step S54, judge the deleted data group previous data group the sequence number whether with quilt
The sequence number of latter data group of the data group deleted is identical:
If identical, by the previous data group and latter data combination simultaneously;
If differing, retain the previous data group and the latter data group;
All described data group in the spatial sequence forms institute after being performed both by second data compression
State the second sequence.
7. method for recognizing sound-groove as claimed in claim 1, it is characterised in that:The eigentransformation is U.S.
Cepstrum Transform.
8. method for recognizing sound-groove as claimed in claim 7, it is characterised in that:In the execution U.S. cepstrum
During conversion, every sentence is divided into the frames of 20ms mono- respectively, and the frame of 10ms is pipetted
Go out to be associated with the sentence frame of the sentence;
Then, remove Jing Yin in units of frame, 12 are stayed per frame after Cepstrum Transform is helped to the sentence frame
Coefficient, and constituted the identification feature with 12 coefficients.
9. method for recognizing sound-groove as claimed in claim 1, it is characterised in that:In the step S3, adopt
Identification feature space is divided into several subspaces with " K- averages " algorithm, the son of each after division is empty
Between the description information of the correspondence subspace is recorded as with the central point of " K- averages " respectively.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510947369.8A CN106887230A (en) | 2015-12-16 | 2015-12-16 | A kind of method for recognizing sound-groove in feature based space |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510947369.8A CN106887230A (en) | 2015-12-16 | 2015-12-16 | A kind of method for recognizing sound-groove in feature based space |
Publications (1)
Publication Number | Publication Date |
---|---|
CN106887230A true CN106887230A (en) | 2017-06-23 |
Family
ID=59176730
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510947369.8A Pending CN106887230A (en) | 2015-12-16 | 2015-12-16 | A kind of method for recognizing sound-groove in feature based space |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN106887230A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111785291A (en) * | 2020-07-02 | 2020-10-16 | 北京捷通华声科技股份有限公司 | Voice separation method and voice separation device |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6130949A (en) * | 1996-09-18 | 2000-10-10 | Nippon Telegraph And Telephone Corporation | Method and apparatus for separation of source, program recorded medium therefor, method and apparatus for detection of sound source zone, and program recorded medium therefor |
CN101661754A (en) * | 2003-10-03 | 2010-03-03 | 旭化成株式会社 | Data processing unit, method and control program |
CN101944359A (en) * | 2010-07-23 | 2011-01-12 | 杭州网豆数字技术有限公司 | Voice recognition method facing specific crowd |
CN102354496A (en) * | 2011-07-01 | 2012-02-15 | 中山大学 | PSM-based (pitch scale modification-based) speech identification and restoration method and device thereof |
CN102623008A (en) * | 2011-06-21 | 2012-08-01 | 中国科学院苏州纳米技术与纳米仿生研究所 | Voiceprint identification method |
CN103943104A (en) * | 2014-04-15 | 2014-07-23 | 海信集团有限公司 | Voice information recognition method and terminal equipment |
CN104185868A (en) * | 2012-01-24 | 2014-12-03 | 澳尔亚有限公司 | Voice authentication and speech recognition system and method |
CN104392718A (en) * | 2014-11-26 | 2015-03-04 | 河海大学 | Robust voice recognition method based on acoustic model array |
-
2015
- 2015-12-16 CN CN201510947369.8A patent/CN106887230A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6130949A (en) * | 1996-09-18 | 2000-10-10 | Nippon Telegraph And Telephone Corporation | Method and apparatus for separation of source, program recorded medium therefor, method and apparatus for detection of sound source zone, and program recorded medium therefor |
CN101661754A (en) * | 2003-10-03 | 2010-03-03 | 旭化成株式会社 | Data processing unit, method and control program |
CN101944359A (en) * | 2010-07-23 | 2011-01-12 | 杭州网豆数字技术有限公司 | Voice recognition method facing specific crowd |
CN102623008A (en) * | 2011-06-21 | 2012-08-01 | 中国科学院苏州纳米技术与纳米仿生研究所 | Voiceprint identification method |
CN102354496A (en) * | 2011-07-01 | 2012-02-15 | 中山大学 | PSM-based (pitch scale modification-based) speech identification and restoration method and device thereof |
CN104185868A (en) * | 2012-01-24 | 2014-12-03 | 澳尔亚有限公司 | Voice authentication and speech recognition system and method |
CN103943104A (en) * | 2014-04-15 | 2014-07-23 | 海信集团有限公司 | Voice information recognition method and terminal equipment |
CN104392718A (en) * | 2014-11-26 | 2015-03-04 | 河海大学 | Robust voice recognition method based on acoustic model array |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111785291A (en) * | 2020-07-02 | 2020-10-16 | 北京捷通华声科技股份有限公司 | Voice separation method and voice separation device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN106971737A (en) | A kind of method for recognizing sound-groove spoken based on many people | |
CN109817246B (en) | Emotion recognition model training method, emotion recognition device, emotion recognition equipment and storage medium | |
CN108597496B (en) | Voice generation method and device based on generation type countermeasure network | |
CN102509547B (en) | Method and system for voiceprint recognition based on vector quantization based | |
CN104167208B (en) | A kind of method for distinguishing speek person and device | |
KR101963993B1 (en) | Identification system and method with self-learning function based on dynamic password voice | |
CN108122556A (en) | Reduce the method and device that driver's voice wakes up instruction word false triggering | |
CN107767861B (en) | Voice awakening method and system and intelligent terminal | |
Khan et al. | Principal component analysis-linear discriminant analysis feature extractor for pattern recognition | |
CN110164452A (en) | A kind of method of Application on Voiceprint Recognition, the method for model training and server | |
CN106448684A (en) | Deep-belief-network-characteristic-vector-based channel-robust voiceprint recognition system | |
CN107274905B (en) | A kind of method for recognizing sound-groove and system | |
CN109473105A (en) | The voice print verification method, apparatus unrelated with text and computer equipment | |
CN106898355B (en) | Speaker identification method based on secondary modeling | |
CN101540170B (en) | Voiceprint recognition method based on biomimetic pattern recognition | |
CN110415701A (en) | The recognition methods of lip reading and its device | |
CN104091602A (en) | Speech emotion recognition method based on fuzzy support vector machine | |
Fong | Using hierarchical time series clustering algorithm and wavelet classifier for biometric voice classification | |
CN105845141A (en) | Speaker confirmation model, speaker confirmation method and speaker confirmation device based on channel robustness | |
CN111091809B (en) | Regional accent recognition method and device based on depth feature fusion | |
CN113129927A (en) | Voice emotion recognition method, device, equipment and storage medium | |
CN109273011A (en) | A kind of the operator's identification system and method for automatically updated model | |
CN106971727A (en) | A kind of verification method of Application on Voiceprint Recognition | |
CN106971730A (en) | A kind of method for recognizing sound-groove based on channel compensation | |
CN102623008A (en) | Voiceprint identification method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20170623 |
|
RJ01 | Rejection of invention patent application after publication |