CN102867513A - Pseudo-Zernike moment based voice content authentication method - Google Patents
Pseudo-Zernike moment based voice content authentication method Download PDFInfo
- Publication number
- CN102867513A CN102867513A CN2012102787243A CN201210278724A CN102867513A CN 102867513 A CN102867513 A CN 102867513A CN 2012102787243 A CN2012102787243 A CN 2012102787243A CN 201210278724 A CN201210278724 A CN 201210278724A CN 102867513 A CN102867513 A CN 102867513A
- Authority
- CN
- China
- Prior art keywords
- frame
- watermark
- pseudo
- voice
- zernike
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Landscapes
- Editing Of Facsimile Originals (AREA)
Abstract
The invention discloses a pseudo-Zernike moment based voice content authentication method. The method comprises the following steps of: during watermark embedding, dividing an original voice signal A into P frames, and dividing each frame into N sections; generating a watermark W according to an average value of the amplitudes of n-order pseudo-Zernike moments of discrete cosine transformation (DCT) low-frequency coefficients of the first N/2 sections of each frame; and embedding the watermark by quantizing pseudo-Zernike moments of the DCT low-frequency coefficients of the last N/2 sections of each frame, and thus obtaining a watermark-containing voice signal A'. The method has the advantages that by fully using the characteristic of close relevancy of the amplitudes of the pseudo-Zernike moments of the DCT low-frequency coefficients of the voice signal and voice contents, and the processing robustness of the conventional voice signal, the sensitivity of attack on malicious tampering is ensured, and high tolerance on certain conventional voice signal processing capacity is ensured.
Description
Technical field
The present invention relates to a kind of speech recognition, especially the solution of voice content authenticity and integrity authentication question.
Background technology
In recent years, extensively the popularizing of the fast development of digitize voice communication and various voice products, and the appearance of various powerful audio frequency process software so that the transmission of digital speech with use day by day become frequent with extensively.Meanwhile, distort the voice content data of transmitting and storing and become relatively easy.For example, one section important court testimony recording, its consequence is well imagined if the key partial content is maliciously tampered in storage, transmission course!Therefore, how to differentiate whether one section important or responsive voice content was tampered, and where had been tampered, whether voice record source is true, credible, and the authentication question that these relate to the digital speech authenticity has caused that Chinese scholars studies interest greatly.Audio watermarking technique occurs just being subject to the people's attention from the nineties in last century, and becomes the focus of information security research field as a kind of technological means of protecting audio frequency.
Compare with sound signal, it is low that voice signal has sampling rate, and normal signal is processed the characteristics such as more responsive.Therefore, existing a lot of audio content identifying algorithms can't be used for the voice content authentication, and the effect that perhaps is used for the voice content authentication is not very desirable.In the actual life, more being to solve Copyright Protection for audio frequency, then more is to solve content authenticity and integrated authentication problem for voice.Voice content authentication techniques based on digital watermarking, if the watermark that embeds and voice self content are irrelevant, can increase on the one hand the transmission quantity of information, also there is on the other hand certain potential safety hazard, so come the voice authentication algorithm of generating watermark just to have more Research Significance and practical value based on voice unique characteristics or content.
The amplitude of Zernike pseudo-matrix (Zernike square) has the feature of rotational invariance, and this feature has been widely used in the fields such as image representation, image retrieval and image watermark, and the application on audio frequency also seldom.Document " Robust audio watermarking based on low-order Zernike moments " (Xiang Shi-jun, Huang Ji-wu, Yang Rui, 5
ThInternational Workshop on Digital Watermarking, pp226-240, Oct.2006) at first audio frequency is carried out one dimension to the conversion of two dimension, then corresponding 2D signal is carried out the Zernike conversion.The amplitude that has proved by experiment the Zernike square is processed normal signal has very strong robustness; Analyze simultaneously the amplitude of Zernike square and the linear relationship of audio samples value, proposed thus the robust audio watermarking algorithm based on low order Zernike square.Document " A pseudo-Zernike moments based audio watermarking scheme robust against desynchronization attacks " (Wang Xiang-yang, Ma Tian-xiao, Niu Pan-pan, Computers and Electrical Engineering, vol.37, no.4, pp.425-443, July 2011) at first embed synchronous code in time domain based on average statistical, then quantize the amplitude embed watermark of Zernike pseudo-matrix, proposed the Audio Watermarking Algorithm based on the synchronous attack resistant of Zernike pseudo-matrix.For the above-mentioned watermarking algorithm based on Zernike pseudo-matrix (Zernike square), on the one hand, need to calculate the Zernike pseudo-matrix of all sample points, calculated amount is larger, and the time of expending is longer.The embedding of watermark is to finish by the sample value of each audio section of convergent-divergent in proportion.The analysis showed that directly the scalable audio sample value is larger to the change amount of original audio, the quality of original audio signal is caused larger destruction; On the other hand, the embedded location of watermark and method are disclosed, and the calculating of the feature of each audio frame (Zernike pseudo-matrix) also is known.So the assailant can find the position of each audio frame and calculate the feature of every frame, the re-quantization Zernike pseudo-matrix removes the watermark of embedding, makes algorithm lose the effect of protection copyright.Perhaps, the assailant can replace the audio frequency that contains watermark with other audio section, and the audio content after then quantizing to replace makes it satisfy the correct condition of extracting of watermark, and its Content Implementation is attacked.Therefore, the content-based strong voice content identifying algorithm of anti-attack ability of research has important practical significance.
Summary of the invention
In view of the deficiencies in the prior art, the object of the present invention is to provide a kind of voice content identifying algorithm based on Zernike pseudo-matrix, this algorithm can effectively be distinguished the normal signal of voice is processed operation and malicious attack, and can effective location the voice content position of maliciously distorting, thereby realize the authenticity and integrity authentication of voice content.
For realizing such purpose, the robustness that the present invention processes normal signal take the Zernike pseudo-matrix amplitude of DCT low frequency coefficient has designed a kind of new watermark generation and embedding grammar as foundation.
A kind of voice content authentication method based on Zernike pseudo-matrix can effectively be distinguished normal signal and process operation and malicious attack, simultaneously to malicious attack tampering location effectively.Thereby realize the authenticity and integrity authentication of voice content, comprise following concrete steps:
(1) watermark embeds: at first K sample point from voice signal begins primary speech signal A is divided into P frame (K is as the key of watermaking system), and every frame is divided into the N section.Then calculate the n rank Zernike pseudo-matrix amplitude sum of the front N/2 section DCT low frequency coefficient of every frame, and obtain the average of Zernike pseudo-matrix amplitude, by average generation watermark W.The watermark that obtains is embedded in the rear N/2 section of every frame by the Zernike pseudo-matrix that quantizes the DCT low frequency coefficient, and the voice signal that contains watermark that obtains is designated as A '
(2) voice content verification process: similar with watermark embed process, at first from the k of voice signal to be detected
1Individual sample point begins A
*Be divided into the P frame, every frame is divided into the N section.Calculate the n rank Zernike pseudo-matrix amplitude sum of the front N/2 section DCT low frequency coefficient of every frame, and ask its average, by average generation watermark W '.Calculate the n rank Zernike pseudo-matrix amplitude of N/2 section DCT low frequency coefficient behind every frame, go out watermark W by the magnitude extraction of Zernike square
*Compare W
*And W ', judge that those different places, corresponding position are the position that voice signal was tampered, thereby realized the authentication of voice content authenticity and integrity.
Compare with existing voice watermarking algorithm for content authentication, the present invention utilizes the content of voice to come generating watermark, and receiving end has also been received the watermark that is embedded in the voice signal when receiving voice signal.Thereby reduced transmission bandwidth, saved resource; Also strengthened simultaneously the security that watermark transmits.The embedding of watermark only need to be carried out pseudo-Zernike conversion to the DCT low frequency coefficient, has improved the efficient of algorithm and the ability that watermark tolerance normal signal is processed.So the present invention is easier to practical application.
Description of drawings
Fig. 1 is the moisture indo-hittite tone signal figure of the embodiment of the invention.
Fig. 2 is to the voice signal figure after the quiet attack of Fig. 1 part voice content.
Fig. 3 is to voice signal figure corresponding behind Fig. 1 partial content substitution attack.
Fig. 4 is the tampering location result of Fig. 2.
Fig. 5 is the tampering location result of Fig. 3.
Fig. 6 is for can not listen property testing the results list.
Fig. 7 is the robustness test result tabulation that normal signal is processed.
Embodiment
Below in conjunction with appendix and embodiment technical scheme of the present invention is further described.
1, the generation of watermark and embedding:
(1) minute frame of speech data and the division of every frame voice segments.With primary speech signal A={a (l), 1≤l≤LA+K} is divided into P frame (K is as the key of watermaking system), and every frame length is I=LA/P, the i frame be designated as A (i) (i=1,2 ..., P).Every frame is divided into the N section, and every section length is I/N, and i frame j section is designated as A (i, j), 1≤i≤P, 1≤j≤N.
(2) dct transform.A (i, j) is done dct transform, the DCT coefficient of D (i, j) expression i frame j section, the DCT coefficient of getting the front N/2 section of i frame is designated as D
1(i, j).
(3) the heavy Zernike pseudo-matrix of Calculation of N-order m.With D
1The front m of (i, j)
1* m
1Individual low frequency coefficient is transformed to 2D signal.Calculate as follows the heavy Zernike pseudo-matrix of its n rank m:
Note { V
NmBe pseudo-Zernike polynomial expression, it is the set that a series of complex value polynomial expressions consist of, { V
NmThe interior Complete Orthogonal base of component unit circle, it is defined as follows formula
V
nm(x,y)=V
nm(ρ,θ)=R
nm(ρ)exp(imθ)
Wherein n is nonnegative integer, and m is for satisfying | the integer of m|≤n.The note true origin is l to the vector of point (x, y), ρ=| l|, θ are that x axle forward is to the anticlockwise angle of vectorial l.R
Nm(ρ) be radial polynomial, namely
2D signal f (x, y) (x in the coordinate plane
2+ y
2≤ 1) can be expressed as V
NmThe linear combination of (x, y), as shown in the formula
Wherein
And V
Nm(x, y) be conjugate complex number each other, A
NmBe the heavy Zernike pseudo-matrix of n rank m, be defined as follows:
(4) generation of voice watermark.Get the front N/2 section of each frame and come generating watermark.Note
1≤i≤P, 1≤j≤N/2 are the amplitude sum of n rank Zernike pseudo-matrix, calculate C
1The average of (i, j)
Note
Most significant digit be M
1(i), M
1(i) corresponding scale-of-two is made as W
1(i)={ w
1(i, t), 1≤t≤N/2}, W
1(i) be the watermark that the i frame generates.
(5) embedding of watermark.Get that the DCT coefficient of N/2 section is designated as D behind the i frame
2(i, j), N2+1≤j≤N is with D
2The front m of (i, j)
2* m
2Individual low frequency coefficient is transformed to 2D signal, and calculates its n rank Zernike pseudo-matrix amplitude sum, is designated as C
2(i, j).Note
Most significant digit be M
2(i, j), watermark embeds according to the methods below:
Work as w
1(i, t)=1 o'clock
Work as w
1(i, j)=0 o'clock
In the following formula, work as M
2(i, j)=9 o'clock, M
2' (i, j)=M
2(i, j)-1; J=t+N2,1≤t≤N2.Use M
2' (i, j) replaces C
2(i, j) integral part most significant digit, and an inferior high position is quantified as 5, corresponding value is designated as C
2' (i, j).
With D
2The front m of (i, j)
2* m
2Individual low frequency coefficient enlarges α
2(i, j) doubly, corresponding value is designated as D
2' (i, j), α
2(i, j) can be obtained by following formula:
To D
2' (i, j) does inverse DCT, and the signal that obtains is the latter half content of i frame, and i frame first half and latter half combine and be the moisture indo-hittite tone signal of i frame.
(6) P speech frame carried out such embedding successively, until complete all speech frames of embedding just obtain moisture indo-hittite sound A '.
2, voice content authentication:
(1) step (1)~(4) of similar watermark generation and telescopiny are to voice signal A to be detected
*Begin to be divided into the P frame from K sample point, every frame is divided into the N section, and the i frame is designated as A
*(i) (i=1,2 ..., P), i frame j section is designated as A
*(i, j), 1≤j≤N; To A
*(i, j) is DCT, and corresponding DCT coefficient is designated as D
*(i, j).The DCT coefficient of getting the front N/2 section of i frame is designated as
Will
Front m
1* m
1Individual low frequency coefficient is transformed to 2D signal, and calculates its n rank Zernike pseudo-matrix amplitude sum, is designated as
1≤j≤N/2.Calculate
The average of 1≤j≤N/2
Note
Most significant digit be
Two-value turns to
Be the watermark that the i frame generates reconstruct.
(2) get that the DCT coefficient of N/2 section is designated as behind the i frame
Will
Front m
2* m
2Individual low frequency coefficient is transformed to 2D signal, and calculates its n rank Zernike pseudo-matrix amplitude sum, is designated as
N2+1≤j≤N.Note
Most significant digit be
Carry out the watermark that following calculating obtains extraction
(3) definition identification sequences TA (i) is
If TA (i)=0 show that then i frame voice content is real, otherwise TA (i)=1 shows that i frame voice content is tampered.
The effect of the inventive method can be verified by following performance evaluation:
1, can not listening property
Choosing sampling rate is 22.05kHz, and sample length is that the monophony voice signal of 1024078,16 quantifications is done and can not be listened property testing.Fig. 6 has provided the SNR value of 3 kinds of sound-types, and can find out by test result that this paper algorithm has well can not listening property.
2, the robustness of normal signal being processed
Test the robustness that this paper algorithm is processed normal signal with error rate BER (bit error rate), BER is defined as follows formula
Wherein, E is for extracting watermark error bit number, and T is the total bit number of voice signal institute water mark inlaying.The BER value more bright algorithm of novel is stronger to the robustness of normal signal processing.
Fig. 7 has listed the BER value (test result of other type voice signal similarly) of adult male voice after processing through some normal signals, can find out that the inventive method has stronger robustness to the conventional voice signal processing such as MP3 compression, low-pass filtering, resampling.
3, malice tampering location
Moisture indo-hittite tone signal has as shown in Figure 1 been carried out respectively quiet and substitution attack.Voice signal after the attack is distinguished as shown in Figures 2 and 3, and corresponding tampering location result respectively as shown in Figure 4 and Figure 5.Among Fig. 4, Fig. 5, the frame of TA (i)=1 represents that by the part of malicious attack the frame of TA (i)=0 represents not have the part of malicious attack.From the result of tampering location, the inventive method is to malicious attack tampering location effectively.
Above-mentioned description for preferred embodiment is too concrete; those of ordinary skill in the art will appreciate that; embodiment described here is in order to help the reader to understand principle of the present invention, should to be understood to that the protection domain of inventing is not limited to such special statement and embodiment.
Claims (1)
1. voice content authentication method based on Zernike pseudo-matrix is processed operation and malicious attack in order to distinguish normal signal, and to malicious attack tampering location effectively, concrete steps comprise simultaneously:
(1) watermark embeds: at first K sample point from voice signal begins primary speech signal A is divided into the P frame, and every frame is divided into the N section; Then N/2 section discrete cosine becomes the n rank Zernike pseudo-matrix amplitude sum of DCT low frequency coefficient before calculating every frame, and obtains the average of Zernike pseudo-matrix amplitude, by average generation watermark W; The watermark that obtains is embedded in the rear N/2 section of every frame by the Zernike pseudo-matrix that quantizes the DCT low frequency coefficient, obtains moisture indo-hittite sound A ';
(2) voice content verification process: similar with watermark embed process, at first from voice signal A to be detected
*K
1Individual sample point begins voice are divided into the P frame, and every frame is divided into the N section.Calculate the n rank Zernike pseudo-matrix amplitude sum of the front N/2 section DCT low frequency coefficient of every frame, and ask its average, by average generation watermark W '; Calculate the n rank Zernike pseudo-matrix amplitude of N/2 section DCT low frequency coefficient behind every frame, go out watermark W by the magnitude extraction of Zernike square
*Compare W
*And W ', judge that different place, corresponding position is the position that voice signal was tampered, thereby realized the authentication of voice content authenticity and integrity.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210278724.3A CN102867513B (en) | 2012-08-07 | 2012-08-07 | Pseudo-Zernike moment based voice content authentication method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201210278724.3A CN102867513B (en) | 2012-08-07 | 2012-08-07 | Pseudo-Zernike moment based voice content authentication method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102867513A true CN102867513A (en) | 2013-01-09 |
CN102867513B CN102867513B (en) | 2014-02-19 |
Family
ID=47446337
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201210278724.3A Expired - Fee Related CN102867513B (en) | 2012-08-07 | 2012-08-07 | Pseudo-Zernike moment based voice content authentication method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102867513B (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103456308A (en) * | 2013-08-05 | 2013-12-18 | 西南交通大学 | Restorable ciphertext domain speech content authentication method |
CN105304091A (en) * | 2015-06-26 | 2016-02-03 | 信阳师范学院 | Voice tamper recovery method based on DCT |
CN106157972A (en) * | 2015-05-12 | 2016-11-23 | 恩智浦有限公司 | Use the method and apparatus that local binary pattern carries out acoustics situation identification |
CN107886956A (en) * | 2017-11-13 | 2018-04-06 | 广州酷狗计算机科技有限公司 | Audio identification methods, device and computer-readable storage medium |
CN111213203A (en) * | 2017-10-20 | 2020-05-29 | 思睿逻辑国际半导体有限公司 | Secure voice biometric authentication |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7280970B2 (en) * | 1999-10-04 | 2007-10-09 | Beepcard Ltd. | Sonic/ultrasonic authentication device |
CN101609675A (en) * | 2009-07-27 | 2009-12-23 | 西南交通大学 | A kind of fragile audio frequency watermark method based on barycenter |
-
2012
- 2012-08-07 CN CN201210278724.3A patent/CN102867513B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7280970B2 (en) * | 1999-10-04 | 2007-10-09 | Beepcard Ltd. | Sonic/ultrasonic authentication device |
CN101609675A (en) * | 2009-07-27 | 2009-12-23 | 西南交通大学 | A kind of fragile audio frequency watermark method based on barycenter |
Non-Patent Citations (3)
Title |
---|
SHIJUN XIANG ET AL.: "Robust Audio Watermarking Based on Low-Order Zernike Moments", 《DIGITAL WATERMARKING LECTURE NOTES IN COMPUTER SCIENCE》 * |
XIANG-YANG WANG ET AL.: "A pseudo-Zernike moment based audio watermarking scheme robust against desynchronization attacks", 《COMPUTERS AND ELECTRICAL ENGINEERING》 * |
XIONG YI-QUN ET AL.: "An Audio Zero-watermark Algorithm Combined DCT with Zernike Moments", 《INTERNATIONAL CONFERENCE ON CYBERWORLDS 2008》 * |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103456308A (en) * | 2013-08-05 | 2013-12-18 | 西南交通大学 | Restorable ciphertext domain speech content authentication method |
CN103456308B (en) * | 2013-08-05 | 2015-08-19 | 西南交通大学 | A kind of recoverable ciphertext domain voice content authentication method |
CN106157972A (en) * | 2015-05-12 | 2016-11-23 | 恩智浦有限公司 | Use the method and apparatus that local binary pattern carries out acoustics situation identification |
CN105304091A (en) * | 2015-06-26 | 2016-02-03 | 信阳师范学院 | Voice tamper recovery method based on DCT |
CN105304091B (en) * | 2015-06-26 | 2018-10-26 | 信阳师范学院 | A kind of voice tamper recovery method based on DCT |
CN111213203A (en) * | 2017-10-20 | 2020-05-29 | 思睿逻辑国际半导体有限公司 | Secure voice biometric authentication |
CN111213203B (en) * | 2017-10-20 | 2021-03-02 | 思睿逻辑国际半导体有限公司 | Secure voice biometric authentication |
CN107886956A (en) * | 2017-11-13 | 2018-04-06 | 广州酷狗计算机科技有限公司 | Audio identification methods, device and computer-readable storage medium |
Also Published As
Publication number | Publication date |
---|---|
CN102867513B (en) | 2014-02-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101290772B (en) | Embedding and extracting method for audio zero water mark based on vector quantization of coefficient of mixed domain | |
Fan et al. | Chaos-based discrete fractional Sine transform domain audio watermarking scheme | |
Liu et al. | A novel speech content authentication algorithm based on Bessel–Fourier moments | |
CN102157154B (en) | Audio-content-based non-uniform discrete cosine transform audio reliability authentication method | |
CN100596061C (en) | Method for watermarking small wave threshold digital audio multiple mesh based on blind source separation | |
CN102867513B (en) | Pseudo-Zernike moment based voice content authentication method | |
CN107993669B (en) | Voice content authentication and tampering recovery method based on modification of least significant digit weight | |
CN103208288A (en) | Dual encryption based discrete wavelet transform-discrete cosine transform (DWT-DCT) domain audio public watermarking algorithm | |
CN105632506A (en) | Robust digital audio watermark embedding and detection method based on polar harmonic transform | |
Wang et al. | Centroid-based semi-fragile audio watermarking in hybrid domain | |
Wang et al. | An algorithm of detecting audio copy-move forgery based on DCT and SVD | |
CN102324234A (en) | Audio watermarking method based on MP3 encoding principle | |
CN105304091A (en) | Voice tamper recovery method based on DCT | |
CN101609675B (en) | Fragile audio frequency watermark method based on mass center | |
CN101872466B (en) | Watermark embedding method, and watermark detection method and device | |
CN105895109A (en) | Digital voice evidence collection and tamper recovery method based on DWT (Discrete Wavelet Transform) and DCT (Discrete Cosine Transform) | |
Liu et al. | Pseudo-zernike moments-based audio content authentication algorithm robust against feature-analysed substitution attack | |
CN108877819B (en) | Voice content evidence obtaining method based on coefficient autocorrelation | |
Chen et al. | A multipurpose audio watermarking scheme for copyright protection and content authentication | |
Lalitha et al. | Localization of copy-move forgery in speech signals through watermarking using DCT-QIM | |
Chen et al. | Multipurpose audio watermarking algorithm | |
CN103745725B (en) | A kind of audio frequency watermark embedding grammar based on constant Q transform | |
CN106373584A (en) | Robust audio watermarking method utilizing complex characteristic quantity and asymmetric framing | |
Yue et al. | Rights protection for trajectory streams | |
Yang et al. | A novel dual watermarking algorithm for digital audio |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20140219 Termination date: 20160807 |