CN113113019A - Voice library generating system and method - Google Patents
Voice library generating system and method Download PDFInfo
- Publication number
- CN113113019A CN113113019A CN202110328947.5A CN202110328947A CN113113019A CN 113113019 A CN113113019 A CN 113113019A CN 202110328947 A CN202110328947 A CN 202110328947A CN 113113019 A CN113113019 A CN 113113019A
- Authority
- CN
- China
- Prior art keywords
- voice
- module
- server
- data
- instruction
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 8
- 238000013500 data storage Methods 0.000 claims description 5
- 230000006978 adaptation Effects 0.000 claims description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
- G10L15/30—Distributed recognition, e.g. in client-server systems, for mobile phones or network applications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/28—Constructional details of speech recognition systems
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/225—Feedback of the input speech
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Debugging And Monitoring (AREA)
Abstract
The invention discloses a voice library generating system and a method, belonging to the technical field of voice library systems, and comprising a client, a voice recording system, a server and an instruction output end, wherein the voice recording system is connected to the server, voice data are collected through the voice recording system, the collected voice data are transmitted to the server, data comparison and storage are carried out through the server, the client is connected to the server, the voice instruction is input into the server through the client, and the voice instruction is output through the instruction output end after the voice instruction is compared through the server, so that voice instruction output is finished.
Description
Technical Field
The invention relates to the technical field of voice library systems, in particular to a voice library generating system and a voice library generating method.
Background
With the development of voice recognition technology, digital equipment and multimedia technology, voice endpoint detection technology has been well developed, voice endpoint detection is a technology for detecting voice segments in continuous signals, and voice endpoint detection can be combined with an automatic voice recognition system and a voiceprint recognition system, so that a voice library needs to be further improved in order to issue instructions directly by multiple devices through languages.
Disclosure of Invention
The embodiment of the invention provides a system and a method for generating a voice library, which aim to solve the technical problem that the voice library in the prior art needs to be further improved.
The embodiment of the invention adopts the following technical scheme: a voice library generating system comprises a client, a voice recording system, a server and an instruction output end, wherein the voice recording system is connected to the server, voice data are collected through the voice recording system, the collected voice data are transmitted to the server, data comparison and storage are carried out through the server, the client is connected to the server, the client inputs voice instructions into the server to be carried out, and the voice instructions are output through the instruction output end after being compared through the server, so that voice instruction output is completed.
Furthermore, the server is composed of a voice matching classification module, a voice data repository, a voice receiving module and a voice comparison module, the voice recording system is connected to the voice matching classification module, voice data and instructions recorded by the voice recording system can be correspondingly classified through the voice matching classification module, the classified voice data are transmitted to the voice data repository for storage, a voice instruction sent by the client is input into the voice database through the voice receiving module, the voice instruction input by the client is compared with the voice data in the voice data repository through the voice comparison module, and therefore the relative instruction of the adaptation part is output through the instruction output end.
Furthermore, the server is also provided with an invalid voice library, the voice matching classification module and the voice data repository are connected to the invalid voice library, unrecognized voice recorded in the voice matching classification module is transmitted to the invalid voice library, invalid voice data is input into the invalid voice library in the voice database, so that the space occupied by the invalid voice data in the voice data repository can be reduced, and a manager can regularly check the voice data in the invalid voice library for debugging.
Furthermore, an error feedback module is arranged in the server, and when the sound data output by the client violates the wish of the client, the sound data can be fed back through the error feedback module, so that the server can be improved according to the requirements of the client conveniently.
Further, the voice recording system comprises a task deployment module and a plurality of recording ends, the plurality of recording ends are deployed with tasks through the task deployment module, the plurality of recording ends record voice data according to the tasks deployed by the task deployment module, and the recorded voice data are transmitted to the server for centralized processing and storage, so that the voice data are collected and learned.
Furthermore, a network uploading module is arranged at the receiving and recording end, and the sound data recorded by the receiving and recording end is transmitted to the server through the network uploading module, so that the receiving and recording efficiency of the receiving and recording end can be greatly improved, and different sound data are used for receiving and recording from various places.
A voice library generation method comprises the following steps:
s1: and the task deployment module performs task deployment on the plurality of receiving and recording ends.
S2: and the recording end records the specified voice data according to the tasks arranged by the task deployment module and uploads the voice data to the server through the network uploading module.
S3: and matching and classifying the sound data through a voice matching and classifying module, transmitting the classified sound data to a sound data repository for storage, and transmitting the unrecognized voice to an invalid voice library.
S4: the client side transmits the instruction sound to the sound comparison module through the voice receiving module, and compares the direct current sound with the sound data in the voice data storage base through the sound comparison module, so that a matched instruction is obtained.
S5: and outputting the instruction sound through an instruction output end.
S6: the sound data output by the client end can be fed back through the error feedback module when the sound data output by the client end violates the desire of the client end.
The embodiment of the invention adopts at least one technical scheme which can achieve the following beneficial effects:
the system comprises a server, a client, a voice command input end, a command output end, a voice receiving module, a voice database and a voice comparison module, wherein the server is used for comparing and storing data, the client inputs the voice command into the server, the server is used for comparing the voice command and outputting the voice command through the command output end, the voice command sent by the client is input into the voice database through the voice receiving module, the voice command input by the client is compared with voice data in the voice data repository through the voice comparison module, and therefore a matching part carries out relative command and outputs the voice command through the command output end.
Secondly, an invalid voice library is arranged in the system, the voice matching classification module and the voice data repository are connected to the invalid voice library, unrecognized voice recorded in the voice matching classification module is transmitted to the invalid voice library, invalid voice data are input into the invalid voice library in the voice database, accordingly, the space occupied by the invalid voice data in the voice data repository can be reduced, and a manager can regularly check the voice data in the invalid voice library for debugging.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:
FIG. 1 is a system architecture diagram of the present invention;
fig. 2 is an architecture diagram of the voice recording system according to the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the specific embodiments of the present invention and the accompanying drawings. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The technical solutions provided by the embodiments of the present invention are described in detail below with reference to the accompanying drawings.
A voice library generating system comprises a client, a voice recording system, a server and an instruction output end, wherein the voice recording system is connected to the server, voice data are collected through the voice recording system, the collected voice data are transmitted to the server, data comparison and storage are carried out through the server, the client is connected to the server, the client inputs voice instructions into the server to be carried out, and the voice instructions are output through the instruction output end after being compared through the server, so that voice instruction output is completed.
Preferably, the server is composed of a voice matching classification module, a voice data repository, a voice receiving module and a voice comparison module, the voice recording system is connected to the voice matching classification module, voice data and instructions recorded by the voice recording system can be correspondingly classified through the voice matching classification module, the classified voice data are transmitted to the voice data repository for storage, a voice instruction sent by the client is input into the voice database through the voice receiving module, the voice instruction input by the client is compared with the voice data in the voice data repository through the voice comparison module, and therefore the relative instruction at the adaptation position is output through the instruction output end.
Preferably, the server is further provided with an invalid voice library, the voice matching classification module and the voice data repository are both connected to the invalid voice library, unrecognized voice recorded in the voice matching classification module is transmitted to the invalid voice library, and invalid voice data is input into the invalid voice library in the voice database, so that the space occupied by the invalid voice data in the voice data repository can be reduced, and a manager can regularly check and debug the voice data in the invalid voice library.
Preferably, the server is provided with an error feedback module, and when the sound data output by the client violates the wish of the client, the feedback can be performed through the error feedback module, so that the server can improve the sound data according to the requirements of the client.
Preferably, the voice recording system is composed of a task deployment module and a plurality of recording ends, the tasks are deployed on the plurality of recording ends through the task deployment module, the plurality of recording ends record voice data according to the tasks deployed by the task deployment module, and the recorded voice data are transmitted to the server to be processed and stored in a centralized manner, so that the voice data are collected and learned.
Preferably, a network uploading module is arranged at the receiving and recording end, and the sound data recorded by the receiving and recording end is transmitted to the server through the network uploading module, so that the receiving and recording efficiency of the receiving and recording end can be greatly improved, and different sound data are used for receiving and recording from various places.
A voice library generation method comprises the following steps:
s1: and the task deployment module performs task deployment on the plurality of receiving and recording ends.
S2: and the recording end records the specified voice data according to the tasks arranged by the task deployment module and uploads the voice data to the server through the network uploading module.
S3: and matching and classifying the sound data through a voice matching and classifying module, transmitting the classified sound data to a sound data repository for storage, and transmitting the unrecognized voice to an invalid voice library.
S4: the client side transmits the instruction sound to the sound comparison module through the voice receiving module, and compares the direct current sound with the sound data in the voice data storage base through the sound comparison module, so that a matched instruction is obtained.
S5: and outputting the instruction sound through an instruction output end.
S6: the sound data output by the client end can be fed back through the error feedback module when the sound data output by the client end violates the desire of the client end.
The above description is only an example of the present invention, and is not intended to limit the present invention. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.
Claims (7)
1. A speech library generation system, comprising: the voice recording system is connected to the server, voice data are collected through the voice recording system, the collected voice data are transmitted to the server, data comparison and storage are conducted through the server, the client is connected to the server, the voice instruction is input into the server through the client, the voice instruction is output through the instruction output end after being compared through the server, and therefore voice instruction output is completed.
2. A speech library generation system according to claim 1, wherein: the server comprises a voice matching classification module, a voice data storage base, a voice receiving module and a voice comparison module, wherein the voice receiving system is connected to the voice matching classification module, voice data and instructions received by the voice receiving system can be correspondingly classified through the voice matching classification module, the classified voice data are transmitted to the voice data storage base to be stored, a voice instruction sent by the client is input into the voice data base through the voice receiving module, the voice instruction input by the client is compared with the voice data in the voice data storage base through the voice comparison module, and therefore the relative instruction of the adaptation part is output through the instruction output end.
3. A speech library generation system according to claim 1, wherein: the server is also provided with an invalid voice library, the voice matching classification module and the voice data repository are connected to the invalid voice library, unrecognizable voices recorded in the voice matching classification module are transmitted to the invalid voice library, invalid voice data are input into the invalid voice library in the voice database, so that the space occupied by the invalid voice data in the voice data repository can be reduced, and a manager can regularly check the voice data in the invalid voice library for debugging.
4. A speech library generation system according to claim 1, wherein: the server is provided with an error feedback module, and when the sound data output by the client violates the wish of the client, the feedback can be carried out through the error feedback module, so that the server can be improved according to the requirements of clients conveniently.
5. A speech library generation system according to claim 1, wherein: the voice recording system is composed of a task deployment module and a plurality of recording ends, the plurality of recording ends are deployed through the task deployment module, the plurality of recording ends record voice data according to the tasks deployed by the task deployment module, and the recorded voice data are transmitted to the server to be processed and stored in a centralized mode, so that the voice data are collected and learned.
6. A speech library generation system according to claim 1, wherein: the network uploading module is arranged at the recording end, and the sound data recorded by the recording end is transmitted to the server through the network uploading module, so that the recording efficiency of the recording end can be greatly improved, and different sound data are used for recording from different places.
7. A method for a speech library generation system according to claims 1-6, comprising the steps of:
s1: the task deployment module performs task deployment on the plurality of recording ends;
s2: the recording end records the appointed voice data according to the tasks arranged by the task deployment module and uploads the voice data to the server through the network uploading module;
s3: matching and classifying the sound data through a voice matching and classifying module, transmitting the classified sound data to a sound data repository for storage, and transmitting the unrecognized voice to an invalid voice library;
s4: the client side transmits the instruction sound to the sound comparison module through the voice receiving module, and compares the direct current sound with the sound data in the voice data repository through the sound comparison module so as to acquire a matched instruction;
s5: outputting the instruction sound through an instruction output end;
s6: the sound data output by the client end can be fed back through the error feedback module when the sound data output by the client end violates the desire of the client end.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110328947.5A CN113113019A (en) | 2021-03-27 | 2021-03-27 | Voice library generating system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110328947.5A CN113113019A (en) | 2021-03-27 | 2021-03-27 | Voice library generating system and method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN113113019A true CN113113019A (en) | 2021-07-13 |
Family
ID=76712393
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110328947.5A Pending CN113113019A (en) | 2021-03-27 | 2021-03-27 | Voice library generating system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113113019A (en) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101000767A (en) * | 2006-01-09 | 2007-07-18 | 杭州世导科技有限公司 | Speech recognition equipment and method |
US20090210221A1 (en) * | 2008-02-20 | 2009-08-20 | Shin-Ichi Isobe | Communication system for building speech database for speech synthesis, relay device therefor, and relay method therefor |
CN101847406A (en) * | 2010-05-18 | 2010-09-29 | 中国农业大学 | Speech recognition query method and system |
CN102708858A (en) * | 2012-06-27 | 2012-10-03 | 厦门思德电子科技有限公司 | Voice bank realization voice recognition system and method based on organizing way |
CN203456091U (en) * | 2013-04-03 | 2014-02-26 | 中金数据系统有限公司 | Construction system of speech corpus |
CN103927006A (en) * | 2014-04-08 | 2014-07-16 | 弗徕威智能机器人科技(上海)有限公司 | Robot based information interaction system and method |
CN105206260A (en) * | 2015-08-31 | 2015-12-30 | 努比亚技术有限公司 | Terminal voice broadcasting method, device and terminal voice operation method |
CN109102807A (en) * | 2018-10-18 | 2018-12-28 | 珠海格力电器股份有限公司 | Personalized voice database creation system, voice recognition control system and terminal |
CN109389969A (en) * | 2018-10-29 | 2019-02-26 | 百度在线网络技术(北京)有限公司 | Corpus optimization method and device |
CN109471931A (en) * | 2018-11-22 | 2019-03-15 | 平安科技(深圳)有限公司 | Corpus collection method, device, computer equipment and storage medium |
CN109801628A (en) * | 2019-02-11 | 2019-05-24 | 龙马智芯(珠海横琴)科技有限公司 | A kind of corpus collection method, apparatus and system |
-
2021
- 2021-03-27 CN CN202110328947.5A patent/CN113113019A/en active Pending
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101000767A (en) * | 2006-01-09 | 2007-07-18 | 杭州世导科技有限公司 | Speech recognition equipment and method |
US20090210221A1 (en) * | 2008-02-20 | 2009-08-20 | Shin-Ichi Isobe | Communication system for building speech database for speech synthesis, relay device therefor, and relay method therefor |
CN101847406A (en) * | 2010-05-18 | 2010-09-29 | 中国农业大学 | Speech recognition query method and system |
CN102708858A (en) * | 2012-06-27 | 2012-10-03 | 厦门思德电子科技有限公司 | Voice bank realization voice recognition system and method based on organizing way |
CN203456091U (en) * | 2013-04-03 | 2014-02-26 | 中金数据系统有限公司 | Construction system of speech corpus |
CN103927006A (en) * | 2014-04-08 | 2014-07-16 | 弗徕威智能机器人科技(上海)有限公司 | Robot based information interaction system and method |
CN105206260A (en) * | 2015-08-31 | 2015-12-30 | 努比亚技术有限公司 | Terminal voice broadcasting method, device and terminal voice operation method |
CN109102807A (en) * | 2018-10-18 | 2018-12-28 | 珠海格力电器股份有限公司 | Personalized voice database creation system, voice recognition control system and terminal |
CN109389969A (en) * | 2018-10-29 | 2019-02-26 | 百度在线网络技术(北京)有限公司 | Corpus optimization method and device |
CN109471931A (en) * | 2018-11-22 | 2019-03-15 | 平安科技(深圳)有限公司 | Corpus collection method, device, computer equipment and storage medium |
CN109801628A (en) * | 2019-02-11 | 2019-05-24 | 龙马智芯(珠海横琴)科技有限公司 | A kind of corpus collection method, apparatus and system |
Non-Patent Citations (1)
Title |
---|
王楠: "语料库在药学英语词汇教学中的应用", 《湖北科技学院学报》 * |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11018885B2 (en) | Summarization system | |
WO2020238209A1 (en) | Audio processing method, system and related device | |
Havard et al. | Speech-coco: 600k visually grounded spoken captions aligned to mscoco data set | |
US9595255B2 (en) | Single interface for local and remote speech synthesis | |
US20200004878A1 (en) | System and method for generating dialogue graphs | |
Sangwan et al. | 'houston, we have a solution': using NASA apollo program to advance speech and language processing technology. | |
CN111798833A (en) | Voice test method, device, equipment and storage medium | |
WO2022074869A1 (en) | System and method for producing metadata of an audio signal | |
CN117762464A (en) | Cloud computing-based software operation and maintenance system and method | |
CN112734604A (en) | Device for providing multi-mode intelligent case report and record generation method thereof | |
CN113113019A (en) | Voice library generating system and method | |
CN101950564A (en) | Remote digital voice acquisition, analysis and identification system | |
KR102307249B1 (en) | Storage system of voice recording information based on blockchain | |
JP2545914B2 (en) | Speech recognition method | |
JP2005196020A (en) | Speech processing apparatus, method, and program | |
CN108170669A (en) | Power dispatching network command issuing method, system and voice recognition and verification unit module thereof | |
CN112270922B (en) | Automatic filling method and device for scheduling log | |
US10915715B2 (en) | System and method for identifying and tagging assets within an AV file | |
US11392639B2 (en) | Method and apparatus for automatic speaker diarization | |
CN113763949A (en) | Speech recognition correction method, electronic device, and computer-readable storage medium | |
US8831940B2 (en) | Hierarchical quick note to allow dictated code phrases to be transcribed to standard clauses | |
CN111914777B (en) | Method and system for identifying robot instruction in cross-mode manner | |
CN118438441A (en) | Intelligent voice management system of scenic spot self-service robot | |
CN111785260B (en) | Clause method and device, storage medium and electronic equipment | |
CN113066507B (en) | End-to-end speaker separation method, system and equipment |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210713 |
|
RJ01 | Rejection of invention patent application after publication |