CN113113019A

CN113113019A - Voice library generating system and method

Info

Publication number: CN113113019A
Application number: CN202110328947.5A
Authority: CN
Inventors: 尤文杰; 邬锡敏
Original assignee: Shanghai Hongzhen Information Science & Technology Co ltd
Current assignee: Shanghai Hongzhen Information Science & Technology Co ltd
Priority date: 2021-03-27
Filing date: 2021-03-27
Publication date: 2021-07-13

Abstract

The invention discloses a voice library generating system and a method, belonging to the technical field of voice library systems, and comprising a client, a voice recording system, a server and an instruction output end, wherein the voice recording system is connected to the server, voice data are collected through the voice recording system, the collected voice data are transmitted to the server, data comparison and storage are carried out through the server, the client is connected to the server, the voice instruction is input into the server through the client, and the voice instruction is output through the instruction output end after the voice instruction is compared through the server, so that voice instruction output is finished.

Description

Voice library generating system and method

Technical Field

The invention relates to the technical field of voice library systems, in particular to a voice library generating system and a voice library generating method.

Background

With the development of voice recognition technology, digital equipment and multimedia technology, voice endpoint detection technology has been well developed, voice endpoint detection is a technology for detecting voice segments in continuous signals, and voice endpoint detection can be combined with an automatic voice recognition system and a voiceprint recognition system, so that a voice library needs to be further improved in order to issue instructions directly by multiple devices through languages.

Disclosure of Invention

The embodiment of the invention provides a system and a method for generating a voice library, which aim to solve the technical problem that the voice library in the prior art needs to be further improved.

The embodiment of the invention adopts the following technical scheme: a voice library generating system comprises a client, a voice recording system, a server and an instruction output end, wherein the voice recording system is connected to the server, voice data are collected through the voice recording system, the collected voice data are transmitted to the server, data comparison and storage are carried out through the server, the client is connected to the server, the client inputs voice instructions into the server to be carried out, and the voice instructions are output through the instruction output end after being compared through the server, so that voice instruction output is completed.

Furthermore, the server is composed of a voice matching classification module, a voice data repository, a voice receiving module and a voice comparison module, the voice recording system is connected to the voice matching classification module, voice data and instructions recorded by the voice recording system can be correspondingly classified through the voice matching classification module, the classified voice data are transmitted to the voice data repository for storage, a voice instruction sent by the client is input into the voice database through the voice receiving module, the voice instruction input by the client is compared with the voice data in the voice data repository through the voice comparison module, and therefore the relative instruction of the adaptation part is output through the instruction output end.

Furthermore, the server is also provided with an invalid voice library, the voice matching classification module and the voice data repository are connected to the invalid voice library, unrecognized voice recorded in the voice matching classification module is transmitted to the invalid voice library, invalid voice data is input into the invalid voice library in the voice database, so that the space occupied by the invalid voice data in the voice data repository can be reduced, and a manager can regularly check the voice data in the invalid voice library for debugging.

Furthermore, an error feedback module is arranged in the server, and when the sound data output by the client violates the wish of the client, the sound data can be fed back through the error feedback module, so that the server can be improved according to the requirements of the client conveniently.

Further, the voice recording system comprises a task deployment module and a plurality of recording ends, the plurality of recording ends are deployed with tasks through the task deployment module, the plurality of recording ends record voice data according to the tasks deployed by the task deployment module, and the recorded voice data are transmitted to the server for centralized processing and storage, so that the voice data are collected and learned.

Furthermore, a network uploading module is arranged at the receiving and recording end, and the sound data recorded by the receiving and recording end is transmitted to the server through the network uploading module, so that the receiving and recording efficiency of the receiving and recording end can be greatly improved, and different sound data are used for receiving and recording from various places.

A voice library generation method comprises the following steps:

s1: and the task deployment module performs task deployment on the plurality of receiving and recording ends.

S2: and the recording end records the specified voice data according to the tasks arranged by the task deployment module and uploads the voice data to the server through the network uploading module.

S3: and matching and classifying the sound data through a voice matching and classifying module, transmitting the classified sound data to a sound data repository for storage, and transmitting the unrecognized voice to an invalid voice library.

S4: the client side transmits the instruction sound to the sound comparison module through the voice receiving module, and compares the direct current sound with the sound data in the voice data storage base through the sound comparison module, so that a matched instruction is obtained.

S5: and outputting the instruction sound through an instruction output end.

S6: the sound data output by the client end can be fed back through the error feedback module when the sound data output by the client end violates the desire of the client end.

The embodiment of the invention adopts at least one technical scheme which can achieve the following beneficial effects:

the system comprises a server, a client, a voice command input end, a command output end, a voice receiving module, a voice database and a voice comparison module, wherein the server is used for comparing and storing data, the client inputs the voice command into the server, the server is used for comparing the voice command and outputting the voice command through the command output end, the voice command sent by the client is input into the voice database through the voice receiving module, the voice command input by the client is compared with voice data in the voice data repository through the voice comparison module, and therefore a matching part carries out relative command and outputs the voice command through the command output end.

Secondly, an invalid voice library is arranged in the system, the voice matching classification module and the voice data repository are connected to the invalid voice library, unrecognized voice recorded in the voice matching classification module is transmitted to the invalid voice library, invalid voice data are input into the invalid voice library in the voice database, accordingly, the space occupied by the invalid voice data in the voice data repository can be reduced, and a manager can regularly check the voice data in the invalid voice library for debugging.

Drawings

The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention and not to limit the invention. In the drawings:

FIG. 1 is a system architecture diagram of the present invention;

fig. 2 is an architecture diagram of the voice recording system according to the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the specific embodiments of the present invention and the accompanying drawings. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

The technical solutions provided by the embodiments of the present invention are described in detail below with reference to the accompanying drawings.

A voice library generating system comprises a client, a voice recording system, a server and an instruction output end, wherein the voice recording system is connected to the server, voice data are collected through the voice recording system, the collected voice data are transmitted to the server, data comparison and storage are carried out through the server, the client is connected to the server, the client inputs voice instructions into the server to be carried out, and the voice instructions are output through the instruction output end after being compared through the server, so that voice instruction output is completed.

Preferably, the server is composed of a voice matching classification module, a voice data repository, a voice receiving module and a voice comparison module, the voice recording system is connected to the voice matching classification module, voice data and instructions recorded by the voice recording system can be correspondingly classified through the voice matching classification module, the classified voice data are transmitted to the voice data repository for storage, a voice instruction sent by the client is input into the voice database through the voice receiving module, the voice instruction input by the client is compared with the voice data in the voice data repository through the voice comparison module, and therefore the relative instruction at the adaptation position is output through the instruction output end.

Preferably, the server is further provided with an invalid voice library, the voice matching classification module and the voice data repository are both connected to the invalid voice library, unrecognized voice recorded in the voice matching classification module is transmitted to the invalid voice library, and invalid voice data is input into the invalid voice library in the voice database, so that the space occupied by the invalid voice data in the voice data repository can be reduced, and a manager can regularly check and debug the voice data in the invalid voice library.

Preferably, the server is provided with an error feedback module, and when the sound data output by the client violates the wish of the client, the feedback can be performed through the error feedback module, so that the server can improve the sound data according to the requirements of the client.

Preferably, the voice recording system is composed of a task deployment module and a plurality of recording ends, the tasks are deployed on the plurality of recording ends through the task deployment module, the plurality of recording ends record voice data according to the tasks deployed by the task deployment module, and the recorded voice data are transmitted to the server to be processed and stored in a centralized manner, so that the voice data are collected and learned.

Preferably, a network uploading module is arranged at the receiving and recording end, and the sound data recorded by the receiving and recording end is transmitted to the server through the network uploading module, so that the receiving and recording efficiency of the receiving and recording end can be greatly improved, and different sound data are used for receiving and recording from various places.

A voice library generation method comprises the following steps:

S5: and outputting the instruction sound through an instruction output end.

The above description is only an example of the present invention, and is not intended to limit the present invention. Various modifications and alterations to this invention will become apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention should be included in the scope of the claims of the present invention.

Claims

1. A speech library generation system, comprising: the voice recording system is connected to the server, voice data are collected through the voice recording system, the collected voice data are transmitted to the server, data comparison and storage are conducted through the server, the client is connected to the server, the voice instruction is input into the server through the client, the voice instruction is output through the instruction output end after being compared through the server, and therefore voice instruction output is completed.

2. A speech library generation system according to claim 1, wherein: the server comprises a voice matching classification module, a voice data storage base, a voice receiving module and a voice comparison module, wherein the voice receiving system is connected to the voice matching classification module, voice data and instructions received by the voice receiving system can be correspondingly classified through the voice matching classification module, the classified voice data are transmitted to the voice data storage base to be stored, a voice instruction sent by the client is input into the voice data base through the voice receiving module, the voice instruction input by the client is compared with the voice data in the voice data storage base through the voice comparison module, and therefore the relative instruction of the adaptation part is output through the instruction output end.

3. A speech library generation system according to claim 1, wherein: the server is also provided with an invalid voice library, the voice matching classification module and the voice data repository are connected to the invalid voice library, unrecognizable voices recorded in the voice matching classification module are transmitted to the invalid voice library, invalid voice data are input into the invalid voice library in the voice database, so that the space occupied by the invalid voice data in the voice data repository can be reduced, and a manager can regularly check the voice data in the invalid voice library for debugging.

4. A speech library generation system according to claim 1, wherein: the server is provided with an error feedback module, and when the sound data output by the client violates the wish of the client, the feedback can be carried out through the error feedback module, so that the server can be improved according to the requirements of clients conveniently.

5. A speech library generation system according to claim 1, wherein: the voice recording system is composed of a task deployment module and a plurality of recording ends, the plurality of recording ends are deployed through the task deployment module, the plurality of recording ends record voice data according to the tasks deployed by the task deployment module, and the recorded voice data are transmitted to the server to be processed and stored in a centralized mode, so that the voice data are collected and learned.

6. A speech library generation system according to claim 1, wherein: the network uploading module is arranged at the recording end, and the sound data recorded by the recording end is transmitted to the server through the network uploading module, so that the recording efficiency of the recording end can be greatly improved, and different sound data are used for recording from different places.

7. A method for a speech library generation system according to claims 1-6, comprising the steps of:

s1: the task deployment module performs task deployment on the plurality of recording ends;

s2: the recording end records the appointed voice data according to the tasks arranged by the task deployment module and uploads the voice data to the server through the network uploading module;

s3: matching and classifying the sound data through a voice matching and classifying module, transmitting the classified sound data to a sound data repository for storage, and transmitting the unrecognized voice to an invalid voice library;

s4: the client side transmits the instruction sound to the sound comparison module through the voice receiving module, and compares the direct current sound with the sound data in the voice data repository through the sound comparison module so as to acquire a matched instruction;

s5: outputting the instruction sound through an instruction output end;