CN109753906A

CN109753906A - Public place anomaly detection method based on domain migration

Info

Publication number: CN109753906A
Application number: CN201811594841.4A
Authority: CN
Inventors: 王�琦; 李学龙; 林维
Original assignee: Northwestern Polytechnical University
Current assignee: Northwestern Polytechnical University
Priority date: 2018-12-25
Filing date: 2018-12-25
Publication date: 2019-05-14
Anticipated expiration: 2038-12-25
Also published as: CN109753906B

Abstract

The present invention relates to a kind of public place anomaly detection method based on domain migration, a large amount of virtual abnormal time videos are created that using the simulation of virtual world, solves the problems, such as the diversity of anomalous event but data deficiencies, virtual data is moved under truth with the method for domain migration again, improve adaptability of the classification and Detection network in formal monitor video, the availability of effective training for promotion network.

Description

Public place anomaly detection method based on domain migration

Technical field

The invention belongs to computer vision, field of video monitoring.For the public place of video monitoring, detect that video is worked as Such as the fighting of middle generation, abnormal behaviour of becoming separated in flight.

Background technique

Nowadays the camera throughout city public domain is all generating countless monitor videos all the time, if can lead to The method for crossing automation carries out the detection of abnormal behaviour to collected video, then this has occurred events of public safety There is extremely strong prevention effect.But since the occurrence frequency of abnormal behaviour is much smaller than the frequency that normal behaviour occurs, and it is abnormal The diversity of behavior, so that the detection of anomalous event becomes extremely difficult.

At present in public place there are two types of the detection methods of abnormal behaviour: the first is R.Mehran et al. in document “R.Mehran,A.Oyama,and M.Shah,Abnormal crowd behavior detection using social force model,Computer Vision and Pattern Recognition,2009.CVPR 2009.IEEE The method based on social force model proposed in Conference on, pp.935-942,2009. ", it regards pedestrian as one A transfer point regards interpersonal interaction as active force between points, by the particle that notes abnormalities it is mobile come Detect the abnormal behaviour in video.

Second method is the method based on optical flow method, such as " Y.Yu, W.Shen, H.Huang, and Z.Zhang, Abnormal event detection in crowded scenes using two sparse dictionaries with The side proposed in saliency, Journal of Electronic Imaging, vol.26, no.3, pp.033013,2017. " Method, by obtaining surface and the motion characteristic of a kind of pedestrian in conjunction with multiple dimensioned light stream histogram and multi-scale gradient histogram, Off-note is added in traditional only sparse model comprising normal characteristics and constructs dictionary.In addition, by the significant of test sample Property is combined with the sparse reconstruct cost on normal dictionary and exception dictionary, measures the normal degree of test sample.

These methods have its limitation, and particle point model can not capture the motion characteristic of personage, the spy based on light stream Sign dictionary does not ensure that all abnormal behaviours can be present in dictionary.

Summary of the invention

Technical problems to be solved

In order to avoid the shortcomings of the prior art, the present invention proposes a kind of public place abnormal behaviour based on domain migration Detection method.

Technical solution

A kind of public place anomaly detection method based on domain migration, it is characterised in that steps are as follows:

Step 1: generating virtual abnormal data using existing virtual image product, virtual abnormal data includes different different Normal classification and normal category data, the data bulk of each classification are identical；

Step 2: the virtual abnormal data training video sorter network generated using step 1 obtains a virtual abnormal number According to sorter network；

Step 3: using the truthful data of the virtual abnormal data and acquisition that generate, training domain migration network is obtained virtual The corresponding true domain video data of anomalous video data；The domain migration network be improved cycle-GAN, improved method: By in cycle-GAN network, all 2D convolutional coding structures are all changed to the 3D convolutional coding structure towards video data, 3D convolutional coding structure Calculation method are as follows:

Wherein P, Q, R respectively indicate the length, width and height of the characteristic pattern of layer network output, and network output is had become in m expression Characteristic pattern quantity.Finally it is calculated at convolution module W, corresponding characteristic pattern V in next layer network, b are offset, i, I-th layer of j-th of 3d convolutional coding structure of j, x, y, the coordinate value in z length, width and height；

Step 4: the virtual abnormal data sorter network that the true domain abnormal data obtained using step 3 obtains step 2 Further classification based training is carried out, training process and step 2 are identical, to obtain the anomalous video sorter network in true domain；

Step 5: true abnormal data to be tested being input to the network model that step 4 training obtains, is utilized Softmax function obtains the input video in the probability being under the jurisdiction of in each abnormal class, and the classification being maximized is as the section The Exception Type of video.

Visual classification network in the step 2 is 3DresNet or space-time double fluid visual classification network.

Beneficial effect

A kind of public place anomaly detection method based on domain migration proposed by the present invention, utilizes the mould of virtual world It is quasi- to be created that a large amount of virtual abnormal time videos, solve the problems, such as the diversity of anomalous event but data deficiencies, and use domain The method of migration moves to virtual data under truth, improves adaptation of the classification and Detection network in formal monitor video Property, the availability of effective training for promotion network.

Detailed description of the invention

Fig. 1 is model of the invention, data flow diagram；

Fig. 2 is the data flow diagram of domain migration network.

Specific embodiment

Now in conjunction with embodiment, attached drawing, the invention will be further described:

The present invention proposes a kind of common scene anomaly detection method based on domain migration, to solve abnormal behaviour multiplicity The phenomenons such as property, frequency be low lead to the difficulty of unusual checking.It is entire that the technical scheme comprises the following steps:

1. existing virtual image product, such as game, CG is utilized to be created that virtual scene, task, model and exception The abnormal behaviour in virtual world is recorded in relevant movement.

2. utilizing the depth of these data one visual classification of training after capturing the virtual video data largely recorded Spend neural network, which can effectively distinguish abnormal behaviour classification (such as fight, become separated in flight) in dummy data set and just Reason condition.

3. using the monitor video in some reality, these videos do not have the generation of anomalous event necessarily.Utilize these Relationship is mutually converted between video and existing virtual video, learns a domain migration network, carries out unsupervised visual domain Virtual video is moved to the real video domain closely similar and true to nature with reality scene by migration, is obtained largely comprising different The monitor video of Chang Hangwei.

4. using the video after migration as data set, the Classification Neural obtained in training (2) again, to improve this Neural network is after cross-domain, i.e., adaptability in truthful data domain, improves the network application to the inspection in real video monitoring Survey ability.

5. the monitor video of set time length can be passed to trained mind in real time every time during practice The short-sighted frequency captured in network, is obtained in each abnormal class and class probability under normal circumstances, takes probability highest Classification of the classification as this section of video.Belong to the exception or normal behaviour of any rank, using testing result to determine prison Whether the generation of abnormal behaviour is had under control.

It is of the invention that the specific implementation steps are as follows:

Step 1, it is necessary first to prepare what a unsupervised domain migration network, the type of the network be " J.Zhu, T.Park,P.Isola,and A.A.Efros,Unpaired image-to-image translation using cycle- Consistent adversarial networks, arXiv preprint, the cycle-GAN mentioned in 2017. ".Different It is that should be carried out some modifications, so that it can handle the data of visual domain (cycle-GAN can only handle image).It repairs The method changed is that, by cycle-GAN network, all 2D convolutional coding structures are all changed to the 3D convolutional coding structure towards video data. The calculation method of 3D convolutional coding structure are as follows:

Wherein P, Q, R respectively indicate the length, width and height of the characteristic pattern of layer network output, and network output is had become in m expression Characteristic pattern quantity.It is finally calculated at convolution module W, corresponding characteristic pattern V in next layer network.Simultaneously in virtual generation Relevant anomalous event video data is simulated and recorded in boundary, shown in FIG as cornered boxes, i.e., virtual anomalous video data. These data include to fight, and chase, become separated in flight, and gunslinging is run, and the different abnormal class and normal category data such as arrest.It is each The news commentary data volume of classification is roughly the same.Finally, it is also necessary to a part of true video monitoring data, for expressing reality scene In monitor video what kind of is, these news commentary data do not need to mark, and to video content also there is no limit.

Step 2, a visual classification network is initialized, which can be 3DResNet, be also possible to space-time double fluid view Frequency sorter network either other existing visual classification networks.Here we using existing 3DResNet, it is come from In " K.Hara, H.Kataoka, and Y.Satoh, " Learning spatio-temporal features with 3D residual networks for action recognition,"Proceedings of the ICCV Workshop on Action,Gesture,and Emotion Recognition,vol.2,no.3,pp.4,2017".This network is 2015 The modified version of the network structure of proposition ----ResNet, it is identical described in its improved method and step 1, i.e., by 2D's Convolutional coding structure is changed to the convolutional coding structure of 3D.

Step 3, using collected virtual abnormal data and any truthful data, a domain migration network is instructed Practice, and obtains the corresponding true domain video data of virtual anomalous video data.As shown in Figure 2, it is assumed that S_real、R_realRespectively I Collected virtual abnormal data and any truthful data, send it to generation network G_StoRAnd G_RtoSIn obtain R_fakeWith S_fake, then it is passed to G respectively_RtoSAnd G_StoRIn, acquisition and S_real、R_realCorresponding video, by consistency comparison and discriminator D_RAnd D_SIdentification improve the fidelity of domain migration rear video.

Whole process can be indicated with following formula:

I.e. during training generator, it is dedicated to minimizing the value of discriminator and maximizes consistency comparison；It is reflecting The value of discriminator is then maximized in other device training process.Finally obtained R_fakeIt can regard virtual anomalous video in Fig. 1 as Corresponding true domain video data.

Step 4, the true domain abnormal data obtained using step 3 carries out further classification instruction to the network that step 2 obtains Practice, process and 2 identical, to obtain the anomalous video sorter network in true domain.

Step 5, during actual test, true abnormal data is input to the network model that step 4 training obtains, benefit The input video is obtained in the probability being under the jurisdiction of in each abnormal class with softmax function, and the classification being maximized is used as should The Exception Type of section video.

Effect of the invention can be described further by following emulation experiment.

1. simulated conditions

The present invention is using four pieces of 1080 Ti GPU of GeForce GTX as hardware foundation, with 64 Ubuntu 16.04 The python programming language of 3.5.4 version, the CUDA conduct of the Pytorch of 0.4.1 version and 9.2 versions are utilized in LTS system The practical rehearsal that software environment is entirely invented.

2. emulation content

Firstly, concentrated using the obtained virtual video data set of simulation and some video datas the video data taken according to Fig. 1 training, finally obtains true domain anomalous video sorter network.And " K.Hara, H.Kataoka, and are used Y.Satoh,Learning spatio-temporal features with 3D residual networks for action recognition,Proceedings of the ICCV Workshop on Action,Gesture,and Emotion Recognition, vol.2, no.3, pp.4,2017. " and our self-designed networks compare, Yi Jiwei Model through the training of domain migration data is compared with the model result that have passed through the training of domain migration data.Judging standard has two, first is that The classification accuracy of video, second is that misclassification seriousness (MISE, misclassification severity).The latter will be abnormal Classification is classified by its seriousness, then calculates the later severity of misclassification.As a result as follows:

Test result of 1: four model of table in real data set

Accuracy (%)	3D ResNet	The present invention
			Before domain migration	19.51	17.07
After domain migration	21.14	26.02

As it can be seen from table 1 network of the invention has in the classification accuracy after domain migration in real data set It is obviously improved.And domain migration technology proposed by the present invention also has certain effect promoting to the performance of 3DResNet, makes it There is higher forecast accuracy to public place unusual checking.

Misclassification seriousness of 2: four models of table in real data set

MISE	3D ResNet	The present invention
			Before domain migration	3.48	3.45
After domain migration	3.45	2.74

As seen from Table 2, our method also has minimum value in misclassification seriousness, has also confirmed the present invention to public affairs Place unusual checking has lower mistake classification seriousness altogether.

Claims

1. a kind of public place anomaly detection method based on domain migration, it is characterised in that steps are as follows:

Step 1: generating virtual abnormal data using existing virtual image product, virtual abnormal data includes different exception class Not and normal category data, the data bulk of each classification are identical；

Step 2: the virtual abnormal data training video sorter network generated using step 1 obtains virtual abnormal data point Class network；

Step 3: using the truthful data of the virtual abnormal data and acquisition that generate, training domain migration network is obtained virtual abnormal The corresponding true domain video data of video data；The domain migration network is improved cycle-GAN, improved method: will In cycle-GAN network, all 2D convolutional coding structures are all changed to the 3D convolutional coding structure towards video data, 3D convolutional coding structure Calculation method are as follows:

Wherein P, Q, R respectively indicate the length, width and height of the characteristic pattern of layer network output, and the feature of network output is had become in m expression Figure quantity.It is finally calculated at convolution module W, corresponding characteristic pattern V in next layer network, b are offset, i, j i-th J-th of 3d convolutional coding structure of layer, x, y, the coordinate value in z length, width and height；

Step 4: the true domain abnormal data obtained using step 3 carries out the virtual abnormal data sorter network that step 2 obtains Further classification based training, training process and step 2 are identical, to obtain the anomalous video sorter network in true domain；

Step 5: true abnormal data to be tested being input to the network model that step 4 training obtains, utilizes softmax letter The number acquisition input video is in the probability being under the jurisdiction of in each abnormal class, exception of the classification being maximized as this section of video Type.

2. a kind of public place anomaly detection method based on domain migration according to claim 1, it is characterised in that Visual classification network in the step 2 is 3DresNet or space-time double fluid visual classification network.