This paper presents a method which attains recoverable privacy protection for video content distribution. The method is based on discrete wavelet transform (DWT), which generates scaling coefficients and wavelet coefficients. In our method, scaling coefficients, which can be regarded as a low-resolution image of an original image, are used for producing privacy-protected image. On the other hand, wavelet coefficients, which can be regarded as privacy information, are embedded into the privacy-protected image via information hiding technique. Therefore, privacy protected image can be recovered by authorized viewers if necessary. The proposed method is fully analyzed through experiments from the viewpoints of the amount of the embedded privacy information, the deterioration due to the embedding, and the computational time.
Recently, video surveillance has received a lot of attention as a useful technology for crime deterrence and investigations and has been widely deployed in many circumstances such as airports, convenience stores, and banks. Video surveillance allows us to remotely monitor a live or recorded video feed which often includes objects such as people. Although video surveillance contributes to realizing a secure and safe community, it also exposes the privacy of the object in the video.
Over the past few years, a lot of techniques on privacy protection in video surveillance system have been proposed [1–7]. Newton et al.  proposed an algorithm to protect the privacy of the individuals in video surveillance data by deidentifying faces. Kitahara et al.  proposed a video capturing system called Stealth Vision, which protects the privacy of the objects by blurring or pixelizing their images. In , Wickramasuriya et al. protect object's privacy based on the authority of either object or viewers. In , Boyle et al. considered face obscuring for privacy protection and discussed the effects of blurring and pixelizing. Crowley et al.  proposed a method for privacy protection by replacing an socially inappropriate original image with a socially acceptable image using eigen-space coding technique. Chinomi et al.  proposed privacy-protected video surveillance system called PriSurv, which adaptively protects objects' privacy based on their privacy policies which are determined according to closeness between objects and viewers.
Although these techniques fulfill some requirements of privacy protection, it also has a potential security flaw when privacy-protected videos produced by the above techniques are distributed on the Internet, because these techniques do not provide methods for recovering the original videos from privacy-protected videos. For example, suppose that a surveillance video camera is installed around school route, and the camera distributes a privacy-protected video on the Internet in usual case. When a crime has occurred around school route, police wants to observe the original image of a suspect in privacy-protected video. In addition, when parents want to observe the situation of their children, they require the video as they are. Thus, in order to improve the security of privacy-protected surveillance system, the privacy protection which can recover the original image from privacy-protected image is strongly required. We refer to such privacy protection as recoverable privacy protection.
Concerning recoverable privacy protection, several techniques have been proposed [8–11]. Dufaux and Ebrahimi  and Dufaux et al.  proposed a method based on transform domain scrambling of regions of interest in a video sequence. A pioneering work was done by Zhang et al. . They proposed a method for storing original privacy information in video using information hiding technique, and it can recover the original privacy information if necessary. However, the method has the drawback that the large amount of the privacy information must be embedded to recover the original image since the privacy information is obtained from the whole information of the object regions. Even if all the privacy information could be embedded using data compression technique, it requires huge computational loads. In , Yu and Babaguchi proposed another method to realize recoverable privacy protection. Their method masks a real face (privacy information) with a virtual face (newly generated face for anonymity). To deal with the huge payload problem of privacy information hiding, the method uses statistical active appearance model (AAM)  for privacy information extraction and recovering. It is shown that the method can embed the privacy information into video without affecting its visual quality and keep its practical usefulness. However, the method requires a set of face images for training statistical AAM.
In this paper, we propose a method for recoverable privacy protection based on discrete wavelet transform (DWT). It is well known that DWT is one of the useful tools for multiresolution analysis. DWT generates scaling coefficients and wavelet coefficients. Since an image consisting of scaling coefficients can be regarded as a reduced-size image of its original, we refer to it as a low-resolution image. A low-resolution image is used for producing privacy-protected image by expanding it to the size of the original image. Using wavelet coefficients, together with a low-resolution image, one can recover its original image. Therefore, wavelet coefficients are regarded as privacy information. In order to prevent unauthorized viewers from recovering privacy-protected image, our method embeds wavelet coefficients into the privacy-protected image via information hiding technique. By this, the privacy-protected image can only be recovered by authorized viewers if necessary. Furthermore, it is shown that the amount of the privacy information of the object can significantly be reduced compared to Zhang's method . In addition, in contrast with Yu's method , our method requires no training beforehand.
Some results of this paper have already been reported in , where a method for bitmap image is developed. In this paper, we extend the method so as to deal with compression technique such as JPEG  and JPEG2000  for content distribution on the Internet and provide detailed algorithms for privacy information extracting, hiding, and recovering. Furthermore, we analyze the effectiveness of the proposed method through numerical experiments from the viewpoints of the amount of the embedded privacy information, the deterioration due to the embedding, and the computational time.
This paper is organized as follows. In Section 2, we show the architecture of the proposed system. In Section 3, the discrete wavelet transform is introduced and the new image processing method for privacy information extraction is proposed. The privacy information hiding and recovering are described in Section 4. Experimental results on the proposed method are presented in Section 5. Conclusions are made in Section 6.
2. System Architecture
Figure 1 shows the architecture of the proposed system. In the encoding procedure, the object region is extracted using adaptive Gaussian mixture model [16, 17], where the object region is defined as the least rectangular area containing human body. Then, the privacy-protected image of the object region is produced by expanding the low-resolution image obtained by DWT. Next, the privacy information of the object is extracted. In this case the privacy information of the object region is defined as the information from which the person corresponding to the region is identified, that is, a set of wavelet coefficients obtained by DWT for the region. Finally, the extracted privacy information and region information are embedded into the surveillance video using the amplitude modulo modulation-based information hiding scheme . The locations and the order of the embedded pixels are described by their corresponding secret key.
Figure 1. Schematic diagram of the system.
When there are multiple object regions in a single image, the obscuration for privacy protection would differ according to each object region. However, in this paper, we do not deal with this issue for simplicity. We have considered such an issue in .
For the decoding procedure, the privacy information and region information are extracted with the procedure of the information hiding method using the secret key. Then the original image of the objects could be recovered by using the encoding process in the reversed order.
3. Privacy Information Extraction
In our system, the original image of the object region is transformed into two sets of data: a low-resolution image and a set of wavelet coefficients. This process is carried out by using discrete wavelet transform. In what follows, first, the discrete wavelet transform is introduced and then, the proposed method is described.
3.1. Discrete Wavelet Transform
The discrete wavelet transform (DWT) is computed by successive lowpass and highpass filtering of discrete signal. We use a Haar discrete wavelet transform to extract privacy information. The Haar DWT of image is given by the following equations:
where , , . The sequences and , which correspond to the impulse responses of lowpass and highpass filters, respectively, are defined as follows:
is the scaling coefficients of level , which is extracted by the lowpass filter . The image composed of the scaling coefficients is referred to as the low-resolution image of level . , , and are, respectively, the wavelet coefficients of level in vertical, horizontal, and diagonal direction, which are extracted by the highpass filter . We refer to these wavelet coefficients as a set of wavelet coefficients of level . Figure 2 shows the result of level DWT, which is composed of a low-resolution image of level and a set of wavelet coefficients from level to level . As defined above, low-resolution image given by DWT can be regarded as a down sampling of the original image. Therefore, if we expand the low-resolution image to the size of the original image, a mosaic image could be obtained, in which the original information can be protected. If the level of DWT is large enough, the low-resolution image of the object region can protect the object's privacy, although the amount of the set of wavelet coefficients to be embedded becomes large. Figure 3 shows the results of low-resolution images, which are expanded to the original image size.
Figure 2. Result of level DWT.
Figure 3. Results of low-resolution images, which are expanded to the original image size. (a) original image; (b) result of level DWT; (c) result of level DWT.
3.2. DWT-Based Privacy Information Extraction
We can extract the bounding box of the object in the surveillance video, using the background subtraction method of adaptive Gaussian mixture model [16, 17]. The bounding box is referred to as an object region. We transform the object region from color space to color space where is the luminance component, and and are the blue and red chrominance components, respectively. According to the fact that human eyes are only sensitive to the luminance but not sensitive to the chrominance, the sensitive privacy information is only included in image. Therefore, we apply the DWT-based method presented in Section 3.1 for images and produce the low-resolution image and the set of wavelet coefficients . When image is given by and level DWT is employed, and are defined as follows:
whereand Since is a one-dimensional array consisting of wavelet coefficients, we also express as . Let be the image obtained by expanding image to the size of image . The privacy-protected image is produced by replacing by and transform to color space. Finally, the set of wavelet coefficients and the coordinates of the most top-left and bottom right pixels of the object region are embedded by applying amplitude modulo modulation. In usual case, images are compressed before Internet transmission by JPEG and so on. For protecting the embedded data from the image compression, we embed the privacy information and region information into the frequency domain of the privacy-protected image after quantization. As compression format, JPEG and JPEG2000 are used in our method. Figure 4 shows the structure of the privacy-protected image compression and information hiding. The details of the embedding method are described in the next section.
Figure 4. The structure of the privacy-protected image compression and information hiding.
4. Privacy Information Hiding and Recovering
Let the size of image be and let the level of DWT be . Then, the encoded data sequence which is embedded in privacy-protected image is generated by the set of wavelet coefficients . The process is shown in Algorithm 1.
Set of wavelet coefficients:
Quantization step size of the privacy information:
Step: Generate the quantized data sequence by
quantizing , Whereis the largest integer
that does not exceed .
Step: Find the intervals consist of successive zeros (but the number
of zeros is more than ) from the data sequence , where , , are the number of
such intervals,the th smallest element of the set of starting points of successive zeros
, and the th smallest element of the set
of end points of successive zeros .
For this calculation, we suppose .
If (): Encode the data sequence of the interval with the run length
coding, and add it to the data sequence . Then Goto Next
Else (): Goto Next
If (): Encode to binary bits, and add the binary bits to the data sequence with
delimiter digit .
Else (): Encode to binary bits, and add the binary bits sandwiched by
the sign bit and delimiter digit to the data sequence .
Data sequence to be embedded:
We apply run length coding for the intervals consisting of successive zeros, since, as shown in Figure 5, a histogram of wavelet coefficients of an image is distributed around with small variance in general.
Figure 5. Histogram of wavelet coefficients.
Next, embed the encoded data sequence to the frequency domain of the privacy-protected image after quantization via amplitude modulo modulation (AMM)  according to the following equation:
where is the set of integers given by . And and are the frequency components at frequency before and after embedding, respectively, and the frequencies for embedding are described by a secret key. Therefore, only the viewer who has the secret key can extract the embedded information from the image. The embedded color component is in the order of , , . Finally, we can obtain a compressed privacy-protected image after the entropy coding of JPEG/JPEG2000.
For the recovering procedure, the privacy information and region information are extracted by taking the congruence modulo of the corresponding pixel values. Then the original image of the object is obtained by the recovering process of the Figure 1.
In this section, we evaluate the performance of proposed method through several experiments. The video sequences used for the experiments are ice (), ice (), and deadline (). In experiments, we apply JPEG for compression. In Figure 6, an original image, a privacy-protected image with privacy information embedded, and a recovered image of ice are shown in Figures 6(a) and 6(b), and 6(c), respectively. In a similar manner, corresponding images of deadline are shown in Figure 7.
Figure 6. Video sequence: ice (), DWT level: , quantization step size: , compression: JPEG.
Figure 7. Video sequence: deadline (), DWT level: , quantization step size: , compression: JPEG.
For each video sequence, the average number of pixels of object regions per frame and the average number of bits for embedded data sequences per frame at different DWT levels are shown, respectively, in Tables 1 and 2.
Table 1. Average number of pixels of object regions per frame (pixel/frame).
Table 2. Average number of bits for embedded data sequences per frame (bit/frame).
From Tables 1 and 2, we can observe that the number of pixels of object regions and the amount of embedded data sequence of ice are about four times larger than those of ice . This is quite natural because the resolution of ice is four times larger than that of ice . We can also observe that the amount of embedded data sequences of deadline is larger than that of ice , whereas the numbers of pixels of object regions of deadline and ice are similar.
In the following, we consider the influence of the above values on the performance of the proposed method. We employ the following three measures for performance evaluation: API, PSNR, and processing time.
API is an abbreviation of Average of Privacy Information and is defined as follows:
Namely, API is the average of the required bits of data sequences for recovering one pixel in the object regions and is regarded as a measure of the amount of privacy information which should be embedded. API is also calculated by using the following equation, which is equivalent to (5):
PSNR (Peak Signal-to-Noise Ratio) is used as a measure of deterioration of recovered image and is also used for evaluating the influence of embedded data sequence on privacy-protected image. PSNR between and is defined as follows:
where MSE (Mean Square Error) is defined by
5.1. Evaluation of the Amount of Privacy Information and the Deterioration Due to Embedding
APIs of each video sequence for different levels of DWT under the condition are shown in Figure 8. From Figure 8, we can observe that API tends to be large as DWT level increases. However, API hardly increases when the level is larger than and does not exceed bit/pixel. Therefore, we could embed all the privacy information into three color channels , , and of the privacy-protected image, even if the level of DWT is large and the object region size is equal to the size of the whole image. On the other hand, using the method of , API becomes bit/pixel ( bit/pixel/channel channel) when the data to be embedded consist of whole information of the object regions.
Figure 8. API for the different level of DWT.
Next, we consider the deterioration of the privacy-protected image due to the privacy information embedding. The deterioration due to embedding can be estimated by (7) and (8). Since amplitude modulo modulation in Section 4 uses congruence modulo 3, in (8) becomes less than or equal to 2. Therefore, an upper bound of MSE can be calculated as follows:
By this inequality, we obtain
This result implies that the deterioration of the privacy-protected image due to the embedding of privacy information is small enough so that we can ignore the influence of the embedding.
5.2. Evaluation of the Deterioration of the Recovered Image
Here, we evaluate the deterioration of the recovered image by PSNR between the original image and the recovered image. Figures 9 and 10 show the PSNR of the recovered image for ice and deadline at the different quantization step size and the different DWT levels. From Figures 9 and 10, we observe that PSNR becomes small as becomes large. Almost PSNRs are larger than 30 (dB). Therefore, the proposed method can recover the image with low deterioration by appropriate choice of DWT level and quantization step size. We can also observe that PSNR of deadline is worse than that of ice . This is due to the fact that the embedded data sequence of deadline is larger than that of ice as shown in Table 2.
5.3. Evaluation of Computational Time
Computational time for generating recoverable privacy-protected image and that for recovering image are shown in Tables 3 and 4, respectively, for each video sequence at different DWT levels under the condition . From Tables 3 and 4, together with Tables 1 and 2, we observe that the influence of the resolution on computational time is dominant, whereas DWT level, the amount of embedded data sequence, and the number of pixels of object regions have an insignificant effect on the computational time.
Table 3. Average CPU time for generating recoverable privacy-protected image per frame (sec/frame).
Table 4. Average CPU time for recovering image per frame (sec/frame).
The generation of recoverable privacy-protected image consists of the following four processes: object extraction, expanding low-resolution image after DWT, JPEG compression, and privacy data embedding. As for the image recovering, we have the following three processes, that is, privacy data extraction, JPEG decompression, and IDWT after shrinking low-resolution image. The rate of each process in generating recoverable privacy-protected image and that for image recovering are shown in Figures 11 and 12, respectively. From Figures 11 and 12, we can observe that computational costs for object extraction, privacy data embedding, privacy data extraction, and image recovering are very small compared to the costs for image compression. Therefore, our proposed method can be applied for real time processing, provided that the computational time for compression can be small.
In this paper, we have presented a method which attains recoverable privacy protection for video content distribution. By the proposed method, all the privacy information can be embedded into the privacy-protected image even if the level of DWT is large and the object region size is equal to the size of the whole image. We also show that proposed method recovers the privacy-protected image with low deterioration, and the computational time for privacy protection and image recovering is small.
The proposed method is based on the idea that the privacy information needed for recovering video sequence is embedded in the video sequence itself (which is referred to as self-recoverable), and only authorized viewers can extract the privacy information. An alternate approach is that the privacy information for recovering video sequence is stored outside (e.g., in a server), and only authorized viewers can access the privacy information. Such a system would be more secure than self-recoverable system, although the system is inferior with respect to the convenience. It is desirable to develop the system that can deal with both methods for protecting privacy information.
Currently, we use the same DWT level for each object region in a single image. However, it is better to change the DWT level adaptively according to the size of object region since the permissible visible detail of each object is not identical. The realization of this function is one of our future work. In our current method, when JPEG2000 is applied for image compression, we have to calculate DWT twice; one is for image compression using 5–3 filter, and another one is for privacy protection using Haar bases. If these two DWTs can be unified, the process of recoverable privacy protection becomes much simpler, and the computational time will be further reduced. The development of such methods is also our future work.
This work was supported in part by a Grant-in-Aid for scientific research from the Japan Society for the Promotion of Science and by SCOPE from the Ministry of Internal Affairs and Communications, Japan.
EM Newton, L Sweeney, B Malin, Preserving privacy by de-identifying face images. IEEE Transactions on Knowledge and Data Engineering 17(2), 232–243 (2005). Publisher Full Text
J Wickramasuriya, M Datt, S Mehrotra, N Venkatasubramanian, Privacy protecting data collection in media spaces. Proceedings of the 12th ACM International Conference on Multimedia (ACM Multimedia '04), October 2004, New York, NY, USA, 48–55
M Boyle, C Edwards, S Greenberg, The effects of filtered video on awareness and privacy. Proceedings of the ACM Conference on Computer Supported Cooperative Work, December 2000, Philadelphia, Pa, USA, 1–10
JL Crowley, J Coutaz, F Babaguchi, Things that see. Communications of the ACM 43(3), 54–64 (2000). Publisher Full Text
K Chinomi, N Nitta, Y Ito, N Babaguchi, PriSurv: privacy protected video surveillance system using adaptive visual abstraction. Proceedings of the 14th International Multimedia Modeling Conference (MMM '08), January 2008, Kyoto, Japan, Lecture Notes in Computer Science 4903, 144–154
X Yu, K Chinomi, T Koshimizu, N Nitta, Y Ito, N Babaguchi, Privacy protecting visual processing for secure video surveillance. Proceedings of the International Conference on Image Processing (ICIP '08), October 2008, San Diego, Calif, USA, 1672–1675
F Dufaux, M Ouaret, Y Abdeljaoued, A Navarro, F Vergnenègre, T Ebrahimi, Privacy enabling technology for video surveillance. Mobile Multimedia/Image Processing for Military and Security Applications, April 2006, Kissimmee, Fla, USA, Proceedings of SPIE 6250
W Zhang, S-CS Cheung, M Chen, Hiding privacy information in video surveillance system. Proceedings of the International Conference on Image Processing (ICIP '05), September 2005, Genova, Italy 3, 868–871
X Yu, N Babaguchi, Privacy preserving: hiding a face in a face. Proceedings of the 8th Asian Conference on Computer Vision (ACCV '07), November 2007, Tokyo, Japan, Lecture Notes in Computer Science 4844, 651–661
G Li, Y Ito, X Yu, N Nitta, N Babaguchi, A discrete wavelet transform based recoverable image processing for privacy protection. Proceedings of the International Conference on Image Processing (ICIP '08), October 2008, San Diego, Calif, USA, 1372–1375
GK Wallace, The JPEG still picture compression standard. IEEE Transactions on Consumer Electronics 38(1), 18–34 (1992). Publisher Full Text
A Skodras, C Christopoulos, T Ebrahimi, The JPEG 2000 still image compression standard. IEEE Signal Processing Magazine 18(5), 36–58 (2001). Publisher Full Text
WEL Grimson, C Stauffer, R Romano, L Lee, Using adaptive tracking to classify and monitor activities in a site. Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 1998, Santa Barbara, Calif, USA, 22–29
C Stauffer, WEL Grimson, Learning patterns of activity using real-time tracking. IEEE Transactions on Pattern Analysis and Machine Intelligence 22(8), 747–757 (2000). Publisher Full Text