Panoramic human structure maintenance based on invariant features of video frames
© Chang et al.; licensee BioMed Central Ltd. 2013
Received: 2 August 2013
Accepted: 22 August 2013
Published: 5 September 2013
Panoramic photography is becoming a very popular and commonly available feature in the mobile handheld devices nowadays. In traditional panoramic photography, the human structure often becomes messy if the human changes position in the scene or during the combination step of the human structure and natural background. In this paper, we present an effective method in panorama creation to maintain the main structure of human in the panorama. In the proposed method, we use an automatic method of feature matching, and the energy map of seam carving is used to avoid the overlapping of human with the natural background. The contributions of this proposal include automated panoramic creation method and it solves the human ghost generation problem in panorama by maintaining the structure of human by energy map. Experimental results prove that the proposed system can be effectively used to compose panoramic photographs and maintain human structure in panorama.
KeywordsASIFT algorithm Human structure maintenanc Panoramic creation
Generation of panorama from a set of individual photos has been a useful and attractive research topic within the researches in the domain for several years so far. Even though the researchers focused more into personal computer based solutions at the beginning nowadays much focus is being diverted to mobile platform based solutions making it a very convenient and attractive application for the users. As an example many recent smart mobiles are equipped with applications that are capable of generating even a 360° panorama in a scene. The panorama generation solution presented in Yingen Xiong’s method  consumes less processing time as the processing is done in memory. Wang Meng  presented an approach to create a single view point full view panorama photograph from a set of image sequence. Individually ordered frames which are extracted from a panning video sequence have been used as the input making it simple for both shooting and stitching. Going forward another step of panorama generation Wagner Daniel et al.  presented a method for the real-time creation and tracking of panoramic maps on mobile phones. Specially, the maps generated are accurate and allow drift-free rotation tracking. But, most of the current technologies used for panorama generation are targeted for natural landscape capturing. Hence, in the situations where human objects appear in the background, the result of panorama may contain blurred human objects, as the structure of human object cannot be detected very precisely via feature extraction, which in turn results in low quality panorama. In regular feature extraction method, defining feature points in human object is very difficult unless there are obvious feature points available on the clothes. Therefore in this paper, we present our efforts in generating a panorama which show the landscape and human objects in the background without any blurred effects.
On the other hand more information of the natural scenery and buildings that we want to capture can be obtained via panoramic photography. Hence, ppanoramic photography can be considered best suited where the user needs more natural scenery in one picture. Even though panoramic photos can be created using commercially available image processing tools in several steps by appropriately segmenting available human objects and combining relevant background features together from the source frame sets, it is very time consuming manual work and the results are not satisfactory. There, in the combining step, most of the images cannot be combined via simple manual methods even in the same scene, b due to the problem of always existing cylindrical distortion exists in camera lenses which is difficult to recognize by the user in the source images. Therefore, we also propose an automated calibration mechanism in the proposed method which in turn reduces steps and time consuming in manual methods.
In summary the main goal of our work is to develop a system to take panoramic photographs, eliminating blurred effects created due to the human objects in frames with the background. Presented solution also reduce the steps comparing to the manual methods, allowing the user to obtain a panoramic photograph via our panning shooting method in video.
Feature extraction can be done by matching the similar objects between difference images. We can regulate and track objects via the information obtained from feature extraction. Even though human eye can detect the features in different images it is not an easy task to be done in computers. One famous method which is used to detect features is the Scale-Invariant Feature Transform (SIFT) algorithm by Matthew Brown  and Saeid Fazli . SIFT algorithm is a very robust method that can detect and describe local features in the image and it can find some features in different images as well. It uses Difference of Gaussian (DOG) function and image pyramid technology to find extreme values in different scale-space. Then a linear least square solution and threshold value is used to decide height-contrast feature points or to excise low-contrast feature points and use each feature points’ gradient direction and feature points, strength to allocate the feature points. Therefore, the information of feature is very credible and can be used in calibration images using calibration matrix.
Though SIFT algorithm can describe local features very robustly, the cost of process time is very large, and some features are not very import and apparent in image. In order to solve the problem of cost, Yingen Xiong  and Zhengyou Zhang  presented the Speeded Up Robust Features (SURF) algorithm that can be used which is faster than SIFT algorithm. But the number of features that can be extracted is less than SIFT algorithm.
In recent years, panorama creation has been attracted by many researchers in the world developing very robust solutions. Matthew Brown et al.  used SIFT algorithm for feature matching in source images where their source images were not in order as per their research. Hemant B. Kekre et al.  presented a panorama generation approach to nullify effect of rotation of partial images on process of vista creation. Their method is capable of resolving the missing region in the vista caused due to the rotation of partial image parts used during the vista creation. Helmut Dersch  presented the open source of panorama creation that can be create panorama via parameters of open source functions. Image inpainting has been used during the process to fill the missing region. That missing view regeneration method was also able to overcome the problem of missing view in vista due to cropping, irregular boundaries of partial image parts and errors in digitization. Wang Meng  presented an approach to create a single view point full view panorama photograph. Song Baosen et al.  then presented another panorama generation based research to enlarge the horizontal and vertical angles of view for an image.
To fulfill the fast developing mobile devices market panorama creation solutions for mobile devices has been presented in recent years. Yingen Xiong et al.  proposed a fast method of panorama creation for mobile devices. In order to reduce the process time they used the default direction of photography instead of the method of calibration. A smoothly varying affine stitching field which is flexible enough to handle parallax while retaining the good extrapolation and occlusion handling properties of parametric transforms was presented by Wen-Yan Lin et al. in . Their algorithm which jointly estimates both the stitching field and correspondence permits the stitching of general motion source images, provided the scenes do not contain abrupt protrusions.
Human structure maintenance
Preserving the full human
In order to solve the problem of human panoramic creation, we use the inpainting technology. Because general panoramic creation may produce a panorama with blurred human object or human object with a wrong structure details. Therefore, we obtain the position of human from source images, then find the largest region position of human and recover it into panorama. The panoramic image often maintain the largest dimensions (the height and width of image) and information. The complete human object will be available in the panorama after merging the relevant parts from the source images. Finally, since we have the information of human, we can use panoramic creation to produce the natural landscapes without the human object, and then recover the largest human object into the empty region in the panoramic image.
In normal circumstances, in the pictures or videos taken for the generation of the panorama, human objects do not move in a short time, and background information also have similar regions in difference source images. But, we cannot obtain the background information that is shielded due to human. Therefore, we use the surrounding region patch to fill structure of human. This concept is very easy and fast. But, this concept does not guarantee the structure in the repaired regions. In this way, the structure cannot be retained and the resultant panorama becomes unnatural. In maintenance of structure in repair regions, we use image inpainting method . Image inpainting method can retain the structure in specified area via user definition. The repair patch consider the similarity of structure in background and filter incorrect structure via inpainting method.
An important problem of inpainting that is used in the proposed method is the structure of background cannot be repaired very accurately in the regions to be repaired. Because image inpainting method has to select the sequence of repair regions depending on the similarity of structure. Therefore, the structure of boundary has a distinguishing feature that captures the complete human from source images and recover into panoramic image.
After this step, the repair structure of human boundary becomes similar and prevent the clutter of structure in repaired regions. The panoramic image becomes a disarray and unnatural in repaired regions via image inpainting method as shown in the example in Figure 5. Steps of the proposed method are presented in the following algorithm and a sample result is shown in Figure 6.
Algorithm: Human panoramic creation- Preserving the Full Human
Data: Human source images
Result: Human inpainting image
Select source image and find the largest human region in each source image.
Using panoramic creation method produce the panorama with a hole region of human.
Using dilation and expansion method for largest human region obtain human boundary correspondence information.
Differentiate the foreground and background region of human boundary.
Using image inpainting method repair human boundary.
Recover human region into panoramic image.
Preserving the incomplete human
In most cases, user cannot control the distance between camera and human. When the human is close to camera and it is required to obtain more background in panorama, the human structure becomes incomplete in some frames. We cannot use the method described in part A, because the incomplete human may be in the same height as in frames. In order to solve the problem in incomplete human, we use energy map and find the seam in proposed method.
Some of panoramic creation methods often use the average value of RGB (or other color space value) on the overlapping region in the combination of source images. The average value is a fair-minded method, but, use of average value method may produce the ghost effect in panoramic image. The average value needs to rely on robust panoramic position method and accurate camera parameters, and there should not any moving or apparent object in source images. Therefore the average value method is not very reliable for our method of this step. Hence we use image stitching method in the proposed method as steps given below.
Matching position of images
In order to establish a complete panorama, one important factor is to find the correct structure in source images. Most of the current methods use artificially marking of the structure points in the images. This is very time-consuming when there are large number of images. One shortcoming of artificially marking is the accuracy of the matching construction. Because the position of marks in images becomes different when we have visual differences or when there are artificial errors of marking the structure points. But, identification correct structure or graphics is also difficult target in automatic processing of computer. Therefore, many useful methods have been proposed based on feature matching and structure identification.
In some cases, where there are many similar many similar features in one image, ASIFT algorithm may still have the wrong feature matching. Hence, we use a simple concept to filter the wrong feature matching via slope. For example, assume that we find the A frame and B frame have the same feature in Y axis and distance is 10 pixels. In subsequent frames, we find the C frame also have the same feature with A frame and B frame, but the feature point is in Y axis, distance is 5 pixels and 5 pixels in X axis. In this way, the concept of fixed displacement cannot use to filter wrong feature matching. Therefore, we use the slope S in all coordinates of feature matching, because we obtained video in same scene and same moving direction of photography, so most of the same feature matching will be same corresponding between two frames and have most same slope in feature matching. Through the slope concept, we can filter wrong feature matching between two frames, and to reserve the true feature matching information. At the same time, we can reduce the processing time in compute calibration parameters matrix. Steps of slope concept are in the following algorithm.
Algorithm: Matching Position of Images
Data: Source frames
Result: Feature Coordinate information
- I.Use ASIFT algorithm to find matching information with two images. Definitions (X A , Y A ) and (X B , Y B ) present the matching coordinates of source A and B and compute the slope S:(2)
If S ≠ 0, using SAD method in a small range bounded by a 3×3 pixels block compute the number of matching information.
Using coordinates of matching information compute calibration matrix via the maximum number of S, then repeat for all source frames.
In order to guarantee the structure and to avoid distortion in panorama, we have to determine the photography of panorama. In the proposed method we set the direction of capturing the scene as a circular path to obtain a source video of a small time. Then a set of frames are separated from the short video to be used as the source frame set. After that ASIFT algorithm is used to obtain the coordinate information of matching features based on the source images. After this step, we need to compute the camera parameter matrix [6, 15] and transformation matrix in order to compensate the distortions in adjacent frames, although we obtained the source videos as smooth as possible.
Moreover according to the characteristic of homography matrix these points in the three-dimensional space must be on the same plane. Thus the following algorithm clusters all feature points and calculate the best homography matrix.
Algorithm : Finding Optimal Homography Matrix
Data: Coordinates of matching information
Result: Homography matrix
- I.Cluster feature points according to color features via mean-shift algorithm.
Transform the color space into CIELuv.
Create a 2D array arrayLU and give the L and U dimension parameters of CIELuv.
According to the arrayLU perform the clustering process and eliminate small regions by merging with neighbor regions.
For each group calculate the homography matrix by using the feature points within the group. Solve at least four pair of corresponding points. If there is no sufficient number of points the group is neglected.
- III.The homography matrix of each group is fed into Equation (3) to calculate the value of H*m and compare the deviation dev between the actual m’ and the calculated H*m, where num is the number of matching feature pairs.(7)
Define the optimal homography matrix is the one with the minimum deviation that is the one with the smallest dev.
In this section, the results of the experiment are discussed. Without using any supporting device for the camera (like tripod) input videos were captured to simulate a regular user who uses a regular camera. The main human did not move in a short time as previously mentioned. For each video, we take the time about 12~16 second that we have been try to keep for one cycle of circle in our photography. We only take video in outdoor, because user often want to retain the natural landscape and human in one image. Although the proposed method also can be used in interior scenes.
In the photography environment, we do not restrict much in distance and brightness. Because we transform video to panorama assuming the rate of the camera moving is not fast. When the rate of the camera moving is too fast, we obtain largely blurred frames. In this way, we obtain low quality results of panorama. Several selected experimental results are shown in Figure 14.(A) to 14.(P). S1~S4 videos were captured in outdoor and the resultant panorama clearly shows that the main person in the panorama. S5 and S7 were captured in outdoor and the resultant panorama clearly shows that main of two persons in the panorama. S6 and S8 videos were captured again in outdoor and the resultant panorama clearly shows that main of three persons in the panorama,
This paper proposes a novel method for generation of panorama image from a video captured from a simple digital camera by a novice user. It further provides details of composing a human panoramic image which provides more scenery information in one image. Main concepts of the proposed method are use of inpainting method and energy map method in human maintenance for panoramic creation. User does not need to tag or give a label of source images. We also combined the advantage of traditional panoramic creation and image stitching in proposed method and proved that proposed method is effective in use as per the shown results in experiment results section.
In panoramic creation, the processing is required to pay more attention to reduce the time taken for the processing. And often it is required to concentrate in feature matching step, because of the feature information are important in image position matching and in computing the homography matrix. Therefore, all source images need to be coordinated in same step of feature matching which results an increase in time complexity when the amount of source images is large in input processing. Authors are working on a proper solution to remove the empty black color regions around the boundary of the panorama and to develop that proposed solution for the mobile devices as well.
Shih-Ming Chang is a PhD student at Department of Computer Science and Information Engineering of Tamkang University, Taiwan. He acquired the Master degree in Department of Computer Science and Information Engineering of Tamkang University of Taiwan in 2009. His research interests are in the area of Computer Vision, Interactive Multimedia and multimedia processing.
Hon-Hang Chang is a PhD and student and currently reading at the Department of Computer Science and Information Engineering, National Central University (NCU), Taiwan (R.O.C.). He acquired his Master’s degree in Department of Photonics and Communication Engineering of Asia University of Taiwan in 2011. His research fields are image processing, information hiding and water marking.
Shwu-Huey Yen is currently an associate professor in Computer Science and Information Engineering (CSIE) department of Tamkang University, New Taipei City, Taiwan. She is also an author of over 50 journal papers and conference papers. Her academic interests are signal processing, multimedia processing and medical imaging.
Timothy K. Shih is a Professor of the Department of Computer Science and Information Engineering, National Central University, Taiwan. He was a Department Chair of the CSIE Department at Tamkang University, Taiwan. Dr. Shih is a Fellow of the Institution of Engineering and Technology (IET). In addition, he is a senior member of ACM and a senior member of IEEE. Dr. Shih also joined the Educational Activities Board of the Computer Society. His current research interests include Multimedia Computing and Distance Learning. Dr. Shih has edited many books and published over 440 papers and book chapters, as well as participated in many international academic activities, including the organization of more than 60 international conferences. He was the founder and co-editor-in-chief of the International Journal of Distance Education Technologies, published by the Idea Group Publishing, USA. Dr. Shih is an associate editor of the ACM Transactions on Internet Technology and an associate editor of the IEEE Transactions on Learning Technologies. He was also an associate editor of the IEEE Transactions on Multimedia. Dr. Shih has received many research awards, including research awards from National Science Council of Taiwan, IIAS research award from Germany, HSSS award from Greece, Brandon Hall award from USA, and several best paper awards from international conferences. Dr. Shih has been invited to give more than 30 keynote speeches and plenary talks in international conferences, as well as tutorials in IEEE ICME 2001 and 2006, and ACM Multimedia 2002 and 2007.
- Xiong Y, Pulli K: Fast image stitching and editing for panorama painting on mobile phones. In IEEE Comput Soc Conf Comput Vis Pattern Recogn Workshops (CVPRW). San Francisco, CA; 2010.Google Scholar
- Wang M: Panorama Painting: With a Bare Digital Camera. In Image and Graphics, 2009. ICIG'09. Fifth International Conference. Xi'an, Shanxi; 2009.Google Scholar
- Wagner D, Mulloni A, Langlotz T, Schmalstieg D: Real-time panoramic mapping and tracking on mobile phones. Waltham, MA: Virtual Reality Conference (VR); 2010.View ArticleGoogle Scholar
- Brown M, Lowe DG: Automatic panoramic image stitching using invariant features. Int J Comput Vis 2007, 74(1):59–73. 10.1007/s11263-006-0002-3View ArticleGoogle Scholar
- Fazli S, Pour HM, Bouzari H: Particle filter based object tracking with sift and color feature. Dubai: International Conference on Machine Vision; 2009.View ArticleGoogle Scholar
- Zhang Z: A flexible new technique for camera calibration. IEEE Trans Pattern Anal Mach Intell 2000, 22(11):1330–1334. 10.1109/34.888718View ArticleGoogle Scholar
- Kekre HB, Thepade SD: Rotation invariant fusion of partial image parts in vista creation using missing view regeneration. WASET Int J Electr Comput Eng Syst (IJECSE) 2008, 47: 660.Google Scholar
- Helmut D: Panorama Tools. Open source software for immersive imaging international VR photography conference, 2007. 2007. ~dersch/IVRPA.pdf, Accessed June 15–20 2007 http://webuser.fhfurtwangen.de/~dersch/IVRPA.pdfGoogle Scholar
- Song B, Yongqing F, Wang J: Automatic panorama creation using multi-row images. Inf Technol J 2011, 10: 1977–1982.View ArticleGoogle Scholar
- Wen-Yan L, Siying L, Yasuyuki M, Tian-Tsong N, Loong-Fah C: Smoothly varying affine stitching. Computer vision and pattern recognition (CVPR). Providence, RI: IEEE Conference; 2011.Google Scholar
- Criminisi A, Perez P, Toyama K: Object removal by exemplar-based inpainting. IEEE Comput Soc Conf Comput Vis Pattern Recogn 2004, 2: 721–728. 2003 2003Google Scholar
- Matthew B, Lowe DG: Recognising Panoramas. In Proceedings of the 9th International Conference on Computer Vision (ICCV2003). Nice, France; 2003:1218–1225.Google Scholar
- Avidan S, Shamir A: Seam carving for content-aware image resizing. ACM Transactions on Graphics (TOG) 2007, 26(3):10. 10.1145/1276377.1276390View ArticleGoogle Scholar
- Morel J-M, Guoshen Y: ASIFT: A new framework for fully affine invariant image comparison. SIAM J Imag Sci 2009, 2(2):438–469. 10.1137/080732730MATHView ArticleGoogle Scholar
- Criminisi A, Reid I, Zisserman A: A plane measuring device. Image Vis Comput 1999, 17(8):625–634. 10.1016/S0262-8856(98)00183-8View ArticleGoogle Scholar
- Sun J, Jia J, Tang C-K, Shum H-Y: Poisson matting. ACM Trans Graph 2004, 23(3):315–321. 10.1145/1015706.1015721View ArticleGoogle Scholar
- Pérez P, Gangnet M, Blake A: Poisson image editing. ACM Trans Graph 2003, 22(3):313–318. 10.1145/882262.882269View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.