کلمات کلیدی مربوط به کتاب چالش های پیشرفته و آینده در تشخیص صحنه های ویدیویی: یک نظرسنجی: علوم و مهندسی کامپیوتر، پردازش داده های رسانه ای، پردازش ویدئو
در صورت تبدیل فایل کتاب State-of-the-art and future challenges in video scene detection: a survey به فرمت های PDF، EPUB، AZW3، MOBI و یا DJVU می توانید به پشتیبان اطلاع دهید تا فایل مورد نظر را تبدیل نمایند.
توجه داشته باشید کتاب چالش های پیشرفته و آینده در تشخیص صحنه های ویدیویی: یک نظرسنجی نسخه زبان اصلی می باشد و کتاب ترجمه شده به فارسی نمی باشد. وبسایت اینترنشنال لایبرری ارائه دهنده کتاب های زبان اصلی می باشد و هیچ گونه کتاب ترجمه شده یا نوشته شده به فارسی را ارائه نمی دهد.
In Multimedia Systems, Vol. 19, № 5; (2013), pp. 427-454,
doi:10.1007/s00530-013-0306-4 by Manfred Del Fabro, Laszlo
Böszörmenyi, Multimedia Systems Volume 19, Issue 5 , pp
427-454
Keywords: video segmentation; scene detection;
non-sequential video; survey.
Topics:
multimedia information systems; computer communication;
networksoperating systems; data storage representation; data
encryption; computer graphics.
Industry
sectors: electronics; it & software;
telecommunications.
Abstract
In the last 15 years much effort has been made in the field of
segmentation of videos into scenes. We give a comprehensive
overview of the published approaches and classify them into
seven groups based on three basic classes of low-level features
used for the segmentation process: (1) visual-based, (2)
audio-based, (3) text-based, (4) audio-visual-based, (5)
visual-textual-based, (6) audio-textual-based and (7) hybrid
approaches. We try to make video scene detection approaches
better assessable and comparable by making a categorization of
the evaluation strategies used. This includes size and type of
the dataset used as well as the evaluation metrics.
Furthermore, in order to let the reader make use of the survey,
we list eight possible application scenarios, including an own
section for interactive video scene segmentation, and identify
those algorithms that can be applied to them. At the end,
current challenges for scene segmentation algorithms are
discussed. In the appendix the most important characteristics
of the algorithms presented in this paper are summarized in
table form.
Cover: Date
2013-10-01;
Print: ISSN 0942-4962;
Online: ISSN
1432-1882;
Publisher: Springer Berlin
Heidelberg.
Author Affiliations:
Institute of Information Technology (ITEC),
Alpen-Adria-Universität Klagenfurt, Klagenfurt, Austria
Classification of Scene
Segmentation Approaches
Scene Segmentation Methods
Rule-Based Methods: 180 degree rule, action matching rule, film
tempo rule, shot/reverse shot rule, establishment/breakdown
rule
Graph-Based Methods
Stochastic-Based Methods
Hierarchical and Full vs. Partial Decomposition
Video Scene Segmentation:
State-of-the-Art
Visual-Based Segmentation
Visual-Based Full Segmentation
Visual-Based Partial Segmentation
Visual Graph-Based Full Segmentation
Visual Stochastic-Based Full Segmentation
Audio-Based Segmentation
Audio-Based Full Segmentation
Audio-Based Partial Segmentation
Text-Based Full Segmentation
Audio-Visual Full Segmentation
Audio-Visual Graph-Based Full Segmentation
Audio-Visual Stochastic-Based Full Segmentation
Audio-Visual Stochastic-Based Partial Segmentation
Hybrid Full Segmentation
Visual-Textual Full Segmentation
Audio-Textual Full Segmentation
Hybrid Partial Segmentation
Audio-Textual Partial Segmentation
Evaluation of Video
Segmentation Approaches
Datasets and Video Genres
Evaluation Methods
Strategies for Video Scene
Segmentation Problems
Movies
Presented approaches for movies
Presented approaches for movies
TV series or sitcoms
News
Presented approaches for news videos
Possible approaches for news videos
Game and TV show videos
Presented approaches for game and TV show videos
Possible approaches for game and TV show videos
Sports videos
Presented approaches for sports videos
Possible approaches for sports videos
Single-shot videos
Possible approaches for single-shot videos
Black-and-white videos
Presented approaches for black-and-white videos
Possible approaches for black-and-white videos
Interactive scene segmentation
Future Challenges in Video Scene Detection
References
Adams, B., Dorai, C., Venkatesh, S.: Toward automatic
extraction of expressive elements from motion pictures: tempo.
IEEE Trans. Multimed. 4(4), 472–481 (2002)
Aner, A., Kender, J.: Video Summaries through mosaic-based shot
and scene clustering. In: Heyden, A., Sparr, G., Nielsen, M.,
Johansen P. (eds.) Computer Vision ECCV 2002, Lecture Notes in
Computer Science, vol. 2353, Chap. 26, pp. 45–
49. Springer, Berlin (2006)
Arifin, S., Cheung, P.Y.K.: Affective level video segmentation
by utilizing the Pleasure-Arousal-dominance information. IEEE
Trans. Multimed. 10(7), 1325–1341 (2008)
Ariki, Y., Kumano, M., Tsukada, K.: Highlight scene extraction
in real time from baseball live video. In: Proceedings of the
5th ACM SIGMM International Workshop on Multimedia Information
Retrieval, MIR ’03, pp. 209–
214. ACM, New York, NY, USA (2003)
Benini, S., Xu, L.Q., Leonardi, R.: Identifying video content
consistency by vector quantization. In: Proceedings of the 2005
International Workshop on Image Analysis for Multimedia
Interactive Services (WIAMIS 2005) (2005)
Bredin, H.: Segmentation of tv shows into scenes using speaker
diarization and speech recognition. In: IEEE International
Conference on Acoustics, Speech and Signal Processing (ICASSP),
2012, pp. 2377–2380 (2012)
Cao, J.R.: Algorithm of scene segmentation based on svm for
scenery documentary. In: Third International Conference on
Natural Computation, 2007 (ICNC 2007), vol. 3, pp. 95–98
(2007)
Chaisorn, L., Chua, T.S., Lee, C.H.: The segmentation of news
video into story units. In: IEEE International Conference on
Multimedia and Expo, 2002. ICME ’02, 2002, vol. 1, pp. 73–76
(2002)
Chasanis, V.T., Likas, A.C., Galatsanos, N.P.: Scene detection
in videos using shot clustering and sequence alignment. IEEE
Trans. Multimed. 11(1), 89–100 (2009)
Chen, L., Ozsu, M.: Rule-based scene extraction from video. In:
Proceedings of 2002 International Conference on Image
Processing (2002)
Chen, L.H., Lai, Y.C., Mark Liao, H.Y.: Movie scene
segmentation using background information. Pattern Recognit.
41, 1056–1065 (2008)
Chen, S.C., Shyu, M.L., Liao, W., Zhang, C.: Scene change
detection by audio and video clues, pp. 365–368
Cheng, W., Lu, J.: Video scene oversegmentation reduction by
tempo analysis. In: Fourth International Conference on Natural
Computation, 2008 (ICNC ’08), vol. 4, pp. 296–300 (2008)
Chu, W.T., Li, C.J., Tseng, S.C.: Travelmedia: an intelligent
management system for media captured in travel. J. Vis. Commun.
Image Represent. 22(1), 93–104 (2011)
Chu, W.T., Lin, C.C., Yu, J.Y.: Using cross-media correlation
for scene detection in travel videos. In: Proceedings of the
ACM International Conference on Image and Video Retrieval, CIVR
’
09. ACM, New York, NY, USA (2009)
Cour, T., Jordan, C., Miltsakaki, E., Taskar, B.: Movie/script:
alignment and parsing of video and text transcription. In:
Forsyth, D., Torr, P., Zisserman, A. (eds.) Computer Vision
ECCV 2008, Lecture Notes in Computer Science, vol. 5305, Chap.
12, pp. 158–
171. Springer, Berlin (2008)
Del Fabro, M., Böszörmenyi, L.: Video scene detection based on
recurring motion patterns. In: Second International Conferences
on Advances in Multimedia (MMEDIA), pp. 113–118 (2010)
Del Fabro, M., Böszörmenyi, L.: Summarization and presentation
of real-life events using community-contributed content. In:
Schoeffmann, K., Merialdo, B., Hauptmann, A., Ngo, C.W.,
Andreopoulos, Y., Breiteneder, C. (eds.) Advances in Multimedia
Modeling, Lecture Notes in Computer Science, vol. 7131, pp.
630–
632. Springer, Berlin (2012)
Del Fabro, M., Sobe, A., Böszörmenyi, L.: Summarization of
real-life events based on community-contributed content. In:
The Fourth International Conferences on Advances in Multimedia,
pp. 119–
126. IARIA (2012)
Ellouze, M., Boujemaa, N., Alimi, A.: Scene pathfinder:
unsupervised clustering techniques for movie scenes extraction.
Multimed. Tools Appl. 47(2), 325–346 (2010)
Ercolessi, P., Bredin, H., Sénac, C., Joly, P.: Segmenting TV
series into scenes using speaker diarization. In: WIAMIS 2011:
12th International Workshop on Image Analysis for Multimedia
Interactive Services. Delft, The Netherlands (2011)
Friedland, G., Gottlieb, L., Janin, A.: Joke-o-mat: browsing
sitcoms punchline by punchline. In: Proceedings of the
Seventeen ACM International Conference on Multimedia, MM ’09,
pp. 1115–
1116. ACM, New York, NY, USA (2009)
Gatica-Perez, D., Loui, A., Sun, M.T.: Finding structure in
home videos by probabilistic hierarchical clustering. IEEE
Trans. Circuits Syst. Video Technol. 13(6), 539– 548
(2003)
Goela, N., Wilson, K., Niu, F., Divakaran, A., Otsuka, I.: An
SVM framework for Genre-Independent scene change detection. In:
IEEE International Conference on Multimedia and Expo, pp.
532–535 (2007)
Gu, Z., Mei, T., Hua, X.S., Wu, X., Li, S.: EMS: Energy
Minimization Based Video Scene Segmentation. In: IEEE
International Conference on Multimedia and Expo, pp. 520–523
(2007)
Han, B., Wu, W.: Video scene segmentation using a novel
boundary evaluation criterion and dynamic programming. In: IEEE
International Conference on Multimedia and Expo (ICME), 2011,
pp. 1–6 (2011)
Hanjalic, A., Lagendijk, R.L., Biemond, J.: Automated
high-level movie segmentation for advanced video-retrieval
systems. IEEE Trans. Circuits Syst. Video Technol. 9(4),
580–588 (1999)
Hauptmann, A., Witbrock, M.: Story segmentation and detection
of commercials in broadcast news video. In: Proceedings. IEEE
International Forum on Research and Technology Advances in
Digital Libraries, 1998. ADL 98, pp. 168–179 (1998)
Hsu, W.H.M., Chang, S.F.: Generative, discriminative, and
ensemble learning on multi-modal perceptual fusion toward news
video story segmentation. In: IEEE International Conference on
Multimedia and Expo, 2004. ICME ’04, vol. 2, pp. 1091–1094
(2004)
Huang, J., Liu, Z., Wang, Y.: Joint scene classification and
segmentation based on hidden markov model. IEEE Trans.
Multimed. 7(3), 538–550 (2005)
Huang, J., Liu, Z., Yao, W.: Integration of audio and visual
information for content-based video segmentation. In:
International Conference on Image Processing, ICIP 98, vol. 3,
pp. 526–529 (1998)
Janin, A., Gottlieb, L., Friedland, G.: Joke-o-Mat HD: browsing
sitcoms with human derived transcripts. In: Proceedings of the
International Conference on Multimedia, MM ’10, pp. 1591–1594.
ACM, New York, NY, USA (2010)
Javed, O., Rasheed, Z., Shah, M.: A framework for segmentation
of talk and game shows. In: Eighth IEEE International
Conference on Computer Vision, ICCV 2001, (2001)
Katz, E., Klein, F., Nolen, R.: The film encyclopedia. Film
Encyclopedia. HarperPerennial (1998).
http://books.google.com/books?id=jhx0QgAACAAJ
Kender, J., Yeo, B.L.: Video scene segmentation via continuous
video coherence. In: Proceedings of IEEE Computer Society
Conference on Computer Vision and Pattern Recognition, pp.
367–373 (1998)
Kohonen, T.: The self-organizing map. Neurocomputing 21(1–3),
1–6 (1998)
Kwon, Y.M., Song, C.J., Kim, I.J.: A new approach for high
level video structuring. In: IEEE International Conference on
Multimedia and Expo, ICME 2000. (2000)
Kyperountas, M., Kotropoulos, C., Pitas, I.: Enhanced
Eigen-Audioframes for audiovisual scene change detection. IEEE
Trans. Multimed. 9(4), 785–797 (2007)
Liang, C., Zhang, Y., Cheng, J., Xu, C., Lu, H.: A novel
role-based movie scene segmentation method. In: Muneesawang,
P., Wu, F., Kumazawa, I., Roeksabutr, A., Liao, M., Tang, X.
(eds.) Advances in Multimedia Information Processing—PCM 2009,
Lecture Notes in Computer Science, vol. 5879, Chap. 82, pp.
917–
922. Springer, Berlin (2009)
Lienbart, R., Pfeiffer, S., Effelsberg, W.: Scene determination
based on video and audio features. In: IEEE International
Conference on Multimedia Computing and Systems, vol. 1, pp.
685–690 (1999)
Lin, T., Zhang, H.J., Shi, Q.Y.: Video scene extraction by
force competition. In: IEEE International Conference on
Multimedia and Expo, p. 192 (2001)
Liu, C., Huang, Q., Jiang, S., Xing, L., Ye, Q., Gao, W.: A
framework for flexible summarization of racquet sports video
using multiple modalities. Comput. Vis. Image Underst. 113(3),
415–424 (2009)
Lu, L., Cai, R., Hanjalic, A.: Audio elements based auditory
scene segmentation. In: IEEE International Conference on
Acoustics, Speech and Signal Processing, 2006. ICASSP 2006
Proceedings, vol. 5, p. V (2006)
Lu, L., Zhang, H.J., Jiang, H.: Content analysis for audio
classification and segmentation. IEEE Trans. Speech Audio
Process. 10(7), 504–516 (2002)
Mitrović, D., Hartlieb, S., Zeppelzauer, M., Zaharieva, M.:
Scene segmentation in artistic archive documentaries. In:
Leitner, G., Hitz, M., Holzinger, A. (eds.) HCI in Work and
Learning, Life and Leisure, Lecture Notes in Computer Science,
vol. 6389, Chap. 27, pp. 400–
410. Springer, Berlin (2010)
Monaco, J.: How to Read a Film: The World of Movies, Media,
Multimedia: Language, History, Theory, 3 edn. Oxford University
Press, USA (2000)
Ngo, C.W., Ma, Y.F., Zhang, H.J.: Video summarization and scene
detection by graph modeling. IEEE Trans. Circuits Syst. Video
Technol. 15(2), 296–305 (2005)
Ngo, C.W., Pong, T.C., Zhang, H.J.: Motion-based video
representation for scene change detection. Int. J. Comput. Vis.
50(2), 127–142 (2002)
Nitanda, N., Haseyama, M., Kitajima, H.: Audio signal
segmentation and classification for scene-cut detection. In:
IEEE International Symposium on Circuits and Systems, 2005.
ISCAS 2005, Vol. 4, pp. 4030– 4033 (2005)
Niu, F., Goela, N., Divakaran, A., Abdel-Mottaleb, M.: Audio
scene segmentation for video with generic content. In: Society
of Photo-Optical Instrumentation Engineers (SPIE) Conference
Series. Presented at the Society of Photo-Optical
Instrumentation Engineers (SPIE) Conference, vol. 6820
(2008)
Odobez, J.M., Gatica-Perez, D., Guillemot, M.: Spectral
structuring of home videos. In: Bakker, E., Lew, M., Huang, T.,
Sebe, N., Zhou, X. (eds.) Image and Video Retrieval, Lecture
Notes in Computer Science, vol. 2728, Chap. 31, pp. 85–
90. Springer, Berlin (2003)
Over, P., Awad, G., Fiscus, J., Antonishek, B., Michel, M.,
Smeaton, A.F., Kraaij, W., Quenot, G.: Trecvid 2010—an overview
of the goals, tasks, data, evaluation mechanisms and metrics.
In: Proceedings of TRECVID 2010. NIST, USA (2010)
Parshin, V., Paradzinets, A., Chen, L.: Multimodal data fusion
for video scene segmentation. In: Bres, S., Laurini, R. (eds.)
Visual Information and Information Systems, Lecture Notes in
Computer Science, vol. 3736, pp. 279–
289. Springer, Berlin (2006)
Petersohn, C.: Temporal video structuring for preservation and
annotation of video content. In: 16th IEEE International
Conference on Image Processing (ICIP), 2009, pp. 93–96
(2009)
Poulisse, G., Moens, M.: Unsupervised scene detection in
olympic video using multi-modal chains. In: 9th International
Workshop on Content-Based Multimedia Indexing (CBMI), 2011, pp.
103–108 (2011)
Rasheed, Z., Shah, M.: Scene Detection in Hollywood Movies and
TV Shows. IEEE Computer Society, Los Alamitos, CA, USA, p. 343
(2003)
Rasheed, Z., Shah, M.: Detection and representation of scenes
in videos. IEEE Trans. Multimed. 7(6), 1097–1105 (2005)
Rui, Y., Huang, T.S., Mehrotra, S.: Constructing
table-of-content for videos. Multimed. Syst. 7(5), 359–368
(1999)
Sakarya, U., Telatar, Z.: Graph-based multilevel temporal video
segmentation. Multimed. Syst. 14(5), 277–290 (2008)
Sakarya, U., Telatar, Z.: Video scene detection using dominant
sets. In: 15th IEEE International Conference on Image
Processing, 2008. ICIP 2008, pp. 73–76 (2008)
Sakarya, U., Telatar, Z.: Video scene detection using
graph-based representations. Signal Process. Image Commun.
25(10), 774–783 (2010)
Sang, J., Xu, C.: Character-based movie summarization. In:
Proceedings of the International Conference on Multimedia, MM
’10, pp. 855–
858. ACM, New York, NY, USA (2010)
Schoeffmann, K., Lux, M., Taschwer, M., Boeszoermenyi, L.:
Visualization of video motion in context of video browsing. In:
Proceedings of the IEEE International Conference on Multimedia
and Expo. IEEE, New York, USA (2009)
Schoeffmann, K., Taschwer, M., Boeszoermenyi, L.: The video
explorer: a tool for navigation and searching within a single
video based on fast content analysis. In: MMSys 10: Proceedings
of the First Annual ACM SIGMM Conference on Multimedia Systems,
p. 247–
258. ACM, New York, NY, USA (2010)
Shi, J., Malik, J.: Normalized cuts and image segmentation.
IEEE Trans. Pattern Anal. Mach. Intell. 22(8), 888–905
(2000)
Sidiropoulos, P., Mezaris, V., Kompatsiaris, I., Kittler, J.:
Differential edit distance: a metric for scene segmentation
evaluation. IEEE Transa. Circuits Syst. Video Technol. 22(6),
904–914 (2012)
Sidiropoulos, P., Mezaris, V., Kompatsiaris, I., Meinedo, H.,
Bugalho, M., Trancoso, I.: Temporal video segmentation to
scenes using High-Level audiovisual features. IEEE Trans.
Circuits Syst. Video Technol. 21(8), 1163–1177 (2011)
Sidiropoulos, P., Mezaris, V., Kompatsiaris, I., Meinedo, H.,
Trancoso, I.: Multi-modal scene segmentation using scene
transition graphs. In: Proceedings of the Seventeen ACM
International Conference on Multimedia, MM ’09, pp. 665–
668. ACM, New York, NY, USA (2009)
Song, Y., Ogawa, T., Haseyama, M.: MCMC-based scene
segmentation method using structure of video. In: IEEE
International Symposium on Communications and Information
Technologies (ISCIT), pp. 862–866 (2010)
Sundaram, H., Chang, S.F.: Video scene segmentation using video
and audio features. In: IEEE International Conference on
Multimedia and Expo, 2000. ICME 2000 (2000)
Sundaram, H., Chang, S.F.: Computable scenes and structures in
films. IEEE Trans. Multimed. 4(4), 482–491 (2002)
Surowiecki, J.: The Wisdom of Crowds. Anchor, New York
(2005)
Tavanapong, W., Zhou, J.: Shot Clustering Techniques for Story
Browsing. IEEE Trans. Multimed. 6(4), 517–527 (2004)
Truong, B.T., Venkatesh, S.: Video abstraction: a systematic
review and classification. ACM Trans. Multimed. Comput. Commun.
Appl. 3(1), 3+ (2007)
Truong, B.T., Venkatesh, S., Dorai, C.: Scene extraction in
motion pictures. IEEE Trans. Circuits Syst. Video Technol.
13(1), 5–15 (2003)
Velivelli, A., Ngo, C.W., Huang, T.S.: Detection of documentary
scene changes by Audio-Visual fusion image and video retrieval.
In: Bakker, E.M., Lew, M.S., Huang, T.S., Sebe, N., Zhou, X.S.
(eds.) Image and Video Retrieval, Lecture Notes in Computer
Science, vol. 2728, Chap. 23, pp. 227–
238. Springer, Berlin (2003)
Vendrig, J., Worring, M.: Systematic evaluation of logical
story unit segmentation. IEEE Trans. Multimed. 4(4), 492–499
(2002)
Vinciarelli, A., Favre, S.: Broadcast news story segmentation
using social network analysis and hidden markov models. In:
Proceedings of the 15th International Conference on Multimedia,
MULTIMEDIA ’07, pp. 261–
264. ACM, New York, NY, USA (2007)
Wang, J., Duan, L., Liu, Q., Lu, H., Jin, J.S.: A multimodal
scheme for program segmentation and representation in broadcast
video streams. IEEE Trans. Multimed. 10(3), 393–408
(2008)
Wang, X., Wang, S., Xuejun, S., Gabbouj, M.: A shot clustering
based algorithm for scene segmentation. In: International
Conference on Computational Intelligence and Security
Workshops, CISW 2007, pp. 259–252 (