TY - JOUR
T1 - Dynamic mode decomposition via dictionary learning for foreground modeling in videos
AU - Ul Haq, Israr
AU - Fujii, Keisuke
AU - Kawahara, Yoshinobu
N1 - Funding Information:
This work was supported by JSPS KAKENHI, Japan (Grant Numbers 18H03287 ) and JST CREST, Japan (Grant Number JPMJCR1913 ).
Publisher Copyright:
© 2020 The Authors
PY - 2020/10
Y1 - 2020/10
N2 - Accurate extraction of foregrounds in videos is one of the challenging problems in computer vision. In this study, we propose dynamic mode decomposition via dictionary learning (dl-DMD), which is applied to extract moving objects by separating the sequence of video frames into foreground and background information with a dictionary learned using block patches on the video frames. Dynamic mode decomposition (DMD) decomposes spatiotemporal data into spatial modes, each of whose temporal behavior is characterized by a single frequency and growth/decay rate and is applicable to split a video into foregrounds and the background when applying it to a video. And, in dl-DMD, DMD is applied on coefficient matrices estimated over a learned dictionary, which enables accurate estimation of dynamical information in videos. Due to this scheme, dl-DMD can analyze the dynamics of respective regions in a video based on estimated amplitudes and temporal evolution over patches. The results on synthetic data exhibit that dl-DMD outperforms the standard DMD and compressed DMD (cDMD) based methods. Also, the results of an empirical performance evaluation in the case of foreground extraction from videos using publicly available dataset demonstrates the effectiveness of the proposed dl-DMD algorithm and achieves a performance that is comparable to that of the state-of-the-art techniques in foreground extraction tasks.
AB - Accurate extraction of foregrounds in videos is one of the challenging problems in computer vision. In this study, we propose dynamic mode decomposition via dictionary learning (dl-DMD), which is applied to extract moving objects by separating the sequence of video frames into foreground and background information with a dictionary learned using block patches on the video frames. Dynamic mode decomposition (DMD) decomposes spatiotemporal data into spatial modes, each of whose temporal behavior is characterized by a single frequency and growth/decay rate and is applicable to split a video into foregrounds and the background when applying it to a video. And, in dl-DMD, DMD is applied on coefficient matrices estimated over a learned dictionary, which enables accurate estimation of dynamical information in videos. Due to this scheme, dl-DMD can analyze the dynamics of respective regions in a video based on estimated amplitudes and temporal evolution over patches. The results on synthetic data exhibit that dl-DMD outperforms the standard DMD and compressed DMD (cDMD) based methods. Also, the results of an empirical performance evaluation in the case of foreground extraction from videos using publicly available dataset demonstrates the effectiveness of the proposed dl-DMD algorithm and achieves a performance that is comparable to that of the state-of-the-art techniques in foreground extraction tasks.
UR - http://www.scopus.com/inward/record.url?scp=85086822043&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85086822043&partnerID=8YFLogxK
U2 - 10.1016/j.cviu.2020.103022
DO - 10.1016/j.cviu.2020.103022
M3 - Article
AN - SCOPUS:85086822043
SN - 1077-3142
VL - 199
JO - Computer Vision and Image Understanding
JF - Computer Vision and Image Understanding
M1 - 103022
ER -