TY - GEN
T1 - ToonMeet
T2 - 35th IEEE International Conference on Tools with Artificial Intelligence, ICTAI 2023
AU - Chen, Chenhao
AU - Fukushima, Shogo
AU - Nakamura, Yugo
AU - Arakawa, Yutaka
N1 - Publisher Copyright:
© 2023 IEEE.
PY - 2023
Y1 - 2023
N2 - In this paper, we propose ToonMeet, a hybrid frame-work for high-resolution and style-controllable online meeting toonification that ensures real-time operation speed. ToonMeet applies video frame interpolation to traditional portrait toonification pipelines, allowing for the synthesis of intermediate frames between adjacent toonified keyframes, significantly accelerating the overall process and saving computational resources. However, this approach brings a new problem, where prevailing flow-based video frame interpolation methods tend to cause more ghost and blur artifacts in toonified scenes compared to non-toonified scenes, especially when fast-moving objects exist. We study this previously undiscussed problem and explore its causes. To address this, we introduce a new dataset called TM3B (Toonified Multi-modal Meeting Behaviors), offering high-resolution and cross-platform multi-modal stylized meeting data of Japanese youth in various scenarios. Then, we fine-tune ToonMeet on these tailored data and the resulting model presents improved optical flow estimation ability on toonified videos. Extensive experiments demonstrate that ToonMeet can achieve great spatiotemporal performance and perform high-quality toonification of online meetings with real-time operation speed.
AB - In this paper, we propose ToonMeet, a hybrid frame-work for high-resolution and style-controllable online meeting toonification that ensures real-time operation speed. ToonMeet applies video frame interpolation to traditional portrait toonification pipelines, allowing for the synthesis of intermediate frames between adjacent toonified keyframes, significantly accelerating the overall process and saving computational resources. However, this approach brings a new problem, where prevailing flow-based video frame interpolation methods tend to cause more ghost and blur artifacts in toonified scenes compared to non-toonified scenes, especially when fast-moving objects exist. We study this previously undiscussed problem and explore its causes. To address this, we introduce a new dataset called TM3B (Toonified Multi-modal Meeting Behaviors), offering high-resolution and cross-platform multi-modal stylized meeting data of Japanese youth in various scenarios. Then, we fine-tune ToonMeet on these tailored data and the resulting model presents improved optical flow estimation ability on toonified videos. Extensive experiments demonstrate that ToonMeet can achieve great spatiotemporal performance and perform high-quality toonification of online meetings with real-time operation speed.
UR - http://www.scopus.com/inward/record.url?scp=85182402331&partnerID=8YFLogxK
UR - http://www.scopus.com/inward/citedby.url?scp=85182402331&partnerID=8YFLogxK
U2 - 10.1109/ICTAI59109.2023.00013
DO - 10.1109/ICTAI59109.2023.00013
M3 - Conference contribution
AN - SCOPUS:85182402331
T3 - Proceedings - International Conference on Tools with Artificial Intelligence, ICTAI
SP - 30
EP - 37
BT - Proceedings - 2023 IEEE 35th International Conference on Tools with Artificial Intelligence, ICTAI 2023
PB - IEEE Computer Society
Y2 - 6 November 2023 through 8 November 2023
ER -