Image Generation from Text and Segmentation

研究成果: 書籍/レポート タイプへの寄稿会議への寄与

抄録

Although Text-to-Image tasks have been developing in recent years, the controlled generation of images that represent the layout of multiple complex objects remains a challenging problem. Specifically, challenges for the GLIDE model of image generation from text are: controlling the number of objects, scale conversion, instability of generation (objects indicated in the text do not appear in the image), and unnatural object structure. In addition, the task of generating images from textual information alone is so flexible that it is difficult to derive the relevance and requires a large number of data for training. In fact, the training of current generative models often uses hundreds of millions of text-image pairs. The large scale of the generative model itself makes the training cost enormous. In this study, we propose and validate a new image generation method that uses segmentation and text as input. That is, image generation from text is assisted by segmentation information. This should solve issues that GLIDE had difficulty with, such as controlling the number of objects, scale conversion, unstable generation, and unnatural object structure. Furthermore, the goal is to reduce the number of data needed for training by making it easier to find the relationship between textual information and image generation. As a result of the verification, our model achieved FID-10k score of 17.12 after training with only about 120,000 training data, and it was confirmed that it is capable of handling complex layouts and maintaining natural object structure even with a large number of objects.

本文言語英語
ホスト出版物のタイトルProceedings - 2022 10th International Symposium on Computing and Networking Workshops, CANDARW 2022
出版社Institute of Electrical and Electronics Engineers Inc.
ページ206-211
ページ数6
ISBN(電子版)9781665475327
DOI
出版ステータス出版済み - 2022
イベント10th International Symposium on Computing and Networking Workshops, CANDARW 2022 - Himeji, 日本
継続期間: 11月 21 202211月 22 2022

出版物シリーズ

名前Proceedings - 2022 10th International Symposium on Computing and Networking Workshops, CANDARW 2022

会議

会議10th International Symposium on Computing and Networking Workshops, CANDARW 2022
国/地域日本
CityHimeji
Period11/21/2211/22/22

!!!All Science Journal Classification (ASJC) codes

  • コンピュータ ネットワークおよび通信
  • コンピュータ サイエンスの応用
  • ハードウェアとアーキテクチャ
  • 情報システムおよび情報管理

フィンガープリント

「Image Generation from Text and Segmentation」の研究トピックを掘り下げます。これらがまとまってユニークなフィンガープリントを構成します。

引用スタイル