Papers
Topics
Authors
Recent
Gemini 2.5 Flash
Gemini 2.5 Flash
97 tokens/sec
GPT-4o
53 tokens/sec
Gemini 2.5 Pro Pro
44 tokens/sec
o3 Pro
5 tokens/sec
GPT-4.1 Pro
47 tokens/sec
DeepSeek R1 via Azure Pro
28 tokens/sec
2000 character limit reached

AACP: Aesthetics assessment of children's paintings based on self-supervised learning (2403.07578v1)

Published 12 Mar 2024 in cs.CV

Abstract: The Aesthetics Assessment of Children's Paintings (AACP) is an important branch of the image aesthetics assessment (IAA), playing a significant role in children's education. This task presents unique challenges, such as limited available data and the requirement for evaluation metrics from multiple perspectives. However, previous approaches have relied on training large datasets and subsequently providing an aesthetics score to the image, which is not applicable to AACP. To solve this problem, we construct an aesthetics assessment dataset of children's paintings and a model based on self-supervised learning. 1) We build a novel dataset composed of two parts: the first part contains more than 20k unlabeled images of children's paintings; the second part contains 1.2k images of children's paintings, and each image contains eight attributes labeled by multiple design experts. 2) We design a pipeline that includes a feature extraction module, perception modules and a disentangled evaluation module. 3) We conduct both qualitative and quantitative experiments to compare our model's performance with five other methods using the AACP dataset. Our experiments reveal that our method can accurately capture aesthetic features and achieve state-of-the-art performance.

Definition Search Book Streamline Icon: https://streamlinehq.com
References (40)
  1. K-SVD: An algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process., 54(11): 4311–4322.
  2. data2vec: A General Framework for Self-supervised Learning in Speech, Vision and Language. In Chaudhuri, K.; Jegelka, S.; Song, L.; Szepesvári, C.; Niu, G.; and Sabato, S., eds., International Conference on Machine Learning, ICML 2022, 17-23 July 2022, Baltimore, Maryland, USA, volume 162 of Proceedings of Machine Learning Research, 1298–1312. PMLR.
  3. BEiT: BERT Pre-Training of Image Transformers. In The Tenth International Conference on Learning Representations, ICLR 2022, Virtual Event, April 25-29, 2022. OpenReview.net.
  4. Emerging Properties in Self-Supervised Vision Transformers. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, 9630–9640. IEEE.
  5. Aesthetic Critiques Generation for Photos. In IEEE International Conference on Computer Vision, ICCV 2017, Venice, Italy, October 22-29, 2017, 3534–3543. IEEE Computer Society.
  6. Chang, N. 2005. Children’s Drawings: Science Inquiry and beyond. Contemporary Issues in Early Childhood, 6: 104 – 106.
  7. A Simple Framework for Contrastive Learning of Visual Representations. In Proceedings of the 37th International Conference on Machine Learning, ICML 2020, 13-18 July 2020, Virtual Event, volume 119 of Proceedings of Machine Learning Research, 1597–1607. PMLR.
  8. Denac, O. 2014. The Significance and Role of Aesthetic Education in Schooling. Creative Education, 05: 1714–1719.
  9. NIMA: Neural Image Assessment. IEEE Transactions on lmage Processing., 27(8): 3998–4011.
  10. Dual Attention Network for Scene Segmentation. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, 3146–3154. Computer Vision Foundation / IEEE.
  11. ConvMAE: Masked Convolution Meets Masked Autoencoders. CoRR, abs/2205.03892.
  12. How to Read Paintings: Semantic Art Understanding with Multi-modal Retrieval. In Leal-Taixé, L.; and Roth, S., eds., Computer Vision - ECCV 2018 Workshops - Munich, Germany, September 8-14, 2018, Proceedings, Part II, volume 11130 of Lecture Notes in Computer Science, 676–691. Springer.
  13. Masked Autoencoders Are Scalable Vision Learners. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, 15979–15988. IEEE.
  14. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Guyon, I.; von Luxburg, U.; Bengio, S.; Wallach, H. M.; Fergus, R.; Vishwanathan, S. V. N.; and Garnett, R., eds., Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA, 6626–6637.
  15. Denoising Diffusion Probabilistic Models. In Larochelle, H.; Ranzato, M.; Hadsell, R.; Balcan, M.; and Lin, H., eds., Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, December 6-12, 2020, virtual.
  16. Effective Aesthetics Prediction With Multi-Level Spatially Pooled Features. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, 9375–9383. Computer Vision Foundation / IEEE.
  17. Squeeze-and-Excitation Networks. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, 7132–7141. Computer Vision Foundation / IEEE Computer Society.
  18. Analyzing and Improving the Image Quality of StyleGAN. In 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, 8107–8116. Computer Vision Foundation / IEEE.
  19. MUSIQ: Multi-scale Image Quality Transformer. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, 5128–5137. IEEE.
  20. PCA-SIFT: A More Distinctive Representation for Local Image Descriptors. In 2004 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR 2004), with CD-ROM, 27 June - 2 July 2004, Washington, DC, USA, 506–513. IEEE Computer Society.
  21. Photo Aesthetics Ranking Network with Attributes and Content Adaptation. In Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part I, volume 9905 of Lecture Notes in Computer Science, 662–679. Springer.
  22. RAPID: Rating Pictorial Aesthetics using Deep Learning. In Hua, K. A.; Rui, Y.; Steinmetz, R.; Hanjalic, A.; Natsev, A.; and Zhu, W., eds., Proceedings of the ACM International Conference on Multimedia, MM ’14, Orlando, FL, USA, November 03 - 07, 2014, 457–466. ACM.
  23. Deep Multi-patch Aggregation Network for Image Style, Aesthetics, and Quality Estimation. In 2015 IEEE International Conference on Computer Vision, ICCV 2015, Santiago, Chile, December 7-13, 2015, 990–998. IEEE Computer Society.
  24. User-Guided Personalized Image Aesthetic Assessment Based on Deep Reinforcement Learning. IEEE Trans. Multim., 25: 736–749.
  25. A-Lamp: Adaptive Layout-Aware Multi-patch Deep Convolutional Neural Network for Photo Aesthetic Assessment. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017, 722–731. IEEE Computer Society.
  26. Composition-Preserving Deep Photo Aesthetics Assessment. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, 497–506. IEEE Computer Society.
  27. DeepArt: Learning Joint Representations of Visual Arts. In Liu, Q.; Lienhart, R.; Wang, H.; Chen, S. K.; Boll, S.; Chen, Y. P.; Friedland, G.; Li, J.; and Yan, S., eds., Proceedings of the 2017 ACM on Multimedia Conference, MM 2017, Mountain View, CA, USA, October 23-27, 2017, 1183–1191. ACM.
  28. AVA: A large-scale database for aesthetic visual analysis. In 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, RI, USA, June 16-21, 2012, 2408–2415. IEEE Computer Society.
  29. Smooth Grad-CAM++: An Enhanced Inference Level Visualization Technique for Deep Convolutional Neural Network Models. CoRR, abs/1908.01224.
  30. Hierarchical Text-Conditional Image Generation with CLIP Latents. CoRR, abs/2204.06125.
  31. Observing young children’s creative thinking: engagement, involvement and persistence. International Journal of Early Years Education, 20: 349 – 364.
  32. High-Resolution Image Synthesis with Latent Diffusion Models. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, 10674–10685. IEEE.
  33. Photorealistic Text-to-Image Diffusion Models with Deep Language Understanding. CoRR, abs/2205.11487.
  34. An Analysis of Pre-school Children’s Perception of Schoolyard through their Drawings. Procedia - Social and Behavioral Sciences, 116: 2105–2114.
  35. Hierarchical Layout-Aware Graph Convolutional Network for Unified Aesthetics Assessment. In IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021, 8475–8484. Computer Vision Foundation / IEEE.
  36. Attention-based Multi-Patch Aggregation for Image Aesthetic Assessment. In Boll, S.; Lee, K. M.; Luo, J.; Zhu, W.; Byun, H.; Chen, C. W.; Lienhart, R.; and Mei, T., eds., 2018 ACM Multimedia Conference on Multimedia Conference, MM 2018, Seoul, Republic of Korea, October 22-26, 2018, 879–886. ACM.
  37. OmniArt: Multi-task Deep Learning for Artistic Data Analysis. CoRR, abs/1708.00684.
  38. Improving GAN Equilibrium by Raising Spatial Awareness. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, 11275–11283. IEEE.
  39. Masked Feature Prediction for Self-Supervised Visual Pre-Training. In IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2022, New Orleans, LA, USA, June 18-24, 2022, 14648–14658. IEEE.
  40. Personalized Image Aesthetics Assessment via Meta-Learning With Bilevel Gradient Optimization. IEEE Transactions on Cybernetics., 52(3): 1798–1811.
User Edit Pencil Streamline Icon: https://streamlinehq.com
Authors (6)
  1. Shiqi Jiang (27 papers)
  2. Ning Li (174 papers)
  3. Chen Shi (55 papers)
  4. Liping Guo (8 papers)
  5. Changbo Wang (20 papers)
  6. Chenhui Li (15 papers)

Summary

We haven't generated a summary for this paper yet.