Open Access

Style classification and visualization of art painting’s genre using self-organizing maps

Human-centric Computing and Information Sciences20166:7

DOI: 10.1186/s13673-016-0063-4

Received: 25 February 2016

Accepted: 21 March 2016

Published: 5 June 2016

Abstract

With the spread of digitalization of art paintings, research on diverse scientific approaches on painted images has become active. In this paper, the method of classifying painting styles by extracting various features from paintings is suggested. Global features are extracted using the color-based statistical computation and composition-based local features of paintings are extracted through the segmentation of objects within the paintings to classify the styles of the paintings. Based on the extracted features, paintings are categorized by style using SOM, which are then analyzed through visualization using the map. We have proved the feasibility of the proposed method of categorizing paintings by style, and the objective features of paintings can contribute to the research on art history and aesthetics.

Keywords

Art painting style Image feature Self-organizing map Watershed segmentation Classification

Background

In art history, the field of study that researches the history of art paintings, painting style has changed continuously according to the changes in time, and individual factors like changes in one’s psychological state, in particular, change the individual painting styles. Such styles have changed in different ways by Western and Oriental paintings, and they act as crucial determinants of deciding the author of a certain painting. Likewise, many factors affect the process of categorizing paintings, and these factors can be re-defined as features.

Every painter has his or her own characteristics that are displayed in each painting. While art historians have analyzed and classified paintings based on the accumulated literature research on aesthetics, there were limitations of interpreting the features that are not visually apparent, because their subjective opinions have been expressed. Therefore, there is a need to develop objective features to classify art paintings based on scientific methods [1, 2].

Recently, art paintings are photographed by digital cameras and are being utilized in diverse ways [3]. There is also diverse research on the characterization of art paintings that have applied the signal processing theory, image processing and machine learning algorithms on digitalized art paintings [47]. Research related to extracting various features such as color, painter and time are carrying out brushstroke and texture of paintings to classify the paintings.

Related works

Brushstroke can be a great feature that displays the characteristics of a painter. Li. et al. [8] automatically extracted and analyzed the brushstrokes of van Gogh. Brushstroke was extracted through the combination of edge detection and clustering-based segmentation, and diverse feature values were calculated from the extracted brushstrokes. Diverse feature values including number and orientation standard deviation for brushstrokes for brushstrokes in a neighborhood, size, length, broadness homogeneity and straightness have been calculated statistically, and his unique brushstroke style was provided as scientific evidence. There are also numerous researches on the art works of van Gogh through the analysis of his strongly rhythmic brushstrokes [9, 10]. Such analysis on brushstrokes, however, is limited for painters who are very unique like van Gogh.

There is research on the classification of art painting by style. Lombardi [11] extracted the features by light, line and color and classified the art paintings by k-nearest neighbor. Visualization and analysis of classification performance were conducted through unsupervised learning, hierarchical clustering, self-organizing maps. Jafarpour et al. [12] proposed the stylistic analysis of paintings through dual-tree complex wavelet transforms and Hidden Markov Trees.

Zujovic et al. [13] proposed the classification method for five genres: abstract expressionism, cubism, impressionism, pop art, and realism. Features of paintings are extracted using the edge map as the gray-level feature and histogram as color feature in HSV space. The classification performance was analyzed from the extracted features using KNN, ANN, SVM, and AdaBoost.

Centinic and Grgic [14] proposed the painter recognition method. The painters were recognized through various classifiers including MLP and Random forest using information and color-based histogram as well as texture-based features using GLCM and DWT.

Shamir et al. [15] proposed the method of recognizing painters and schools of art centering on Impressionism, Expressionism, and Surrealism. Because the sizes of the paintings are diverse, the paintings have been resized to 600 pixels in the smaller size of vertical and horizontal and have been cropped to 600 × 600 to establish the dataset unlike conventional research. 11 types of features have been calculated from the image, and a total of 3658 image descriptors have been extracted. Useful informations have been selected from the extracted descriptors using the Fisher score and used as feature vectors.

Li et al. [16] proposed the aesthetic visual quality assessment method for paintings. A total of 40 global and local features were extracted by characteristics including color, brightness and composition. Since the aesthetic visual quality assessment work is highly subjective, the art works of famous painters are scored visually and then surveyed. Based on the average of the score, the paintings are labeled as “high-quality” or “low-quality.” Naive bayes and adaptive boosting have been used as classifiers to analyze the performance. There is also research on utilizing the painting recognition in the field of mobile augmented reality like [17].

As mentioned above, we were motivated by the research of extracting various features of paintings and analyzing them, and this paper classifies art works as expressionism, impressionism, post-impressionism, and surrealism by their style. The tool of visualizing and analyzing the classifications is then proposed.

Paper structure

The rest of the paper is organized as follows. "Feature extraction for art paintings" section describes the proposed feature extraction method, including color features and composition features. "Classification and visualization" section describes the classification and the visualization for styles of paintings. "Experiments" section analyzes experimental results of the proposed method. "Conclusions" section concludes the proposed approach and discusses about future works.

Feature extraction for art paintings

Each digital image of paintings is different in resolution and horizontal and vertical ratio. In order to extract feature values from paintings with different sizes, the process of normalizing the paintings in a certain size must be preceded. However, the problem of this lies in the possibility of the distortion of ratio and features like texture. There is also the problem of the definition of the photographed paintings according to the environment. Clear pictures with high-definition provide more information and are appropriate to extract detailed texture features like brushstroke. However, as low-definition images have less high frequency components that can express the delicacy compared to high-definition images, they are relatively less appropriate to extract texture information as features. Therefore, we focus on extracting feature values that are independent from the definition and image resolution that differ from the environment where the paintings are photographed.

Painters express their emotions in paintings using the basic elements of point, line, area and color. In this paper, the features of paintings are extracted after being divided into either color or composition. Color is considered as the global features for the shades that are used or preferred by the painters, while composition is the local features analyzed and extracted from the segmentation of the paintings by region.

Color features

As the global features of paintings, color is important. When we look at paintings, we tend to look at the overall color first. Then, we look into the composition and detailed aspect of the paintings. Each painter has his or her favorite colors, and there are different color tones by painting style. In this paper, the color features are extracted through statistical computation of the overall pixel of the painting images. The painting images are generally inserted as RGB color model. Yet, the RGB color is a color model that is not intuitive as a visual recognition method. Therefore, we use the HSV color model that is intuitive to visual recognition when extracting the feature on the use of colors from paintings.

The average of hue and saturation is calculated and used to extract the rough statistic characteristics of paintings. Hue shows which colors have been used, and saturation shows whether the paintings are light or dark according to the amount of white pigment. Average hue and average saturation feature can be calculated based on the below Eqs. 1 and 2.
$$\begin{aligned} f_{ 1 }=\frac{ 1 }{ MN } \sum _{ n }^{ }{ \sum _{ m }^{ }{ { I }_{ H }(m,n) } } \end{aligned}$$
(1)
$$\begin{aligned} f_{ 2 }=\frac{ 1 }{ MN } \sum _{ n }^{ }{ \sum _{ m }^{ }{ { I }_{ S }(m,n) } } \end{aligned}$$
(2)
where M and N are the size of rows and columns of the image, and \({ I }_{ H }(m,n)\) and \({ I }_{ S }(m,n)\) are the hue value and saturation value at the pixel (mn) respectively.
The number of colors used mainly can express the colorfulness of paintings. While some painters prefer the combination of less hue, some prefer to express various colors using many types of hue. The types of hue used per painting style are calculated by Hue count, according to the following method. From the image of the HSV color model, bin is divided by the number of k for the hue component to calculate histogram \({ h }_{ H }(i)\). Then, hue count feature is extracted as shown in Eq. 3, and \(k=20\), \(c=0.1\) is used in this paper.
$$f_{ 3 }=number \, of \, \{ { i|h_{ H }(i)>c\cdot q\} }$$
(3)
where i is bin index, q is the largest value of histogram, and c is a constant(\(0<c\le 1\)).
Then the hue distribution feature is extracted in order to show the ratio dispersion of each hue component in the image. From the above \(h_{H}(i)\), the value of bin is divided by the overall image pixel value and used as shown in Eq. 3.
$$f_{ 3+i }=h_{ H }(i),\qquad i=1,2,,20$$
(4)

Composition features

The color features depicted in "Color features" section show the overall visual characteristics of paintings. The composition of paintings can differ by style. By analyzing the structural elements of paintings, the composition of paintings can be characterized through each local feature. In order to extract the structural elements of paintings, the objects within the image must be distinguished. There are diverse segmentation methods to distinguish the interested objects in image processing. In this paper, the segmentation method is used for images as follows in order to separate each painting.

Assuming that the objects within the image will be expressed in similar colors, the dominant color is first extracted from the image. Dominant color is extracted through conducting the k-means clustering by HSV components. After the magnification by cluster to extract the location of the objects divided by color, the image is segmented by cluster using the morphology and connected component computation. Then the higher 50 % in size is selected and used as markers of watershed segmentation. The watershed segmentation is carried out through the marker developed for each cluster to finally segment the objects in the painting image. Figure 1 shows the process of segmentation of color-based objects in paintings using the watershed algorithm proposed in this paper.
Fig. 1

Illustration of the proposed watershed segmentation method

Composition features consist of shape and color features by segment. In order to extract the shape feature, the variance and skewness for segment, the top three largest areas are calculated respectively. The following equations are used to extract the 12 different shape features.
$${ f }_{ 23+j }=\frac{ \sum _{ k\in { Region }_{ j } }^{ }{ { x }_{ k } } }{ { area\, of\, Region }_{ j } }$$
(5)
$${ f }_{ 26+j }=\frac{ \sum _{ k\in { Region }_{ j } }^{ }{ y_{ k } } }{ { area\, of\, Region }_{ j } }$$
(6)
$${ f }_{ 29+j }=\frac{ \sum _{ k\in { Region }_{ j } }^{ }{ \left[ { { (x }_{ k }-\bar{ x } ) }^{ 2 }+{ { (y }_{ k }-\bar{ y } ) }^{ 2 } \right] } }{ { area\, of\, Region }_{ j } }$$
(7)
$${ f }_{ 32+j }=\frac{ \sum _{ k\in { Region }_{ j } }^{ }{ \left[ { { (x }_{ k }-\bar{ x } ) }^{ 3 }+{ { (y }_{ k }-\bar{ y } ) }^{ 3 } \right] } }{ { area\, of\, Region }_{ j } }$$
(8)
where \(j\,(j=1,2,3)\) is the index of the largest three regions and \(({ x }_{ k },{ y }_{ k })\) is the normalized coordinates of a pixel and \((\overline{ x } ,\overline{ y } )\) is the normalized coordinates of the center of mass in the corresponding region.
For the extraction of color features, the average of each component of HSV for the segment, the top five largest area, is calculated and used. The following equations are used to extract the 12 different color features of a segment.
$${ f }_{ 35+j }=\frac{ \sum _{ (m,n)\in { Region }_{ j } }^{ }{ { I }_{ H }(m,n) } }{ { area\, of\, Region }_{ j } }$$
(9)
$${ f }_{ 40+j }=\frac{ \sum _{ (m,n)\in { Region }_{ j } }^{ }{ { I }_{ S }(m,n) } }{ { area\, of\, Region }_{ j } }$$
(10)
$${ f }_{ 45+j }=\frac{ \sum _{ (m,n)\in { Region }_{ j } }^{ }{ { I }_{ V }(m,n) } }{ { area\, of\, Region }_{ j } }$$
(11)
where \(j\,(j=1,2,3,4,5)\) is the index of the largest five regions.
Table 1 shows the 50 features extracted from "Color features" and "Composition features" section. All feature values have a normalized value from 0 to 1 regardless of the image definition. The 50 feature values are created into vector, which are then used as input vectors to be explained in "Classification and visualization" section.
Table 1

The proposed features for painting image

Category

Feature

Characteristics

Meaning of feature

Global

\(f_{ 1 }\)

Color

Average hue for the whole image

\(f_{ 2 }\)

Color

Average saturation for the whole image

\(f_{ 3 }\)

Color

Number of quantized hues

\(f_{ 4 }\)\(f_{ 23 }\)

Color

Hue distribution

Local

\(f_{ 24 }\)\(f_{ 26 }\)

Composition

Horizontal coordinate of the mass center

\(f_{ 27 }\)\(f_{ 29 }\)

Composition

Vertical coordinate of the mass center

\(f_{ 30 }\)\(f_{ 32 }\)

Composition

Mass variance for the segment

\(f_{ 33 }\)\(f_{ 35 }\)

Composition

Mass skewness for the segment

\(f_{ 36 }\)\(f_{ 40 }\)

Color

Average hue for the segment

\(f_{ 41 }\)\(f_{ 45 }\)

Color

Average saturation for the segment

\(f_{ 46 }\)\(f_{ 50 }\)

Brightness

Average brightness for the segment

Classification and visualization

In this paper, the self-organizing map (SOM) proposed by Kohonen is used in order to classify the styles using the extracted features from "Feature extraction for art paintings" section. As the unsupervised learning method, in SOM, the input areas of learning samples are represented in the map through the competitive learning process. The benefit of this method is its usefulness in visualizing the high-dimensional data into low-dimensional view. In this paper, such benefit of SOM is used to classify the paintings and connect them to the map for visual expression, and to be utilized in analyzing the correlation for each painting style.

Figure 2 illustrates the process of training and classifying SOM using the feature values extracted in "Feature extraction for art paintings" section. Normalized feature values between 0 and 1 are extracted from each learning data, which are developed into vector to be used as input vectors for SOM. SOM is trained using the developed input vectors. Once the learning process is completed, the best matching unit (BMU) for each input data is found and the class of each node is designated using the class number. At the same time, the painting image list connected to each node is saved.
Fig. 2

Training and classification processes using SOM

Figure 3 shows the actual execution screen the developed visualization tool that uses SOM. Through learning and classification, each node of the map is connected to the painting image list. Selecting the node of the map segment in the upper left hand corner of the tool prints out painting images connected to the node selected in the upper right hand corner. The number shown in the mode of the map shows the class number of each node. In the lower left hand corner, the detailed information of the image list connected to each node is shown, and diverse information needed for analysis is printed in the lower right hand corner.
Fig. 3

Execution screen of the developed tool: a parameter settings window of SOM, b train results analysis screen

Experiments

Image dataset

In order to conduct experiments on the four different painting styles of expressionism, impressionism, post-impressionism, and surrealism, representative painters of each style have been selected randomly. The images of their paintings were used as experiment data. In this paper, a total of 1633 pieces of artwork painted by 19 painters have been collected from [18]. The collected images are of different resolutions and are shown in Fig. 4. Table 2 shows the number of data by style and painter that have been collected for the experiments. From the collected data, randomly selected 50 % for each painter was used for training, and the remaining 50 % was used for testing.
Fig. 4

Collected image samples by painting style: a expressionism, b impressionism, c post-impressionism, d surrealism samples

Table 2

Number of data by style and painter

Style

Painter

Number of images

Expressionism

Edvard Munch

40

Ernst Ludwig Kirchner

68

Franz Marc

52

Max Beckmann

20

Wassily Kandinsky

65

Impressionism

Alfred Sisley

43

Claude Monet

196

Edgar Degas

100

Edouard Manet

106

Pierre-Auguste Renoir

105

Post-impressionism

Georges Seurat

32

Paul Cezanne

102

Paul Gauguin

150

Paul Signac

24

Vincent van Gogh

346

Surrealism

Joan Miro

94

Max Ernst

46

Rene Magritte

18

Salvador Dali

26

Learning results

For the input data, the best matching unit (BMU) is found, and the class number of the input data is voted. The class number is designated by the painting style of the input data. The class number with the most votes within each node is designated as the class number of the node. In this paper, the train performance is evaluated according to the data ratio that matches the class number of nodes after inputting the learning data. Binary classification method of pairing the four styles into two pairs was used in this paper. Table 3 shows the train performance of each experiment. Although there are differences depending on the class pair, the average of the train precision is about 0.93.
Table 3

Learning performance using train dataset by painting style combination

Class pair

Precision

Expressionism/impressionism

0.95

Expressionism/post-impressionism

0.91

Expressionism/surrealism

0.95

Impressionism/post-impressionism

0.88

Impressionism/surrealism

0.94

Post-impressionism/surrealism

0.93

Classification results analysis

The classification performance is verified by inputting the test data in the learning map. Figure 5 shows that the paintings have been well classified by style, and Table 4 shows the classification performance for the test data. As shown in Fig. 5a, paintings of expressionism are clustered at the right-bottom on the map that is result for expressionism and post-impressionism. Likewise, as shown in Fig. 5b, paintings of expressionism are clustered at the top-center for the pair of expressionism and Impressionism. Of course, the location of each cluster may be changed according to the training result of the SOM. However, the visualization results can be seen that the art paintings with the similar features are clustered by similar locations on the map.
Table 4

Classification performance using painting style combination test dataset

Class pair

Precision

Expressionism/impressionism

0.95

Expressionism/post-impressionism

0.91

Expressionism/surrealism

0.93

Impressionism/post-impressionism

0.85

Impressionism/surrealism

0.93

Post-impressionism/surrealism

0.93

Fig. 5

Samples of classification by painting style through the dispersion on the map: a result of Expressionism and Post-Impressionism, b result of Expressionism and Impressionism

Conclusions

In this paper, we proposed the method of classifying paintings by style. Through the statistic computation, the features of the paintings have expressed in objective figures of pixels. 50 features that show the global and local features have been extracted and the extracted feature values have been clustered using the unsupervised learning method. Experiments have verified that the paintings can be classified by style, and SOM was visualized to enable the analysis of the correlation of painting styles of the art pieces.

Since painter limits the number of art pieces, there is the difficulty of establishing a vast amount of data. Therefore, different types of images digitalized in diverse environments should be collected to increase the learning performance by increasing the amount of data. In the future, more features should be extracted, and the classification performance should be enhanced using the machine learning method. Through increased research, scientific features to classify the painting styles should be suggested, and this can be used as the base data for research on art history and aesthetics. This can also be the base research for a system that suggested paintings of similar style or paintings by the same author when a certain image is shown.

Declarations

Authors’ contributions

SGL and EYC designed the study, developed the methodology, collected the data, performed the analysis, and wrote the manuscript together. Both authors read and approved the final manuscript.

Competing interests

The authors declare that they have no competing interests.

Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made.

Authors’ Affiliations

(1)
Dept. Electrical and Computer Engineering, Pusan National University

References

  1. Barni M, Pelagotti A, Piva A (2005) Image processing for the analysis and conservation of paintings: opportunities and challenges. IEEE Signal Process Mag 22:141–144View ArticleGoogle Scholar
  2. Lyu S, Rockmore D, Farid H (2004) A digital technique for art authentication. Proc Natl Acad Sci USA 101(49):17006–17010View ArticleGoogle Scholar
  3. Barni M, Pelagotti A, Piva A (2005) Image processing for the analysis and conservation of paintings: opportunities and challenges. IEEE Signal Process Mag 22(5):141–144View ArticleGoogle Scholar
  4. Ivanova K, Stanchev P, Velikova E, Vanhoof K, Depaire B, Kannan R, Mitov I, Markov K (2012) Data engineering and management: second International Conference, ICDEM 2010, Tiruchirappalli, India, July 29–31, 2010. Revised selected papers, ch. Features for art painting classification based on vector quantization of MPEG-7 descriptors. Springer, Berlin, pp 146–153View ArticleGoogle Scholar
  5. Rigau J, Feixas M, Sbert M (2008) “Informational dialogue with van gogh’s paintings”, in Proceedings of the fourth eurographics conference on computational aesthetics in graphics, visualization and imaging, computational aesthetics’08. Eurographics Association, Aire-la-VilleGoogle Scholar
  6. Saleh B, Elgammal A (2015) A unified framework for painting classification. IEEE International Conference on Data Mining Workshop (ICDMW), p. 1254–1261
  7. Vieira V, Fabbri R, Sbrissa D, da Fontoura Costa L, Travieso G (2015) A quantitative approach to painting styles. Phys A Stat Mech Appl 417:110–129View ArticleGoogle Scholar
  8. Li J, Yao L, Hendriks E, Wang JZ (2012) Rhythmic brushstrokes distinguish van Gogh from his contemporaries: findings via automated brushstroke extraction. IEEE Transact Pattern Anal Mach Intell 34(6):1159–1176View ArticleGoogle Scholar
  9. Berezhnoy I, Postma E, van den Herik J (2007) Computer analysis of Van Gogh’s complementary colours. Pattern Recognit Lett 28:703–709View ArticleGoogle Scholar
  10. Johnson CR, Hendriks E, Berezhnoy IJ, Brevdo E, Hughes SM, Daubechies I, Li J, Postma E, Wang JZ (2008) Image processing for artist identification. IEEE Signal Process Mag 25(4):37–48View ArticleGoogle Scholar
  11. Lombardi TE (2005) The classification of style in fine-art painting. PhD thesis, Pace University, New York
  12. Jafarpour S, Polatkan G, Brevdo E, Hughes S, Brasoveanu A, Daubechies I (2009) "Stylistic analysis of paintings using wavelets and machine learning," in signal processing conference, 17th European. pp. 1220–1224
  13. Zujovic J, Gandy L, Friedman S, Pardo B, Pappas TN (2009) "Classifying paintings by artistic genre: an analysis of features and classifiers," multimedia signal processing, 2009. IEEE International Workshop on MMSP ’09, pp. 1–5
  14. Cetinic E, Grgic S (2013) Automated painter recognition based on image feature extraction," in ELMAR, 2013 55th International Symposium, pp. 19–22
  15. Shamir L, Macura T, Orlov N, Eckley DM, Goldberg IG (2010) Impressionism, expressionism, surrealism: automated recognition of painters and schools of art. Transact Appl Percept 7:8Google Scholar
  16. Li C, Chen T (2009) Aesthetic visual quality assessment of paintings. IEEE J Sel Top Signal Process 3(2):236–252View ArticleGoogle Scholar
  17. Martinel N, Micheloni C, Foresti GL (2013) Robust painting recognition and registration for mobile augmented reality. IEEE Signal Process Lett 20(11):1022–1025View ArticleGoogle Scholar
  18. Kren E, Marx D Web gallery of art. http://www.wga.hu/index1.html

Copyright

© Lee and Cha. 2016