When we look at a painting—say, a centuries-old depiction of a funeral—the human mind undergoes a complex, deeply personal process. We don’t simply "sense" an emotion like a thermometer reading temperature. We infer. We synthesize cultural context, personal history, and the subtle interplay of light and shadow to construct a feeling. It might be grief, or perhaps a strange, quiet relief. But when we ask the world’s most advanced artificial intelligence to label that same artwork, the response is often startlingly uniform: "Calm."
This isn’t just a quirk of a single model. It is a systematic, algorithmic flattening of human culture. As AI systems become the primary interfaces through which we consume information, their inability to grasp the nuance of human emotion—and their tendency to default to Western-centric, low-arousal labels—threatens to standardize our understanding of art, history, and ourselves.

The Genesis of a Default Setting
The issue came to the forefront in the summer of 2025, when researchers in Taiwan released "EmoArt," a massive dataset comprising 132,664 images. Designed to train models for art therapy applications, the dataset faced a logistical bottleneck: it was impossible for human psychologists to manually annotate every image. The researchers turned to GPT-4 to bridge the gap, subsequently reporting a 91.47% alignment between the model’s labels and human expectations.
However, a closer inspection reveals a sobering reality. According to the EmoArt data, a staggering 55.95% of all artworks are labeled as "calm." When subjected to statistical analysis, this result is not a reflection of the diversity of the art world; it is a mathematical anomaly. A chi-square test returns a result of 449,027, an astronomical figure that confirms the bias is not random—it is systemic.

A Chronology of Algorithmic Reductionism
To understand how we arrived at a point where AI views nearly 60% of human artistic expression as "calm," we must look at the evolution of these models:
- Pre-2023: AI models were primarily text-based, struggling to bridge the gap between visual aesthetics and semantic labels.
- 2024: The rise of sophisticated multimodal models allowed for the integration of vision and language, leading to the creation of massive, crowdsourced or automatically labeled datasets like LAION-5B.
- 2025: The EmoArt project attempts to quantify emotional resonance in art, using GPT-4 as the primary arbiter. This marks the moment where automated "sentiment analysis" becomes a benchmark for cultural interpretation.
- 2026: Independent researchers, including data scientist Nastassia Shaveika, begin conducting comparative audits of these models, uncovering the "calm" default and the lack of cultural nuance in non-Western art styles.
The Anatomy of the Bias: Valence and Arousal
The problem lies in the core affect theory used to train these systems. Emotions are mapped onto a two-dimensional plane: valence (positive vs. negative) and arousal (energetic vs. not). The model has essentially learned that the vast majority of art is "safe."

Data analysis of the EmoArt set reveals that 87.9% of artworks are assigned a positive valence, while 76.4% are categorized as low arousal. The intersection of these two categories is, by definition, "calm." The AI is essentially trained to categorize art that is "happy, but not too happy," creating a sterilized, middle-of-the-road emotional landscape that ignores the chaotic, raw, and often uncomfortable reality of human creative output.
The Specter of Orientalism
The bias becomes even more pronounced when we move away from Western art styles. Edward Said’s concept of "Orientalism"—the projection of Western myths and hierarchies onto the East—is alive and well in the architecture of modern LLMs.

When researchers split the EmoArt dataset by origin, the disparity is stark. The model has learned seven times more color-emotion associations from Western art than from Eastern traditions. In Western traditions, red might signal alert or passion; in Chinese culture, it is the color of celebration and good fortune. Black, often associated with mystery or mourning in the West, represents discipline and mastery in Japanese ink wash painting.
Because the models have ingested so little non-Western art, they lack the "training" to interpret it. Consequently, Chinese ink paintings—which feature everything from sweeping landscapes to intense political satire—are labeled as "calm" nearly 90% of the time. The entropy of these labels—a measure of diversity—is exceptionally low for Eastern traditions, effectively meaning the AI has stopped classifying and started guessing based on a generic, Western-centric template.

Experimental Evidence: The "Calm" Default
In a controlled experiment involving 23 diverse artworks and several leading models (OpenAI’s GPT-4/5.1, Anthropic’s Claude Sonnet/Haiku 4.5, and Google’s Gemini 2.5 Flash), the tendency to default to "calm" was pervasive.
When shown Gérard Schneider’s Opus 110—a work defined by kinetic energy and heavy black forms—Claude Sonnet 4.5 described it as having "profound quietude." When shown Thomas Hart Benton’s Midwest, a painting of farmers during the Great Depression, the models largely missed the context of collective struggle, with several labeling the scene "tired" or "excited" while ignoring the nuance of the era’s socio-economic weight.

Perhaps most tellingly, when prompted to "not default to ‘calm’ or ‘contentment’," the models often struggled to articulate an alternative, sometimes failing entirely to capture the spirit resonance (qiyun shengdong) of traditional Chinese works.
Implications for Society: From Museums to Algorithms
The consequences of this bias are not merely academic. We are increasingly deploying AI-driven sentiment analysis in public and private spheres. The Cleveland Museum of Art’s "ArtLens" exhibition, for example, uses facial recognition to sort visitors’ expressions into fixed categories. By reinforcing the idea that human emotion can be mapped into a narrow, pre-defined grid, we are training ourselves to view our own complexity through the limited lens of a machine.

The "calm" label may seem harmless, but the automation of emotional interpretation carries significant risks:
- Cultural Homogenization: As AI becomes the gatekeeper for cultural metadata, non-Western artistic traditions risk being re-categorized into generic, Western-friendly emotional buckets.
- Diminished Emotional Literacy: By relying on AI to tell us how to "feel" about art, we risk losing the capacity for deep, independent, and perhaps uncomfortable reflection.
- Automated Bias: If these models are integrated into therapeutic or educational tools, they may misdiagnose, misinterpret, or invalidate the lived experiences of users whose emotional expressions do not align with the "calm" baseline.
A Call for Transparency
The developers of these models—OpenAI, Anthropic, Google, and others—have yet to provide a robust solution for the cultural and emotional myopia of their systems. As it stands, the AI acts as an oracle: it gives us an answer, and we are inclined to believe it.

The path forward requires a shift in how we build and audit these systems. We must move away from the "black box" approach to training and demand datasets that are as diverse as the human experience they claim to interpret. Furthermore, we must acknowledge that AI is not a neutral observer. It is a reflection of its training data—a snapshot of the Western-dominated internet.
As Nastassia Shaveika, whose research highlights these systemic failures, suggests, the goal is not to "fight" AI, but to be honest about its limitations. We must stop treating these models as objective truth-tellers. If we continue to accept the "calm" default, we aren’t just letting machines interpret art; we are allowing them to define the boundaries of human emotion. The next time you find yourself before a painting, ask yourself what you feel—before you let an algorithm tell you that it’s just "calm."

