72 Spirituality Studies 11-2 Fall 2025 A qualitative and quantitative evaluation conducted by OpenAI (OpenAI 2024, 44–60) showed that one of their most popular models (i.e., GPT-4) carries various risks including hallucinations, harmful content, harms of representation, allocation, and quality of service, disinformation and influence operations, proliferation of conventional and unconventional weapons, privacy breaches, and cybersecurity. Although they reduced hallucinations significantly (45– 65%) in their most recent flagship model (i.e., GPT-5), it still makes factual errors (Cirra AI 2025, 7, 14) and it “doesn’t truly understand or reason like a human in an unbounded way, it can make mistakes or weird outputs, and it requires massive compute.” (Cirra AI 2025, 30). However, this is not a unique case of OpenAI but a generic problem of all contemporary AI models. Hallucination, for example, is the production of a nonsensical or untruthful content. It can be particularly harmful when models get more and more convincing, and their response is perceived as truthful. Building trust in such a model can bring overreliance that later turns out to be unjustified. This hallucination is especially dangerous, when one is not an expert of a topic and cannot see which part of the content is misleading. For this reason, an extra software layer for “grounding” is introduced by AI companies that tries to eliminate or at least minimize such diversions. The root of the problem, however, lies in the rush to release new models and improvements without fully fixing earlier issues. To better understand how AI works, we give a short introduction to large language models (LLM), a type of machine learning model used today in solutions such as ChatGPT, Google Gemini and Anthropic Claude. 3.1 A Short Introduction to Large Language Models A large language model (LLM) in AI is a type of machine learning model using giant artificial neural networks trained on large amounts of data to “understand” and “generate” human language. It’s built up on the prediction of one word based on the previous ones within a sentence. This is done by assigning probabilities to the sequences of words, word fractions or symbols that are called tokens altogether. It simply learns the relationship between words based on the provided training data. The models thus learn the likelihood of different words following other words. This allows them to determine if a sentence is fluent and natural. Transformer models (Vaswani et al. 2017, 3–4; StatQuest with Josh Starmer 2023) like the popular GPT architecture, which is pre-trained to predict the next token in a document, is the state-of-the-art for many natural language processing (NLP) tasks such as machine translation, text generation, speech recognition, and more. The following sections highlight some of the shortcomings that LLMs must first ward off to comply with trustworthiness (Han et al. 2025, 82). 3.2 The Alignment and Moderation Problem of Using AI “The alignment problem… refers to the potential discrepancy between what we ask our algorithms to optimize and what we actually want them to do for us; this has raised various concerns in the context of AI ethics and AI safety” (Murphy 2022, 28). Researchers have proposed implementing Reinforcement Learning from Human Feedback (RLHF) to better align LLMs with human values, intentions and preferences (Ouyang et al. 2022; Peng et al. 2023; OpenAI 2023; Brown et al. 2020) and prevent undesired behaviours. For example, Microsoft’s Bing chatbot (Microsoft n.d.) sparked public concern shortly after its release due to some disturbing responses, leading Microsoft to restrict the chatbot’s interactions with users (Greshake et al. 2023, 2). At the same time, an AI agent equipped with moderation and alignment to human values can act as a double-edged sword. On the one hand, it may help filter out disruptive content gathered from unreliable sources. On the other hand, when an AI agent serves as a spiritual guide, it may inadvertently distort concepts or ideas from original scriptures due to misinterpreting the author’s intent. This issue can easily arise because ancient scriptures often emphasize conflicts and wars that sometimes involve acts of cruelty such as Bhima’s actions in the Mahabharata. GitaGPT (GitaGPT n.d.), for example, condoned violence and claimed that killing someone is entirely fine if it is one’s dharma or duty (Shivji 2023). Without proper context, understanding, and alignment, the authentic message of such stories can easily become obscured, altered, or even completely distorted, potentially leading aspirants toward a fundamental misunderstanding of the genuine transcendental teachings.
RkJQdWJsaXNoZXIy MTUwMDU5Ng==