Experiential Reinforcement Learning for Stock Trading: An LLM-Based Agent Comparison of Mistral and Qwen Without Gradient Updates

Sumedha Arya

doi:10.71097/IJSAT.v17.i1.10419

Experiential Reinforcement Learning for Stock Trading: An LLM-Based Agent Comparison of Mistral and Qwen Without Gradient Updates

Author(s)	Ms. Sumedha Arya
Country	India
Abstract	Large Language Models (LLMs) have proved themselves stronger in decision-making tasks. However, they often struggle in financial markets where information related to profits and losses are delayed and uncertain. In this study, we adapt the Experiential Reinforcement Learning (ERL) framework for single-asset stock trading using Dow Jones Industrial Average (DJIA) data and financial news from 2015 to 2020. Unlike traditional reinforcement learning (RL), our method does not update model weights; instead, learning happens through structured self-reflection, FAISS-based memory storage, and reusing successful trades as few-shot examples. We implement the ERL cycle—first decision, simulated outcome, reflection, improved second decision, and selective memory storage—using Mistral-7B-Instruct-v0.3 and Qwen2.5-7B-Instruct in a custom trading simulator with $10,000 initial capital. Results show that Mistral produced weak reward signals and ended with a −2.13% return, while Qwen stored more useful reflections and achieved around +2–3% return in partial runs, showing more stable improvement. Overall, the study highlights the importance of model capability and clean reward signals, and demonstrates that ERL can be an effective no-gradient alternative to traditional RL for delayed-reward financial trading tasks.
Keywords	Experiential Reinforcement Learning, Large Language Models, Stock Trading, Reflection, Prompt-based Learning, Zero-Gradient Learning
Published In	Volume 17, Issue 1, January-March 2026
Published On	2026-03-03
DOI	https://doi.org/10.71097/IJSAT.v17.i1.10419

View / Download PDF File

About IJSAT Fees & Payment Current Issue Publication Archive	Submit Research Paper Track Submission Status Publication Guidelines Publication Ethics Peer Review & Plagiarism	Join as a Reviewer Editors & Reviewers Reviewer Referral Program Get Reviewer Membership Certi.	Website/Journal Policies Usage Policy Content Policies Privacy Policy

Contact Us	Message on WhatsApp	+91-9687-182-185	editor@ijsat.org

International Journal on Science and Technology

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Experiential Reinforcement Learning for Stock Trading: An LLM-Based Agent Comparison of Mistral and Qwen Without Gradient Updates

Share this