International Journal on Science and Technology

E-ISSN: 2229-7677     Impact Factor: 9.88

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 17 Issue 1 January-March 2026 Submit your research before last 3 days of March to publish your research paper in the issue of January-March.

Proximal Policy Optimization (PPO)–Driven Reinforcement Learning Model for Automatic Stock Trading using the combination of Trend–Volume–Volatility Integration

Author(s) Mr. Suryansh Kumar, Mr. Arup Kadia, Mr. Aditya Sharma, Mr. Rajraushan Kumar
Country India
Abstract The stock market trading is an area full of uncertainties as price changes happen very often, and the markets can be noisy; besides, market volatility can change too, which in turn causes unreliable signals when solely relying on a single technical indicator. This paper proposes the implementation of a strong and dependable multi-indicator trading system that combines the use of Simple Moving Average Crossover (SMAC) for identifying trends, Average Traded Volume (ATV) for validating participation of the market, and Bollinger Bands (BB) for confirmation of prices based on the volatility level. The SMAC indicator forms the backbone of the buy and sell signals, and the support of the trading activity represented by the ATV helps to reduce the risk of false breakouts. The use of Bollinger Bands helps to prevent trading activity during the extreme overbought or oversold market conditions and thus improves the market volatility-based timing of the entry and exit. Moreover, the proposed indicator framework can be further improved by a reinforcement learning agent based on Proximal Policy Optimization (PPO) that interacts with the historical market data in order to learn the best trading actions. The reinforcement learning model takes into account the trend direction, volume strength, volatility position and current portfolio status to make the decision of whether to buy, sell or hold. The historic stock market data-based experimental evaluation shows that the suggested SMAC–ATV–BB–PPO plan results in better trade accuracy, fewer false signals and best risk-adjusted returns in comparison with the conventional SMAC-based trading strategies.
Keywords Reinforcement Learning, Proximal Policy Optimization, Stock Trading, Simple Moving Average Crossover, Average Traded Volume, Bollinger Bands.
Field Computer > Artificial Intelligence / Simulation / Virtual Reality
Published In Volume 17, Issue 1, January-March 2026
Published On 2026-01-16

Share this