International Journal on Science and Technology
E-ISSN: 2229-7677
•
Impact Factor: 9.88
A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal
Plagiarism is checked by the leading plagiarism checker
Call for Paper
Volume 17 Issue 1
January-March 2026
Indexing Partners
Proximal Policy Optimization (PPO)–Driven Reinforcement Learning Model for Automatic Stock Trading using the combination of Trend–Volume–Volatility Integration
| Author(s) | Mr. Suryansh Kumar, Mr. Arup Kadia, Mr. Aditya Sharma, Mr. Rajraushan Kumar |
|---|---|
| Country | India |
| Abstract | The stock market trading is an area full of uncertainties as price changes happen very often, and the markets can be noisy; besides, market volatility can change too, which in turn causes unreliable signals when solely relying on a single technical indicator. This paper proposes the implementation of a strong and dependable multi-indicator trading system that combines the use of Simple Moving Average Crossover (SMAC) for identifying trends, Average Traded Volume (ATV) for validating participation of the market, and Bollinger Bands (BB) for confirmation of prices based on the volatility level. The SMAC indicator forms the backbone of the buy and sell signals, and the support of the trading activity represented by the ATV helps to reduce the risk of false breakouts. The use of Bollinger Bands helps to prevent trading activity during the extreme overbought or oversold market conditions and thus improves the market volatility-based timing of the entry and exit. Moreover, the proposed indicator framework can be further improved by a reinforcement learning agent based on Proximal Policy Optimization (PPO) that interacts with the historical market data in order to learn the best trading actions. The reinforcement learning model takes into account the trend direction, volume strength, volatility position and current portfolio status to make the decision of whether to buy, sell or hold. The historic stock market data-based experimental evaluation shows that the suggested SMAC–ATV–BB–PPO plan results in better trade accuracy, fewer false signals and best risk-adjusted returns in comparison with the conventional SMAC-based trading strategies. |
| Keywords | Reinforcement Learning, Proximal Policy Optimization, Stock Trading, Simple Moving Average Crossover, Average Traded Volume, Bollinger Bands. |
| Field | Computer > Artificial Intelligence / Simulation / Virtual Reality |
| Published In | Volume 17, Issue 1, January-March 2026 |
| Published On | 2026-01-16 |
Share this

CrossRef DOI is assigned to each research paper published in our journal.
IJSAT DOI prefix is
10.71097/IJSAT
Downloads
All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.