
International Journal on Science and Technology
E-ISSN: 2229-7677
•
Impact Factor: 9.88
A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal
Plagiarism is checked by the leading plagiarism checker
Call for Paper
Volume 16 Issue 2
April-June 2025
Indexing Partners



















From Web to File: Creating a Scraper for Structured E-commerce Product Data
Author(s) | Mustafa Sultan, Manavlal Nagdev, Md Muaviya Ansari, Maya Baniya, Vineeta Rathore |
---|---|
Country | India |
Abstract | The acquisition of organized product data continues to be a crucial obstacle in the dynamic world of e-commerce. This problem is made worse by the growing complexity of contemporary websites, which include dynamic content and anti-scraping features. By addressing the shortcomings of current approaches, this paper offers a thorough methodology for creating a reliable web scraper designed especially for Indian e-commerce platforms. To efficiently handle static as well as dynamic material, the suggested approach incorporates Beautiful Soup and Selenium with Flask and React.js. Overcoming anti-scraping mechanisms, guaranteeing data accuracy through sophisticated preprocessing approaches, and offering actionable insights through data visualization are some of the research's main accomplishments. This study also includes scalability to manage big datasets across various e-commerce platforms, ethical scraping methods, and compliance with robots.txt instructions. The scraper's ability to extract, clean, and analyze data is confirmed by experimental findings, providing a scalable and morally sound option for automated e-commerce data extraction. |
Keywords | Web scraping, e-commerce, data preprocessing, Selenium, Beautiful Soup, data visualization, anti-scraping techniques, scalability, ethical scraping |
Field | Computer > Data / Information |
Published In | Volume 16, Issue 2, April-June 2025 |
Published On | 2025-05-24 |
DOI | https://doi.org/10.71097/IJSAT.v16.i2.5423 |
Short DOI | https://doi.org/g9mq73 |
Share this


CrossRef DOI is assigned to each research paper published in our journal.
IJSAT DOI prefix is
10.71097/IJSAT
Downloads
All research papers published on this website are licensed under Creative Commons Attribution-ShareAlike 4.0 International License, and all rights belong to their respective authors/researchers.
