International Journal on Science and Technology

E-ISSN: 2229-7677     Impact Factor: 9.88

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 16 Issue 2 April-June 2025 Submit your research before last 3 days of June to publish your research paper in the issue of April-June.

From Web to File: Creating a Scraper for Structured E-commerce Product Data

Author(s) Mustafa Sultan, Manavlal Nagdev, Md Muaviya Ansari, Maya Baniya, Vineeta Rathore
Country India
Abstract The acquisition of organized product data continues to be a crucial obstacle in the dynamic world of e-commerce. This problem is made worse by the growing complexity of contemporary websites, which include dynamic content and anti-scraping features. By addressing the shortcomings of current approaches, this paper offers a thorough methodology for creating a reliable web scraper designed especially for Indian e-commerce platforms. To efficiently handle static as well as dynamic material, the suggested approach incorporates Beautiful Soup and Selenium with Flask and React.js. Overcoming anti-scraping mechanisms, guaranteeing data accuracy through sophisticated preprocessing approaches, and offering actionable insights through data visualization are some of the research's main accomplishments. This study also includes scalability to manage big datasets across various e-commerce platforms, ethical scraping methods, and compliance with robots.txt instructions. The scraper's ability to extract, clean, and analyze data is confirmed by experimental findings, providing a scalable and morally sound option for automated e-commerce data extraction.
Keywords Web scraping, e-commerce, data preprocessing, Selenium, Beautiful Soup, data visualization, anti-scraping techniques, scalability, ethical scraping
Field Computer > Data / Information
Published In Volume 16, Issue 2, April-June 2025
Published On 2025-05-24
DOI https://doi.org/10.71097/IJSAT.v16.i2.5423
Short DOI https://doi.org/g9mq73

Share this