The project was to fetch product information from the supplier website where there were no API available. eGrove developer was hired for data mining using Selenium based script. The script fetches product information & seller profile information from the supplier’s wholesale website. An algorithm was implemented in such a way that the script passed all the captcha & authentication layers and obeyed all policies. Using this automated product catalog building script the product information like Product category, name, Item ID, attributes, price, variations and images for specific seller store and seller profile information are fetched and added to database. The fetched product and seller information for seller store can also be exported as CSV file. The CSV format of product information can be customized to support product import in to eCommerce marketplace websites
- Algorithm to pass all captcha, seller login authentication layers.
- Data mining of product information by category available in the seller store.
- Images associated with products are built as URL and saved as server-to-server download.
- Seller store and Product information can be exported as a separate CSV compatible for Magento store import format.
- Product information and Seller details are mapped based on unique ID generated that provided a way for product association in Magento with sellers.
Magento 1.9, Selenium, Scrapy, CAPTCHA, CSV, MySQL, Beautiful Soup