Mining stops before the requested/total number of pages are scraped
This usually happens when WebHarvy is unable to load the next page of data by clicking the next page link selected during configuration. Please try the following.
- 1. Increase 'Page Load Timeout' and 'Script Load Wait Time' values in Miner Settings so that the page gets enough time to load all data before it is scraped. Increasing these values will slow down mining but will minimize page load time outs.
- 2. If the website displays separate links to load pages 1, 2, 3 etc., click on the direct link to load page number 2 and set it as the next page link.
- 3. Try URL based pagination or pagination via JavaScript.
- 4. When mining aborts before completion, you can click the Start button again (without closing the Miner window) and WebHarvy will try to resume mining from where it stopped.
- 5. You can also directly change the starting URL of the configuration so that mining starts at a different page (where it stopped) than it was originally configured for.
- 6. Also, websites can potentially block you if you access their pages via software for long time/data for data extraction. The solution here is to scrape via proxy servers or VPN so that you can remain anonymous and avoid getting blocked by websites. Try using proxy servers with WebHarvy.
- 7. Try after enabling the Disable cookies while mining option and/or Use separate browser engines for mining links option in Browser Settings.
- 8. In case you are trying to scrape a relatively large number of records please refer 'How to scrape large amounts of data ?'