In case you are planning to scrape an entire website or scrape data in the order of several hundred thousands of records then it is recommended that instead of attempting to mine and get the entire database in a single mining session, split the whole task to fewer manageable chunks - say of few thousand records each. Use the following methods for this.


1. You can change the starting URL of a configuration as explained at Edit Starting URL to make the configuration start mining at a different page than it was originally configured for.

2. Use the Auto Save Mined Data option (see Miner Settings) so that the mined data is not lost even if the program terminates unexpectedly due to any unknown reason. Also use the Inject Pauses during mining option in Miner Settings to avoid making continuous long time requests to the web server.

3. Since there is chance of target website blocking your IP due to large mining sessions, you may need to Scrape data anonymously via proxy servers or VPN to prevent prematurely aborted mining sessions.

4. In case you are using Category or Keyword scraping features with large number of keywords or category links you can split them via features to edit the keyword and to edit the URL list associated with the configuration.


Reference : https://www.webharvy.com/articles/howto.html