Next Page on Crunchbase - Gets to 20 then restarts at 2

Hi,

I created a recipe to scrape the data from a Crunchbase search that has 22k results.

Crunchbase show 50 rows per page hence, 22,000/50=440 pages.

I selected my rows and cols and the easy Nav finder easily found the “Next” on the page and I set it going.

After trying it once or twice and not getting the output I was expecting, I was watching to the log file (see below) and noticed that it would progress till it reached PageID=20, and then restart at 2.

So its not a factor of the Nav/Next page not working (It goes from 1 to 20 OK) and its not at the roll over from Pg9 to 10 (1 to 2 digit) or Pg 99 to 100 (2 to 3 digit) so I am perplexed as to why DataMiner is doing this.

So now I have burnt through a bunch of my “page per month in my subscription” to re-scrape the same page, and I have not got the data I need.

(Trying to post this I get a restriction of only 2 URLs in a post so have replaced https://www.crunchbase.com/ with … below)

Has anyone seem this behaviour before and know how to fix it please?

Thanks

Ian

Here is the output from the Log

Scraping logs

Page: “…discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499”

22:03:15 Not Started

22:03:25 Scrape Requested.

22:03:25 Scraping Data: Crunchbase Sg Private Company

22:03:25 Scraped Page.

22:03:27 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=2_a_d94ecad4-dc36-4bfa-9f28-461c9b1244b4”

22:03:37 Scraped Page.

22:03:39 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=3_a_00a1b342-5f4d-4dbc-0687-dc4c5ef900bc”

22:03:52 Scraped Page.

22:03:53 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=4_a_d75aee19-b425-1a27-eaa9-18cd45c821b8”

22:04:07 Scraped Page.

22:04:08 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=5_a_9cc14b9c-da70-2da3-3643-9056ea285e07”

22:04:21 Scraped Page.

22:04:23 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=6_a_61aadb9c-a60d-45d9-976b-cdcf71700604”

22:04:36 Scraped Page.

22:04:39 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=7_a_abb252ed-9deb-4f84-87c3-42b2e59597f4”

22:04:52 Scraped Page.

22:04:55 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=8_a_0096880d-1427-ada2-c5c4-6996f2f6639d”

22:05:06 Scraped Page.

22:05:08 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=9_a_eff4a653-b90d-4b02-a551-abebd10f5b4e”

22:05:20 Scraped Page.

22:05:21 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=10_a_9b49dc21-99a4-4ba2-bb6d-d7295004f5d3”

22:05:35 Scraped Page.

22:05:36 Waiting…

Page: …/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=11_a_78dfd53e-4519-4bde-a6e8-f39e92c58ceb"

22:05:46 Scraped Page.

22:05:48 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=12_a_aedfe7d0-0348-4d5c-ba93-743b36a6f71a”

22:06:00 Scraped Page.

22:06:01 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=13_a_114bcec0-35d8-462a-88e7-d66055abcef6”

22:06:11 Scraped Page.

22:06:13 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=14_a_cdfe812d-8f2e-47ad-bdc4-f2e391dcdb96”

22:06:23 Scraped Page.

22:06:24 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=15_a_f4d737cf-a282-42b5-89ae-5f4aec6cb6b0”

22:06:38 Scraped Page.

22:06:40 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=16_a_8df40707-365a-4121-97a8-feec5c631bfd”

22:06:51 Scraped Page.

Page: "…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=17_a_6385c794-4a95-40e0-a90b-667fe7917985

22:06:53 Waiting…

22:07:04 Scraped Page.

22:07:07 Waiting…

Page:" …/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=18_a_085714a3-e5a4-408e-9601-94e3324528da"

22:07:18 Scraped Page.

22:07:20 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=19_a_34f18b86-9aa2-456c-94df-1731ca4dcb99”

22:07:32 Scraped Page.

22:07:33 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=20_a_cb46bb4c-7023-4ae2-9f17-48c8f7aa5788”

22:07:48 Scraped Page.

22:07:49 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499”

22:08:02 Scraped Page.

22:08:03 Waiting…

Page: …/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=2_a_d94ecad4-dc36-4bfa-9f28-461c9b1244b4"

22:08:16 Scraped Page.

22:08:18 Waiting…

Page: “…/discover/saved/sg-private-companies/efc04efd-e14e-45df-99c2-5dd2089f4499?pageId=3_a_00a1b342-5f4d-4dbc-0687-dc4c5ef900bc”