I have a mid/large size data scraping project. It involves scraping roughly 3 million web pages from a large well know site. About 400,000 of these pages will require following a link on the page and scraping a second page that is related to the first page. The site lists the pages in sequential order (ie [login to view URL] then [login to view URL] etc.) so it is very easy to get the pages, there is no need to enter a search term. The site will likely block the scraping, so it is necessary that you understand how to use proxies to get around site blocking. This should be a pretty simple project. You will be required to use your own server and deliver the data, not the scraping code. If you have experience scraping, this should be easy, if not please do not reply to this post. After receiving your intial bid and looking at your background I will share more information.
The data will require some sorting and will be delivered in excel and mysql as well as potentially transferred into a server.