Check for Valid URLs Rapidly - Using Rotating IP and Proxies - Mac OS

Cerrado Publicado hace 7 años Pagado a la entrega
Cerrado Pagado a la entrega

I need a piece of software, preferably desktop based for Mac OS, that cycles through an increasing integer and enters it into a URL. The software should check if the URL leads to a valid product page, and if so, saving the integer in a list within the software's GUI.

The general template would require the base site address, and the starting integer. Both numbers defined by the user. For instance:

http://(sitenamehere)/cart/(integer here):1

So (sitenamehere) would be the site we're scraping info on, and (integer here) would be the number we are starting at. The software should increase by one integer for every check, and should use Rotating IPs and proxies to avoid bans. I would like to software to perform these checks as quickly as possible, looking to check about 800k-1 million product pages per day.

PHP Arquitectura de software Extracción de datos web

Nº del proyecto: #11128270

Sobre el proyecto

7 propuestas Proyecto remoto Activo hace 7 años

7 freelancers están ofertando un promedio de $152 por este trabajo

e3d

Hi, it's not a problem to make such software, however sites usually ban your ip after few hundred-thousand attempts. this can be avoided by using proxies, but it's not cheap to hold your own proxies, so I can quote yo Más

$147 USD en 3 días
(344 comentarios)
9.0
phpXpertbd

Dear Sir, I'm very much delighted to let you know that i did data scraping with PHP-cURL, PhantomJS, Node.js, Selenium from many sites. I just scraped the data from web site and then wrote the data in mysql database Más

$250 USD en 5 días
(101 comentarios)
7.5
hAbd0u

Hello, You project were very clear, but I have some questions: 1) "The general template would require the base site address, and the starting integer. Both numbers defined by the user. For instance: http://(sitename Más

$190 USD en 10 días
(7 comentarios)
4.5
developers786

because I have developed the scraper before. for yellowpages, justdial and dx.com. auto increment in pages to scrape next page data and proxy script is implemented. I can provide you the software after minor changes. b Más

$133 USD en 10 días
(6 comentarios)
3.1
paristracy

A proposal has not yet been provided

$155 USD en 1 día
(0 comentarios)
0.0