Crawl Domain Find Expred Domains -- 2

Cerrado Publicado hace 4 años Pagado a la entrega
Cerrado Pagado a la entrega

We are looking for a crawler to crawl every page of a website looking for external links pointing to expired domains.

User should definde a list of sites to crawl via text file. Crawler should work logically crawling all pages of a site and not be sitemap dependent. Only unique external domains should be logged to prevent duplicate domain availability lookups.

User should also be able to define a list of urls to ignore checking for availability; eg. [login to view URL] etc. these domains should be user defined in a blacklist text file.

Results should be given in a csv file listing linking domain and available domain.

Python Extracción de datos web

Nº del proyecto: #19765484

Sobre el proyecto

6 propuestas Proyecto remoto Activo hace 4 años

6 freelancers están ofertando un promedio de $28 por este trabajo

chirgeo

Hi. I did read the project description and have a few questions. 1. Do you need the script as well or data only? 2. What is the format of the output data? CSV is OK? We can do other formats as well. 3. Which fields do Más

$100 USD en 5 días
(110 comentarios)
7.3
smsaurabhv

‌Hi, I have gone through your requirement to scrape lots of websites. I am EXPERT in building scraping tools /scripts. Hence, I can SURELY work on your project. I am having 4 YEARS of EXPERIENCE in developing PHP-PYTHO Más

$15 USD en 3 días
(51 comentarios)
4.9
techlinesols6

Dear Prospect Hiring Manager. Thank you for giving me a chance to bid on your project. i am a serious bidder here and i have already worked on a similar project before and can deliver as u have mentioned "I can do th Más

$13 USD en 7 días
(1 comentario)
0.0
hienhdt32

I have experiment in crawling data using be4, scrapy,... with python, extract data to xml, json,... Contact me!

$20 USD en 2 días
(0 comentarios)
0.0
junadatar947

Hi there JUNA here. I understand that you need a crawler or A SPIDER for scraping expired domain but my question is will you provide the list of domains that you need to check for the availability otherwise this sho Más

$12 USD en 3 días
(0 comentarios)
0.0