Find Jobs
Hire Freelancers

Scrape a website & insert into database & perform some tasks with the information

$250-750 CAD

Cerrado
Publicado hace casi 8 años

$250-750 CAD

Pagado a la entrega
I need someone to write some software that will archive every listing posted on a particular website and use that information as described in the features section of this post. Basic logic of program: 1. Send a request to a website that returns listings in xml format 2. Check each listing against a mysql database 3. Send a web request to each new listing individually to get all the information 4. Features 1,2,3 (Explained in detail below) 5. Upload images from the listings to amazon S3 6. Add the information for each listing to a mysql database 7. Sleep before looping back to step 1 (Read feature 4) Limitations: The website is limited to a 20 listings at a time (Step 1). If all new listings are found, keep sending web requests for the next page of listings until previous listings are found, so no listings are missed. (During peak times it is possible for more than 20 listings to be posted between the minimum sleep period of 2 minutes) Features: 1. Create a table that tracks listings that are from the same user (by using two values found in the listing). Keep a tally of how many listings that user has posted and a tally of how many of those listings are unique (I suggest this is done on a separate thread as to not slow down the scraping). 2. If enabled, check each new listing's price against comparable listings on another website (web request to an api), and calculate the average value for comparable listings using the archive of listings in my database. Use some math calculations to decide if the listing is undervalued by a configurable amount/percent and send an alert (Amazon SNS and database entry). (This must be done on a separate thread as to not slow down the scraping) 3. Check each listing against search criteria, which can be configured by adding rows of criteria to a mysql database, and send an alert (Amazon SNS and database entry) if a new listing satisfies that criteria. (This will be simple criteria, such as if the listings price is >100, or if the listing is a specific model, etc). (This must be done on a separate thread as to not slow down the scraping) 4. Adjust the sleep time automatically as to minimize the amount of pages requested before finding previous listings (Explained in limitations). With a minimum sleep time of 2 minutes, a maximum of 15 minutes from 7AM - 11PM, and a maximum of 2 hours from 11PM-7AM, before looping. 5. Once daily check each active listing in the database against the website to see if the listing has been updated, or if the listing has been deleted. If it has been updated, save the changes to the database as a new row. If it has been deleted, change the status in the database so the listing will not be checked again. (I suggest this be a separate script ran by a cron job). Requirements: 1. Must run on a linux server 2. Error Handling (Website down, website responds with unexpected data, etc) 3. Log activity/errors in a text file. Send an alert if errors occur (Amazon SNS and entry into database) Program can be coded in any language that can run on a linux vps and take advantage of the multiple ip addresses the server has. PHP would be preferred.
ID del proyecto: 10186611

Información sobre el proyecto

11 propuestas
Proyecto remoto
Activo hace 8 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos
11 freelancers están ofertando un promedio de $386 CAD por este trabajo
Avatar del usuario
Hi, I have read the description & would like to discuss.. I have good web scraping experience & reviews. & can develop web scraping scripts in Python & C# Hope we can discuss details..
$250 CAD en 3 días
5,0 (149 comentarios)
6,8
6,8
Avatar del usuario
We are a team (19 operator and 2 Quality checker)here from last 4 year giving all research service world wide with best quality output , I have gone through your project description, It is really a interesting job, and our operator are experienced enough in research skill so they easily can collect the data from several source, from a deep investigation, but its bit time consuming job not a copy paste. We would like to talk in details and give the total structure about how we ll do this job if you need. LETS TALK HERE FOR DUSCUSING THE JOB Thanks Dg
$250 CAD en 10 días
4,7 (221 comentarios)
7,1
7,1
Avatar del usuario
I have reviewed your bid request and I am very interested in your project. I was trained overseas and have an extensive customer service record so contact me so we can discuss further or begin. I work in milestones and the "payment for time" option. If payment is by deliverables, then the milestones are 50% payment once the initial work/draft is done and the remaining can be paid if/when revisions are needed and completed. Bonuses welcomed and much appreciated. I've done many jobs on freelancer.com and hope for many with you and if nothing else add me to your coder list and notify me of your future jobs. Thanks.
$261 CAD en 7 días
2,5 (1 comentario)
1,7
1,7
Avatar del usuario
I have great expertise in web scraping in PHP. I have built up a personal library that lets me accomplish every request easily. I can handle sessions, proxies and avoid anti-scraping controls.
$250 CAD en 3 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
I am New to Freelancer. But i have been working with a company and was working really good i have made few apps and done like more than 1K data entry projects and i have typing speed almost 95 WPM and can assure to complete your work and provide you with the best i can .
$277 CAD en 10 días
0,0 (0 comentarios)
0,0
0,0
Avatar del usuario
A proposal has not yet been provided
$555 CAD en 10 días
0,0 (0 comentarios)
0,0
0,0

Sobre este cliente

Bandera de BANGLADESH
Bangladesh
0,0
0
Miembro desde abr 11, 2016

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.