Web Scrape multiple retail sites

Cancelado Publicado Jul 21, 2006 Pagado a la entrega
Cancelado Pagado a la entrega

I have an existing marketplace website that is currently under development. I will like to webscrape multiple retail websites, cleanse the data and store the data in the database. I will be providing you with a copy of our database. See the attached usecase for detailed description of this functionaltiy. I need a very experienced team with multiple completed work references in web scrapping and data cleansing. I will need a clear and precise description on how you intend to implement the solution and what software packages (web scrapping and data cleansing) you intend to use. If I don't get work references of this kind of solution (web scrapping) and detailed description on how you intend to build the solution and the software languages and packages you intend to use, I will not look at your bid. I will need a timeframe when you intend to complete the job. This has to be a solid time frame. I will also need to know the number of full-time developers that will work on this project. I expect at least 2 developers. Please note that I don't expect you to implement any part of our existing website's front end. You are only loading our website's database with data that you will web-scrape from other websites. The way I envision this to work is: 1. The scraping jobs scrapes the websites listed in the original document. Some sites may have RSS feeds 2. Scraping scripts loads the data in a staging area 3. Scraping scripts cleanses the data from staging area and store the cleansed data into a database dedicated for scraped data 4. Load scripts loads the cleansed data into a search engine index database via xml calls

## Deliverables

1) Complete and fully-functional working program(s) in executable form as well as complete source code of all work done.

2) Deliverables must be in ready-to-run condition, as follows (depending on the nature of the deliverables):

a) For web sites or other server-side deliverables intended to only ever exist in one place in the Buyer's environment--Deliverables must be installed by the Seller in ready-to-run condition in the Buyer's environment.

b) For all others including desktop software or software the buyer intends to distribute: A software installation package that will install the software in ready-to-run condition on the platform(s) specified in this bid request.

3) All deliverables will be considered "work made for hire" under U.S. Copyright law. Buyer will receive exclusive and complete copyrights to all work purchased. (No GPL, GNU, 3rd party components, etc. unless all copyright ramifications are explained AND AGREED TO by the buyer on the site per the coder's Seller Legal Agreement).

## Platform

OS - Windows Server 2003

Administración de bases de datos Ingeniería MySQL PHP Arquitectura de software Verificación de software SQL Web Hosting Gestión de páginas web Verificación de páginas web

Nº del proyecto: #3664092

Sobre el proyecto

Proyecto remoto Activo Jul 22, 2006