The requirement is to develop a system that can crawl a specific website, and index all pages from the website, and save the references in a SQL database on Amazon Cloud.
¥ Crawl a specific website and index all pages on the site.
¥ Develop a feature that generates a numeric unique page ID for each page on the website
¥ Save references to all the pages in a database. (the Page ID)
¥ The database must contain a string with the page ID and the page url´s, and an extra field that can contain up to 2000 characters of plain text for all pages on the website.
¥ The database must be developed as a MySQL database that can be saved in an Amazon cloud solution.
¥ It must be possible to restart the crawling process and save any new pages into the database as a new string
19 freelancers están ofertando el promedio de €314 para este trabajo
Hello, I have more than a year of experience in web scraping and I can deliver software per Your requirements in 10 days. Look forward to hearing from You.