Small PHP, Python or Java program to grep contents
$500-900 HKD
En curso
Publicado hace más de 12 años
$500-900 HKD
Pagado a la entrega
Need one Python or Java based Google App Engine program to grep contents from a website. PHP is not preferred but will be considered if the developer's bid is competitive and has created similar programs before.
The website accepts query by URL and then returns a webpage with the search results. The website is quite simple, no exotic css or javascript, esay to parse. If there are more than 50 hits, the search results will be divided into different webpages that a user has to click in order to see all the results. The program should be able to perform the following functions:
1. accept a query by GET method. The query may contain quotation marks and punctuations.
2. send the search request to the website based on the query. The syntax of the search request will be provided by the Employer.
3. collect all the search results (the search results, as described above, may be divided into multiple page).
4. return the search results in JSON format. The JSON format will be provided by the Employer.
The program should also be able to accept date fields and a frequency field and then add to the query in step 2 above. Therefore, the program should modify the query itself. For example, the program receives from URL a query:
query=happy&freq=week&from_date=23&from_month=6&from_year=2005&to_date=22&to_month=6&to_year=2006
Then, the program will send a modified query to the website indicating the search should be based on a date range. In the above example, as the date range is from 23 June 2005 to 22 June 2006, the program should send the 52 queries (52 weeks) for searching "happy" to the website. Therefore, if there are date fields and a frequency field, the step will become:
1. accept a query by GET method. The query may contain quotation marks.
2. send the search request to the website based on the query.
3. collect all the search results (the search results, as described above, may be divided into multiple page).
3b. store the search results, and go back to step 2 until all the search results have been collected.
4. return the search results in JSON format.
All rights in the program will assigned and transferred to the Employer.
More information can be provided if necessary.