OVERVIEW
------------------
This project will involve taking a list of 1057 movies which were shown in US theaters between 2003 and 2010 and collecting information about their trailers. The main goal is to get the list of trailers and each trailer's release date for each movie, but additional informational for each trailer such as number of views and user ratings will also be collected. I provide a list of 8 websites from which to collect this information (the information should be scraped and assembled in a format I specify). All scripts as well as the output files will be submitted.
Important note for bidders: some of the sites will require non-standard scraping or data assembly. This will involve:
- API on one site
- extract text from flash on one site
- use of the Internet Archive ([login to view URL]) on two others
If you have experience with these issues or sites (or similar non-standard sources) please let me know in your bid and/or in a private message.
DETAILS
------------------------------------
(1) ATTACHED FILES (detailed instructions and examples)
- [login to view URL]
- [login to view URL]
(2) INSTRUCTIONS
(A) SCRAPING
- take the list of N=1057 movies (I will provide to the winning bidder). It will include movie title, US release date, IMDB code, and other metadata.
- for each movie on the list I will want a list of trailers, each trailer's release date, and additional metadata
- this information will be collected from N=8 web sites (the specific variables needed from each site are listed and known difficulties are listed in [login to view URL]): the sites are,
(i) Google trends ([login to view URL])
(ii) HD Trailers ([login to view URL])
(iii) Trailer addict ([login to view URL])
need to extract some variables from flash file
(iv) Spike ([login to view URL])
(v) Internetvideo Archive ([login to view URL])
Has API
(vi) Apple iTunes ([login to view URL])
Collect data from Internet Archive (([login to view URL])
(vii) Moviefone ([login to view URL])
Collect data from Internet Archive (([login to view URL])
(viii) Youtube ([login to view URL])
- NOTES:
- you will have to make sure the list of trailers does not include bad matches
- each movie titles will likely have multiple trailers
- the list of trailers might differ between the websites
- the release dates or metadata for a trailer might differ between the websites
(B) ASSEMBLING DATA
- the data requested in the attached [login to view URL] should be extracted from the websites
- the data should be entered in the attached spreadsheet ([login to view URL]) using the listed variables (columns)
Note that there are separate tabs for each data source (and two tabs for Google trends)
- the attached [login to view URL] includes data for two sample movie titles.
(2) DELIVERABLES
- [login to view URL] with all specified data entered for all N=1057 movie title
- if any scripts are used, please submit them
Hi.. Expert web scraper/Data Minor here. Interested in your project. I assure you 100% accurate and good quality work. Ready to start. have a look at PM. Regards
Dear Buyer I am really interested in this project. I can start work on immediately. Thank you very much for giving us an opportunity to bid for your project. Please check PMB for more details. Thanks SHAN
Hello Sir/Madam
I have 3+ years of experience in web scraping. I have scraped more than 900 websites.
I can do this for you.
Kindly Check your PMB
Thanks and regards