Scrape sections of text from webpages and count words (ideally, in R)

Completado Publicado hace 5 años Pagado a la entrega
Completado Pagado a la entrega

For a research project, I need to scrape text from webpages. There is a list of 359 webpages. The text on each webpage is divided into sections. I need to count the total number of words in each section and I need to count the number of occurrences of each word in my dictionary per section. Ideally, you would also send me each section in a separate text file. Attached you can find an excel file with the links to the webpages. Each webpage has a table of contents. For example, the first webpage should be divided into the section of text before the table of contents, a section with the header "Prospectus Summary", a section with the header "Summary Financial Data", a section with the header "Risk Factors", and so forth. There are usually around 20 sections. The text on each webpage is between 50 and 180 pages long. The structure of the webpages is very similar but sometimes different sections with different headers exist. The problem I experienced when attempting to scrape the text by sections is that webpages have slightly different structures. A majority of webpages has sections in capital letters with the same name but some webpages have different headers and a slightly different structure.

I am also attaching a list of uncertain words. For each word and each section in each prospectus, I need a word count (number of occurrences). The word count should include words in either upper or lower case letters.

I am attaching some R code that I used to scrape the entire webpage (not divided into sections) and count the words, in case it is helpful. It would be great to have your code in R so that I could continue to use it. I need the output as soon as possible (by Monday).

I would very much appreciate your help and look forward to hearing from you!

Lenguaje de Programación R Extracción de datos web

Nº del proyecto: #18742555

Sobre el proyecto

12 propuestas Proyecto remoto Activo hace 5 años

Adjudicado a:

zekovicm

Hi there,I am Web Scraping expert from Bosnia & Herzegovina,Europe. I have carefully gone through with your requirements and I can scrape all those webpages for you ! I can start immediately and finish it within the a Más

€177 EUR en 2 días
(96 comentarios)
7.2

12 freelancers están ofertando un promedio de €124 por este trabajo

polarjin2017

Hello? How are you? I have seen the project - "Scrape sections of text from webpages and count words (ideally, in R)." As you can see my profile in these fields((R Programming Language, Web Scraping)), I have been Más

€155 EUR en 3 días
(43 comentarios)
6.4
schoudhary1553

Hello, I have gone through your job posting and become very much interested to work with you. I am an expert in this field. I have already completed several projects like this. For evidence you can see my profile. Más

€250 EUR en 4 días
(31 comentarios)
6.2
dreamci

Hello, Full Stack Expert development team is ready to serve you. We are working on hourly rate 35usd/h Please check my profile and message me for more details Thanks

€155 EUR en 3 días
(9 comentarios)
4.5
engineeringexp

A Data Scientist with experience in Python, R programming, R Shiny, R studio and anything related to data science and python Master in Engineering, Electrical and Electronic Engineer, who is dynamic, reliable, resou Más

€30 EUR en 3 días
(15 comentarios)
4.6
kkc1985612

Hi, I have a lot of experience in scraping data in websites using python, Nodejs, Javascript and PHP. I can save the scrapped data in xlsx, csv, sql files and more database according to user's requirement. Also I can Más

€100 EUR en 3 días
(10 comentarios)
4.2
developer2581

HI Hello Sir, We have gone through the details you have provided and would be pleased to work on this with you to deliver the results that you have expected and We are sure you will not be disappointed if you giv Más

€155 EUR en 3 días
(1 comentario)
3.4
yuriyes43

Hi, there! I am very interested in your job. I have been working as a full stack web developer for over 5 years. I am highly skilled in Webscrapping with php curl and "puppetter" node modules so I feel confident tha Más

€155 EUR en 3 días
(2 comentarios)
3.5
fastestJohn

Hello. I am interesting in your project. I have rich experiences in web scrapping. Please look my portfolios and reviews. I am new in freelance. so I have little reviews. But I think my skills , honest and integr Más

€77 EUR en 3 días
(3 comentarios)
3.1
engrranakaleem21

Hello, I have already gone through the details from you. I think it is the best suited for me and you will have got an excellent feedback. I am available from now on. I ensure you i will provide you a better quality w Más

€30 EUR en 1 día
(2 comentarios)
2.1
mishrakajal19951

Hi there.. I have more than 2.5 years experience in web scraping and working as web/app scraper in a company. so I shall work on this with asp.net c# and will provide you quality data. thanks..

€55 EUR en 5 días
(1 comentario)
0.0