Scrape data from SEC EDGAR website (from Form 10-K; 82,955 .txt links provided)

Completado Publicado hace 5 años Pagado a la entrega
Completado Pagado a la entrega

I’m interested in collecting information about employee unionization from public company annual reports (Form 10-K). Each 10-K contains several standardized sections. The labor union info I’m interested in is located in “Item 1. Business” and “Item 1A. Risk Factors”.

Step 1: Access the links to Form 10-K .txt files (N=82955) in the attached read file and search Item 1 and Item 1A ONLY for the keywords below…

KEYWORDS: collective bargaining, collective-bargaining, CBA, labo(u)r union(s), labo(u)r agreement(s), labo(u)r contract(s), labo(u)r organization(s), union agreement(s), union contract(s), union organization(s), or union(s)

Step 2: If one of the above keywords matches the text in Item 1 or 1A, add the entire sentence (or paragraph, whichever is easier) with the match to new field/column in the read file. Maybe the output file could have one field for any Item 1 output and a second field for any Item 1A output.

Appendix C and Appendix D of the attached research paper (pg36-37) provide some examples of the union-related text that I’m looking for.

Step 3: (If possible) Create 3 different union-related variables (binary, percentage, number) from the extracted Item 1 sentences/paragraphs. Create a separate set of 3 union-related variables from the Item 1A text. First, identify whether the union-related statement is positive or negative (i.e. employees are represented/covered by a union V.S. employees are NOT in a union, none of our employees are represented) with a binary variable (=1 for (some) union representation and =0 for no representation). Second, extract the percentage of employees covered if available. Third, extract the number of employees covered if available.

I realize this last part is tricky to do mechanically. I’ll have to check this part manually anyways, so any progress here with a reasonable error rate will be appreciated.

Extracción de datos Extracción de datos web

Nº del proyecto: #17182236

Sobre el proyecto

15 propuestas Proyecto remoto Activo hace 5 años

Adjudicado a:

$83 USD en 5 días
(3 comentarios)
3.0

15 freelancers están ofertando un promedio de $148 por este trabajo

schoudhary1553

Hello, I can help with you in your project Scrape data from SEC EDGAR website . I have more than 5 years of experience in Data Mining, Web Scraping. We have worked on several similar projects before! We have worked Más

$250 USD en 3 días
(65 comentarios)
6.6
PhpWebD

I'm a developer with extensive experience in building high quality sites and apps. I have an experience in (Ionic framework/React Native/NativeScript/PHP/Javascript/UI design). I know how to do apps in native IOS Más

$85 USD en 2 días
(7 comentarios)
6.0
osmanaydin

Hello sir, I have completed web/data scraping jobs for many times. I am interested in your project as well. I would like to discuss details via pm. I am looking forward to hear from you soon. Best Regards,

$84 USD en 3 días
(48 comentarios)
5.9
aspnetprogr

Hi, I have good experience in web scraping. I will provide application where you can add list of url, match with keywords automatically and search result. I already have application developed using c#.net and s Más

$111 USD en 2 días
(2 comentarios)
3.3
gbemidata

Hello, I am a web search and data entry specialist . I will do a quality job for you which will meet your requirements and expectations with full compliance with the time limit. Hope to hear from you Relevant Skills Más

$250 USD en 3 días
(2 comentarios)
1.4
ibrahimder0

I have 6 year experience Freelancer,up work,Fiverr & 99design market place I have seen your project that i can to do easily because I have many experience to Graphic Design,Webdesign,Web Develop & programming .So I cou Más

$155 USD en 3 días
(0 comentarios)
0.0
ITCristRo

how are you,sir? I am a ultimate developer who has rich experience in this field. If you contact me, you and i will all be happy. Thank you for your reply in advance. Scrape data from SEC EDGAR website (from Form 10-K Más

$155 USD en 1 día
(0 comentarios)
0.0
shopmagia

Good day, i am an expert web scraper with lot of experience. i usually use php and curl and have success almost all the times; i do not support captcha; my code is well written, fast and long lasting. I have p Más

$155 USD en 3 días
(0 comentarios)
0.8