Find Jobs
Hire Freelancers

Clean Up Text - Print all Sentences Without Proper Nouns and convert to UTF-8

$10-30 USD

En curso
Publicado hace más de 7 años

$10-30 USD

Pagado a la entrega
I have a large corpus of text. I need all sentences that do not have any proper nouns (nouns that are capitalized - should be able to do this with regex) in the sentence pulled out of the folder of text files and printed to a new file. Files include some interestingly formatted creative commons books. Files will be sent upon request as they were too big to upload all. Here are a few examples:
ID del proyecto: 11579309

Información sobre el proyecto

6 propuestas
Proyecto remoto
Activo hace 8 años

¿Buscas ganar dinero?

Beneficios de presentar ofertas en Freelancer

Fija tu plazo y presupuesto
Cobra por tu trabajo
Describe tu propuesta
Es gratis registrarse y presentar ofertas en los trabajos
6 freelancers están ofertando un promedio de $31 USD por este trabajo
Avatar del usuario
Hi. So, you need to extract only those proper nouns printed in a new file - one each line ? Also, I see that you have .rft files also? if you could provide me more files with data would be great. Also, do you need the script or only the result? Thx
$30 USD en 0 día
5,0 (121 comentarios)
7,6
7,6
Avatar del usuario
Hello, I am Hermann, a scientist and software developer. I made an offer to solve your problem in 5 days' time, as I don't know how much text you need formatted, but guess I'll be able to finish the task much faster. I would need to know a couple of things of you to do the task: 1) I guess your definition of a sentence is a string of text starting either from the beginning of a file or from a full stop and running on to another full stop, right? 2) I'm not sure I understood correctly: Is a bad noun one having a capital first letter or is a bad sentence one without a word with capital first letter except the first letter of the sentence? 3) If I understand correctly, you need two files for each text: the file containing only "good" sentences, and the file containing only "bad" sentences, right? What further information, if any, is required to connect both texts? 4) The best way to parse multiple files would be to convert them to a common format. I can do that easily by creating a script that converts files and reformats the results locally on my linux machine, but I can't create anything that will run on a Windows machine because Windows sucks hard at such tasks without having bouth specialized software for that. Would that be ok with you? Best regards Hermann
$35 USD en 5 días
5,0 (2 comentarios)
2,2
2,2
Avatar del usuario
I am pleased to inform that I have been working similar kind of work for quiet a time
$30 USD en 5 días
0,0 (0 comentarios)
0,0
0,0

Sobre este cliente

Bandera de UNITED STATES
Jersey City, United States
3,7
2
Forma de pago verificada
Miembro desde abr 4, 2016

Verificación del cliente

¡Gracias! Te hemos enviado un enlace para reclamar tu crédito gratuito.
Algo salió mal al enviar tu correo electrónico. Por favor, intenta de nuevo.
Usuarios registrados Total de empleos publicados
Freelancer ® is a registered Trademark of Freelancer Technology Pty Limited (ACN 142 189 759)
Copyright © 2024 Freelancer Technology Pty Limited (ACN 142 189 759)
Cargando visualización previa
Permiso concedido para Geolocalización.
Tu sesión de acceso ha expirado y has sido desconectado. Por favor, inica sesión nuevamente.