Modify Python script to allow for more concurrent tasks (Debian Linux)(repost)

Completado Publicado Nov 6, 2011 Pagado a la entrega
Completado Pagado a la entrega

I have a Python script that creates multiple threads, when I generate more than 500 threads I get this output:

Traceback (most recent call last):

File "./[url removed, login to view]", line 391, in <module>

[url removed, login to view]()

File "/usr/lib/python2.6/[url removed, login to view]", line 474, in start

_start_new_thread(self.__bootstrap, ())

[url removed, login to view]: can't start new thread

At 500 threads I have low cpu usage and 15903092k free ram.

The purpose of the script is to download websites and scan them for keywords, essentially it is a web crawler.

It appears that the limiting factors are currently stack size and the global interpreter lock.

This project is to:

1. Remove the requirement to change stack size and set a maximum thread limit within the code. I suggest this is done by moving aware from a threaded design, but I'm open to discussion about this.

2. Overcome the global interpreter lock limitation of one cpu. The script must run on 8+ cpus.

3. Currently certain websites cause threads to segfault or hang. You need to implement appropriate error handling to allow the script to log an error and continue.

Linux Python Administración de sistemas

Nº del proyecto: #3678220

Sobre el proyecto

1 propuesta Proyecto remoto Activo Nov 8, 2011

Adjudicado a:

gabrielpsl

See private message.

$63.75 USD en 8 días
(7 comentarios)
4.2