multithreading - Python multiprocessing Pool vs multiprocessing ThreadPool -
i have list of image paths want divide between processes or threads each process processes part of list. processing includes loading image disk, computation , return result. i'm using python 2.7 multiprocessing.pool
here's how create worker processes
def processparallel(classifier,path): files=glob.glob(path+"\*.png") files_sorted=sorted(files,key=lambda file_name:int(file_name.split('--')[1])) p = multiprocessing.pool(processes=4,initializer=initializer,initargs=(classifier,)) data=p.map(loadandclassify, files_sorted) return data
the issue i'm facing when log initialization time in intializer function, came know workers aren't initialized in parallel , rather each worker initialized gap of 5 seconds , here logs reference
2016-08-08 12:38:32,043 - custom_logging - info - worker started 2016-08-08 12:38:37,647 - custom_logging - info - worker started 2016-08-08 12:38:43,187 - custom_logging - info - worker started 2016-08-08 12:38:48,634 - custom_logging - info - worker started
i've tried using multiprocessing.pool.threadpool
instead starts workers @ same time.
know how multiprocessing on windows work , have place main guard
protect our code spawning infinite processes. issue in case i've hosted script on iis using fastcgi , script isn't main , it's being run fastcgi process (there's wfastcgi.py script responsible that). there main guard inside wfastcgi.py , logs indicate i'm not creating infinite no of processes.
now want know reason behind multiprocessing pool not creating worker threads simultaneously, i'll appreciate help.
edit 1: here's initializer function
def initializer(classifier): global indexing_classifier logger.info('worker started') indexing_classifier=classifier
i had many issues trying run multiprocessing under cgi/wsgi, works fine locally, not on real webservers... isn't compatible. if need multiprocessing, send async jobs off celery.
Comments
Post a Comment