Multiprocessing/threading: Data Appending & Output Return
Solution 1:
Multiprocessing spawns a different Process with it's own global variables copies from current environment. All the changes in variable made in that process does not reflect in parent process. You need to share memory between the process and variables in shared memory can be exchanged.
You can use multiprocessing.Manager
to create a shared object like list or dictionary, and manipulate that object.
Processes are assigned to different cores/thread of your processor. If you have a 4 core/8 thread system, spawn a maximum of 7 processes to maximize performance, any more than that some processes will interfere with other processes and can slow down/reduce the cpu time allotted to your os which can crash your system. It's always the cpu cores/cpu threads -1 processes for stable processing leaving atleast one core to os to handle other operations.
You can modify your code like this
from multiprocessing import Process, Manager
import time
defrun(list_):
list_.append(trace)
if __name__ == "__main__":
jobs = []
gen_count = 0
leaked_count = 0
system_count = 0with Manager() as manager:
list_ = manager.list()
for _ inrange(multiprocessing.cpu_count()-1):
p = Process(target=run,args=(list_))
jobs.append(p)
p.start()
whileTrue: #stops main thread from completing execution
time.sleep(5) #wait 5 second before checking if processes are terminatedifall([not x.is_alive() for x in jobs]): #check if all processes terminatedbreak#breaks the loop
Solution 2:
The way multiprocessing works, each subtask runs in its own memory space and gets its own copy of any global variables. A common way around this limitation to effectively have shared data is to use a multiprocessing.Manager
to coordinate concurrent access to it and transparently prevent any problems that might cause.
Below is an example of doing that based on your sample code. It also uses a multiprocessing.Pool()
which makes it easy to create a fixed-size collection of process objects that can each provide asynchronous results from each subtask (or wait until all of them are finished before retrieving them, as is being done here).
from functools import partial
import multiprocessing
defrun(data, i):
data.append('trace%d' % i)
return1, 2, 3# values to add to gen_count, leaked_count, and system_countif __name__ == '__main__':
N = 10
manager = multiprocessing.Manager() # create SyncManager
data = manager.list() # create a shared list
pool = multiprocessing.Pool()
async_result = pool.map_async(partial(run, data), range(N))
values = tuple(zip(*async_result.get()))
gen_count = sum(values[0])
leaked_count = sum(values[1])
system_count = sum(values[2])
print(data)
print('Totals: gen_count {}, leaked_count {}, system_count {}'.format(
gen_count, leaked_count, system_count))
Output:
['trace0', 'trace1', 'trace2', 'trace4', 'trace3', 'trace5', 'trace8', 'trace6', 'trace7', 'trace9']
Totals: gen_count 10, leaked_count 20, system_count 30
Post a Comment for "Multiprocessing/threading: Data Appending & Output Return"