Manage Python Multiprocessing With Mongodb
Solution 1:
db.authenticate will have to connect to mongo server and it will try to make a connection. So, even though connect=False is being used, db.authenticate will require a connection to be open. Why don't you create the mongo client instance after fork? That's look like the easiest solution.
Solution 2:
Since db.authenticate
must open the MongoClient and connect to the server, it creates connections which won't work in the forked subprocess. Hence, the error message. Try this instead:
db = MongoClient('mongodb://user:password@localhost', connect=False).database
Also, delete the Lock l
. Acquiring a lock in one subprocess has no effect on other subprocesses.
Solution 3:
Here is how I did it for my problem:
import pathos.pools as pp
import time
import db_access
classMultiprocessingTest(object):
def__init__(self):
passdeftest_mp(self):
data = [[form,'form_number','client_id'] for form inrange(5000)]
pool = pp.ProcessPool(4)
pool.map(db_access.insertData, data)
if __name__ == '__main__':
time_i = time.time()
mp = MultiprocessingTest()
mp.test_mp()
time_f = time.time()
print'Time Taken: ', time_f - time_i
Here is db_access.py:
from pymongo import MongoClient
def insertData(form):
client = MongoClient()
db = client['TEST_001']
db.initialization.insert({
"form": form[0],
"form_number": form[1],
"client_id": form[2]
})
This is happening to your code because you are initiating MongoCLient() once for all the sub-processes. MongoClient is not fork safe. So, initiating inside each function works and let me know if there are other solutions.
Post a Comment for "Manage Python Multiprocessing With Mongodb"