_id and MongoId can be a source of problems that can make what would seem a trivial operation potentially complicated.
MongoId is not as predictable or safe as mysql's auto increment (an example that most PHP developers will be familiar with). _id is generated by the client rather than the server and so does not guarantee that it will be collision free.
By comparison, server side auto_increment mechanisms that PHP programmers might typically be used to wont collide until every single id had been used and with 64bits you can ensure this will almost never happen. You will also know when your table is getting full, and you can predict the rate. Most importantly, no matter the mechanism, being server side guarantees two clients wont collide. Mongo's behaviour is different to this.
Generally speaking inserting without specifying _id will tend to work, but there are some cases where is can fail or is particularly prone to failure.
The total size I believe is 96 bits. This might seem like a lot but the value is not created randomly. It is generated like this:
$unixtime . $machine_id . $pid . $counter
The counter starts from zero and is attached to each instance of MongoClient thus two MongoClient connections to the same server will almost certainly not work (produce a collision):
If MongoWrapper is not using a singleton for the connection or something to the same effect, the second call will most likely have the same unixtime. It will certainly have the same machine_id, pid and counter. The insert will fail.
If you are not using a singleton, this will work:
You may also have difficulties in a multiple machine environment.
machine_id is a hash of gethostname. This is not guaranteed to be unique across machines. Some people do not set hostnames at all. If you do not ensure that your machines all have unique hostnames then if in the same second, two machines run a script that inserts, the second will have a 1 in 2^15 chance of colliding (assuming the most common PID max). Depending on how the system handles pids, the probability may actually be a little less. In short, make sure any host accessing your mongodb has a hostname that is unique among any other host accessing your mongodb.
I've seen some specs specify that counter should start from a random value but I highly recommend against this as it merely hides/obscures the problem.