A MongoDB server can contain multiple databases. A database is an independent container for data. Each database can contain multiple collections and a collection can contain multiple documents. It is important not to confuse the term database in MongoDB with the term database in the relational world.
A collection is a collection of documents (objects). One of the key differences between MongoDB and relational databases is that all of the documents in a collection do not need to have the same schema. Contrast this to a relational database where all rows in a table must have the same schema.
A document is a single record in a collection. Typically all documents in a collection have the same structure, although it is not strictly necessary. An advantage of the document approach to data is that it is possible to vary the schema of in a collection between documents. This means that a document identified by a key ABC-123 might have name, address, and phone number fields. But the document identified by the key DEF-456 has name, address, fax, and email address fields. This flexibility makes it possible to store data without a predefined schema.
Although MongoDB is extremely fast, it is only able to do so by loading its "working-set" into RAM, and so if your working-set is larger than your available RAM, things will start to slow down. This is where sharding comes into play, allowing you to load-balance the required RAM by better distributing and indexing the working-set. Although MongoPress is able to support sharding, it does not perform any specific tasks to do so, and is left to the person managing the MongoDB server to create. In the case of replica-sets, all configuration is once again done on the MongoDB server, but there is a setting that needs to be applied directly within the actions taking place in PHP, so options for this can be configured through the config.php file.
By default, MongoPress indexes objects and slugs by their respective MongoIDs, which in-turn incorporate their own timestamps that are used to help the server sort files into more manageable chunks before flushing RAM and read / writing to your hard-disk. This sorting roughly translates to chunks that represent recently created objects. When sharding MongoPress, if you are using htaccess, it is worth considering that slugs could be better optimized by indexing and sharding the slug itself, as opposed to the ID, but in the case of objects, it will remain optimal to index by ID. In environments that rely heavily upon user multi-user input, User IDs would be more suitable.
GridFS is a native MongoDB function that we were planning to wait until 0.2 before showcasing, but decided to squeeze it into our initial release. It basically stores and serves media directly from the MongoDB server and is replicated and sharded in the same way data is, by being split into smaller chunks of data. MongoPress also keeps track of how many times each image has been viewed, or how many times the ZIP file has been downloaded.