教程#

本教程旨在介绍如何使用 MongoDB 和 PyMongo .

先决条件#

在我们开始之前，确保你有 PyMongo 分布 installed . 在Python shell中，应在不引发异常的情况下运行以下命令：

>>> import pymongo

本教程还假设MongoDB实例运行在默认主机和端口上。假设你有 downloaded and installed MongoDB，您可以这样启动：

$ mongod

与MongoClient建立连接#

使用时的第一步 PyMongo 是为了创造一个 MongoClient 为了跑步 蒙古德 实例。这样做很容易：

>>> from pymongo import MongoClient
>>> client = MongoClient()

上面的代码将连接到默认主机和端口。我们也可以显式指定主机和端口，如下所示：

>>> client = MongoClient("localhost", 27017)

或者使用MongoDB URI格式：

>>> client = MongoClient("mongodb://localhost:27017/")

获取数据库#

MongoDB的单个实例可以支持多个独立的 databases 。使用PyMongo时，您可以使用上的属性样式访问数据库 MongoClient 实例：

>>> db = client.test_database

如果您的数据库名不能使用属性样式访问（比如 test-database )，您可以改用字典式访问：

>>> db = client["test-database"]

收集#

A collection 是存储在MongoDB中的一组文档，可以被认为大致相当于关系数据库中的一个表。在PyMongo中获取集合的工作原理与获取数据库的工作原理相同：

>>> collection = db.test_collection

或（使用字典式访问）：

>>> collection = db["test-collection"]

关于MongoDB中集合（和数据库）的一个重要注意事项是，它们是延迟创建的——上面的命令都没有在MongoDB服务器上实际执行任何操作。在向集合和数据库中插入第一个文档时，将创建这些集合和数据库。

文件#

MongoDB中的数据使用JSON样式的文档表示（并存储）。在PyMongo中，我们使用字典来表示文档。例如，可以使用以下字典来表示博客文章：

>>> import datetime
>>> post = {
...     "author": "Mike",
...     "text": "My first blog post!",
...     "tags": ["mongodb", "python", "pymongo"],
...     "date": datetime.datetime.now(tz=datetime.timezone.utc),
... }

请注意，文档可以包含原生Python类型(如 datetime.datetime 实例)，它们将自动转换为相应的 BSON 类型。

插入文档#

要将文档插入到集合中，我们可以使用 insert_one() 方法：

>>> posts = db.posts
>>> post_id = posts.insert_one(post).inserted_id
>>> post_id
ObjectId('...')

当文档被插入特殊密钥时， "_id" ，则在文档尚未包含 "_id" 钥匙。的价值 "_id" 在集合中必须是唯一的。 insert_one() 返回 InsertOneResult 。有关以下内容的更多信息 "_id" ，请参阅 documentation on _id 。

插入第一个文档后帖子集合实际上已在服务器上创建。我们可以通过列出数据库中的所有集合来验证这一点：

>>> db.list_collection_names()
['posts']

获取单个文档 `find_one()`#

在MongoDB中可以执行的最基本的查询类型是 find_one() . 此方法返回与查询（或 None 如果没有匹配项）。当您知道只有一个匹配的文档，或者只对第一个匹配的文档感兴趣时，它非常有用。在这里我们使用 find_one() 要从posts集合获取第一个文档，请执行以下操作：

>>> import pprint
>>> pprint.pprint(posts.find_one())
{'_id': ObjectId('...'),
 'author': 'Mike',
 'date': datetime.datetime(...),
 'tags': ['mongodb', 'python', 'pymongo'],
 'text': 'My first blog post!'}

结果是一个与我们先前插入的字典相匹配的字典。

备注

返回的文档包含 "_id" ，在插入时自动添加。

find_one() 还支持查询结果文档必须匹配的特定元素。为了将结果限制在作者为“Mike”的文档中，我们需要：

>>> pprint.pprint(posts.find_one({"author": "Mike"}))
{'_id': ObjectId('...'),
 'author': 'Mike',
 'date': datetime.datetime(...),
 'tags': ['mongodb', 'python', 'pymongo'],
 'text': 'My first blog post!'}

如果我们试着用另一个作者，比如“艾略特”，我们不会得到任何结果：

>>> posts.find_one({"author": "Eliot"})
>>>

按ObjectId查询#

我们也可以通过它找到一个帖子 _id ，在我们的示例中是一个ObjectId：

>>> post_id
ObjectId(...)
>>> pprint.pprint(posts.find_one({"_id": post_id}))
{'_id': ObjectId('...'),
 'author': 'Mike',
 'date': datetime.datetime(...),
 'tags': ['mongodb', 'python', 'pymongo'],
 'text': 'My first blog post!'}

请注意，ObjectId与其字符串表示形式不同：

>>> post_id_as_str = str(post_id)
>>> posts.find_one({"_id": post_id_as_str})  # No result
>>>

web应用程序中的一个常见任务是从请求URL获取ObjectId并找到匹配的文档。在这种情况下 从字符串转换ObjectId 在传递给 find_one ：：

from bson.objectid import ObjectId

# The web framework gets post_id from the URL and passes it as a string
def get(post_id):
    # Convert from string to ObjectId:
    document = client.db.collection.find_one({'_id': ObjectId(post_id)})

参见

当我在web应用程序中按ObjectId查询文档时，我没有得到任何结果

大容量插入#

为了使查询更加有趣，让我们再插入一些文档。除了插入单个文档，我们还可以执行 大容量插入 操作，方法是将列表作为第一个参数传递给 insert_many() . 这将插入列表中的每个文档，只向服务器发送一个命令：

>>> new_posts = [
...     {
...         "author": "Mike",
...         "text": "Another post!",
...         "tags": ["bulk", "insert"],
...         "date": datetime.datetime(2009, 11, 12, 11, 14),
...     },
...     {
...         "author": "Eliot",
...         "title": "MongoDB is fun",
...         "text": "and pretty easy too!",
...         "date": datetime.datetime(2009, 11, 10, 10, 45),
...     },
... ]
>>> result = posts.insert_many(new_posts)
>>> result.inserted_ids
[ObjectId('...'), ObjectId('...')]

关于这个例子，有几个有趣的事情需要注意：

结果来自 insert_many() 现在返回2 ObjectId 实例，每个插入的文档一个。

new_posts[1] 有一个不同的“形状”与其他帖子-没有 "tags" 字段，我们添加了一个新字段， "title" . 这就是我们所说的MongoDB schema-free .

查询多个文档#

要获取作为查询结果的多个文档，我们使用 find() 方法。 find() 返回A Cursor 实例，它允许我们迭代所有匹配的文档。例如，我们可以迭代 posts 收藏：

>>> for post in posts.find():
...     pprint.pprint(post)
...
{'_id': ObjectId('...'),
 'author': 'Mike',
 'date': datetime.datetime(...),
 'tags': ['mongodb', 'python', 'pymongo'],
 'text': 'My first blog post!'}
{'_id': ObjectId('...'),
 'author': 'Mike',
 'date': datetime.datetime(...),
 'tags': ['bulk', 'insert'],
 'text': 'Another post!'}
{'_id': ObjectId('...'),
 'author': 'Eliot',
 'date': datetime.datetime(...),
 'text': 'and pretty easy too!',
 'title': 'MongoDB is fun'}

就像我们做的那样 find_one() ，我们可以将文档传递给 find() 限制返回的结果。这里，我们只得到那些作者是“Mike”的文档：

>>> for post in posts.find({"author": "Mike"}):
...     pprint.pprint(post)
...
{'_id': ObjectId('...'),
 'author': 'Mike',
 'date': datetime.datetime(...),
 'tags': ['mongodb', 'python', 'pymongo'],
 'text': 'My first blog post!'}
{'_id': ObjectId('...'),
 'author': 'Mike',
 'date': datetime.datetime(...),
 'tags': ['bulk', 'insert'],
 'text': 'Another post!'}

计数#

如果我们只想知道有多少文档与一个查询匹配，我们可以执行 count_documents() 操作而不是完整查询。我们可以获得集合中所有文档的计数：

>>> posts.count_documents({})
3

或者只匹配特定查询的文档：

>>> posts.count_documents({"author": "Mike"})
2

范围查询#

MongoDB支持多种不同类型的 advanced queries 。例如，让我们执行一个查询，其中我们将结果限制为某个日期之前的帖子，但也按作者对结果进行排序：

>>> d = datetime.datetime(2009, 11, 12, 12)
>>> for post in posts.find({"date": {"$lt": d}}).sort("author"):
...     pprint.pprint(post)
...
{'_id': ObjectId('...'),
 'author': 'Eliot',
 'date': datetime.datetime(...),
 'text': 'and pretty easy too!',
 'title': 'MongoDB is fun'}
{'_id': ObjectId('...'),
 'author': 'Mike',
 'date': datetime.datetime(...),
 'tags': ['bulk', 'insert'],
 'text': 'Another post!'}

这里我们用的是 "$lt" 运算符执行范围查询，同时调用 sort() 按作者对结果进行排序。

索引#

添加索引有助于加速某些查询，还可以为查询和存储文档添加附加功能。在此示例中，我们将演示如何创建 unique index 在拒绝其值已存在于索引中的文档的键上。

首先，我们需要创建索引：

>>> result = db.profiles.create_index([("user_id", pymongo.ASCENDING)], unique=True)
>>> sorted(list(db.profiles.index_information()))
['_id_', 'user_id_1']

注意，我们现在有两个索引：一个是 _id MongoDB自动创建，另一个是 user_id 我们刚刚创造了。

现在让我们设置一些用户配置文件：

>>> user_profiles = [{"user_id": 211, "name": "Luke"}, {"user_id": 212, "name": "Ziltoid"}]
>>> result = db.profiles.insert_many(user_profiles)

索引阻止我们插入 user_id 已在集合中：

>>> new_profile = {"user_id": 213, "name": "Drew"}
>>> duplicate_profile = {"user_id": 212, "name": "Tommy"}
>>> result = db.profiles.insert_one(new_profile)  # This is fine.
>>> result = db.profiles.insert_one(duplicate_profile)
Traceback (most recent call last):
DuplicateKeyError: E11000 duplicate key error index: test_database.profiles.$user_id_1 dup key: { : 212 }

参见

上的MongoDB文档 indexes

教程#

先决条件#

与MongoClient建立连接#

获取数据库#

收集#

文件#

插入文档#

获取单个文档 find_one()#

按ObjectId查询#

大容量插入#

查询多个文档#

计数#

范围查询#

索引#

获取单个文档 `find_one()`#