gitpython教程

GitPython提供对您的Git存储库的对象模型访问。本教程由多个部分组成,其中大部分部分解释了现实生活中的用例。

此处提供的所有代码都源自 test_docs.py 以确保正确。了解这一点还应该允许您更轻松地运行代码以用于您自己的测试目的。您所需要的只是一个git-python的开发人员安装。

满足回购类型

第一步是创建 git.Repo 对象来表示存储库。

from git import Repo

# rorepo is a Repo instance pointing to the git-python repository.
# For all you know, the first argument to Repo is a path to the repository you
# want to work with.
repo = Repo(self.rorepo.working_tree_dir)
assert not repo.bare

在上面的示例中,目录 self.rorepo.working_tree_dir 等于 /Users/mtrier/Development/git-python 是我的工作存储库,其中包含 .git 目录。还可以使用 bare 储存库。

bare_repo = Repo.init(os.path.join(rw_dir, "bare-repo"), bare=True)
assert bare_repo.bare

repo对象提供对数据的高级访问,它允许您创建和删除头、标记和远程设备,并访问存储库的配置。

repo.config_reader()  # Get a config reader for read-only access.
with repo.config_writer():  # Get a config writer to change configuration.
    pass  # Call release() to be sure changes are written and locks are released.

查询活动分支、查询未跟踪的文件或存储库数据是否已修改。

assert not bare_repo.is_dirty()  # Check the dirty state.
repo.untracked_files  # Retrieve a list of untracked files.
# ['my_untracked_file']

从现有存储库克隆或初始化新的空存储库。

cloned_repo = repo.clone(os.path.join(rw_dir, "to/this/path"))
assert cloned_repo.__class__ is Repo  # Clone an existing repository.
assert Repo.init(os.path.join(rw_dir, "path/for/new/repo")).__class__ is Repo

将存储库内容存档到tar文件中。

with open(os.path.join(rw_dir, "repo.tar"), "wb") as fp:
    repo.archive(fp)

高级回购使用

当然,您还可以使用这种类型做更多的事情,下面的大多数将在特定的教程中进行更详细的解释。如果您不能立即理解其中的一些示例,也不必担心,因为它们可能需要对Git的内部工作原理有一个透彻的了解。

查询相关存储库路径…

assert os.path.isdir(cloned_repo.working_tree_dir)  # Directory with your work files.
assert cloned_repo.git_dir.startswith(cloned_repo.working_tree_dir)  # Directory containing the git repository.
assert bare_repo.working_tree_dir is None  # Bare repositories have no working tree.

Heads 头在吉特语中是树枝。 References 是指向特定提交或其他引用的指针。头部和 Tags 是一种参考。gitpython允许您非常直观地查询它们。

self.assertEqual(
    repo.head.ref,
    repo.heads.master,  # head is a sym-ref pointing to master.
    "It's ok if TC not running from `master`.",
)
self.assertEqual(repo.tags["0.3.5"], repo.tag("refs/tags/0.3.5"))  # You can access tags in various ways too.
self.assertEqual(repo.refs.master, repo.heads["master"])  # .refs provides all refs, i.e. heads...

if "TRAVIS" not in os.environ:
    self.assertEqual(repo.refs["origin/master"], repo.remotes.origin.refs.master)  # ... remotes ...
self.assertEqual(repo.refs["0.3.5"], repo.tags["0.3.5"])  # ... and tags.

你也可以创建新的头…

new_branch = cloned_repo.create_head("feature")  # Create a new branch ...
assert cloned_repo.active_branch != new_branch  # which wasn't checked out yet ...
self.assertEqual(new_branch.commit, cloned_repo.active_branch.commit)  # pointing to the checked-out commit.
# It's easy to let a branch point to the previous commit, without affecting anything else.
# Each reference provides access to the git object it points to, usually commits.
assert new_branch.set_commit("HEAD~1").commit == cloned_repo.active_branch.commit.parents[0]

…和标签…

past = cloned_repo.create_tag(
    "past",
    ref=new_branch,
    message="This is a tag-object pointing to %s" % new_branch.name,
)
self.assertEqual(past.commit, new_branch.commit)  # The tag points to the specified commit
assert past.tag.message.startswith("This is")  # and its object carries the message provided.

now = cloned_repo.create_tag("now")  # This is a tag-reference. It may not carry meta-data.
assert now.tag is None

你可以向下移动到 git objects 通过引用和其他对象。一些物体 commits 有其他要查询的元数据。

assert now.commit.message != past.commit.message
# You can read objects directly through binary streams, no working tree required.
assert (now.commit.tree / "VERSION").data_stream.read().decode("ascii").startswith("3")

# You can traverse trees as well to handle all contained files of a particular commit.
file_count = 0
tree_count = 0
tree = past.commit.tree
for item in tree.traverse():
    file_count += item.type == "blob"
    tree_count += item.type == "tree"
assert file_count and tree_count  # We have accumulated all directories and files.
self.assertEqual(len(tree.blobs) + len(tree.trees), len(tree))  # A tree is iterable on its children.

Remotes 允许处理fetch、pull和push操作,同时向 progress delegates .

from git import RemoteProgress

class MyProgressPrinter(RemoteProgress):
    def update(self, op_code, cur_count, max_count=None, message=""):
        print(
            op_code,
            cur_count,
            max_count,
            cur_count / (max_count or 100.0),
            message or "NO MESSAGE",
        )

self.assertEqual(len(cloned_repo.remotes), 1)  # We have been cloned, so should be one remote.
self.assertEqual(len(bare_repo.remotes), 0)  # This one was just initialized.
origin = bare_repo.create_remote("origin", url=cloned_repo.working_tree_dir)
assert origin.exists()
for fetch_info in origin.fetch(progress=MyProgressPrinter()):
    print("Updated %s to %s" % (fetch_info.ref, fetch_info.commit))
# Create a local branch at the latest fetched master. We specify the name
# statically, but you have all information to do it programmatically as well.
bare_master = bare_repo.create_head("master", origin.refs.master)
bare_repo.head.set_reference(bare_master)
assert not bare_repo.delete_remote(origin).exists()
# push and pull behave very similarly.

这个 index 也被称为吉特演讲中的舞台。它用于准备新的提交,并可用于保存合并操作的结果。我们的索引实现允许数据流到索引中,这对于没有工作树的裸存储库很有用。

self.assertEqual(new_branch.checkout(), cloned_repo.active_branch)  # Checking out branch adjusts the wtree.
self.assertEqual(new_branch.commit, past.commit)  # Now the past is checked out.

new_file_path = os.path.join(cloned_repo.working_tree_dir, "my-new-file")
open(new_file_path, "wb").close()  # Create new file in working tree.
cloned_repo.index.add([new_file_path])  # Add it to the index.
# Commit the changes to deviate masters history.
cloned_repo.index.commit("Added a new file in the past - for later merge")

# Prepare a merge.
master = cloned_repo.heads.master  # Right-hand side is ahead of us, in the future.
merge_base = cloned_repo.merge_base(new_branch, master)  # Allows for a three-way merge.
cloned_repo.index.merge_tree(master, base=merge_base)  # Write the merge result into index.
cloned_repo.index.commit(
    "Merged past and now into future ;)",
    parent_commits=(new_branch.commit, master.commit),
)

# Now new_branch is ahead of master, which probably should be checked out and reset softly.
# Note that all these operations didn't touch the working tree, as we managed it ourselves.
# This definitely requires you to know what you are doing! :)
assert os.path.basename(new_file_path) in new_branch.commit.tree  # New file is now in tree.
master.commit = new_branch.commit  # Let master point to most recent commit.
cloned_repo.head.reference = master  # We adjusted just the reference, not the working tree or index.

Submodules 表示Git子模块的所有方面,这允许您查询其所有相关信息,并以各种方式进行操作。

# Create a new submodule and check it out on the spot, setup to track master
# branch of `bare_repo`. As our GitPython repository has submodules already that
# point to GitHub, make sure we don't interact with them.
for sm in cloned_repo.submodules:
    assert not sm.remove().exists()  # after removal, the sm doesn't exist anymore
sm = cloned_repo.create_submodule("mysubrepo", "path/to/subrepo", url=bare_repo.git_dir, branch="master")

# .gitmodules was written and added to the index, which is now being committed.
cloned_repo.index.commit("Added submodule")
assert sm.exists() and sm.module_exists()  # This submodule is definitely available.
sm.remove(module=True, configuration=False)  # Remove the working tree.
assert sm.exists() and not sm.module_exists()  # The submodule itself is still available.

# Update all submodules, non-recursively to save time. This method is very powerful, go have a look.
cloned_repo.submodule_update(recursive=False)
assert sm.module_exists()  # The submodule's working tree was checked out by update.

正在检查引用

References 是提交图的提示,从中可以轻松地检查项目的历史记录。

import git

repo = git.Repo.clone_from(self._small_repo_url(), os.path.join(rw_dir, "repo"), branch="master")

heads = repo.heads
master = heads.master  # Lists can be accessed by name for convenience.
master.commit  # the commit pointed to by head called master.
master.rename("new_name")  # Rename heads.
master.rename("master")

Tags 是(通常不可变)对提交和/或标记对象的引用。

tags = repo.tags
tagref = tags[0]
tagref.tag  # Tags may have tag objects carrying additional information
tagref.commit  # but they always point to commits.
repo.delete_tag(tagref)  # Delete or
repo.create_tag("my_tag")  # create tags using the repo for convenience.

A symbolic reference 是引用的特殊情况,因为它指向另一个引用而不是提交。

head = repo.head  # The head points to the active branch/ref.
master = head.reference  # Retrieve the reference the head points to.
master.commit  # From here you use it as any other reference.

访问 reflog 很容易。

log = master.log()
log[0]  # first (i.e. oldest) reflog entry
log[-1]  # last (i.e. most recent) reflog entry

修改引用

您可以轻松创建和删除 reference types 或者修改它们指向的位置。

new_branch = repo.create_head("new")  # Create a new one.
new_branch.commit = "HEAD~10"  # Set branch to another commit without changing index or working trees.
repo.delete_head(new_branch)  # Delete an existing head - only works if it is not checked out.

创建或删除 tags 同样的方法,除非你以后不能改变它们。

new_tag = repo.create_tag("my_new_tag", message="my message")
# You cannot change the commit a tag points to. Tags need to be re-created.
self.assertRaises(AttributeError, setattr, new_tag, "commit", repo.commit("HEAD~1"))
repo.delete_tag(new_tag)

改变 symbolic reference 以较低的成本切换分支(无需调整索引或工作树)。

new_branch = repo.create_head("another-branch")
repo.head.reference = new_branch

了解对象

对象是任何可存储在Git对象数据库中的对象。对象包含有关其类型、未压缩大小以及实际数据的信息。每个对象都由二进制sha1哈希唯一标识,大小为20字节,十六进制表示为40字节。

Git只知道4种不同的对象类型 BlobsTreesCommitsTags .

在gitpython中,所有对象都可以通过它们的公共基础进行访问,可以进行比较和散列。它们通常不是直接实例化的,而是通过引用或专门的存储库函数。

hc = repo.head.commit
hct = hc.tree
assert hc != hct
assert hc != repo.tags[0]
assert hc == repo.head.reference.commit

常用字段是…

self.assertEqual(hct.type, "tree")  # Preset string type, being a class attribute.
assert hct.size > 0  # size in bytes
assert len(hct.hexsha) == 40
assert len(hct.binsha) == 20

Index objects 是可以放入Git索引的对象。这些对象是树、块和子模块,它们还知道文件系统中的路径以及它们的模式。

self.assertEqual(hct.path, "")  # Root tree has no path.
assert hct.trees[0].path != ""  # The first contained item has one though.
self.assertEqual(hct.mode, 0o40000)  # Trees have the mode of a Linux directory.
self.assertEqual(hct.blobs[0].mode, 0o100644)  # Blobs have specific mode, comparable to a standard Linux fs.

通路 blob 使用流的数据(或任何对象数据)。

hct.blobs[0].data_stream.read()  # Stream object to read data from.
hct.blobs[0].stream_data(open(os.path.join(rw_dir, "blob_data"), "wb"))  # Write data to a given stream.

提交对象

Commit 对象包含有关特定提交的信息。使用中的引用获取提交 Examining References 或者如下。

获得指定版本的承诺

repo.commit("master")
repo.commit("v0.8.1")
repo.commit("HEAD~10")

重复50次提交,如果需要分页,可以指定要跳过的提交数。

fifty_first_commits = list(repo.iter_commits("master", max_count=50))
assert len(fifty_first_commits) == 50
# This will return commits 21-30 from the commit list as traversed backwards master.
ten_commits_past_twenty = list(repo.iter_commits("master", max_count=10, skip=20))
assert len(ten_commits_past_twenty) == 10
assert fifty_first_commits[20:30] == ten_commits_past_twenty

提交对象包含各种元数据

headcommit = repo.head.commit
assert len(headcommit.hexsha) == 40
assert len(headcommit.parents) > 0
assert headcommit.tree.type == "tree"
assert len(headcommit.author.name) != 0
assert isinstance(headcommit.authored_date, int)
assert len(headcommit.committer.name) != 0
assert isinstance(headcommit.committed_date, int)
assert headcommit.message != ""

注:日期时间用 seconds since epoch 格式。转换为人类可读形式可以用 time module 方法。

import time

time.asctime(time.gmtime(headcommit.committed_date))
time.strftime("%a, %d %b %Y %H:%M", time.gmtime(headcommit.committed_date))

您可以通过将调用链接到 parents

assert headcommit.parents[0].parents[0].parents[0] == repo.commit("master^^^")

上面对应 master^^^master~3 用吉特的话说。

树对象

A tree 记录指向目录内容的指针。假设您想要主分支上最新提交的根目录树

tree = repo.heads.master.commit.tree
assert len(tree.hexsha) == 40

一旦你有了一棵树,你就可以得到它的内容

assert len(tree.trees) > 0  # Trees are subdirectories.
assert len(tree.blobs) > 0  # Blobs are files.
assert len(tree.blobs) + len(tree.trees) == len(tree)

知道树的行为类似于列表,并且能够按名称查询条目,这很有用。

self.assertEqual(tree["smmap"], tree / "smmap")  # Access by index and by sub-path.
for entry in tree:  # Intuitive iteration of tree members.
    print(entry)
blob = tree.trees[1].blobs[0]  # Let's get a blob in a sub-tree.
assert blob.name
assert len(blob.path) < len(blob.abspath)
self.assertEqual(tree.trees[1].name + "/" + blob.name, blob.path)  # This is how relative blob path generated.
self.assertEqual(tree[blob.path], blob)  # You can use paths like 'dir/file' in tree,

有一种方便的方法,允许您从树中获取命名的子对象,其语法类似于在POSIX系统中如何写入路径。

assert tree / "smmap" == tree["smmap"]
assert tree / blob.path == tree[blob.path]

您还可以直接从存储库中获取提交的根目录树。

# This example shows the various types of allowed ref-specs.
assert repo.tree() == repo.head.commit.tree
past = repo.commit("HEAD~5")
assert repo.tree(past) == repo.tree(past.hexsha)
self.assertEqual(repo.tree("v0.8.1").type, "tree")  # Yes, you can provide any refspec - works everywhere.

由于树只允许直接访问其中间子项,因此使用遍历方法获取迭代器以递归方式检索项。

assert len(tree) < len(list(tree.traverse()))

备注

如果树返回子模块对象,它们将假定它们存在于当前头部的提交中。它源于的树可能是在另一个提交中扎根的,但它不知道。这就是为什么调用者必须使用 set_parent_commit(my_commit) 方法。

索引对象

Git索引是包含要在下一次提交时写入的更改或最终必须在其中进行合并的阶段。您可以使用 IndexFile 对象。轻松修改索引

index = repo.index
# The index contains all blobs in a flat list.
assert len(list(index.iter_blobs())) == len([o for o in repo.head.commit.tree.traverse() if o.type == "blob"])
# Access blob objects.
for (_path, _stage), _entry in index.entries.items():
    pass
new_file_path = os.path.join(repo.working_tree_dir, "new-file-name")
open(new_file_path, "w").close()
index.add([new_file_path])  # Add a new file to the index.
index.remove(["LICENSE"])  # Remove an existing one.
assert os.path.isfile(os.path.join(repo.working_tree_dir, "LICENSE"))  # Working tree is untouched.

self.assertEqual(index.commit("my commit message").type, "commit")  # Commit changed index.
repo.active_branch.commit = repo.commit("HEAD~1")  # Forget last commit.

from git import Actor

author = Actor("An author", "author@example.com")
committer = Actor("A committer", "committer@example.com")
# Commit with a commit message, author, and committer.
index.commit("my commit message", author=author, committer=committer)

从其他树或合并后创建新索引。将该结果写入新的索引文件,以备日后检查。

from git import IndexFile

# Load a tree into a temporary index, which exists just in memory.
IndexFile.from_tree(repo, "HEAD~1")
# Merge two trees three-way into memory...
merge_index = IndexFile.from_tree(repo, "HEAD~10", "HEAD", repo.merge_base("HEAD~10", "HEAD"))
# ...and persist it.
merge_index.write(os.path.join(rw_dir, "merged_index"))

处理遥控器

Remotes 用作外部存储库的别名,以便于推送和提取它们

empty_repo = git.Repo.init(os.path.join(rw_dir, "empty"))
origin = empty_repo.create_remote("origin", repo.remotes.origin.url)
assert origin.exists()
assert origin == empty_repo.remotes.origin == empty_repo.remotes["origin"]
origin.fetch()  # Ensure we actually have data. fetch() returns useful information.
# Set up a local tracking branch of a remote branch.
empty_repo.create_head("master", origin.refs.master)  # Create local branch "master" from remote "master".
empty_repo.heads.master.set_tracking_branch(origin.refs.master)  # Set local "master" to track remote "master.
empty_repo.heads.master.checkout()  # Check out local "master" to working tree.
# Three above commands in one:
empty_repo.create_head("master", origin.refs.master).set_tracking_branch(origin.refs.master).checkout()
# Rename remotes.
origin.rename("new_origin")
# Push and pull behaves similarly to `git push|pull`.
origin.pull()
origin.push()  # Attempt push, ignore errors.
origin.push().raise_if_error()  # Push and raise error if it fails.
# assert not empty_repo.delete_remote(origin).exists()     # Create and delete remotes.

通过像访问属性一样访问选项,您可以轻松访问遥控器的配置信息。不过,远程配置的修改更为明确。

assert origin.url == repo.remotes.origin.url
with origin.config_writer as cw:
    cw.set("pushurl", "other_url")

# Please note that in Python 2, writing origin.config_writer.set(...) is totally
# safe. In py3 __del__ calls can be delayed, thus not writing changes in time.

还可以使用git命令上的新上下文管理器(例如,用于使用特定的ssh密钥)指定每次调用的自定义环境。以下示例适用于 git 开始于 v2.3 ::

ssh_cmd = 'ssh -i id_deployment_key'
with repo.git.custom_environment(GIT_SSH_COMMAND=ssh_cmd):
    repo.remotes.origin.fetch()

这个脚本设置了一个自定义脚本来代替 ssh ,可用于 git 之前 v2.3 ::

ssh_executable = os.path.join(rw_dir, 'my_ssh_executable.sh')
with repo.git.custom_environment(GIT_SSH=ssh_executable):
    repo.remotes.origin.fetch()

下面是一个示例可执行文件,它可以用来代替 ssh_executable 以上:

#!/bin/sh
ID_RSA=/var/lib/openshift/5562b947ecdd5ce939000038/app-deployments/id_rsa
exec /usr/bin/ssh -o StrictHostKeyChecking=no -i $ID_RSA "$@"

请注意,该脚本必须是可执行的(即 chmod +x script.sh )。 StrictHostKeyChecking=no 用于避免提示将主机密钥保存到 ~/.ssh/known_hosts ,如果您将其作为守护进程运行,则会发生这种情况。

你也可以看看 Git.update_environment(...) 如果您希望更永久地设置更改的环境。

子模块处理

Submodules 使用gitpython提供的方法可以方便地处理,而且作为一个额外的好处,gitpython提供的功能比其原始的C-Git实现更智能、更不易出错,也就是说,在更新子模块时,gitpython会努力保持存储库的一致性。安全或调整现有配置。

repo = self.rorepo
sms = repo.submodules

assert len(sms) == 1
sm = sms[0]
self.assertEqual(sm.name, "gitdb")  # GitPython has gitdb as its one and only (direct) submodule...
self.assertEqual(sm.children()[0].name, "smmap")  # ...which has smmap as its one and only submodule.

# The module is the repository referenced by the submodule.
assert sm.module_exists()  # The module is available, which doesn't have to be the case.
assert sm.module().working_tree_dir.endswith("gitdb")
# The submodule's absolute path is the module's path.
assert sm.abspath == sm.module().working_tree_dir
self.assertEqual(len(sm.hexsha), 40)  # Its sha defines the commit to check out.
assert sm.exists()  # Yes, this submodule is valid and exists.
# Read its configuration conveniently.
assert sm.config_reader().get_value("path") == sm.path
self.assertEqual(len(sm.children()), 1)  # Query the submodule hierarchy.

除了查询功能外,您还可以将子模块的存储库移动到其他路径<move(…)>,编写其配置<config _writer()。设置value(…).release()`>,更新其工作树<``update(…)>,然后删除或添加它们<remove(…)`, ``add(...) >

如果您通过遍历一个树对象(该树对象不是在头的commit中建立的)获得子模块对象,则必须通知子模块其实际commit,以便使用 set_parent_commit(...) 方法。

特价商品 RootModule Type允许您将超级项目(主资料库)视为子模块层次结构的根,这允许非常方便的子模块处理。它的 update(...) 方法,以便在子模块随时间更改值时提供更新子模块的高级方式。UPDATE方法将跟踪更改并确保您的工作树和子模块签出保持一致,这在子模块被删除或添加到仅命名两个已处理案例的情况下非常有用。

此外,gitpython还添加了跟踪特定分支的功能,而不仅仅是提交。受自定义更新方法的支持,您可以自动将子模块更新到远程存储库中可用的最新版本,并跟踪这些子模块的更改和移动。要使用它,请将要跟踪的分支的名称设置为 submodule.$name.branch 选择权 .git模块 文件,并在结果存储库上使用gitpython更新方法 to_latest_revision 参数已打开。在后一种情况下,子模块的sha将被忽略,而本地跟踪分支将自动更新到相应的远程分支,前提是没有本地更改。产生的行为与svn::externals中的行为非常相似,后者有时很有用。

获取差异信息

diff通常可以通过 Diffable 因为他们提供了 diff 方法。此操作生成 DiffIndex 允许您轻松访问有关路径的差异信息。

索引与树、索引与工作树、树与树以及树与工作副本之间可以进行差异。如果涉及提交,则将隐式使用它们的树。

hcommit = repo.head.commit
hcommit.diff()  # diff tree against index.
hcommit.diff("HEAD~1")  # diff tree against previous tree.
hcommit.diff(None)  # diff tree against working tree.

index = repo.index
index.diff()  # diff index against itself yielding empty diff.
index.diff(None)  # diff index against working copy.
index.diff("HEAD")  # diff index against current HEAD tree.

返回的项是DiffIndex,它本质上是Diff对象的列表。它提供了额外的过滤,以方便您查找可能需要的内容。

# Traverse added Diff objects only
for diff_added in hcommit.diff("HEAD~1").iter_change_type("A"):
    print(diff_added)

如果要实现类似git状态的功能,请使用diff框架。

  • 您的头指向的索引树和提交树之间的差异

  • 使用 repo.index.diff(repo.head.commit)

  • 索引和工作树之间的差异

  • 使用 repo.index.diff(None)

  • 未跟踪文件列表

  • 使用 repo.untracked_files

交换分支

在类似的分支之间切换 git checkout ,您实际上需要将头部符号引用指向新分支,并将索引和工作副本重置为匹配。一个简单的手工方法是

# Reset our working tree 10 commits into the past.
past_branch = repo.create_head("past_branch", "HEAD~10")
repo.head.reference = past_branch
assert not repo.head.is_detached
# Reset the index and working tree to match the pointed-to commit.
repo.head.reset(index=True, working_tree=True)

# To detach your head, you have to point to a commit directly.
repo.head.reference = repo.commit("HEAD~5")
assert repo.head.is_detached
# Now our head points 15 commits into the past, whereas the working tree
# and index are 10 commits in the past.

但是,前面的方法会残忍地覆盖用户在工作副本和索引中所做的更改,并且比 git-checkout . 后者通常会阻止你破坏你的工作。使用以下更安全的方法。

# Check out the branch using git-checkout.
# It will fail as the working tree appears dirty.
self.assertRaises(git.GitCommandError, repo.heads.master.checkout)
repo.heads.past_branch.checkout()

初始化存储库

在本例中,我们将初始化一个空存储库,向索引中添加一个空文件,并提交更改。

import git

repo_dir = os.path.join(rw_dir, "my-new-repo")
file_name = os.path.join(repo_dir, "new-file")

r = git.Repo.init(repo_dir)
# This function just creates an empty file.
open(file_name, "wb").close()
r.index.add([file_name])
r.index.commit("initial commit")

请看一下各个方法,因为它们通常支持大量参数来定制它们的行为。

直接使用git

如果由于未包装而缺少功能,您可以方便地使用 git 直接命令。它由每个存储库实例拥有。

git = repo.git
git.checkout("HEAD", b="my_new_branch")  # Create a new branch.
git.branch("another-new-one")
git.branch("-D", "another-new-one")  # Pass strings for full control over argument order.
git.for_each_ref()  # '-' becomes '_' when calling it.

默认情况下,返回值将是命令生成的标准输出通道的字符串。

关键字参数转换为命令行上的短关键字参数和长关键字参数。特殊的概念 git.command(flag=True) 将创建一个没有值的标志,如 command --flag .

如果 None 在参数中找到,将自动删除。作为参数传递的列表和元组将递归地解包到各个参数。对象被转换为字符串 str(...) 功能。

对象数据库

git.Repo 实例由其对象数据库实例提供支持,该实例将在提取任何数据或写入新对象时使用。

数据库的类型决定了某些性能特征,例如每秒可以读取的对象数量、读取大型数据文件时的资源使用情况以及应用程序的平均内存占用。

GITDB

GitDB是Git对象数据库的纯Python实现。它是GitPython0.3中使用的默认数据库。它在处理大文件时使用较少的内存,但在从密集存储库中提取大量小对象时会慢2到5倍::

repo = Repo("path/to/repo", odbt=GitDB)

gitCmd对象数据库

Git命令数据库使用持久的git-cat-file实例来读取存储库信息。这些进程在所有条件下都运行得非常快,但会为进程本身消耗额外的内存。解压缩大文件时,内存使用量将远远高于 GitDB **

repo = Repo("path/to/repo", odbt=GitCmdObjectDB)

git命令调试和自定义

使用环境变量,您可以进一步调整git命令的行为。

  • GIT_PYTHON_TRACE

  • 如果设置为非0,所有执行的git命令将在发生时显示。

  • 如果设置为 full ,执行的git命令 _and_ 它在stdout和stderr上的整个输出将在它们发生时显示出来。

NOTE :所有日志记录都使用python记录器输出,因此请确保您的程序配置为显示信息级消息。如果不是这样,请尝试将以下内容添加到程序中:

import logging
logging.basicConfig(level=logging.INFO)
  • GIT_PYTHON_GIT_EXECUTABLE

  • 如果设置了,它应该包含指向g it可执行文件的完整路径,例如 c:\Program Files (x86)\Git\bin\git.exe 在Windows或 /usr/bin/git 在Linux上。

甚至更多…

其中还有更多的功能,比如归档存储库、获取统计数据和日志、责备,以及可能还有一些其他在这里没有提到的东西。

检查单元测试,以深入了解每个函数应该如何使用。