此处显示的所有代码都源自 以确保正确性。了解这一点还可以让您更容易地为自己的测试目的运行代码,您所需要的只是一个git python的开发人员安装。


第一步是创建 git.Repo 对象来表示存储库。

from git import Repo

# rorepo is a Repo instance pointing to the git-python repository.
# For all you know, the first argument to Repo is a path to the repository
# you want to work with
repo = Repo(self.rorepo.working_tree_dir)
assert not repo.bare

在上面的示例中,目录 self.rorepo.working_tree_dir 等于 /Users/mtrier/Development/git-python 是我的工作存储库,其中包含 .git 目录。还可以使用 bare 储存库。

bare_repo = Repo.init(os.path.join(rw_dir, 'bare-repo'), bare=True)
assert bare_repo.bare


repo.config_reader()             # get a config reader for read-only access
with repo.config_writer():       # get a config writer to change configuration
    pass                         # call release() to be sure changes are written and locks are released


assert not bare_repo.is_dirty()  # check the dirty state
repo.untracked_files             # retrieve a list of untracked files
# ['my_untracked_file']


cloned_repo = repo.clone(os.path.join(rw_dir, 'to/this/path'))
assert cloned_repo.__class__ is Repo     # clone an existing repository
assert Repo.init(os.path.join(rw_dir, 'path/for/new/repo')).__class__ is Repo


with open(os.path.join(rw_dir, 'repo.tar'), 'wb') as fp:




assert os.path.isdir(cloned_repo.working_tree_dir)                   # directory with your work files
assert cloned_repo.git_dir.startswith(cloned_repo.working_tree_dir)  # directory containing the git repository
assert bare_repo.working_tree_dir is None                            # bare repositories have no working tree

Heads 头在吉特语中是树枝。 References 是指向特定提交或其他引用的指针。头部和 Tags 是一种参考。gitpython允许您非常直观地查询它们。

self.assertEqual(repo.head.ref, repo.heads.master,  # head is a sym-ref pointing to master
                 "It's ok if TC not running from `master`.")
self.assertEqual(repo.tags['0.3.5'], repo.tag('refs/tags/0.3.5'))   # you can access tags in various ways too
self.assertEqual(repo.refs.master, repo.heads['master'])            # .refs provides all refs, ie heads ...

if 'TRAVIS' not in os.environ:
    self.assertEqual(repo.refs['origin/master'], repo.remotes.origin.refs.master)  # ... remotes ...
self.assertEqual(repo.refs['0.3.5'], repo.tags['0.3.5'])             # ... and tags


new_branch = cloned_repo.create_head('feature')               # create a new branch ...
assert cloned_repo.active_branch != new_branch                # which wasn't checked out yet ...
self.assertEqual(new_branch.commit, cloned_repo.active_branch.commit)  # pointing to the checked-out commit
# It's easy to let a branch point to the previous commit, without affecting anything else
# Each reference provides access to the git object it points to, usually commits
assert new_branch.set_commit('HEAD~1').commit == cloned_repo.active_branch.commit.parents[0]


past = cloned_repo.create_tag('past', ref=new_branch,
                              message="This is a tag-object pointing to %s" %
self.assertEqual(past.commit, new_branch.commit)        # the tag points to the specified commit
assert past.tag.message.startswith("This is")  # and its object carries the message provided

now = cloned_repo.create_tag('now')            # This is a tag-reference. It may not carry meta-data
assert now.tag is None

你可以向下移动到 git objects 通过引用和其他对象。一些物体 commits 有其他要查询的元数据。

assert now.commit.message != past.commit.message
# You can read objects directly through binary streams, no working tree required
assert (now.commit.tree / 'VERSION')'ascii').startswith('3')

# You can traverse trees as well to handle all contained files of a particular commit
file_count = 0
tree_count = 0
tree = past.commit.tree
for item in tree.traverse():
    file_count += item.type == 'blob'
    tree_count += item.type == 'tree'
assert file_count and tree_count                        # we have accumulated all directories and files
self.assertEqual(len(tree.blobs) + len(tree.trees), len(tree))   # a tree is iterable on its children

Remotes 允许处理fetch、pull和push操作,同时向 progress delegates .

from git import RemoteProgress

class MyProgressPrinter(RemoteProgress):
    def update(self, op_code, cur_count, max_count=None, message=''):
        print(op_code, cur_count, max_count, cur_count / (max_count or 100.0), message or "NO MESSAGE")
# end

self.assertEqual(len(cloned_repo.remotes), 1)                    # we have been cloned, so should be one remote
self.assertEqual(len(bare_repo.remotes), 0)                      # this one was just initialized
origin = bare_repo.create_remote('origin', url=cloned_repo.working_tree_dir)
assert origin.exists()
for fetch_info in origin.fetch(progress=MyProgressPrinter()):
    print("Updated %s to %s" % (fetch_info.ref, fetch_info.commit))
# create a local branch at the latest fetched master. We specify the name statically, but you have all
# information to do it programatically as well.
bare_master = bare_repo.create_head('master', origin.refs.master)
assert not bare_repo.delete_remote(origin).exists()
# push and pull behave very similarly

这个 index 也被称为吉特演讲中的舞台。它用于准备新的提交,并可用于保存合并操作的结果。我们的索引实现允许数据流到索引中,这对于没有工作树的裸存储库很有用。

self.assertEqual(new_branch.checkout(), cloned_repo.active_branch)     # checking out branch adjusts the wtree
self.assertEqual(new_branch.commit, past.commit)                       # Now the past is checked out

new_file_path = os.path.join(cloned_repo.working_tree_dir, 'my-new-file')
open(new_file_path, 'wb').close()                             # create new file in working tree
cloned_repo.index.add([new_file_path])                        # add it to the index
# Commit the changes to deviate masters history
cloned_repo.index.commit("Added a new file in the past - for later merege")

# prepare a merge
master = cloned_repo.heads.master                         # right-hand side is ahead of us, in the future
merge_base = cloned_repo.merge_base(new_branch, master)   # allows for a three-way merge
cloned_repo.index.merge_tree(master, base=merge_base)     # write the merge result into index
cloned_repo.index.commit("Merged past and now into future ;)",
                         parent_commits=(new_branch.commit, master.commit))

# now new_branch is ahead of master, which probably should be checked out and reset softly.
# note that all these operations didn't touch the working tree, as we managed it ourselves.
# This definitely requires you to know what you are doing :) !
assert os.path.basename(new_file_path) in new_branch.commit.tree  # new file is now in tree
master.commit = new_branch.commit            # let master point to most recent commit
cloned_repo.head.reference = master          # we adjusted just the reference, not the working tree or index

Submodules 表示Git子模块的所有方面,这允许您查询其所有相关信息,并以各种方式进行操作。

# create a new submodule and check it out on the spot, setup to track master branch of `bare_repo`
# As our GitPython repository has submodules already that point to GitHub, make sure we don't
# interact with them
for sm in cloned_repo.submodules:
    assert not sm.remove().exists()                   # after removal, the sm doesn't exist anymore
sm = cloned_repo.create_submodule('mysubrepo', 'path/to/subrepo', url=bare_repo.git_dir, branch='master')

# .gitmodules was written and added to the index, which is now being committed
cloned_repo.index.commit("Added submodule")
assert sm.exists() and sm.module_exists()             # this submodule is defintely available
sm.remove(module=True, configuration=False)           # remove the working tree
assert sm.exists() and not sm.module_exists()         # the submodule itself is still available

# update all submodules, non-recursively to save time, this method is very powerful, go have a look
assert sm.module_exists()                             # The submodules working tree was checked out by update


References 是提交图的提示,从中可以轻松地检查项目的历史记录。

import git
repo = git.Repo.clone_from(self._small_repo_url(), os.path.join(rw_dir, 'repo'), branch='master')

heads = repo.heads
master = heads.master       # lists can be accessed by name for convenience
master.commit               # the commit pointed to by head called master
master.rename('new_name')   # rename heads

Tags 是(通常不可变)对提交和/或标记对象的引用。

tags = repo.tags
tagref = tags[0]
tagref.tag                  # tags may have tag objects carrying additional information
tagref.commit               # but they always point to commits
repo.delete_tag(tagref)     # delete or
repo.create_tag("my_tag")   # create tags using the repo for convenience

A symbolic reference 是引用的特殊情况,因为它指向另一个引用而不是提交。

head = repo.head            # the head points to the active branch/ref
master = head.reference     # retrieve the reference the head points to
master.commit               # from here you use it as any other reference

访问 reflog 很容易。

log = master.log()
log[0]                      # first (i.e. oldest) reflog entry
log[-1]                     # last (i.e. most recent) reflog entry


您可以轻松创建和删除 reference types 或者修改它们指向的位置。

new_branch = repo.create_head('new')     # create a new one
new_branch.commit = 'HEAD~10'            # set branch to another commit without changing index or working trees
repo.delete_head(new_branch)             # delete an existing head - only works if it is not checked out

创建或删除 tags 同样的方法,除非你以后不能改变它们。

new_tag = repo.create_tag('my_new_tag', message='my message')
# You cannot change the commit a tag points to. Tags need to be re-created
self.assertRaises(AttributeError, setattr, new_tag, 'commit', repo.commit('HEAD~1'))

改变 symbolic reference 以较低的成本切换分支(无需调整索引或工作树)。

new_branch = repo.create_head('another-branch')
repo.head.reference = new_branch



Git只知道4种不同的对象类型 BlobsTreesCommitsTags .


hc = repo.head.commit
hct = hc.tree
hc != hct                           # @NoEffect
hc != repo.tags[0]                  # @NoEffect
hc == repo.head.reference.commit    # @NoEffect


self.assertEqual(hct.type, 'tree')           # preset string type, being a class attribute
assert hct.size > 0                 # size in bytes
assert len(hct.hexsha) == 40
assert len(hct.binsha) == 20

Index objects 是可以放入Git索引的对象。这些对象是树、块和子模块,它们还知道文件系统中的路径以及它们的模式。

self.assertEqual(hct.path, '')                  # root tree has no path
assert hct.trees[0].path != ''         # the first contained item has one though
self.assertEqual(hct.mode, 0o40000)              # trees have the mode of a linux directory
self.assertEqual(hct.blobs[0].mode, 0o100644)   # blobs have specific mode, comparable to a standard linux fs

通路 blob 使用流的数据(或任何对象数据)。

hct.blobs[0]        # stream object to read data from
hct.blobs[0].stream_data(open(os.path.join(rw_dir, 'blob_data'), 'wb'))  # write data to given stream


Commit 对象包含有关特定提交的信息。使用中的引用获取提交 Examining References 或者如下。




fifty_first_commits = list(repo.iter_commits('master', max_count=50))
assert len(fifty_first_commits) == 50
# this will return commits 21-30 from the commit list as traversed backwards master
ten_commits_past_twenty = list(repo.iter_commits('master', max_count=10, skip=20))
assert len(ten_commits_past_twenty) == 10
assert fifty_first_commits[20:30] == ten_commits_past_twenty


headcommit = repo.head.commit
assert len(headcommit.hexsha) == 40
assert len(headcommit.parents) > 0
assert headcommit.tree.type == 'tree'
assert len( != 0
assert isinstance(headcommit.authored_date, int)
assert len( != 0
assert isinstance(headcommit.committed_date, int)
assert headcommit.message != ''

注:日期时间用 seconds since epoch 格式。转换为人类可读形式可以用 time module 方法。

import time
time.strftime("%a, %d %b %Y %H:%M", time.gmtime(headcommit.committed_date))

您可以通过将调用链接到 parents

assert headcommit.parents[0].parents[0].parents[0] == repo.commit('master^^^')

上面对应 master^^^master~3 用吉特的话说。


A tree 记录指向目录内容的指针。假设您想要主分支上最新提交的根目录树

tree = repo.heads.master.commit.tree
assert len(tree.hexsha) == 40


assert len(tree.trees) > 0          # trees are subdirectories
assert len(tree.blobs) > 0          # blobs are files
assert len(tree.blobs) + len(tree.trees) == len(tree)


self.assertEqual(tree['smmap'], tree / 'smmap')          # access by index and by sub-path
for entry in tree:                                         # intuitive iteration of tree members
blob = tree.trees[1].blobs[0]                              # let's get a blob in a sub-tree
assert len(blob.path) < len(blob.abspath)
self.assertEqual(tree.trees[1].name + '/' +, blob.path)   # this is how relative blob path generated
self.assertEqual(tree[blob.path], blob)                             # you can use paths like 'dir/file' in tree


assert tree / 'smmap' == tree['smmap']
assert tree / blob.path == tree[blob.path]


# This example shows the various types of allowed ref-specs
assert repo.tree() == repo.head.commit.tree
past = repo.commit('HEAD~5')
assert repo.tree(past) == repo.tree(past.hexsha)
self.assertEqual(repo.tree('v0.8.1').type, 'tree')        # yes, you can provide any refspec - works everywhere


assert len(tree) < len(list(tree.traverse()))


如果树返回子模块对象,它们将假定它们存在于当前头部的提交中。它源于的树可能是在另一个提交中扎根的,但它不知道。这就是为什么调用者必须使用 set_parent_commit(my_commit) 方法。


Git索引是包含要在下一次提交时写入的更改或最终必须在其中进行合并的阶段。您可以使用 IndexFile 对象。轻松修改索引

index = repo.index
# The index contains all blobs in a flat list
assert len(list(index.iter_blobs())) == len([o for o in repo.head.commit.tree.traverse() if o.type == 'blob'])
# Access blob objects
for (_path, _stage), entry in index.entries.items():
new_file_path = os.path.join(repo.working_tree_dir, 'new-file-name')
open(new_file_path, 'w').close()
index.add([new_file_path])                                             # add a new file to the index
index.remove(['LICENSE'])                                              # remove an existing one
assert os.path.isfile(os.path.join(repo.working_tree_dir, 'LICENSE'))  # working tree is untouched

self.assertEqual(index.commit("my commit message").type, 'commit')              # commit changed index
repo.active_branch.commit = repo.commit('HEAD~1')                      # forget last commit

from git import Actor
author = Actor("An author", "")
committer = Actor("A committer", "")
# commit by commit message and author and committer
index.commit("my commit message", author=author, committer=committer)


from git import IndexFile
# loads a tree into a temporary index, which exists just in memory
IndexFile.from_tree(repo, 'HEAD~1')
# merge two trees three-way into memory
merge_index = IndexFile.from_tree(repo, 'HEAD~10', 'HEAD', repo.merge_base('HEAD~10', 'HEAD'))
# and persist it
merge_index.write(os.path.join(rw_dir, 'merged_index'))


Remotes 用作外部存储库的别名,以便于推送和提取它们

empty_repo = git.Repo.init(os.path.join(rw_dir, 'empty'))
origin = empty_repo.create_remote('origin', repo.remotes.origin.url)
assert origin.exists()
assert origin == empty_repo.remotes.origin == empty_repo.remotes['origin']
origin.fetch()                  # assure we actually have data. fetch() returns useful information
# Setup a local tracking branch of a remote branch
empty_repo.create_head('master', origin.refs.master)  # create local branch "master" from remote "master"
empty_repo.heads.master.set_tracking_branch(origin.refs.master)  # set local "master" to track remote "master
empty_repo.heads.master.checkout()  # checkout local "master" to working tree
# Three above commands in one:
empty_repo.create_head('master', origin.refs.master).set_tracking_branch(origin.refs.master).checkout()
# rename remotes
# push and pull behaves similarly to `git push|pull`
# assert not empty_repo.delete_remote(origin).exists()     # create and delete remotes


assert origin.url == repo.remotes.origin.url
with origin.config_writer as cw:
    cw.set("pushurl", "other_url")

# Please note that in python 2, writing origin.config_writer.set(...) is totally safe.
# In py3 __del__ calls can be delayed, thus not writing changes in time.

还可以使用git命令上的新上下文管理器(例如,用于使用特定的ssh密钥)指定每次调用的自定义环境。以下示例适用于 git 开始于 v2.3 ::

ssh_cmd = 'ssh -i id_deployment_key'
with repo.git.custom_environment(GIT_SSH_COMMAND=ssh_cmd):

这个脚本设置了一个自定义脚本来代替 ssh ,可用于 git 之前 v2.3 ::

ssh_executable = os.path.join(rw_dir, '')
with repo.git.custom_environment(GIT_SSH=ssh_executable):

下面是一个示例可执行文件,它可以用来代替 ssh_executable 以上:

exec /usr/bin/ssh -o StrictHostKeyChecking=no -i $ID_RSA "$@"

请注意,脚本必须是可执行的(即 chomd +x script.shStrictHostKeyChecking=no 用于避免提示将主机密钥保存到 ~/.ssh/known_hosts ,如果您将其作为守护进程运行,则会发生这种情况。

你也可以看看 Git.update_environment(...) 如果您希望更永久地设置更改的环境。


Submodules 使用gitpython提供的方法可以方便地处理,而且作为一个额外的好处,gitpython提供的功能比其原始的C-Git实现更智能、更不易出错,也就是说,在更新子模块时,gitpython会努力保持存储库的一致性。安全或调整现有配置。

repo = self.rorepo
sms = repo.submodules

assert len(sms) == 1
sm = sms[0]
self.assertEqual(, 'gitdb')                         # git-python has gitdb as single submodule ...
self.assertEqual(sm.children()[0].name, 'smmap')           # ... which has smmap as single submodule

# The module is the repository referenced by the submodule
assert sm.module_exists()                         # the module is available, which doesn't have to be the case.
assert sm.module().working_tree_dir.endswith('gitdb')
# the submodule's absolute path is the module's path
assert sm.abspath == sm.module().working_tree_dir
self.assertEqual(len(sm.hexsha), 40)                       # Its sha defines the commit to checkout
assert sm.exists()                                # yes, this submodule is valid and exists
# read its configuration conveniently
assert sm.config_reader().get_value('path') == sm.path
self.assertEqual(len(sm.children()), 1)                    # query the submodule hierarchy

除了查询功能外,您还可以将子模块的存储库移动到其他路径<move(…)>,编写其配置<config _writer()。设置value(…).release()`>,更新其工作树<``update(…)>,然后删除或添加它们<remove(…)`, ``add(...) >

如果您通过遍历一个树对象(该树对象不是在头的commit中建立的)获得子模块对象,则必须通知子模块其实际commit,以便使用 set_parent_commit(...) 方法。

特殊 RootModule 类型允许您将主存储库视为子模块层次结构的根,这允许非常方便的子模块处理。其 update(...) 方法被重新实现,以便在子模块随时间改变其值时提供更新子模块的高级方法。更新方法将跟踪更改,并确保您的工作树和子模块签出保持一致,这对于删除或添加子模块以仅命名两个已处理的案例非常有用。

此外,gitpython还添加了跟踪特定分支的功能,而不仅仅是提交。受自定义更新方法的支持,您可以自动将子模块更新到远程存储库中可用的最新版本,并跟踪这些子模块的更改和移动。要使用它,请将要跟踪的分支的名称设置为 submodule.$name.branch 选择权 .git模块 文件,并在结果存储库上使用gitpython更新方法 to_latest_revision 参数已打开。在后一种情况下,子模块的sha将被忽略,而本地跟踪分支将自动更新到相应的远程分支,前提是没有本地更改。产生的行为与svn::externals中的行为非常相似,后者有时很有用。


diff通常可以通过 Diffable 因为他们提供了 diff 方法。此操作生成 DiffIndex 允许您轻松访问有关路径的差异信息。


hcommit = repo.head.commit
hcommit.diff()                  # diff tree against index
hcommit.diff('HEAD~1')          # diff tree against previous tree
hcommit.diff(None)              # diff tree against working tree

index = repo.index
index.diff()                    # diff index against itself yielding empty diff
index.diff(None)                # diff index against working copy
index.diff('HEAD')              # diff index against current HEAD tree


# Traverse added Diff objects only
for diff_added in hcommit.diff('HEAD~1').iter_change_type('A'):


  • 您的头指向的索引树和提交树之间的差异

  • 使用 repo.index.diff(repo.head.commit)

  • 索引和工作树之间的差异

  • 使用 repo.index.diff(None)

  • 未跟踪文件列表

  • 使用 repo.untracked_files


在类似的分支之间切换 git checkout ,您实际上需要将头部符号引用指向新分支,并将索引和工作副本重置为匹配。一个简单的手工方法是

# Reset our working tree 10 commits into the past
past_branch = repo.create_head('past_branch', 'HEAD~10')
repo.head.reference = past_branch
assert not repo.head.is_detached
# reset the index and working tree to match the pointed-to commit
repo.head.reset(index=True, working_tree=True)

# To detach your head, you have to point to a commit directly
repo.head.reference = repo.commit('HEAD~5')
assert repo.head.is_detached
# now our head points 15 commits into the past, whereas the working tree
# and index are 10 commits in the past

但是,前面的方法会残忍地覆盖用户在工作副本和索引中所做的更改,并且比 git-checkout . 后者通常会阻止你破坏你的工作。使用以下更安全的方法。

# checkout the branch using git-checkout. It will fail as the working tree appears dirty
self.assertRaises(git.GitCommandError, repo.heads.master.checkout)



import git

repo_dir = os.path.join(rw_dir, 'my-new-repo')
file_name = os.path.join(repo_dir, 'new-file')

r = git.Repo.init(repo_dir)
# This function just creates an empty file ...
open(file_name, 'wb').close()
r.index.commit("initial commit")



如果由于未包装而缺少功能,您可以方便地使用 git 直接命令。它由每个存储库实例拥有。

git = repo.git
git.checkout('HEAD', b="my_new_branch")         # create a new branch
git.branch('-D', 'another-new-one')             # pass strings for full control over argument order
git.for_each_ref()                              # '-' becomes '_' when calling it


关键字参数转换为命令行上的短关键字参数和长关键字参数。特殊的概念 git.command(flag=True) 将创建一个没有值的标志,如 command --flag .

如果 None 在参数中找到,将自动删除。作为参数传递的列表和元组将递归地解包到各个参数。对象被转换为字符串 str(...) 功能。


git.Repo 实例由其对象数据库实例提供支持,该实例将在提取任何数据或写入新对象时使用。



gitdb是git对象数据库的纯Python实现。它是gitpython 0.3中使用的默认数据库。它在处理大型文件时使用较少的内存,但在从密集的存储库中提取大量小对象时将慢2到5倍:

repo = Repo("path/to/repo", odbt=GitDB)


git命令数据库使用持久的git cat文件实例来读取存储库信息。它们在所有条件下都运行得非常快,但会为进程本身消耗额外的内存。在提取大型文件时,内存使用率将远远高于 GitDB ::

repo = Repo("path/to/repo", odbt=GitCmdObjectDB)




  • 如果设置为非0,所有执行的git命令将在发生时显示。

  • 如果设置为 full ,执行的git命令 _and_ 它在stdout和stderr上的整个输出将在它们发生时显示出来。

NOTE :所有日志记录都使用python记录器输出,因此请确保您的程序配置为显示信息级消息。如果不是这样,请尝试将以下内容添加到程序中:

import logging

  • 如果设置了,它应该包含指向g it可执行文件的完整路径,例如 c:\Program Files (x86)\Git\bin\git.exe 在Windows或 /usr/bin/git 在Linux上。