Bio.Phylo.TreeConstruction模块
树构建的类和方法。
- class Bio.Phylo.TreeConstruction.DistanceMatrix(names, matrix=None)
基类:
_Matrix
可用于基于距离的树算法的距离矩阵类。
无论用户提供什么,所有对角线元素都将为零。
- __init__(names, matrix=None)
初始化课程。
- __setitem__(item, value)
将Matrix的项目设置为值。
- format_phylip(handle)
将Phylip格式的数据写入给定的类似文件的对象或处理。
输出流是与Phylip程序一起使用的输入距离矩阵格式(例如“neighbor”)。请参阅:http://evolution.genetics.washington.edu/phylip/doc/neighbor.html
- 参数:
- 手柄文件或类似文件的目标
可写文本模式文件处理或支持“写”方法的其他对象,例如StringIO或sys. stdout。
- __firstlineno__ = 314
- __static_attributes__ = ()
- class Bio.Phylo.TreeConstruction.DistanceCalculator(model='identity', skip_letters=None)
基类:
object
根据DNA或蛋白质序列比对计算距离矩阵。
此类根据DNA或蛋白质序列的多序列比对以及替代模型的给定名称来计算距离矩阵。
目前仅使用评分矩阵。
- 参数:
- 模型str
用于计算距离的模型矩阵的名称。属性
dna_models
包含DNA序列的可用模型名称,protein_models
for protein蛋白质sequences序列.
示例
加载小型PHYLIP对齐,从中计算距离::
>>> from Bio.Phylo.TreeConstruction import DistanceCalculator >>> from Bio import AlignIO >>> aln = AlignIO.read(open('TreeConstruction/msa.phy'), 'phylip') >>> print(aln) Alignment with 5 rows and 13 columns AACGTGGCCACAT Alpha AAGGTCGCCACAC Beta CAGTTCGCCACAA Gamma GAGATTTCCGCCT Delta GAGATCTCCGCCC Epsilon
具有“身份”模型的DNA计算器::
>>> calculator = DistanceCalculator('identity') >>> dm = calculator.get_distance(aln) >>> print(dm) Alpha 0.000000 Beta 0.230769 0.000000 Gamma 0.384615 0.230769 0.000000 Delta 0.538462 0.538462 0.538462 0.000000 Epsilon 0.615385 0.384615 0.461538 0.153846 0.000000 Alpha Beta Gamma Delta Epsilon
带有“blosum 62”模型的蛋白质计算器::
>>> calculator = DistanceCalculator('blosum62') >>> dm = calculator.get_distance(aln) >>> print(dm) Alpha 0.000000 Beta 0.369048 0.000000 Gamma 0.493976 0.250000 0.000000 Delta 0.585366 0.547619 0.566265 0.000000 Epsilon 0.700000 0.355556 0.488889 0.222222 0.000000 Alpha Beta Gamma Delta Epsilon
相同的计算,使用新的对齐对象:
>>> from Bio.Phylo.TreeConstruction import DistanceCalculator >>> from Bio import Align >>> aln = Align.read('TreeConstruction/msa.phy', 'phylip') >>> print(aln) Alpha 0 AACGTGGCCACAT 13 Beta 0 AAGGTCGCCACAC 13 Gamma 0 CAGTTCGCCACAA 13 Delta 0 GAGATTTCCGCCT 13 Epsilon 0 GAGATCTCCGCCC 13
具有“身份”模型的DNA计算器::
>>> calculator = DistanceCalculator('identity') >>> dm = calculator.get_distance(aln) >>> print(dm) Alpha 0.000000 Beta 0.230769 0.000000 Gamma 0.384615 0.230769 0.000000 Delta 0.538462 0.538462 0.538462 0.000000 Epsilon 0.615385 0.384615 0.461538 0.153846 0.000000 Alpha Beta Gamma Delta Epsilon
带有“blosum 62”模型的蛋白质计算器::
>>> calculator = DistanceCalculator('blosum62') >>> dm = calculator.get_distance(aln) >>> print(dm) Alpha 0.000000 Beta 0.369048 0.000000 Gamma 0.493976 0.250000 0.000000 Delta 0.585366 0.547619 0.566265 0.000000 Epsilon 0.700000 0.355556 0.488889 0.222222 0.000000 Alpha Beta Gamma Delta Epsilon
- dna_models = ['benner22', 'benner6', 'benner74', 'blastn', 'dayhoff', 'feng', 'genetic', 'gonnet1992', 'hoxd70', 'johnson', 'jones', 'levin', 'mclachlan', 'mdm78', 'megablast', 'blastn', 'rao', 'risler', 'schneider', 'str', 'trans']
- protein_models = ['blastp', 'blosum45', 'blosum50', 'blosum62', 'blosum80', 'blosum90', 'pam250', 'pam30', 'pam70']
- models = ['identity', 'benner22', 'benner6', 'benner74', 'blastn', 'dayhoff', 'feng', 'genetic', 'gonnet1992', 'hoxd70', 'johnson', 'jones', 'levin', 'mclachlan', 'mdm78', 'megablast', 'blastn', 'rao', 'risler', 'schneider', 'str', 'trans', 'blastp', 'blosum45', 'blosum50', 'blosum62', 'blosum80', 'blosum90', 'pam250', 'pam30', 'pam70']
- __init__(model='identity', skip_letters=None)
使用距离模型初始化。
- get_distance(msa)
返回对齐或MultipleSeqConnection对象的DistanceMatrix。
- 参数:
- MSA对齐或MultipleSeqAlignment对象,表示
DNA或蛋白质多序列比对。
- __firstlineno__ = 364
- __static_attributes__ = ('scoring_matrix', 'skip_letters')
- class Bio.Phylo.TreeConstruction.TreeConstructor
基类:
object
所有树构造函数的Base Class。
- build_tree(msa)
调用者从对齐或MultipleSeqEqualition对象构建树。
这应该在子类中实现。
- __firstlineno__ = 577
- __static_attributes__ = ()
- class Bio.Phylo.TreeConstruction.DistanceTreeConstructor(distance_calculator=None, method='nj')
-
基于距离的树构造器。
- 参数:
- 方法str
距离树构建方法,“nj”(默认)或“upgma”。
- distance_calculatorDistanceCalculator
用于多序列比对的距离矩阵计算器。如果
build_tree
将被称为。
示例
加载一个小型PHYLIP对齐来计算距离,然后构建upgma树::
>>> from Bio.Phylo.TreeConstruction import DistanceTreeConstructor >>> from Bio.Phylo.TreeConstruction import DistanceCalculator >>> from Bio import AlignIO >>> aln = AlignIO.read(open('TreeConstruction/msa.phy'), 'phylip') >>> constructor = DistanceTreeConstructor() >>> calculator = DistanceCalculator('identity') >>> dm = calculator.get_distance(aln) >>> upgmatree = constructor.upgma(dm) >>> print(upgmatree) Tree(rooted=True) Clade(branch_length=0, name='Inner4') Clade(branch_length=0.18749999999999994, name='Inner1') Clade(branch_length=0.07692307692307693, name='Epsilon') Clade(branch_length=0.07692307692307693, name='Delta') Clade(branch_length=0.11057692307692304, name='Inner3') Clade(branch_length=0.038461538461538464, name='Inner2') Clade(branch_length=0.11538461538461536, name='Gamma') Clade(branch_length=0.11538461538461536, name='Beta') Clade(branch_length=0.15384615384615383, name='Alpha')
建造新泽西树::
>>> njtree = constructor.nj(dm) >>> print(njtree) Tree(rooted=False) Clade(branch_length=0, name='Inner3') Clade(branch_length=0.18269230769230765, name='Alpha') Clade(branch_length=0.04807692307692307, name='Beta') Clade(branch_length=0.04807692307692307, name='Inner2') Clade(branch_length=0.27884615384615385, name='Inner1') Clade(branch_length=0.051282051282051266, name='Epsilon') Clade(branch_length=0.10256410256410259, name='Delta') Clade(branch_length=0.14423076923076922, name='Gamma')
相同的示例,使用新的对齐类::
>>> from Bio.Phylo.TreeConstruction import DistanceTreeConstructor >>> from Bio.Phylo.TreeConstruction import DistanceCalculator >>> from Bio import Align >>> aln = Align.read(open('TreeConstruction/msa.phy'), 'phylip') >>> constructor = DistanceTreeConstructor() >>> calculator = DistanceCalculator('identity') >>> dm = calculator.get_distance(aln) >>> upgmatree = constructor.upgma(dm) >>> print(upgmatree) Tree(rooted=True) Clade(branch_length=0, name='Inner4') Clade(branch_length=0.18749999999999994, name='Inner1') Clade(branch_length=0.07692307692307693, name='Epsilon') Clade(branch_length=0.07692307692307693, name='Delta') Clade(branch_length=0.11057692307692304, name='Inner3') Clade(branch_length=0.038461538461538464, name='Inner2') Clade(branch_length=0.11538461538461536, name='Gamma') Clade(branch_length=0.11538461538461536, name='Beta') Clade(branch_length=0.15384615384615383, name='Alpha')
建造新泽西树::
>>> njtree = constructor.nj(dm) >>> print(njtree) Tree(rooted=False) Clade(branch_length=0, name='Inner3') Clade(branch_length=0.18269230769230765, name='Alpha') Clade(branch_length=0.04807692307692307, name='Beta') Clade(branch_length=0.04807692307692307, name='Inner2') Clade(branch_length=0.27884615384615385, name='Inner1') Clade(branch_length=0.051282051282051266, name='Epsilon') Clade(branch_length=0.10256410256410259, name='Delta') Clade(branch_length=0.14423076923076922, name='Gamma')
- methods = ['nj', 'upgma']
- __init__(distance_calculator=None, method='nj')
初始化课程。
- build_tree(msa)
构建并返回树、邻居连接或UPGMA。
- upgma(distance_matrix)
构建并返回UPGMA树。
构造并返回具有算术平均值的未加权对群方法(UPGMA)树。
- 参数:
- distance_matrixDistanceMatrix
用于树构建的距离矩阵。
- nj(distance_matrix)
构建并返回邻居连接树。
- 参数:
- distance_matrixDistanceMatrix
用于树构建的距离矩阵。
- __annotations__ = {}
- __firstlineno__ = 588
- __static_attributes__ = ('distance_calculator', 'method')
- class Bio.Phylo.TreeConstruction.Scorer
基类:
object
所有树评分方法的基本类别。
- get_score(tree, alignment)
调用者获取给定对齐的树的分数。
这应该在子类中实现。
- __firstlineno__ = 883
- __static_attributes__ = ()
- class Bio.Phylo.TreeConstruction.TreeSearcher
基类:
object
所有树搜索方法的Base Class。
- search(starting_tree, alignment)
呼叫者以搜索具有开始树的最佳树。
这应该在子类中实现。
- __firstlineno__ = 894
- __static_attributes__ = ()
- class Bio.Phylo.TreeConstruction.NNITreeSearcher(scorer)
基类:
TreeSearcher
使用最近邻立交桥(NNI)算法进行树搜索。
- 参数:
- 得分手ParsimonyScorer
简约评分器,用于计算NNI算法期间不同树的简约评分。
- __init__(scorer)
初始化课程。
- search(starting_tree, alignment)
实现TreeSearcher.search方法。
- 参数:
- starting_tree树
NNI方法的开始树。
- 对准对齐或MultipleSeq对齐对象
多序列比对用于计算不同NNI树的简约得分。
- __annotations__ = {}
- __firstlineno__ = 905
- __static_attributes__ = ('scorer',)
- class Bio.Phylo.TreeConstruction.ParsimonyScorer(matrix=None)
基类:
Scorer
具有得分矩阵的节俭得分手。
这是Fitch算法和Sankoff算法的组合。有关用法,请参阅ParsimonyTreeConstructor。
- 参数:
- 矩阵_Matrix
用于节俭分数计算的评分矩阵。
- __init__(matrix=None)
初始化课程。
- get_score(tree, alignment)
使用惠誉算法计算节俭分数。
使用Fitch算法(没有惩罚矩阵)或Sankoff算法(有矩阵)计算并返回给定树和MSA的简约得分。
- __annotations__ = {}
- __firstlineno__ = 1056
- __static_attributes__ = ('matrix',)
- class Bio.Phylo.TreeConstruction.ParsimonyTreeConstructor(searcher, starting_tree=None)
-
节俭树构造函数。
- 参数:
- 搜索器TreeSearcher
树搜索最好的节俭树。
- starting_tree树
开始树提供给客户。
示例
我们将加载对齐,然后加载已经从中计算出的各种树::
>>> from Bio import AlignIO, Phylo >>> aln = AlignIO.read(open('TreeConstruction/msa.phy'), 'phylip') >>> print(aln) Alignment with 5 rows and 13 columns AACGTGGCCACAT Alpha AAGGTCGCCACAC Beta CAGTTCGCCACAA Gamma GAGATTTCCGCCT Delta GAGATCTCCGCCC Epsilon
加载开始树::
>>> starting_tree = Phylo.read('TreeConstruction/nj.tre', 'newick') >>> print(starting_tree) Tree(rooted=False, weight=1.0) Clade(branch_length=0.0, name='Inner3') Clade(branch_length=0.01421, name='Inner2') Clade(branch_length=0.23927, name='Inner1') Clade(branch_length=0.08531, name='Epsilon') Clade(branch_length=0.13691, name='Delta') Clade(branch_length=0.2923, name='Alpha') Clade(branch_length=0.07477, name='Beta') Clade(branch_length=0.17523, name='Gamma')
从开始树构建节俭树::
>>> scorer = Phylo.TreeConstruction.ParsimonyScorer() >>> searcher = Phylo.TreeConstruction.NNITreeSearcher(scorer) >>> constructor = Phylo.TreeConstruction.ParsimonyTreeConstructor(searcher, starting_tree) >>> pars_tree = constructor.build_tree(aln) >>> print(pars_tree) Tree(rooted=True, weight=1.0) Clade(branch_length=0.0) Clade(branch_length=0.19732999999999998, name='Inner1') Clade(branch_length=0.13691, name='Delta') Clade(branch_length=0.08531, name='Epsilon') Clade(branch_length=0.04194000000000003, name='Inner2') Clade(branch_length=0.01421, name='Inner3') Clade(branch_length=0.17523, name='Gamma') Clade(branch_length=0.07477, name='Beta') Clade(branch_length=0.2923, name='Alpha')
相同的示例,使用新的对齐类::
>>> from Bio import Align, Phylo >>> alignment = Align.read(open('TreeConstruction/msa.phy'), 'phylip') >>> print(alignment) Alpha 0 AACGTGGCCACAT 13 Beta 0 AAGGTCGCCACAC 13 Gamma 0 CAGTTCGCCACAA 13 Delta 0 GAGATTTCCGCCT 13 Epsilon 0 GAGATCTCCGCCC 13
加载开始树::
>>> starting_tree = Phylo.read('TreeConstruction/nj.tre', 'newick') >>> print(starting_tree) Tree(rooted=False, weight=1.0) Clade(branch_length=0.0, name='Inner3') Clade(branch_length=0.01421, name='Inner2') Clade(branch_length=0.23927, name='Inner1') Clade(branch_length=0.08531, name='Epsilon') Clade(branch_length=0.13691, name='Delta') Clade(branch_length=0.2923, name='Alpha') Clade(branch_length=0.07477, name='Beta') Clade(branch_length=0.17523, name='Gamma')
从开始树构建节俭树::
>>> scorer = Phylo.TreeConstruction.ParsimonyScorer() >>> searcher = Phylo.TreeConstruction.NNITreeSearcher(scorer) >>> constructor = Phylo.TreeConstruction.ParsimonyTreeConstructor(searcher, starting_tree) >>> pars_tree = constructor.build_tree(alignment) >>> print(pars_tree) Tree(rooted=True, weight=1.0) Clade(branch_length=0.0) Clade(branch_length=0.19732999999999998, name='Inner1') Clade(branch_length=0.13691, name='Delta') Clade(branch_length=0.08531, name='Epsilon') Clade(branch_length=0.04194000000000003, name='Inner2') Clade(branch_length=0.01421, name='Inner3') Clade(branch_length=0.17523, name='Gamma') Clade(branch_length=0.07477, name='Beta') Clade(branch_length=0.2923, name='Alpha')
- __annotations__ = {}
- __firstlineno__ = 1164
- __init__(searcher, starting_tree=None)
初始化课程。
- __static_attributes__ = ('searcher', 'starting_tree')
- build_tree(alignment)
建造这棵树。
- 参数:
- 对准MultipleSeqAlignment
多序列比对以计算节俭树。