编写自己的格式化程序¶

以及创造 your own lexer 为Pygments编写新的格式化程序既简单又简单。

格式化程序是用一些关键字参数（格式化程序选项）初始化的类，它必须提供“format（）”方法。此外，格式化程序还应提供一个“get_style_defs（）”方法，该方法以可用于格式化程序输出格式的形式从样式返回样式定义。

快速启动¶

与Pygments一起提供的最基本的格式化程序是 NullFormatter . 它只是将令牌的值发送到输出流：

from pygments.formatter import Formatter

class NullFormatter(Formatter):
    def format(self, tokensource, outfile):
        for ttype, value in tokensource:
            outfile.write(value)

如你所见， format() 方法传递两个参数： tokensource 和 outfile . 第一个是 (token_type, value) 元组，后者是具有 write() 方法。

因为格式化程序是基本的，所以它不会覆盖“get_style_defs（）”方法。

风格¶

样式没有被实例化，但是它们的元类提供了一些类函数，这样您就可以轻松地访问样式定义。

样式是不可重复的，并在形式中生成元组 (ttype, d) 在哪里？ ttype 是令牌和 d 是带有以下键的dict：

'color': 十六进制颜色值（例如： 'ff0000' 为红色）或 None 如果没有定义。
'bold': True 如果该值应为粗体
'italic': True 如果该值应为斜体
'underline': True 如果该值应加下划线
'bgcolor': 背景的十六进制颜色值（例如： 'eeeeeee' 浅灰色）或 None 如果没有定义。
'border': 边框的十六进制颜色值（例如： '0000aa' 对于深蓝色）或 None 没有边界。

以后可能会出现其他键，格式化程序应该忽略它们不支持的所有键。

HTML 3.2格式化程序¶

对于更复杂的示例，让我们实现一个HTML3.2格式化程序。我们不使用CSS而是使用内联标记 (<u> ， <font> 等）。因为这不是很好的样式，此格式化程序不在标准库中；-）

from pygments.formatter import Formatter

class OldHtmlFormatter(Formatter):

    def __init__(self, **options):
        Formatter.__init__(self, **options)

        # create a dict of (start, end) tuples that wrap the
        # value of a token so that we can use it in the format
        # method later
        self.styles = {}

        # we iterate over the `_styles` attribute of a style item
        # that contains the parsed style values.
        for token, style in self.style:
            start = end = ''
            # a style item is a tuple in the following form:
            # colors are readily specified in hex: 'RRGGBB'
            if style['color']:
                start += '<font color="#%s">' % style['color']
                end = '</font>' + end
            if style['bold']:
                start += '<b>'
                end = '</b>' + end
            if style['italic']:
                start += '<i>'
                end = '</i>' + end
            if style['underline']:
                start += '<u>'
                end = '</u>' + end
            self.styles[token] = (start, end)

    def format(self, tokensource, outfile):
        # lastval is a string we use for caching
        # because it's possible that an lexer yields a number
        # of consecutive tokens with the same token type.
        # to minimize the size of the generated html markup we
        # try to join the values of same-type tokens here
        lastval = ''
        lasttype = None

        # wrap the whole output with <pre>
        outfile.write('<pre>')

        for ttype, value in tokensource:
            # if the token type doesn't exist in the stylemap
            # we try it with the parent of the token type
            # eg: parent of Token.Literal.String.Double is
            # Token.Literal.String
            while ttype not in self.styles:
                ttype = ttype.parent
            if ttype == lasttype:
                # the current token type is the same of the last
                # iteration. cache it
                lastval += value
            else:
                # not the same token as last iteration, but we
                # have some data in the buffer. wrap it with the
                # defined style and write it to the output file
                if lastval:
                    stylebegin, styleend = self.styles[lasttype]
                    outfile.write(stylebegin + lastval + styleend)
                # set lastval/lasttype to current values
                lastval = value
                lasttype = ttype

        # if something is left in the buffer, write it to the
        # output file, then close the opened <pre> tag
        if lastval:
            stylebegin, styleend = self.styles[lasttype]
            outfile.write(stylebegin + lastval + styleend)
        outfile.write('</pre>\n')

评论应该能解释这一点。同样，此格式化程序不会重写'get_style_defs（）'方法。如果我们使用CSS类而不是内联HTML标记，那么我们需要首先生成CSS。为此，存在“get_style_defs（）”方法：

生成样式定义¶

一些格式化程序 LatexFormatter 以及 HtmlFormatter 不输出内联标记，但引用宏或css类。因为这些定义不是输出的一部分，所以 get_style_defs() 方法存在。它被传递一个参数（如果它被使用，它的使用方式取决于格式化程序），并且必须返回一个字符串或 None .

目录

上一主题

下一主题

编写自己的格式化程序¶

快速启动¶

风格¶

HTML 3.2格式化程序¶

生成样式定义¶