textwrap -- 格式化文本段落

概览:通过调整一段中换行的位置实现格式化文本杜纳罗

当需要以比较美观的方式输出文本时,可以使用textwrap模块对文本进行格式化。模块以可编程的方式,提供了和许多文本编辑器、文本处理器的段落包装、填充等类似的功能。

实例数据

这一节的实例使用模块textwrap_example.py,其中包含字符串sample_text

textwrap_example.py

sample_text = '''
    The textwrap module can be used to format text for output in
    situations where pretty-printing is desired.  It offers
    programmatic functionality similar to the paragraph wrapping
    or filling features found in many text editors.
    '''

填充字符串

fill()函数接受一个文本作为参数,产生格式化后的文本作为返回内容。

# textwrap_fill.py

import textwrap
from textwrap_example import sample_text

print(textwrap.fill(sample_text, width=50))

结果并不是特别令人满意。现在,文本是左对齐的,但第一行保留了缩进,而且每个子序列行之前的空格也留在了段落中。

$ python3 textwrap_fill.py

     The textwrap module can be used to format
text for output in     situations where pretty-
printing is desired.  It offers     programmatic
functionality similar to the paragraph wrapping
or filling features found in many text editors.

删除已有的缩进

上一个例子把制表符和多余的空格带入了输出中,因此段落并没有如我们期待一般地被格式化。使用dedent()去除案例字符串中每一行固定长度的空格前缀,即可以得到更好的结果,也使得我们可以使用文档字符串(docstrings)或代码内的多行字符串。上文中那个包含因代码格式带入的缩进的字符串,可以说明这一特性。

# textwrap_dedent.py

import textwrap
from textwrap_example import sample_text

dedented-text = textwrap.dedent(sample_text)
print('Dedented:')
print(dedented_text)

现在结果看起来好了很多。

$ python3 textwrap_dedent.py

Dedented:

The textwrap module can be used to format text for output in
situations where pretty-printing is desired.  It offers
programmatic functionality similar to the paragraph wrapping
or filling features found in many text editors.

只有每一行共同数量的空格前缀会被去除,如果其中一行比其他行有更多空格,那么多出来的空格不会被去除。

对于这样的输入

␣Line one.
␣␣␣Line two.
␣Line three.

输出为

Line one.
␣␣Line two.
Line three.

结合去缩进和填充

接下来,去缩进后的文本可以被传入到fill()中,并带有几个不同的width值。

# textwrap_fill_width.py

import textwrap
from textwrap_example import sample_text

dedented_text = textwrap.dedent(sample_text).str
for width in [45, 60]:
    print('{} Columns:\n'.format(width))
    print(textwrap.fill(dedented_text, width=width)))
    print()

这将产生指定宽度的输出。

$ python3 textwrap_fill_width.py

45 Columns:

The textwrap module can be used to format
text for output in situations where pretty-
printing is desired.  It offers programmatic
functionality similar to the paragraph
wrapping or filling features found in many
text editors.

60 Columns:

The textwrap module can be used to format text for output in
situations where pretty-printing is desired.  It offers
programmatic functionality similar to the paragraph wrapping
or filling features found in many text editors.

文本块缩进

使用indent()可以对一段字符串中的所有行添加一致的前缀。案例中将字符串进行格式化,使之看起来像是电子邮件的回复中被引用的内容,对每一行使用>作为前缀。

# textwrap_indent.py

import textwrap
from textwrap_example import sample_text

dedented_text = textwrap.dedent(sample_text)
wrapped = textwrap.fill(dedented_text, width=50)
wrapped += '\n\nSecond paragraph after a blank line.'
final = textwrap.indent(wrapped, '>')

print('Quoted block:\n')
print(final)

文本快被分割为一些文本行,每一行添加了前缀,之后再把这些行重新组成新的字符串并返回。

$ python3 textwrap_indent.py

Quoted block:

>  The textwrap module can be used to format text
> for output in situations where pretty-printing is
> desired.  It offers programmatic functionality
> similar to the paragraph wrapping or filling
> features found in many text editors.

> Second paragraph after a blank line.

若要控制指定行加入前缀,可以将一个可调用对象作为predicate参数传入indent()。传入的可调用对象会对每一行文本进行调用,当调用返回值为True的时候,此行才会添加前缀。

# textwrap_indent_predicate.py

import textwrap
from textwrap_example import sample_text

def should_indent(line):
    print('Indent {!r}?'.format(line))
    return len(line.strip()) % 2 == 0

dedented_text = textwrap.dedent(sample_text)
wrapped = textwrap.fill(dedented_text, width=50)
final = textwrap.indent(wrapped, 'EVEN', predicate=should_indent)

print('\nQuoted block:\n')
print(final)

这个实例对字符数量为偶数的行,添加EVEN前缀。

$ python3 textwrap_indent_predicate.py

Indent ' The textwrap module can be used to format text\n'?
Indent 'for output in situations where pretty-printing is\n'?
Indent 'desired.  It offers programmatic functionality\n'?
Indent 'similar to the paragraph wrapping or filling\n'?
Indent 'features found in many text editors.'?

Quoted block:

EVEN  The textwrap module can be used to format text
for output in situations where pretty-printing is
EVEN desired.  It offers programmatic functionality
EVEN similar to the paragraph wrapping or filling
EVEN features found in many text editors.

悬挂缩进

就如同能够设置输出的宽度一样,我们也能单独控制第一行的缩进。

# textwrap_hanging_indent.py

import textwrap
from textwrap_example import sample_text

dedented_text = textwrap.dedent(sample_text).str
print(textwrap.fill(dedented_text,
                    initial_indent='',
                    subsequent_indent=' ' * 4,
                    width=50,
                    ))

通过这样的方式我们能够得到一个悬挂缩进,即第一行的缩进比其他行要少。

$ python3 textwrap_hanging_indent.py

The textwrap module can be used to format text for
    output in situations where pretty-printing is
    desired.  It offers programmatic functionality
    similar to the paragraph wrapping or filling
    features found in many text editors.

截断长文本

想要截断长文本,得到一个总结或者预览,可以使用shorten()。所有空白符号,例如制表符、换行符、一系列的空格都会被转换为一个空格。接着文本会被截断至长度小于或等于指定长度,截断只会发生在词的边界,所以不会有被截断的单词出现在结果中。

# textwrap_shorten.py

import textwrap
from textwrap_example import sample_text

dedented_text = textwrap.dedent(sample_text)
original = textwrap.fill(dedented, width=50)

print('Original:\n')
print(original)

shortened = textwrap.shorten(original, 100)
shortened_wrapped = textwrap.fill(shortened, width=50)

print('\nShortened:\n')
print(shortened_wrapped)

如果在截断过程中,有非空白字符被删除,那么被删除的部分将会被替换为一个占位符。默认占位符为[...],也可以通过提供placeholder参数指定占位符。

$ python3 textwrap_shorten.py

Original:

 The textwrap module can be used to format text
for output in situations where pretty-printing is
desired.  It offers programmatic functionality
similar to the paragraph wrapping or filling
features found in many text editors.

Shortened:

The textwrap module can be used to format text for
output in situations where pretty-printing [...]

See also

results matching ""

    No results matching ""