Apache POI Word

Last Modified: 2023/11/15

Overview

In the previous article, we talked about how to create paragraphs. When it comes to adding headings, we didn't actually use any styles. In this section, we will implement a style called "Heading 1" and apply it to the heading text.

The benefits of using style

Using styles can greatly simplify the code. Imagine an article that may contain multiple Level 2 headings, and it's obvious that the style of each Level 2 heading is the same. Without using styles, you would need to set the same properties for each Level 2 heading, which would result in a lot of repetitive code. The solution is simple: define a style called "Level 2 Heading," and then all Level 2 headings can use this style.

Style management

In POI Word, a document can have multiple styles, and styles are managed through XWPFStyles. We can create a style object (XWPFStyle object) and then add that style to the XWPFStyles object. Each style has a unique ID, and we can later refer to that style using its ID.

A style will definitely include an ID and a style name, and other properties will depend on the purpose of the style. For example, a Level 1 heading usually has a large font size, bold text, and is centered. Now let's define this style:

import org.apache.poi.xwpf.usermodel.*;
import org.openxmlformats.schemas.wordprocessingml.x2006.main.*;

import java.io.FileOutputStream;
import java.io.IOException;
import java.math.BigInteger;

XWPFDocument doc = new XWPFDocument();
XWPFStyles styles = doc.createStyles();
CTStyle ctStyleHeading1 = CTStyle.Factory.newInstance();
// 样式 id
ctStyleHeading1.setStyleId("heading1");
CTString styleName = CTString.Factory.newInstance();
// 样式名称
styleName.setVal("heading 1");
ctStyleHeading1.setName(styleName);

CTRPr rPr = ctStyleHeading1.addNewRPr();
// 加粗
rPr.addNewB().setVal(true);
// 字体大小
rPr.addNewSz().setVal(new BigInteger("44"));
rPr.addNewSpacing().setVal(new BigInteger("44"));
CTFonts ctFonts = rPr.addNewRFonts();
// 字体名称
ctFonts.setCs("Calibri");
ctFonts.setAscii("Calibri");
ctFonts.setHAnsi("Calibri");
// 设置大纲级别
CTPPrGeneral pPr = ctStyleHeading1.addNewPPr();
pPr.addNewOutlineLvl().setVal(new BigInteger("0"));

// 居中
pPr.addNewJc().setVal(STJc.CENTER);

XWPFStyle heading1Style = new XWPFStyle(ctStyleHeading1);
heading1Style.setType(STStyleType.PARAGRAPH);
// 加入 XWPFStyles 中管理
styles.addStyle(heading1Style);

I have to admit that the code for defining styles can be quite verbose. However, once the styles are defined, applying them is very simple. You just need to refer to the style by its ID.

XWPFParagraph paragraph = doc.createParagraph();
// 通过 id 引用样式
paragraph.setStyle("heading1");
XWPFRun run = paragraph.createRun();
run.setText("This the title");

Another important thing to note is that the outline level of a heading is crucial. The outline level corresponds to the document's hierarchical structure and skeleton, similar to a table of contents.

The following is an explanation about document outlines, excerpted from WPcontentOverview:

Specifies the outline level associated with the paragraph. It is used to build the table of contents and does not affect the appearance of the text. The single attribute val can have a value of from 0 to 9, where 9 indicates that no outline level applies to the paragraph. So <w:outlineLvl w:val="0"/> indicates that the paragraph is an outline level 1.

CTxx

In the above implementation, you may have noticed several interfaces starting with "CT," such as CTRPr, CTFonts, CTString, and CTStyle. These interfaces actually correspond to complex types defined in Office Open XML. Creating objects for these interfaces follows a similar pattern, usually done using CTxx.Factory.newInstance().

  • CTStyle.Factory.newInstance()
  • CTString.Factory.newInstance()

I agree that these interfaces belong to a relatively low-level API. In general, if POI does not provide higher-level methods, we can still achieve the same effect using these lower-level APIs.

For example, the following two code snippets both insert the text "hello" into a paragraph, but it's evident that the first method is much more concise:

XWPFParagraph paragraph = doc.createParagraph();
Run run = paragraph.createRun();
run.setText("hello");

Indeed, the second method uses a more direct approach. We can consider it "rough" because it exposes the underlying XML structure more explicitly.

XWPFParagraph p = doc.createParagraph();
XWPFRun run = p.createRun();
// r tag
CTR ctr = run.getCTR();
// t tag
CTText ctText = ctr.addNewT();
// text in t tag
ctText.setStringValue("hello");

If POI already provides convenient methods to achieve a certain operation, there is no need to take a detour. Remembering so many XML tags can be difficult, so if you encounter unfamiliar ones, you can look them up in WPcontentOverview for reference.

Feedback

Notice:Feedback requires logging into the system first.