Overview

Paper project

With PaperBuilder you create a paper project that contains one or more versions of a manuscript. Each manuscript version is independent and has its own text, figures, tables, analyses, etc. PaperBuilder includes a command-line interface that makes it easy to manage and build paper versions and to execute analyses. Once initialized, the basic folder structure of a paper project is as follows:

root
 ├─ paper.project
 ├─ config.yaml
 ├─ plot_options.yaml
 └─ versions
     ├─ <version>
          ├─ text
          │   └─ *.md
          ├─ analysis
          │   ├─ <analysis>
          │        ├─ spec.yaml
          │        ├─ plot_options.yaml
          │        ├─ *.py, *.ipynb
          │        └─ *.svg
          ├─ figures
          │   └─ *.svg, *.png
          ├─ tables
          │   └─ *.csv
          ├─ variables
          │   └─ *.yaml
          ├─ output
          │   └─ *.pdf, *.docx, ...
          ├─ config.yaml
          ├─ plot_options.yaml
          ├─ build.py
          └─ build.db

Here, only one version <version> with a single analysis <analysis> is shown. At the project root level, paper.project is a file that marks the project folder and contains basic information about the project (e.g. project name, description, creation date). config.yaml is a configuration file with options set for all paper versions in the project (for more information see Configuration options). Similarly, plot_options.yaml is used to define plotting options at the project level, which apply to all analyses in all paper versions (for more information see Configuring plotting options).

Paper version

Each paper version has a text folder that contains a file with the manuscript’s main text. The text and logical structure of the manuscript is written in pandoc flavored Markdown. (TODO: <version>/text/manuscript.md (or: <name>-00-<description>.md and <name>.yaml))

The figures and tables folders hold all images and table data (in .csv files) that can be referred to from the Markdown text file. The variables folder holds yaml files with data is integrated into the text document during a preprocessing step.

The analysis folder contains subfolders with Python code, either as Python modules (*.py) or IPython notebooks (*.ipynb), and figure layouts (*.svg). A spec.yaml file describes how figures, tables and variables are generated from the Python code (see Creating and executing an analysis). The output of all analyses is saved in the corresponding figures, tables and variables folders. For more information, see Creating and executing an analysis.

The config.yaml and plot_options.yaml files set (plotting) options only for this paper version. An analysis folder may also contain a plot_options.yaml file that is only used in the context of that analysis. For more information, see Configuration options and Configuring plotting options.

Once a paper is built from its sources, the resulting pdf or docx files are saved in the output folder. The build.py and build.db files are required for the build process, but do not need to be touched by the user. See Building a paper and Build workflow.

Build workflow

The figure below shows how the final paper is built from the Markdown text file, figures, tables and variable files. In a first step, Jinja2 is used to replace all references to variables in the text file with the actual values as defined in the variables/*.yaml files. Next, the pandoc conversion tool is executed to convert the preprocessed Markdown text file to pdf or docx output. In the process, literature citations are resolved and both figures and tables are inserted.

Although image files, table data and variable files can be created manually in the figures, tables and variables folders, often these files are produced by executing the analyses.

The build process is based on the pydoit automation tool. PaperBuilder includes a command-line interface (see Command line interface) that provides an easy way to manage the build process.

paper build workflow

Schematic overview of paper building workflow.