Writing is a time-consuming process; writing high-quality publications requires attention to detail at every step of the way, from the actual prose on paper to its layout in the document to the presentation of figures. In this guide we walk you through 10 aspects of writing a scientific article using LaTeX to format your work and save you time. This is not an introduction to LaTeX.1 We emphasize typing commands at the unix command line in this guide as a way for you to peek under the hood of the LaTeX engine. This will give you (the author!) power over the production of your own academic documents.2
This guide could be even longer than it is. There are many, many fantastic resources on typesetting. Here we have hand-selected 10 topics to help lower the barrier to a more efficient and higher quality paper writing workflow. Specifically we focus on
tex
,
latex
, pdflatex
, xelatex
,
lualatex
, etc.To help people practice these commands we have hands-on examples ready in a JupyterLab session, through Binder. Here you can follow along, processing documents in a terminal session. You can start this environment here:
To use LaTeX on your own computer, you will need to install it (we highly recommend following the links therein to TeX Live on each system).
A LaTeX document (or a .tex
file) is a plain
text document that contains commands that tell the LaTeX processing
program how to create a beautiful pdf. These commands can be “markup”
like \textbf{this is bold}
for bold text
or $\alpha + \beta \frac{1}{x^2}$
for math like α+β1x2 or commands
that tell LaTeX about document structure like
\section{Introduction}
or even commands to identify a
bibliography like \bibliography{refs_example.bib}
.
Once you have a plain text document with markup, you then process it
using a set of programs to create a publishable output like a
.pdf
file. This figure shows an example of a LaTeX document
and highlights different parts of the document and their role.
Imagine we have a document called example.tex
. After
processing that document via, say, the command
latexmk -pdflatex example.tex
, one can see a pdf file like
the following image:
What does example.tex
look like when compiled to a pdf
document? Can you add a title or author? Can you make some text bold?3 You can
practice by following these steps (and similar ones) in later
sections:
1_structure
in the JupyterLab
window that launches when you clink on launch binder
above
or here: Terminal
icon in the JupyterLab
panels
. If you see a list of directories like
adaptive
, you need to change directories to the
latex-guide/1_struture
so you should type (or paste)
cd latex-guide/1_structure
. Now try ls
again.
You should should see files like example.tex
.latexmk -pdf example.tex
and then looking at the pdf (you can click on it in the right-hand
sidebar).You can also copy the GitHub repository for the EGAP methods guides
or just the subdirectory for latex-guide
to your own local
machine and launch the Terminal to see a Unix command prompt if you are
using a Mac or Linux machine. Windows machines also offer a
Unix command prompt, but it is a bit more involved to install it.
tex
, latex
,
pdflatex
, etcAlthough the most basic program that parses markup is called
latex
, in current daily use, you will mostly find yourself
using pdflatex
or even xelatex
or maybe
lualatex
.
When Donald Knuth created this approach to making
beautiful scientific documents, he started with the tex
program but Leslie Lamport built latex
by combining
multiple tex
commands into fewer and simpler macros. Both
originally created documents in dvi
or
postscript
format. Nowadays, pdf
files are the
best way to make a document that looks the same to all who want to view
it on their screens or print it for themselves.
Here is a list of the common programs that one might use to create a pdf file from a latex document:
tex
: a program that typesets TeX directives or
macrospdftex
: a program that generates a PDF (instead of
DVI)latex
: a program that typesets a pile of LaTeX
directives and macrospdflatex
: a program that generates a PDF from
LaTeXbibtex
: a program to take bibliographic information
from a .aux
file (created by a run of latex
or
pdflatex
etc.) and generates a bibliographybiber
: a program like bibtex but with more database
management capabilitiesxelatex
: support for a wide variety of fonts and
characters (you can type xelatex example.tex
after changing the
font to one that is installed on your system)lualatex
: extends latex so that more programming can be
done within it (via Lua for more complicate document designs and
workflows. See here for more on lualatex)For example, at the command prompt in the Terminal, you might type
pdflatex example.tex
to create an example.pdf
file.
Notice also:
pdflatex
(or xelatex
or
lualatex
) takes several passes — it must be run more than
one time — if your document involves citations or other more complex
features (like cross-references, tables of contents, etc.). In the
example above, if you only create example.pdf
once, the
citation will show up as a ?
and no bibliography will be
printed.latexmk
or latexrun
automate
this process of multiple passes by a latex processing program and a
bibliography creation program.The following figure shows how it may require three runs of
pdflatex
(plus a run of bibtex
) to go from an
example.tex
file to an example.pdf
file:
You can replace those multiple lines with a single call to
latexmk -pdflatex example.tex
.
pdflatex
) and PDF figures (or
PNG … more on this later) rather than DVI or PS format for sharing
generated documentslatexmk
to automate the
process of repeatedly using both pdflatex
and
bibtex
(or biber
for those using
biblatex
) to process a file.See the directory 2_texflavors
and the
readme.md
file therein. Can you change the font and use
xelatex
to make a pdf, say, trying
latexmk -xelatex example.tex
? You may need to use the unix
cd
(change directory) command to move from
1_structure
to 2_texflavors
by typing
cd ../2_texflavors
(this means “change directory to
2_texflavors
which is one above my current directory in the
directory hierarchy”.
A given scientific paper will require many files and often involves
many authors. For example, a single paper will often use several
.tex
files (for different sections), multiple figures (in
.pdf
form), and bibliographies (in .bib
files). Further, each figure might depend on a pipeline of raw data and
code. Organizing these files in a consistent fashion will lead to a
clear process when dealing with revisions later.
For example a main.tex
file might look like this:
\documentclass{article}
\title{My Title}
\begin{document}
\maketitle
\input{abstract}
\input{intro}
\input{results}
...\bibliography{mybib.bib}
\end{document}
But results.tex
might look like this:
\section{Results}
\ref{fig:vaccine_by_pop} shows that opposition to vaccination peaks at a population of 100,000.
Figure~
\begin{center}
\begin{figure}[!ht]
\includegraphics[width=.8\textwidth]{vaccine_by_pop.pdf}
\caption{Vaccination opposition by population}\label{fig:vaccine_by_pop}
\end{figure}
\end{center}
The number 100,000
and the figure
vaccine_by_pop.pdf
might derive from the R file called
vaccine_by_pop.R
. This R file relies on data that is
cleaned by vaccine_data_cleaning.py
. The data themselves
may also require code to download, clean, and merge with other
files.
So how do we organize the data, the files, and the overall workflow? There are many possibilities, but we’re reminded by a slice of the Zen of Python:
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
We provide two specific examples of workflows below, first noting two aspects that will greatly improve your process. The first is to separate your data from your processing and presentation:
data1.csv, ..., datan.csv
)data_merged_filtered.db
)temp_vs_time.csv
)temp_vs_time.py
)The second aspect, directly related to the LaTeX, is
to establish a predictable naming convention. For example, each output
like a table or figure uses one script with the same
name:temp_vs_time.pdf <—> temp_vs_time.py
and that
LaTeX labelling follow this convention
\label{fig:temp_vs_time}
. When editing the document, the
path from figure to the associated plotting script and related data is
then clear. For a high level discussion of project workflow see 10 Things to Know About Project Workflow.
Here are two examples of directory structures which have worked for us:
In this example, we use Matt West’s directory structure, where the versions of the paper are kept in their own directories:
paper_topic_name_dir_name | string used for repo, tex, and bib files
+ requirements.txt | number of pages, etc
+ 1_submitted_paper
| +-- paper_topic_name.tex
| +-- refs_topic_name.bib
| +-- journal_class.cls | any files needed for the journal latex style
| +-- figures
| | +-- temp_vs_time.pdf | descriptive names for figures (not fig1.pdf, etc)
| | +-- error_vs_stepsize.pdf
| | `-- ...
| +-- data | data files that generate the figures
| | +-- Makefile | Makefile that will re-generate all figures
| | +-- temp_vs_time.csv | use the same name as the resulting figure
| | +-- plot_temp_vs_time.py | plotting scripts, use names like plot_.py
| | `-- ...
| `-- submitted_paper_topic_name.pdf | actual PDF file submitted
+ 2_reviews
| +-- review_1.pdf | individual reviews
| +-- review_2.pdf
| `-- editor_statement.pdf | instructions and summary from editor
+ 3_response_to_reviews
| +-- response_topic_name.tex
| `-- sent_response_topic_name.pdf | actual PDF file sent to editor
` 4_revised_paper
+-- paper_topic_name_revised.tex
+-- refs_topic_name_revised.bib
+-- journal_class.cls | copy here any other files needed
+-- figures | copy here all the figures again
| +-- temp_vs_time.pdf | edit figures as needed
| +-- error_vs_stepsize.pdf
| `-- ...
+-- data | copy all data again and edit as needed
| `-- ...
`-- submitted_paper_topic_name_revised.pdf | actual PDF submitted
Reference: Matt West @ https://lagrange.mechse.illinois.edu/latex_quick_ref/
An alternative approach uses git branches for different versions, and
a single Makefile
for all tasks (from turning the paper
into a pdf file via LaTeX, to creating figures, etc.). See also the
discussion in Bowers and Voors (2016),
section 3.
paper_topic_name_dir_name | string used for repo, tex, and bib files
+ Makefile | file that tracks file relationships
+-- Data | directory for data and data cleaning, merging work
+ README.md | file with instructions and explanations
+ merge_data.R |
+ orig_data.csv | original data set, not to be changed
+ merge_data.csv |
`-- ... |
+-- Analysis |
+ README.md |
+ linear_simulations.R | file that runs simulations and saves output
+ linear_simulations.rda | output from linear_simulations.R
`-- ... |
+-- Figures |
+ README.md |
+ linear_simulations_N100.R | file creating a figure
+ linear_simulations_N100.pdf | the figure from linear_simulations_N100.R
+ descriptives.R | file creating a table
+ descriptives.tex | the table in LaTeX format
`-- ... |
+-- Paper |
+ README.md |
+ main.tex | the main LaTeX file
+ abstract.tex | the abstract file
`-- ... |
+-- References |
+ big.bib | bibliography file
`-- ... |
Now is better than never.
Feel free to play with the directory 3_workflows
and the
readme.md
file therein.
Often your writing is interleaved with edits and contributions from co-authors. How do you track changes and versions in your LaTeX document?
We strongly recommend git version control via github, either when working along on a document or when multiple authors are involved. We do not describe git and github in-depth here, but instead offer the following high-level best practices.
What files should you track (in version control)?
.tex file
!.bib
file for your article./figures/*.pdf
./data/*.py
,
./data/*.R
./data/*.csv
What should you not track (in version control)?
paper_randnoise.pdf
*.log
, *.bbl
,
*.aux
, etc.DS_Store
or other garbage from your systemVersion control is invaluable as a collaboration tool, however it does require diligence when working with co-authors on a LaTeX document. We recommend the following recipe:
latexmk myfile.tex -C
, and
recompile to verify there are no errors.Fewer tools allow collaborators to edit plain text documents at the same time. We nearly always rely on asychronous collaboration, even if we have broken up a task and the whole team is working on it at the same time, even in the same room.
Overleaf is designed for this task. It compiles LaTeX and syncs with github. See also the online versions of LaTeX listed here.
There are other systems for editing plain text at the same time such as Teletype for Atom.
See the directory 4_git
and the readme.md
file therein.
The overarching style of your document is often decided by the journal. With this in mind, it is best to typeset your document with the journal’s style file. For example here is the style file for Political Analysis. The Society for Industrial and Applied Mathematics (SIAM) provides style files directly whereas others, e.g. are included with your TeX distribution and available in CTAN. In any case, committing and not deviating from the expected format will accelerate your time-to-publication by not slowing down the copy editing at the journal. The style files will provide macros for author formats, custom figure environments, and almost certainly the preferred style for the bibilography. In addition, most journals provide a style guide that will detail the expectations on punctuation, hyphens, commas, etc.
See directory 5_style
and readme.md
for an
example.
You already know Hemingway’s famous quote: “the only kind of writing is re-writing”. However, you might not know about linters.
A linter is a program that analyzes your text (sometimes in real-time, as you write it). When your misspelled words are highlighted in your email client, you are seeing the results of a linter alerting you to improve your text. Linters are also used in programming — catching code errors before running the code, by alerting you to unmatched parentheses or missing semi-colons.
Other linters can look for issues with style. Consider the following terrible sentence:
More research is needed to fill the gap created in extant literature in order to impact policy with very important findings.
One linter, the write-good, highlights several potential problems:
col 16 error| [write-good] "is needed" may be passive voice [E]
col 71 error| [write-good] "in order to" is wordy or unneeded [E]
col 102 error| [write-good] "very" is a weasel word and can weaken meaning [E]
Of course, linters cannot do it all. We use them because they draw attention to sentences that may need work. Ultimately they (hopefully) help focus our attention on prose: re-writing the sentence without using a passive voice, without using “impact” as a verb (!), and with a stronger justification for research than to just fill a gap in the literature.
There are many fantastic tips and guides to improving your writing, from reading paragraphs and sentences out loud to “edit by ear” (Becker 1986) to guides specific to academic writing: Gopen and Swan (1990) and Becker (1986). Here, we offer a few directions that improve your writing specifically in LaTeX:
.tex
document on-the-fly).% TODO
,
%
marks a line as a comment in the .tex
file.
You can find all places where you have % TODO
in your
document using: grep TODO paper_randnoise.tex
.See the directory 6_linting
and the
readme.md
file therein.
You will find that authors have their own macros, their own style in
the .tex
document, and their own preferences when using
LaTeX. Here we offer general principles that can help improve your
overall LaTeX workflow:
\begin{align}
\langle u, v \rangle & = \langle f, v\rangle\\
& = G(v)
\end{align}
\begin{tabular}{lrllr}
\toprule
& \multicolumn{1}{c}{$n$}
& \multicolumn{1}{c}{$t$}
& \multicolumn{1}{c}{$\rho$}
& \multicolumn{1}{c}{$m$} \\
\midrule
& \num{ 19929} & 0.32 & 0.8 & 55 \\
experiment 1 & \num{ 7729292} & 0.78 & 0.7 & 85 \\
experiment 2 & \num{888173928} & 1.25 & 0.65 & 2 \\
experiment 3 \bottomrule
\end{tabular}
.tex
file$\vec{H}(\text{curl},\Omega)$
to
produce →H(curl,Ω)
we might use a macro to create a shortcut command like
$\Hcurl$
:\newcommand{\Hcurl}{\vec{H}(\text{curl},\Omega)}
\renewcommand{\vec}[1]{\boldsymbol #1}
.tex
source unreadable.booktabs
: provides clean horizontal lines for tables
(avoid vertical lines), providing \toprule
and
\bottomrule
in the example above.siunitx
: to format large numbers and notation,
providing \num
in the example above. \begin{align}
for everything, instead try
specific environments built for your purpose.equation
is your base equation environment. Use this
unless you have multiple equations.align
should be used for multiple equations that
require alignment.split
is used for a single equation that
requires alignment when split.multline
is used for a single equation where
no alignment is needed.subequations
may be used around align
to
retain a single equation numbering.See example.tex
in 7_dos
for examples of
use.
\label{fig:easy_figure_name}
\begin{figure}[!ht]
\centering
\includegraphics{example.pdf}
\captions{A caption}\label{fig:example}
\end{figure}
\label{eq:useful_equation_name}
\begin{equation}\label{eq:Axb}
A x = b
\end{equation}
\label{sec:i_can_remember_this_section_name}
.\label{tab:what_a_great_table_name}
.Central to TeX is an algorithm for placing and spacing figures and
text so that you don’t have to. Float environments (figure, table, etc)
should be attached to the paragraph of their first reference (more in
the next section). Avoid use of
\FloatBarrier
, \newpage
, \vspace
,
\hspace
, etc to muscle your own spacing.
.tex
document readable.See the directory 7_dos
and the readme.md
file therein.
The LaTeX system allows you to (1) insert citations in your text
using commands like \cite{ChOlSe_2021_lsrbm}
which can turn
into [7]
, (Chaudhry et al., 2021)
,
[Ch21]
, or other citation styles within the text itself and
also (2) to print out your bibliography, formatted according to your
journal’s guidelines, using a single command in the LaTeX document like
\bibliography{mybib.bib}
. Separating formatting from
information saves time: hundreds of citations will be printed
automatically in the correct format if desired including only the
sources you cited. If you decide that you no longer need a citation,
this will be removed from your bibliography automatically. Journals
often provide formatting guidelines in .bst
files that can
be referred to in the \bibliographystyle{}
command.
The program bibtex
(or biber
) reads
.aux
files created by latex programs and creates a
.bbl
file which is then read by the LaTeX program to format
everything (above we showed the need to run pdflatex
,
bibtex
, pdflatex
, and pdflatex
in
order to generate citations).
To use bibtex
, you need a plain text file that is a
database with entries formatted in BibTeX format. For example, here is
one entry in the BibTeX file for this essay:
@article{ChOlSe_2021_lsrbm,
author = {Chaudhry, Jehanzeb H. and Olson, Luke N. and Sentz, Peter},
doi = {10.1137/20M1323552},
journal = {SIAM Journal on Scientific Computing},
number = {2},
pages = {A1081-A1107},
title = {A Least-Squares Finite Element Reduced Basis Method},
url = {https://doi.org/10.1137/20M1323552},
volume = {43},
year = {2021}
}
.bib
entry.
Grab the full citation online at the citation’s journal and/or Google
Scholar see instructions here for getting BibTeX formatted
entries from Google Scholar.{ }
instead of
“ “
.{ }
also force capitalization: for example
title = {All about {Krylov} methods}
..bib
entries. This can generate warnings..bib
file) once. (And you can use tools like
Zotero and BibDesk to make managing those collections of
bibliographic information easier.)See the directory 8_citations
and the
readme.md
file therein.
Figures, tables, and math break up the text of a document and convey
information that can make or break the overall flow of your story. In
general, if a figure or table has been created using code, your project
should have a figure or table creation file like
linear_simulations_N100.R
which creates one figure
linear_simulations_N100.pdf
. This figure creation file
might require as input another file with simulation results, and in turn
the simulation results creator file may need data; this dependency may
be described in a readme
or Makefile
. For
example in line 1 of the Makefile below we see
Data/clean_data.csv: Data/clean_data.R Data/raw_data.csv
which means that the file Data/clean_data.csv
depends on
Data/clean_data.R
and Data/raw_data.csv
. And
line 2 is a command used to create Data/clean_data.csv
(in
this case, the command is R ---file Data/clean_data.R
.
1 Data/clean_data.csv: Data/clean_data.R Data/raw_data.csv
2 R ---file Data/clean_data.R
3
4 Analysis/linear_simulations.rda: Analysis/linear_simulations.R Data/clean_data.csv
5 R --file Analysis/linear_simulations.R
6
7 Figures/linear_simulations_N100.pdf: Figures/linear_simulations_N100.R Analysis/linear_simulations.rda
8 R --file Figures/linear_simulations_N100.R
In general figures, tables, and math should appear close to where they are discussed in the text.
Figures are central to the overall feel of your article. Here are a few general tips for working with LaTeX and figures:
\includegraphics
to scale a figure will also
change the font sizes; you should attempt to generate unscaled figures.
extrafont
.rcparams
here.\includegraphics[]{}
command. For
example, if we wanted to include a figure but scale it to 1/3 of the
width of the text (the area within the left and right margins), we would
use:\includegraphics[width=0.3\textwidth]{myfig.pdf}
\ref{fig:vaccine_by_pop} shows that opposition to vaccination peaks at a population of 100,000.
Figure~%
\begin{figure}[!ht]
\centering
\includegraphics[width=.8\textwidth]{vaccine_by_pop.pdf}
\caption{Vaccination opposition by population}\label{fig:vaccine_by_pop}
\end{figure}
\begin{figure}[!ht]
or
\begin{table}[!ht]
.
!
tex will ignore area restrictions.h
place it “here” if it fits in the area.t
place it at the “top” otherwise and if it fits
otherwise create a new page.xtable
package to
convert a matrix or data-frame to a LaTeX formatted table.Math fonts should work with the main font of the article. For examples of good math and text font pairings see the LaTeX Font Catalogue.
See the directory 9_figures
and the
readme.md
file therein. In particular, you will consider
the following “bad” figure and how to improve it in your LaTeX
document.
\cref{}
referencing for allA LaTeX document is a plain
text file. This means that you can use any text
editor to write a LaTeX document. However, a text editor that (1)
recognizes that \textbf{}
is a LaTeX command or that (2)
keeps track of matching braces and parentheses makes it easier to write
LaTeX markup. To that end, we use neovim (sometimes with the vimr gui) with
vimtex
plugins but we know that there are many other approaches to typing a
plain text document using LaTeX markup.
We wrote this document using pandoc flavored markdown and turned it from plain text into HTML via the following command at the unix command line on our OS X laptops:
pandoc latex-guide.md --to html4 --from markdown+yaml_metadata_block+autolink_bare_uris+tex_math_single_backslash+inline_code_attributes --output latex-guide.html --self-contained --variable bs3=TRUE --standalone --section-divs --template latex-guide-template.html --include-in-header latex-guide-header.html --number-sections --table-of-contents --toc-depth=1 --variable theme=bootstrap --mathjax --variable 'mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --citeproc
Alternatively, if you have access to R, you can do the following to turn this markdown document into HTML.
-e "library(rmarkdown); render('latex-guide.md')" Rscript
We suggest the Free online introduction to LaTeX if you are brand new to LaTeX.↩︎
We have decided to write this guide in a very opinionated way. And we emphasize the nitty gritty of technical document creation. If these opinions inspire a reader to write a 10 Things Guide on using Markdown or Google Docs, please do write one! As an open-source document, we are also happy to receive pull requests for improvements to this guide.↩︎
Try out \title{Some Paper}
and
\author{Some Person}
in the preamble and
\maketitle
right after the \begin{document}
line.↩︎