Writing is a time-consuming process; writing high-quality publications requires attention to detail at every step of the way, from the actual prose on paper to its layout in the document to the presentation of figures. In this guide we walk you through 10 aspects of writing a scientific article using LaTeX to format your work and save you time. This is not an introduction to LaTeX.1 We emphasize typing commands at the unix command line in this guide as a way for you to peek under the hood of the LaTeX engine. This will give you (the author!) power over the production of your own academic documents.2
This guide could be even longer than it is. There are many, many fantastic resources on typesetting. Here we have hand-selected 10 topics to help lower the barrier to a more efficient and higher quality paper writing workflow. Specifically we focus on
tex,
latex, pdflatex, xelatex,
lualatex, etc.To help people practice these commands we have hands-on examples ready in a JupyterLab session, through Binder. Here you can follow along, processing documents in a terminal session. You can start this environment here:
To use LaTeX on your own computer, you will need to install it (we highly recommend following the links therein to TeX Live on each system).
A LaTeX document (or a .tex file) is a plain
text document that contains commands that tell the LaTeX processing
program how to create a beautiful pdf. These commands can be “markup”
like \textbf{this is bold} for bold text
or $\alpha + \beta \frac{1}{x^2}$ for math like \(\alpha + \beta \frac{1}{x^2}\) or commands
that tell LaTeX about document structure like
\section{Introduction} or even commands to identify a
bibliography like \bibliography{refs_example.bib}.
Once you have a plain text document with markup, you then process it
using a set of programs to create a publishable output like a
.pdf file. This figure shows an example of a LaTeX document
and highlights different parts of the document and their role.
The Structure of a LaTeX document
Imagine we have a document called example.tex. After
processing that document via, say, the command
latexmk -pdflatex example.tex, one can see a pdf file like
the following image:
Above see an image of the pdf document associated
with the example.tex file in the 1_structure
subdirectory.
What does example.tex look like when compiled to a pdf
document? Can you add a title or author? Can you make some text bold?3 You can
practice by following these steps (and similar ones) in later
sections:
1_structure in the JupyterLab
window that launches when you clink on launch binder above
or here: Terminal icon in the JupyterLab
panels. If you see a list of directories like
adaptive, you need to change directories to the
latex-guide/1_struture so you should type (or paste)
cd latex-guide/1_structure. Now try ls again.
You should should see files like example.tex.latexmk -pdf example.tex
and then looking at the pdf (you can click on it in the right-hand
sidebar).You can also copy the GitHub repository for the EGAP methods guides
or just the subdirectory for latex-guide to your own local
machine and launch the Terminal to see a Unix command prompt if you are
using a Mac or Linux machine. Windows machines also offer a
Unix command prompt, but it is a bit more involved to install it.
tex, latex,
pdflatex, etcAlthough the most basic program that parses markup is called
latex, in current daily use, you will mostly find yourself
using pdflatex or even xelatex or maybe
lualatex.
When Donald Knuth created this approach to making
beautiful scientific documents, he started with the tex
program but Leslie Lamport built latex by combining
multiple tex commands into fewer and simpler macros. Both
originally created documents in dvi or
postscript format. Nowadays, pdf files are the
best way to make a document that looks the same to all who want to view
it on their screens or print it for themselves.
Here is a list of the common programs that one might use to create a pdf file from a latex document:
tex: a program that typesets TeX directives or
macrospdftex: a program that generates a PDF (instead of
DVI)latex: a program that typesets a pile of LaTeX
directives and macrospdflatex: a program that generates a PDF from
LaTeXbibtex: a program to take bibliographic information
from a .aux file (created by a run of latex or
pdflatex etc.) and generates a bibliographybiber: a program like bibtex but with more database
management capabilitiesxelatex: support for a wide variety of fonts and
characters (you can type xelatex example.tex after changing the
font to one that is installed on your system)lualatex: extends latex so that more programming can be
done within it (via Lua for more complicate document designs and
workflows. See here for more on lualatex)For example, at the command prompt in the Terminal, you might type
pdflatex example.tex to create an example.pdf
file.
Notice also:
pdflatex (or xelatex or
lualatex) takes several passes — it must be run more than
one time — if your document involves citations or other more complex
features (like cross-references, tables of contents, etc.). In the
example above, if you only create example.pdf once, the
citation will show up as a ? and no bibliography will be
printed.latexmk or latexrun automate
this process of multiple passes by a latex processing program and a
bibliography creation program.The following figure shows how it may require three runs of
pdflatex (plus a run of bibtex) to go from an
example.tex file to an example.pdf file:
From LaTeX to PDF commands
You can replace those multiple lines with a single call to
latexmk -pdflatex example.tex.
pdflatex) and PDF figures (or
PNG … more on this later) rather than DVI or PS format for sharing
generated documentslatexmk to automate the
process of repeatedly using both pdflatex and
bibtex (or biber for those using
biblatex) to process a file.See the directory 2_texflavors and the
readme.md file therein. Can you change the font and use
xelatex to make a pdf, say, trying
latexmk -xelatex example.tex? You may need to use the unix
cd (change directory) command to move from
1_structure to 2_texflavors by typing
cd ../2_texflavors (this means “change directory to
2_texflavors which is one above my current directory in the
directory hierarchy”.
A given scientific paper will require many files and often involves
many authors. For example, a single paper will often use several
.tex files (for different sections), multiple figures (in
.pdf form), and bibliographies (in .bib
files). Further, each figure might depend on a pipeline of raw data and
code. Organizing these files in a consistent fashion will lead to a
clear process when dealing with revisions later.
For example a main.tex file might look like this:
\documentclass{article}
\title{My Title}
\begin{document}
\maketitle
\input{abstract}
\input{intro}
\input{results}
...
\bibliography{mybib.bib}
\end{document}But results.tex might look like this:
\section{Results}
Figure~\ref{fig:vaccine_by_pop} shows that opposition to vaccination peaks at a population of 100,000.
\begin{center}
\begin{figure}[!ht]
\includegraphics[width=.8\textwidth]{vaccine_by_pop.pdf}
\caption{Vaccination opposition by population}\label{fig:vaccine_by_pop}
\end{figure}
\end{center}The number 100,000 and the figure
vaccine_by_pop.pdf might derive from the R file called
vaccine_by_pop.R. This R file relies on data that is
cleaned by vaccine_data_cleaning.py. The data themselves
may also require code to download, clean, and merge with other
files.
So how do we organize the data, the files, and the overall workflow? There are many possibilities, but we’re reminded by a slice of the Zen of Python:
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
We provide two specific examples of workflows below, first noting two aspects that will greatly improve your process. The first is to separate your data from your processing and presentation:
data1.csv, ..., datan.csv)data_merged_filtered.db)temp_vs_time.csv)temp_vs_time.py)The second aspect, directly related to the LaTeX, is
to establish a predictable naming convention. For example, each output
like a table or figure uses one script with the same
name:temp_vs_time.pdf <—> temp_vs_time.py and that
LaTeX labelling follow this convention
\label{fig:temp_vs_time}. When editing the document, the
path from figure to the associated plotting script and related data is
then clear. For a high level discussion of project workflow see 10 Things to Know About Project Workflow.
Here are two examples of directory structures which have worked for us:
In this example, we use Matt West’s directory structure, where the versions of the paper are kept in their own directories:
paper_topic_name_dir_name | string used for repo, tex, and bib files
+ requirements.txt | number of pages, etc
+ 1_submitted_paper
| +-- paper_topic_name.tex
| +-- refs_topic_name.bib
| +-- journal_class.cls | any files needed for the journal latex style
| +-- figures
| | +-- temp_vs_time.pdf | descriptive names for figures (not fig1.pdf, etc)
| | +-- error_vs_stepsize.pdf
| | `-- ...
| +-- data | data files that generate the figures
| | +-- Makefile | Makefile that will re-generate all figures
| | +-- temp_vs_time.csv | use the same name as the resulting figure
| | +-- plot_temp_vs_time.py | plotting scripts, use names like plot_.py
| | `-- ...
| `-- submitted_paper_topic_name.pdf | actual PDF file submitted
+ 2_reviews
| +-- review_1.pdf | individual reviews
| +-- review_2.pdf
| `-- editor_statement.pdf | instructions and summary from editor
+ 3_response_to_reviews
| +-- response_topic_name.tex
| `-- sent_response_topic_name.pdf | actual PDF file sent to editor
` 4_revised_paper
+-- paper_topic_name_revised.tex
+-- refs_topic_name_revised.bib
+-- journal_class.cls | copy here any other files needed
+-- figures | copy here all the figures again
| +-- temp_vs_time.pdf | edit figures as needed
| +-- error_vs_stepsize.pdf
| `-- ...
+-- data | copy all data again and edit as needed
| `-- ...
`-- submitted_paper_topic_name_revised.pdf | actual PDF submitted
Reference: Matt West @ https://lagrange.mechse.illinois.edu/latex_quick_ref/
An alternative approach uses git branches for different versions, and
a single Makefile for all tasks (from turning the paper
into a pdf file via LaTeX, to creating figures, etc.). See also the
discussion in Bowers and Voors (2016),
section 3.
paper_topic_name_dir_name | string used for repo, tex, and bib files
+ Makefile | file that tracks file relationships
+-- Data | directory for data and data cleaning, merging work
+ README.md | file with instructions and explanations
+ merge_data.R |
+ orig_data.csv | original data set, not to be changed
+ merge_data.csv |
`-- ... |
+-- Analysis |
+ README.md |
+ linear_simulations.R | file that runs simulations and saves output
+ linear_simulations.rda | output from linear_simulations.R
`-- ... |
+-- Figures |
+ README.md |
+ linear_simulations_N100.R | file creating a figure
+ linear_simulations_N100.pdf | the figure from linear_simulations_N100.R
+ descriptives.R | file creating a table
+ descriptives.tex | the table in LaTeX format
`-- ... |
+-- Paper |
+ README.md |
+ main.tex | the main LaTeX file
+ abstract.tex | the abstract file
`-- ... |
+-- References |
+ big.bib | bibliography file
`-- ... |
Now is better than never.Feel free to play with the directory 3_workflows and the
readme.md file therein.
Often your writing is interleaved with edits and contributions from co-authors. How do you track changes and versions in your LaTeX document?
We strongly recommend git version control via github, either when working along on a document or when multiple authors are involved. We do not describe git and github in-depth here, but instead offer the following high-level best practices.
What files should you track (in version control)?
.tex file!.bib file for your article./figures/*.pdf./data/*.py ,
./data/*.R./data/*.csvWhat should you not track (in version control)?
paper_randnoise.pdf*.log, *.bbl,
*.aux, etc.DS_Store or other garbage from your systemVersion control is invaluable as a collaboration tool, however it does require diligence when working with co-authors on a LaTeX document. We recommend the following recipe:
latexmk myfile.tex -C, and
recompile to verify there are no errors.Fewer tools allow collaborators to edit plain text documents at the same time. We nearly always rely on asychronous collaboration, even if we have broken up a task and the whole team is working on it at the same time, even in the same room.
Overleaf is designed for this task. It compiles LaTeX and syncs with github. See also the online versions of LaTeX listed here.
There are other systems for editing plain text at the same time such as Teletype for Atom.
See the directory 4_git and the readme.md
file therein.
The overarching style of your document is often decided by the journal. With this in mind, it is best to typeset your document with the journal’s style file. For example here is the style file for Political Analysis. The Society for Industrial and Applied Mathematics (SIAM) provides style files directly whereas others, e.g. are included with your TeX distribution and available in CTAN. In any case, committing and not deviating from the expected format will accelerate your time-to-publication by not slowing down the copy editing at the journal. The style files will provide macros for author formats, custom figure environments, and almost certainly the preferred style for the bibilography. In addition, most journals provide a style guide that will detail the expectations on punctuation, hyphens, commas, etc.
See directory 5_style and readme.md for an
example.
You already know Hemingway’s famous quote: “the only kind of writing is re-writing”. However, you might not know about linters.
A linter is a program that analyzes your text (sometimes in real-time, as you write it). When your misspelled words are highlighted in your email client, you are seeing the results of a linter alerting you to improve your text. Linters are also used in programming — catching code errors before running the code, by alerting you to unmatched parentheses or missing semi-colons.
Other linters can look for issues with style. Consider the following terrible sentence:
More research is needed to fill the gap created in extant literature in order to impact policy with very important findings.
One linter, the write-good, highlights several potential problems:
col 16 error| [write-good] "is needed" may be passive voice [E]
col 71 error| [write-good] "in order to" is wordy or unneeded [E]
col 102 error| [write-good] "very" is a weasel word and can weaken meaning [E]
Of course, linters cannot do it all. We use them because they draw attention to sentences that may need work. Ultimately they (hopefully) help focus our attention on prose: re-writing the sentence without using a passive voice, without using “impact” as a verb (!), and with a stronger justification for research than to just fill a gap in the literature.
There are many fantastic tips and guides to improving your writing, from reading paragraphs and sentences out loud to “edit by ear” (Becker 1986) to guides specific to academic writing: Gopen and Swan (1990) and Becker (1986). Here, we offer a few directions that improve your writing specifically in LaTeX:
.tex document on-the-fly).% TODO,
% marks a line as a comment in the .tex file.
You can find all places where you have % TODO in your
document using: grep TODO paper_randnoise.tex.See the directory 6_linting and the
readme.md file therein.
You will find that authors have their own macros, their own style in
the .tex document, and their own preferences when using
LaTeX. Here we offer general principles that can help improve your
overall LaTeX workflow:
\begin{align}
\langle u, v \rangle & = \langle f, v\rangle\\
& = G(v)
\end{align} \begin{tabular}{lrllr}
\toprule
& \multicolumn{1}{c}{$n$}
& \multicolumn{1}{c}{$t$}
& \multicolumn{1}{c}{$\rho$}
& \multicolumn{1}{c}{$m$} \\
\midrule
experiment 1 & \num{ 19929} & 0.32 & 0.8 & 55 \\
experiment 2 & \num{ 7729292} & 0.78 & 0.7 & 85 \\
experiment 3 & \num{888173928} & 1.25 & 0.65 & 2 \\
\bottomrule
\end{tabular}.tex file$\vec{H}(\text{curl},\Omega)$ to
produce \(\vec{H}(\text{curl},\Omega)\)
we might use a macro to create a shortcut command like
$\Hcurl$:\newcommand{\Hcurl}{\vec{H}(\text{curl},\Omega)}\renewcommand{\vec}[1]{\boldsymbol #1}.tex source unreadable.booktabs: provides clean horizontal lines for tables
(avoid vertical lines), providing \toprule and
\bottomrule in the example above.siunitx: to format large numbers and notation,
providing \num in the example above. \begin{align} for everything, instead try
specific environments built for your purpose.equation is your base equation environment. Use this
unless you have multiple equations.align should be used for multiple equations that
require alignment.split is used for a single equation that
requires alignment when split.multline is used for a single equation where
no alignment is needed.subequations may be used around align to
retain a single equation numbering.See example.tex in 7_dos for examples of
use.
\label{fig:easy_figure_name}\begin{figure}[!ht]
\centering
\includegraphics{example.pdf}
\captions{A caption}\label{fig:example}
\end{figure}\label{eq:useful_equation_name}\begin{equation}\label{eq:Axb}
A x = b
\end{equation}\label{sec:i_can_remember_this_section_name}.\label{tab:what_a_great_table_name}.Central to TeX is an algorithm for placing and spacing figures and
text so that you don’t have to. Float environments (figure, table, etc)
should be attached to the paragraph of their first reference (more in
the next section). Avoid use of
\FloatBarrier, \newpage, \vspace,
\hspace, etc to muscle your own spacing.
.tex document readable.See the directory 7_dos and the readme.md
file therein.
The LaTeX system allows you to (1) insert citations in your text
using commands like \cite{ChOlSe_2021_lsrbm} which can turn
into [7], (Chaudhry et al., 2021),
[Ch21], or other citation styles within the text itself and
also (2) to print out your bibliography, formatted according to your
journal’s guidelines, using a single command in the LaTeX document like
\bibliography{mybib.bib}. Separating formatting from
information saves time: hundreds of citations will be printed
automatically in the correct format if desired including only the
sources you cited. If you decide that you no longer need a citation,
this will be removed from your bibliography automatically. Journals
often provide formatting guidelines in .bst files that can
be referred to in the \bibliographystyle{} command.
The program bibtex (or biber) reads
.aux files created by latex programs and creates a
.bbl file which is then read by the LaTeX program to format
everything (above we showed the need to run pdflatex,
bibtex, pdflatex, and pdflatex in
order to generate citations).
To use bibtex, you need a plain text file that is a
database with entries formatted in BibTeX format. For example, here is
one entry in the BibTeX file for this essay:
@article{ChOlSe_2021_lsrbm,
author = {Chaudhry, Jehanzeb H. and Olson, Luke N. and Sentz, Peter},
doi = {10.1137/20M1323552},
journal = {SIAM Journal on Scientific Computing},
number = {2},
pages = {A1081-A1107},
title = {A Least-Squares Finite Element Reduced Basis Method},
url = {https://doi.org/10.1137/20M1323552},
volume = {43},
year = {2021}
}.bib entry.
Grab the full citation online at the citation’s journal and/or Google
Scholar see instructions here for getting BibTeX formatted
entries from Google Scholar.{ } instead of
“ “.{ } also force capitalization: for example
title = {All about {Krylov} methods}..bib entries. This can generate warnings..bib file) once. (And you can use tools like
Zotero and BibDesk to make managing those collections of
bibliographic information easier.)See the directory 8_citations and the
readme.md file therein.
Figures, tables, and math break up the text of a document and convey
information that can make or break the overall flow of your story. In
general, if a figure or table has been created using code, your project
should have a figure or table creation file like
linear_simulations_N100.R which creates one figure
linear_simulations_N100.pdf. This figure creation file
might require as input another file with simulation results, and in turn
the simulation results creator file may need data; this dependency may
be described in a readme or Makefile. For
example in line 1 of the Makefile below we see
Data/clean_data.csv: Data/clean_data.R Data/raw_data.csv
which means that the file Data/clean_data.csv depends on
Data/clean_data.R and Data/raw_data.csv. And
line 2 is a command used to create Data/clean_data.csv (in
this case, the command is R ---file Data/clean_data.R.
1 Data/clean_data.csv: Data/clean_data.R Data/raw_data.csv
2 R ---file Data/clean_data.R
3
4 Analysis/linear_simulations.rda: Analysis/linear_simulations.R Data/clean_data.csv
5 R --file Analysis/linear_simulations.R
6
7 Figures/linear_simulations_N100.pdf: Figures/linear_simulations_N100.R Analysis/linear_simulations.rda
8 R --file Figures/linear_simulations_N100.RIn general figures, tables, and math should appear close to where they are discussed in the text.
Figures are central to the overall feel of your article. Here are a few general tips for working with LaTeX and figures:
\includegraphics to scale a figure will also
change the font sizes; you should attempt to generate unscaled figures.
extrafont.rcparams here.\includegraphics[]{} command. For
example, if we wanted to include a figure but scale it to 1/3 of the
width of the text (the area within the left and right margins), we would
use:\includegraphics[width=0.3\textwidth]{myfig.pdf}Figure~\ref{fig:vaccine_by_pop} shows that opposition to vaccination peaks at a population of 100,000.
%
\begin{figure}[!ht]
\centering
\includegraphics[width=.8\textwidth]{vaccine_by_pop.pdf}
\caption{Vaccination opposition by population}\label{fig:vaccine_by_pop}
\end{figure}\begin{figure}[!ht] or
\begin{table}[!ht].
! tex will ignore area restrictions.h place it “here” if it fits in the area.t place it at the “top” otherwise and if it fits
otherwise create a new page.xtable package to
convert a matrix or data-frame to a LaTeX formatted table.Math fonts should work with the main font of the article. For examples of good math and text font pairings see the LaTeX Font Catalogue.
See the directory 9_figures and the
readme.md file therein. In particular, you will consider
the following “bad” figure and how to improve it in your LaTeX
document.
A terrible figure
\cref{} referencing for allA LaTeX document is a plain
text file. This means that you can use any text
editor to write a LaTeX document. However, a text editor that (1)
recognizes that \textbf{} is a LaTeX command or that (2)
keeps track of matching braces and parentheses makes it easier to write
LaTeX markup. To that end, we use neovim (sometimes with the vimr gui) with
vimtex
plugins but we know that there are many other approaches to typing a
plain text document using LaTeX markup.
We wrote this document using pandoc flavored markdown and turned it from plain text into HTML via the following command at the unix command line on our OS X laptops:
pandoc latex-guide.md --to html4 --from markdown+yaml_metadata_block+autolink_bare_uris+tex_math_single_backslash+inline_code_attributes --output latex-guide.html --self-contained --variable bs3=TRUE --standalone --section-divs --template latex-guide-template.html --include-in-header latex-guide-header.html --number-sections --table-of-contents --toc-depth=1 --variable theme=bootstrap --mathjax --variable 'mathjax-url:https://mathjax.rstudio.com/latest/MathJax.js?config=TeX-AMS-MML_HTMLorMML' --citeproc
Alternatively, if you have access to R, you can do the following to turn this markdown document into HTML.
Rscript -e "library(rmarkdown); render('latex-guide.md')"We suggest the Free online introduction to LaTeX if you are brand new to LaTeX.↩︎
We have decided to write this guide in a very opinionated way. And we emphasize the nitty gritty of technical document creation. If these opinions inspire a reader to write a 10 Things Guide on using Markdown or Google Docs, please do write one! As an open-source document, we are also happy to receive pull requests for improvements to this guide.↩︎
Try out \title{Some Paper} and
\author{Some Person} in the preamble and
\maketitle right after the \begin{document}
line.↩︎