Clueless Fundatma — GeistHaus

LaTeX and arxiv

Sachin Shanbhag Aug 20, 2024 Updated Aug 20, 2024

Show full content

Posting a LaTeX manuscript on arxiv is straightforward.

Compile your document (say, main.tex). It is okay to leave all your figures in a "figs/" subfolder. Unlike some outlets, you don't have to flatten your directory structure.
Apart from main.tex, main.bbl [bibliography], and figures, you may delete all other files. You need the bbl file because arxiv does not run bibtex.
Zip the folder, and upload on arxiv.

If you have a supplementary information document (say, si.tex) and you use the "xr" package to cross-reference between main.tex and si.tex, then a few extra steps are required.
arxiv compiles all tex files in the zipped folder in alphabetic order. So it is important that "main.tex" appears before "si.tex", in case your tex files have different labels.

Compile main.tex and si.tex several times on your machine, so that inter-document cross-references work as desired.
Apart from main.tex, si.tex, main.bbl, si.bbl, main.aux, si.aux, and figures, you may delete all other files. [xr uses main.aux and si.aux.]
Relabel main.tex to main_ren.tex, main.bbl to main_ren.bbl, si.tex to si_ren.tex, and si.bbl to si_ren.bbl. Do not relabel *.aux files. Do not compile.
Zip the folder, and upload on arxiv.

tag:blogger.com,1999:blog-7379110960796014170.post-2132492234012960651

Extensions

Large PDFs with Matplotlib

Sachin Shanbhag Feb 22, 2024 Updated Feb 22, 2024

Show full content

Vector graphics (SVG/PDF) outputs of scatterplots with thousands of points lead to bloated files, unlike say raster formats like PNG. This makes scrolling PDF documents that include such bloated files a painful affair.
The reason is fairly obvious: vector files scale with the number of data-points, while raster files scale with the number of pixels.

There are many potential solutions. The simplest is to rasterize only the large dataset of scatter points using the rasterized=True flag. Thus,
plt.plot(x, y, 'o', alpha=0.1, rasterized=True)

The resulting PDF is much lighter.

tag:blogger.com,1999:blog-7379110960796014170.post-4748745287129665690

Extensions

Merging BibTeX bibliography files

Sachin Shanbhag Sep 8, 2023 Updated Sep 8, 2023

Show full content

Suppose you want to merge two bib files (f1.bib and f2.bib) that have considerable overlap. One easy solution using Jabref works as described below.

Suppose the target bibliography file without duplicates is merge.bib.

1. Copy f1.bib to merge.bib [cp f1.bib merge.bib]

2. Open merge.bib with Jabref

3. Then click File > Import into current database and select the other file [f2.bib]

4. You get a dialog box which allows you to manually decide what entries/versions you want to retain. If both f1.bib and f2.bib are of comparable quality, you can select "Deselect all duplicates" which automatically unselects duplicated entries.

5. Hit "OK" and save the modfied database [Ctrl-S]

tag:blogger.com,1999:blog-7379110960796014170.post-335171618629390229

Extensions

Two useful Matplotlib utilities

Sachin Shanbhag Jul 9, 2023 Updated Jul 9, 2023

Show full content

1. Latexify_py

latexify is a Python package to compile a fragment of Python source code to a corresponding expression.

2. Pylustrator
Pylustrator offers an interactive interface to find the best way to present your data in a figure for publication. Added formatting an styling can be saved by automatically generated code. To compose multiple figures to panels, pylustrator can compose different subfigures to a single figure.
See Youtube demo.

tag:blogger.com,1999:blog-7379110960796014170.post-3340317422981657294

Extensions

LaTeX to Word

Sachin Shanbhag Nov 8, 2022 Updated Nov 8, 2022

Show full content

Often I have a document in LaTeX, and somebody else needs an editable copy in Word. Here is a list of hacks I have learnt to use:

1. If the document is relatively free of math and figures then the simplest course is often to compile a PDF, and "import" the PDF into MS Word. This works out remarkably well in many cases.

2. The same thing above applies to figures. You can now directly drop PDF images into a Word doc.

3. If you have lots of equations, then it is worthwhile to use pandoc

pandoc mydoc.tex -o mydoc.docx

More sophisticated options to copy cross-references, and bibliography exist. See this as well.

4. Many journals accept PDF figures. If they need TIFF, then you can use Adobe Acrobat online to do this conversion. In my experience, this produces smaller files compared to other automatic converters including ImageMagick.

tag:blogger.com,1999:blog-7379110960796014170.post-4170163808684388264

Extensions

Recursively Clean LaTeX Debris in all Sub-Folders

Sachin Shanbhag Aug 17, 2022 Updated Aug 17, 2022

Show full content

Often, I have a big folder like Lectures/ which may have sub-folders based on topics, and each topic might have additional folders. To clean auxillary LaTeX files in one fell swoop use,

find ./ \( -iname "*.bbl" -o -iname "*.aux" -o -iname "*.log" -o -iname "*.blg" -o -iname "*.nav" -o -iname "*.snm" -o -iname "*.toc" -o -iname "*.vrb" -o -iname "*.out" -o -iname "*.synctex.gz" -o -iname _minted*" \) -delete

tag:blogger.com,1999:blog-7379110960796014170.post-7282680105888463669

Extensions

RegEx Help

Sachin Shanbhag Jul 18, 2022 Updated Jul 18, 2022

Show full content

This ML based regex generator is quite handy!

https://www.autoregex.xyz/home

tag:blogger.com,1999:blog-7379110960796014170.post-3983872237731886599

Extensions

Lectures on Graphical Models

Sachin Shanbhag Jun 22, 2022 Updated Jun 22, 2022

Show full content

Christopher Bishop has an excellent set (1, 2, and 3) of introductory lectures on "Probabilistic Graphical Models". They are well-motivated and cover topics that include:

directed and undirected graphs
conditional independence
factor graphs
inference using factor graphs and sum/product rules

tag:blogger.com,1999:blog-7379110960796014170.post-5273762668474203195

Extensions

QuickTip: Extracting pages from PDF on Linux

Sachin Shanbhag Mar 22, 2022 Updated Mar 22, 2022

Show full content

On a Mac OSX system, the default app Preview allows you to cut and paste pages from a PDF.

On Linux you can use PDFChain to manipulate PDFs. If you simply want to extract a certain range, then qpdf is quite handy.

A CLI solution is to use ghostscript as described here:

gs -sDEVICE=pdfwrite -dNOPAUSE -dBATCH -dSAFER \

-dFirstPage=1 -dLastPage=15 -sOutputFile=outfile.pdf inpfile.pdf

You can make the interface friendlier by saving a function in your bashrc as described in the article.

tag:blogger.com,1999:blog-7379110960796014170.post-1563023955986480393

Extensions

Matplotlib: Lines Connecting Points and Boxes

Sachin Shanbhag Apr 2, 2021 Updated Apr 2, 2021

Show full content

This gist has python functions that help Matplotlib draw lines connecting points, and to draw boxes.

def drawBox(xlim, ylim):
   pts = [[xlim[0], ylim[0]], [xlim[1], ylim[0]],   [xlim[1], ylim[1]], [xlim[0], ylim[1]],   [xlim[0], ylim[0]]]
   x, y = zip(*pts)
   return x, y

def connectPoints(pts):
   x, y = zip(*pts)
   return x, y

tag:blogger.com,1999:blog-7379110960796014170.post-7180703147324476552

Extensions

Quicktip: Batch convert LibreOffice documents to PDF

Sachin Shanbhag Mar 30, 2021 Updated Mar 30, 2021

Show full content

To convert all the DOCX files the current working directory to PDF

lowriter --headless --convert-to pdf *.docx

Similarly, to convert ODT files,

lowriter --headless --convert-to pdf *.docx

tag:blogger.com,1999:blog-7379110960796014170.post-1627753946851645812

Extensions

QuickTip: LaTeX multiline equations with explanations

Sachin Shanbhag Feb 4, 2021 Updated Feb 4, 2021

Show full content

Sometimes you want to write a sequence of steps, and write the explanation for each step next to it.

abc = xyz pythagoras rule

= uvw triangle inequality

= ABC

It is easy to do this with the amsmath package as detailed in this StackOverflow question.

\usepackage{amsmath}

\begin{align*}
abc &= xyz \\
    &= uvw && \text{pythagoras rule} \\
    &= D   && \text{triangle inequality} \\
    &= ABC && 
\end{align*}

tag:blogger.com,1999:blog-7379110960796014170.post-9144808658838399408

Extensions

Smooth Transition Between Functions

Sachin Shanbhag Dec 7, 2020 Updated Dec 7, 2020

Show full content

Stitching together two functions is sometimes required as a way to transition from one dependence to another. The following schematic describes the idea pictorially:

Two different approaches are considered in this PDF (or this Jupyter Notebook).

tag:blogger.com,1999:blog-7379110960796014170.post-6978117268717848934

Extensions

Trapezoidal rule in log-log space

Sachin Shanbhag Oct 26, 2020 Updated Oct 26, 2020

Show full content

Consider the problem described in this StackOverFlow post. You have a function with certain smoothness properties that are apparent on a log-log plot. This is often accompanied by a large domain of integration. It seems worthwhile to "integrate in logspace", whatever that means.
This Jupyter notebook probes this question and makes some recommendations.

tag:blogger.com,1999:blog-7379110960796014170.post-8502572431310449563

Extensions

Quicktip: Reindent Python Scripts

Sachin Shanbhag May 20, 2020 Updated May 20, 2020

Show full content

Suppose part of a python file uses spaces for indentation, while another part uses tabs. This will throw up exceptions at runtime. So the question is how to fix it.

One answer is to use the python script reindent.py. Stick it in some folder (~/bin/) in the default path and make it executable (chmod +x reindent.py).

The usage is straightforward:

reindent -n file.py

modifies the original file in place.

tag:blogger.com,1999:blog-7379110960796014170.post-6147677137041679267

Extensions

Matplotlib: Saving TIFF and JPG formats

Sachin Shanbhag May 18, 2020 Updated May 18, 2020

Show full content

With pillow installed, on my LinuxMint installation:

import matplotlib
matplotlib.use('TkAgg') # backend

x = np.linspace(0,1)
plt.plot(x, x**2)
plt.savefig('test.tiff', dpi=300, fmt="tiff", pil_kwargs={"compression": "tiff_lzw"})

tag:blogger.com,1999:blog-7379110960796014170.post-9037155008202118086

Extensions

QuickTip: Catching array bounds violations in Fortran 90

Sachin Shanbhag Jan 20, 2020 Updated Jan 20, 2020

Show full content

With gfortran, you can check if array bounds are violated during runtime by,

gfortran -fbounds-check myProg.f90

tag:blogger.com,1999:blog-7379110960796014170.post-5938818558135056286

Extensions

LaTeX: Cross-referencing between Different Documents

Sachin Shanbhag Oct 18, 2019 Updated Oct 18, 2019

Show full content

Problem: I have a manuscript TeX file (main.tex), and an independent supporting information file (si.tex). I was to cross-reference (using \label and \ref) items across the two files.

For example, I might want to reference figure 1 from si.tex in main.tex.

Solution: As this SO answer suggests, the answer lies in the CTAN package xr.

In main.tex, just include "si.tex" as an external documents, and all its labels become visible!

\usepackage{xr}
\externaldocument{si}

tag:blogger.com,1999:blog-7379110960796014170.post-8002400868511069481

Extensions

Parameter Uncertainty in Numpy Polyfit

Sachin Shanbhag Oct 17, 2019 Updated Oct 17, 2019

Show full content

Say you want to fit a line to (x,y) data. With polyfit, you can say,
coeff = np.polyfit(x, y, 1)
With numpy 1.7 and greater, you can also request the estimated covariance matrix,
coeff, cov = np.polyfit(x, y, 1, cov=True)
The standard error on the parameters is the square-root of the diagonal elements
print(np.sqrt(np.diag(cov)))
This report referenced in the SO page is quite useful!

tag:blogger.com,1999:blog-7379110960796014170.post-9076001693744905870

Extensions

Learning Gaussian Processes

Sachin Shanbhag Sep 30, 2019 Updated Sep 30, 2019

Show full content

I've been studying up Gaussian process modeling for machine learning.

For someone seeing these concepts for the first time, I would recommend the following sequence based on my experience:

1. A Visual Exploration of Gaussian Processes

It hits the key points of what makes multinormal distributions special (conditionals and marginals are normal too!), and the visuals help build intuition.

1a. Gaussian Processes for Dummies

You might not need this, but I like this essay because it is jargon-free, and focuses on how to get things going. There is python code at the end, which you can play with.

2. Chapter 2 of Gaussian Process for Machine Learning

This "bible" is astonishingly well-written. If you are familiar with linear algebra and some statistics, this is a breezy read. Plus, all the important formulae and algorithms you see in different articles, are available here in one place!

3. If you like videos, then this YouTube lecture might be worth watching!

tag:blogger.com,1999:blog-7379110960796014170.post-7783801797054789719

Extensions

QuickTip: Toggling to Previous View in PDF Readers:

Sachin Shanbhag Aug 22, 2019 Updated Aug 22, 2019

Show full content

I use Preview (on my Mac laptop) and Foxit Reader (on my Linux Desktop) to read PDFs.

While reading papers, I often find myself clicking on links to citations. This takes me to the reference section. After looking up the citation, I like to go back to the previous location on the paper (right before clicking on the link).

How to go back to the "previous view" isn't well documented.

In Preview, the short cut is "Cmd + [" and "Cmd + ]".

In Foxit Reader for Linux (v2.4 and above) the short cut is "Alt + Left Arrow" and "Alt + Right Arrow", respectively.

tag:blogger.com,1999:blog-7379110960796014170.post-915547280223216176

Extensions

QuickTip: Math Font in Matplotlib

Sachin Shanbhag Jul 5, 2019 Updated Jul 5, 2019

Show full content

Matplotlib (v2 and higher) uses "mathtext" to render math by default. It is quite capable, but I don't like the default font, and prefer the classic "Computer Modern" font.

You can fix this globally by modifying the rc file in your custom-style file (use the command matplotlib.get_configdir() to find location) by adding the line:

mathtext.fontset : cm

If you want to render all text using LaTeX (this slows down rendering somewhat), then use:

text.usetex : true

tag:blogger.com,1999:blog-7379110960796014170.post-5277852549021918834

Extensions

Snip Math

Sachin Shanbhag Jul 3, 2019 Updated Jul 3, 2019

Show full content

Mathpix Snip looks like an amazing tool.

You take a screenshot of some math and get it rendered in LaTeX.

The process as illustrated on their website:

It is available for download on all major OS.

tag:blogger.com,1999:blog-7379110960796014170.post-8701390930265791343

Extensions

Zero and Infinity

Sachin Shanbhag Jun 26, 2019 Updated Jun 26, 2019

Show full content

A triangle has three corners.

A pentagon has five. A decagon has ten.

As the number of corners becomes large, the polygon becomes more "circular".

When the number of corners is infinity, the polygon is a circle - a shape with zero corners!

tag:blogger.com,1999:blog-7379110960796014170.post-7602759657752126246

Extensions

Links

Sachin Shanbhag Jun 19, 2019 Updated Jun 19, 2019

Show full content

1. Stay in the Game: There is hope in humanity!

2. Kevin Simler's graphical essay on going critical

3. The force of Gilbert Strang

4. Strogatz's "Beauty of Calculus" Lecture

tag:blogger.com,1999:blog-7379110960796014170.post-1121502545138802983

Extensions