The grid lines on graph paper have to be just right. If they’re too dark, they compete for attention with your ink. If they’re too light, what’s the point?

I took Edward Tufte’s concept of the data-ink ratio and quantified what makes the best graph paper. I used “machine learning” to compare the darkness of the grid to the background (quotes because the term machine learning gets used for everything these days, but, hey… what can you do). Technically speaking, I used k-means clustering in R.

Setup

I took 6 gridded notebooks that I’ve accumulated over the years (pictured below).

Screen Shot 2018-04-17 at 1.28.42 AM

From left-to-right and top-to-bottom, here’s what they are:

  1. Some random unmarked brown notebook. I think it’s Japanese.
  2. Campus S5 by Kokuyo.
  3. Moleskin.
  4. Bienfang Gridded Paper.
  5. Kokuyo filler 5m A5.
  6. Fabriano EcoQua Spiral Notebook.

Equipment

  1. Canon Pixma MG5320 Scanner
  2. Pigma Micron 005 Black Ink Pen
  3. Various gridded notebooks
  4. Scissors
  5. iPhone X (camera)
  6. My Hackintosh with 40″ Display

Procedure

For each notebook, I:

  1. Cut a small square of graph paper.
  2. Drew an ink dot within a grid square by circling pen around one point.
  3. Photographed this paper section on top of notebook it came from (see above).
  4. Scanned in black & white at 300 DPI in PNG format (to avoid compression artifacts). With care taken to position paper in same position in scanner each time.
  5. Cropped scanned image within borders of grid paper so that only the paper is visible (no white background of the scanner).

Screen Shot 2018-04-17 at 1.29.01 AM

Then, I ran all those images through this analysis script I wrote (below).

Validation

The script separates each scan into its 3 components: background, grid, and dot, and colors them separately. Check out the images below to see how it did. (It is critical to validate your machine learning algorithm is doing what you think it is, or someone can end up dead.)

Screen Shot 2018-04-17 at 1.29.17 AM

The colors chosen for each component don’t matter—all that counts is that the algorithm was able to separate of the primary components of the grid paper. The separation looks good to me, aside from the 4th sample from the top. But the grid lines on that one are extremely faint, so the noisiness makes sense.

Analysis

Now that I trust my algorithm knows how to separate each sample (hopefully you do too), we can proceed. In addition to outputting those validation images, the script prints out the average darkness for each sample’s components, plus an “Ink-Grid ratio” stat (Ink dot / Gridlines).

Screen Shot 2018-04-16 at 2.18.28 PM

With an Ink-Grid ratio of 11.74, Sample 4 is the outlier. The ratio is sky high because the gridlines on that one are too faint.

Sample 3 has the darkest gridlines. That’s the Moleskin. I find them too dark. Those dark lines give it the lowest Ink-Grid ratio of the bunch, at 4.11.

To my eyes, all the others are decent, but Sample 6 is the best. That’s the Fabriano EcoQua notebook. Sample 5, the Kokuyo “filler”, is a close second, with the added benefit that the pages are perforated, so they tear out easily.

Screen Shot 2018-04-17 at 1.33.19 AM

Discuss on Hacker News. For more like this, follow @masonicb00m on Twitter.