Challenges with Density Measurement

The other day, I spent a few hours grinding coffee.

In a previous post, I proposed a model of coffee grinding, along with some data from my morning coffee which seemed to support the model. However, it should be said that this isn’t very high quality data—there are very few data points outside the range of grind settings I use on a daily basis, and those data points aren’t repeated.

One reason for this is that you go through a lot of coffee if you’re making repeated measurements of regularly spaced data points.

Level Ground, a roaster from Victoria, Canada, offered to help with this. They generously sent 3 kg of their “donation” coffee for use in this experiment. This coffee is a mix of beans and roasts, which poses some challenges here—insofar as these measurements shed light on the properties of the bean, we might see contributions from several different beans, which could obscure detailed features in the plot. But this would also be the case for blended coffees, so I think it’s well worth exploring.


I made 69 measurements in total—three samples each at 23 equally spaced grind settings ranging from 0.5 to 6.0 on my Eureka Mignon Specialita. This corresponds with a burr spacing of roughly 25 to 300 μm, based on measurements discussed in this post.

These measurements were made in two passes. First, I measured at a grind setting of 0.5, 1.0, 1.5, etc. Then I came back and measured at 0.75, 1.25, 1.75, etc. One advantage to measuring in this way is that it will make some error sources more obvious—we expect measurements from the two passes to line up, so if they don’t, something is wrong.

For each measurement, I used the following protocol:

  1. Weigh about 15 grams of beans into a weighing boat.
  2. Pour the beans into the single-dose hopper.
  3. Grind into the portafilter basket.
  4. With the grinder still running, blow out any remaining coffee using a bellows.
  5. Weigh the ground coffee.
  6. Use a WDT tool, in a spirograph motion, starting from the bottom of the basket and raising the tool until it just levels out the surface.
  7. Tamp the coffee very firmly using an ordinary tamper, so that the grounds no longer compress when additional force is applied.
  8. Check zero on the depth measurement using a flat surface.
  9. Measure depth from the rim of the portafilter basket as described in a previous post.
  10. Clean the portafilter basket, tamper, and caliper using a microfiber cloth.

After changing the grind setting, I would use the bellows to blow out the grinder another time, until no grounds came out of the chute.


I’ve uploaded the raw depth measurements to GitHub. The following chart shows the calculated density values:

I’ve drawn the first pass in blue and the second pass in orange here to give a visual separation between the two passes. Clearly, something changed from one pass to the other—the big question in my mind is, what?

Sources of error

There are two different kinds of error in this data.

First, there is random error which spreads out the three repetitions at each grind setting. This error is reasonably consistent across the data, with a width of about 0.02 g cm-3.

Second, there is what appears to be a systematic error correlated with grind setting. At lower grind settings, we see a difference of about 0.03 g cm-3 between the two passes; at a grind setting of about 4.5 there is almost no difference; and we see larger differences again above that.

Where do these differences come from? A few possibilities come to mind:

  • Error in measuring weight
  • Error in measuring depth
  • Inconsistency in the grind setting
  • Variations in the beans
  • Variations in tamping force

Let’s consider each of these.

Error in measuring weight

In a previous post, I described a peculiar issue with the accuracy of my cheap 0.01 g scale, which effectively reduces the scale’s accuracy to 0.1 g.

If we assume the scale is only accurate to 0.1 g, this represents an error of about 0.7% in weight. In density, this would correspond with an error of about 0.003 g cm-3—much smaller than the difference between the two passes in the plot above.

We could address this source of error by adding a known weight to ensure that our measurements fall outside the range where the scale is known to be inaccurate. For example, we can add a 0.5 g weight to shift a measurement of 18.00 g up to 18.50 g. Then we can subtract 0.5 g before recording the value.

Error in measuring depth

Cheap digital calipers are surprisingly accurate. Typically, the biggest difference between a cheap caliper and the high-end model is fit and finish.

The zero on the calipers was checked before each measurement, so we should have caught any slippage in the depth attachment.

One potential source of error here is related to the hardness of the surface we’re measuring. It may be that the measurement changes depending on how hard we press the depth measurement tool onto the surface.

To quantify this, we can try measuring the same tamped puck three times. In the small experiment below, I prepared three identical tamped pucks, then repeated the depth measurement three times for each. All three trials gave two identical measurements, so only two points are visible for each trial.

The density measurements for each tamped puck repeat within about 0.002 g cm-3. It seems likely that this is not a significant contributor to our experimental error.

Interestingly, although the grind setting was not changed between trials, and a spring tamper was used, we see a variation of about 0.01 g cm-3 between pucks.

In this case, most of the difference is between the first and second tests, so it’s possible the main contributor here is a change in grind setting. This could be addressed by running an additional dose of beans through the grinder every time the grind setting is changed.

Inconsistency in the grind setting

In a previous post, I talked about modifications I’ve made to the Specialita grinder to improve repeatability. In that post, I measured burr spacing vs. grind setting, and obtained an excellent linear fit. This suggests to me that the repeatability of the grind setting is not a significant source of error.

Variations in the beans

The beans used in this experiment were “donation” beans, which could include several different beans and roasts. These could be distributed spatially within the bag, so that we were performing the experiment on different beans in the second pass than in the first. This would be difficult to quantify, but we could mitigate this issue by thoroughly mixing the beans before the experiment.

Variations in the tamping force

I recently wrote about an experiment in which I measured the effect of tamping force on the density of the tamped puck. That experiment showed the following density changes:

  • An increase of 0.03 g cm-3 when going from a 15 pound to a 30 pound tamping force.
  • An increase of 0.015 g cm-3 when going from a 1 second tamp to a 10 second tamp.

These changes could explain the difference we observed between the two passes. In the current experiment, we were not using a spring loaded tamper, so it’s quite possible that the tamping force varied significantly between measurements made in the morning (blue points) versus measurements made in the afternoon (orange points).

Interestingly, in the current experiment we saw more variation between the passes at lower grind settings versus at higher grind settings. This could be explained by a change in composition—i.e., at low grind settings we would expect more fines, which might have a greater sensitivity to differences in tamping force compared with nominal particles.

I have a future experiment in mind to measure stress vs. strain for ground coffee and for sifted fines and nominal particles, in order to explore this relationship further. In the meantime, though, we can mitigate the effect of changes in tamping force by using a spring loaded tamper for future experiments.


I was surprised by the difference between the two passes in this experiment. However, exploring that difference has led to a number of insights, which I’ve explored in recent posts.

I am planning to repeat this experiment soon, with a few changes:

  • Weights will be adjusted out of the “danger zone” for my scale, then corrected back to the actual value.
  • Some beans will be run through the grinder after changing the grind setting, in order to clear out old grounds.
  • Beans used in the experiment will be mixed beforehand.
  • A spring loaded tamper will be used to provide a consistent tamping force.

These changes should reduce the spread of the measurement points, so we’ll have a better chance of identifying meaningful patterns in the data.

Leave a comment

Your email address will not be published. Required fields are marked *