Skip to content

Commit

Permalink
Update docs
Browse files Browse the repository at this point in the history
  • Loading branch information
danielcompton committed Dec 5, 2018
1 parent a9a5ce7 commit ecc3ec3
Showing 1 changed file with 21 additions and 5 deletions.
26 changes: 21 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,9 +22,25 @@ Require the namespace and add `wrap-file-etag` to your middleware stack:
(etag/wrap-file-etag))
```

### Returning 304 Not Modified responses

This middleware only calculates checksums, it doesn't make any decisions about the status code returned to the client. If the User Agent has provided an Etag in an [If-None-Match](https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/If-None-Match) header that matches what is calculated by the server, then you probably want to return a [304 Not Modified](https://httpstatuses.com/304) response. I recommend using the middleware built-in to Ring, [`wrap-not-modified`](http://ring-clojure.github.io/ring/ring.middleware.not-modified.html).

```clojure
(ns my-app.core
(:require [co.deps.ring-etag-middleware :as etag]
[ring.middleware.not-modified :as not-modified]))

(-> handler
(etag/wrap-file-etag)
(not-modified/wrap-not-modified))
```

For a more complete example, you can see the [middleware configuration](https://github.com/bhauman/lein-figwheel/blob/v0.5.17/sidecar/src/figwheel_sidecar/components/figwheel_server.clj#L261-L263) that Figwheel uses.

## Caching checksum calculations

Once a checksum for a file has been calculated once, it is unnecessary to calculate it again. If the files you are serving are immutable, then it would be possible to pre-calculate the checksum once. However if you are working in an environment where the files being served may change (say a ClojureScript compiler output directory), then you cannot store the checksum separately from the file (either in-memory or on-disk), as you don't have a 100% reliable method for detecting when to recalculate the checksum (without running a file watcher, which introduces its own problems).
Once a checksum for a file has been calculated once, it is unnecessary to calculate it again. If the files you are serving are immutable, then it would be possible to pre-calculate the checksum once and store the checksum in a local atom. However if you are working in an environment where the files being served may change (say a ClojureScript compiler output directory), then you cannot store the checksum separately from the file (either in-memory or on-disk), as you don't have a 100% reliable method for detecting when to recalculate the checksum (without running a file watcher, which introduces its own problems).

Instead, ring-etag-middleware provides a way to store checksums in the [extended attributes](https://en.wikipedia.org/wiki/Extended_file_attributes) of the files being served. If this option is enabled, the middleware will check if the `java.io.File` in the Ring response has a checksum calculated. If so it will return it as the ETag; if not it will calculate the checksum and store it as an extended attribute on the `File`. The JDK doesn't support this on all platforms that have support for extended attributes (notably [macOS](https://bugs.openjdk.java.net/browse/JDK-8030048)), so it is recommended to check for support with the provided `supports-extended-attributes?` function.

Expand All @@ -39,14 +55,14 @@ Instead, ring-etag-middleware provides a way to store checksums in the [extended
))

(-> handler
(etag/wrap-file-etag
{:extended-attributes?
(etag/wrap-file-etag
{:extended-attributes?
(etag/supports-extended-attributes? file-path)}))
```

## Checksums or hashes?

[Checksums](https://en.wikipedia.org/wiki/Checksum) are faster to calculate than [cryptographic hash functions](https://en.wikipedia.org/wiki/Cryptographic_hash_function) like MD5 or SHA1. We don't need any of the cryptographic properties that the hash functions provide for an ETag, so using a checksum is a better choice. Pandect has some [benchmarks](https://github.com/xsc/pandect#benchmark-results) showing the speed differences between checksums and hashes.
[Checksums](https://en.wikipedia.org/wiki/Checksum) are faster to calculate than [cryptographic hash functions](https://en.wikipedia.org/wiki/Cryptographic_hash_function) like MD5 or SHA1. An ETag doesn't need any of the cryptographic properties that hash functions provide, so using a checksum is a better choice. Pandect has some [benchmarks](https://github.com/xsc/pandect#benchmark-results) showing the speed differences between checksums and hashes.

We use CRC32 over Adler32 because it has a [lower risk](https://www.leviathansecurity.com/blog/analysis-of-adler32) of collisions at the cost of being slightly slower to calculate (10-20%). If you are at all concerned about performance, you should enable storing checksums in file extended attributes.

Expand All @@ -58,4 +74,4 @@ I've written a [blog post](https://danielcompton.net/2018/03/21/how-to-serve-clo

Copyright © 2018 Daniel Compton

Distributed under the MIT license.
Distributed under the MIT license.

0 comments on commit ecc3ec3

Please sign in to comment.