Varnishlog: measure your Varnish cache performance

Written by Thijs Feryn's blog - - Aggregated on Friday September 8, 2017
Tags: tech, cache, logs, varnish

Varnishlog is an essential tool to measure the performance of your Varnish cache. Why add a reverse caching proxy like Varnish in front of your application, if you can’t analyze how it performs or behaves?

This blog post is an in-depth tutorial on how to use the varnishlog tool to retrieve Varnish Shared Memory Log information. The goal is to filter the output and turn it into meaningful output that gives insight into the behavior of your Varnish cache.

Follow along for an in-depth tutorial. Use the Table Of Contents if you want to read specific sections of this long blog post.

Varnishlog

Varnishlog is a utility that reads the contents of the in-memory log that Varnish provides. This log is often referred to as the Varnish Shared memory Log (VSL). Because it is stored in memory, it’s not a persistent log.

The varnishlog utility offers you a way to have a look at what’s currently happening within your Varnish instance. By default, the output is grouped by transaction and is tag based. A tag has a value and varnishlog allows you to query and filter the output by tags and values.

Have a look at the man pages for full reference about the usage of the varnishlog binary.

Varnish Shared Memory Log

The varnishlog is just a utility, an interface. The actual logs are generated by the varnishd program. By default the VSL is 81 megabytes in size. However, the size VSL size can be set to a custom value by adding the -l option to your Varnish startup options.

The VSL output contains log lines that belong to transactions. Every log line has a tag and a corresponding value. Here’s an example extract:

* << Request >> 10973258
 - Begin req 10973257 rxreq
 - Timestamp Start: 1501507281.942533 0.000000 0.000000
 - Timestamp Req: 1501507281.942533 0.000000 0.000000
 - ReqStart 127.0.0.1 59753
 - ReqMethod GET
 - ReqURL /
 - ReqProtocol HTTP/1.1
 - ReqHeader Host: blog.feryn.eu

What does this output mean?

This is only an extract. An actual transaction contains a lot more information. We’ll dig deeper throughout this blog post. But you’ll find the full reference for VSL on the Varnish documentation site.

Transactions

Transactions are items of work. Every transaction in the log is identified by a transaction ID (VXID). Log lines that share the same VXID belong to the same transaction.

There are 2 kinds of transactions:

A session refers to the TCP connection that the client makes to the server. When keep-alive is supported, one connection can process multiple requests.

The request can have different incarnations:

Transactions can depend on one another, which means there’s a certain hierarchy:

Basically: transactions initiate other transactions. These dependencies are visualized in the image below:

Image courtesy of Varnish Software

You see the Begin tag for the different transactions in the log output. You’ll notice the different types of transactions, the parent and child VXIDs, and the reason why this is transaction is occurring.

Here’s what happens in the example above:

It may seem weird that a transaction that occurs later in the flow, is displayed first. Because of the dependencies between transactions, there is some kind of encapsulation going on. The image above illustrates this quite nicely: transaction 3 is shown first, because it is the inner most transaction in the flow, encapsulated by its parent transactions.

Transaction grouping

By default, transactions are grouped by their transaction ID (VXID): transactions are displayed in their entirety and separately, including sessions, and the different kinds of requests. This results in independent transactions with limited context about their correlation.

By adding the -g argument to the varnishlog binary, we can specify how we want to group our output. The following grouping modes are available:

Let’s illustrate these grouping modes with an example. The diagram below show the content composition of a website that has a header, a navigation, a footer, and the main content block.

A content composition example which we’ll examine with varnishlog

The header and navigation are composed through 2 ESI requests:

The homepage also loads the /footer resource through an AJAX call. 

We inspect the shared memory log by executing the following command:

varnishlog -i Begin,ReqUrl,Link,BereqURL

This command displays the start of transactions, the request URL, linked transactions, and the backend request URL when a backend call is made.

This results in the following output:

* << BeReq >> 98318
- Begin bereq 98317 fetch
- BereqURL /

* << BeReq >> 98320
- Begin bereq 98319 fetch
- BereqURL /header

* << Request >> 98319
- Begin req 98317 esi
- ReqURL /header
- Link bereq 98320 fetch

* << BeReq >> 98322
- Begin bereq 98321 fetch
- BereqURL /nav

* << Request >> 98321
- Begin req 98317 esi
- ReqURL /nav
- Link bereq 98322 fetch

* << Request >> 98317
- Begin req 98316 rxreq
- ReqURL /
- Link bereq 98318 fetch
- Link req 98319 esi
- Link req 98321 esi

* << BeReq >> 98324
- Begin bereq 98323 fetch
- BereqURL /footer

* << Request >> 98323
- Begin req 98316 rxreq
- ReqURL /footer
- Link bereq 98324 fetch

* << Session >> 98316
- Begin sess 0 HTTP/1
- Link req 98317 rxreq
- Link req 98323 rxreq

As you can see, there is no grouping, no hierarchy, an no indentation. Thanks to the Link tag, we know which requests are triggered by what session, and which backend fetches are triggered by what request. But it’s not really intuitive.

We can also group by session. This would require us to use the following command:

varnishlog -i Begin,ReqUrl,Link,BereqURL -g session

The output would be the following:

* << Session >> 14
- Begin sess 0 HTTP/1
- Link req 65539 rxreq
- Link req 65545 rxreq
** << Request >> 65539
-- Begin req 14 rxreq
-- ReqURL /
-- Link bereq 65540 fetch
-- Link req 65541 esi
-- Link req 65543 esi
** << Request >> 65545
-- Begin req 14 rxreq
-- ReqURL /footer
-- Link bereq 65546 fetch
*** << BeReq >> 65540
--- Begin bereq 65539 fetch
--- BereqURL /
*** << Request >> 65541
--- Begin req 65539 esi
--- ReqURL /header
--- Link bereq 65542 fetch
*** << Request >> 65543
--- Begin req 65539 esi
--- ReqURL /nav
--- Link bereq 65544 fetch
*** << BeReq >> 65546
--- Begin bereq 65545 fetch
--- BereqURL /footer
**** << BeReq >> 65542
---- Begin bereq 65541 fetch
---- BereqURL /header
**** << BeReq >> 65544
---- Begin bereq 65543 fetch
---- BereqURL /nav

In the example above, there’s indentation, and you can clearly see which session contains what requests, and which subrequests are triggered.  This is the actual hierarchy of the example above:

The hierarchy of varnishlog transactions

When you group transactions based on the request using varnishlog -g request, session information will be omitted. When you perform raw grouping using varnishlog -g raw, every log record is a transaction on its own. In this case there is no correlation between log records.

The VCL flow

Shared memory logs only provide meaning when you understand how they represent the VCL flow. Varnish uses the Varnish Configuration Language to extend its default behavior.

Here’s the diagram containing the flow:

 

It is essential that your understand the behavior, but that’s material for a blog post on its own. Basically, you should understand that Varnish receives requests and tries to return the cached response from memory, rather than fetching it from your backend.

Here’s a brief summary of the decision making process:

The differente stages of the flow are identified by the corresponding hook names. These names are useful to know when you come across them in the memory logs, as they provide context.

Varnishlog tags

So far we’ve talked about loglines, about the correlation between log lines and transactions, and the correlation between transactions. In the previous examples, we already hinted towards tags. I showed you some basic tags like Begin, Link, ReqUrl, ReqHeader. Although these tags were useful in the context of transaction identification, the output gives little meaning from a broader perspective.

It’s about time we introduce some more tags and group them according to their meaning. The full reference can be found on the Varnish documentation website.

There are different kinds of tags:

Request tags

Request tags are used to identify the different bits and pieces of an incoming HTTP request.

You’ll find these tags in the log lines below:

- ReqStart 127.0.0.1 56312
- ReqMethod GET
- ReqURL /
- ReqProtocol HTTP/1.1
- ReqHeader Host: localhost
- ReqHeader Connection: keep-alive
- ReqHeader Upgrade-Insecure-Requests: 1
- ReqHeader User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36
- ReqHeader Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
- ReqHeader Accept-Encoding: gzip, deflate, br
- ReqHeader Accept-Language: nl,en-US;q=0.8,en;q=0.6
- ReqHeader X-Forwarded-For: 127.0.0.1

Response tags

Response tags identify log lines that give insight about the HTTP response that is returned to the client by Varnish. Here’s an overview of tags:

- RespProtocol HTTP/1.1
- RespStatus 200
- RespReason OK
- RespHeader Host: localhost
- RespHeader Cache-Control: public, s-maxage=500
- RespHeader Date: Tue, 01 Aug 2017 08:56:44 GMT
- RespHeader ETag: "c5afddc587599a72d467caca23e980bf"
- RespHeader Vary: Accept-Language
- RespHeader Content-Length: 3098
- RespHeader Content-Type: text/html; charset=UTF-8
- RespHeader X-Varnish: 32770
- RespHeader Age: 10

The example above returns a simple page where no errors occur, hence the 200 status code. It’s clear that this response is an HTTP 1.1 response. The response headers that the backend returned, are included, but Varnish also adds a couple of extra headers:

Backend tags

When a connection to the backend is made, the information about the backend is logged, as shown by the log extracts below:

- BackendOpen 19 boot.default 127.0.0.1 8080 127.0.0.1 62552
- BackendClose 19 boot.default

In this case we’re setting up a connection on the localhost to a webserver that is running on port 8080. The local port on Varnish that is used is port 62552. The backend is identified as “default”, and was loaded at boot time. Later on we’re closing the connection to that named backend.

In one of the next segments, I’ll show you how to monitor backend health using Backend_health tags.

There are other backend tags available, you’ll find them on the Varnish documentation page.

Backend request tags

When an object cannot be served from cache, Varnish makes a backend request. The backend request object contains the data of the original request object, but with potentially updated values. Typical backend request tags are:

 

Again, there’s more, but I won’t cover them all.

Backend response tags

It will come as no surprise that the backend response tags are almost identical to the response tags. A backend response object contains payload and headers that were returned by the backend. The value of a backend response will be stored in a cache object, and later on copied into a response object.

These are the most common backend response tags:

Object tags

Object tags represent cached objects. The value of a cache object will primarily come from a backend response. Here are the object tags, which are again quite similar to the response tags:

VCL behavior tags

When examining logs, the input and output data is significant, but the behavior of the cache is just as important. The VCL related tags are injected at different stages in the logs to indicate which part of the flow has been reached by the transaction. Here are some example tags:

If we perform another varnishlog call on our example website, where we only display the URL, the backend request URL, and the relevant VCL tags, the output could be the following:

* << Request >> 5
- ReqURL /
- VCL_call RECV
- VCL_return hash
- VCL_call HASH
- VCL_return lookup
- VCL_call MISS
- VCL_return fetch
- VCL_call DELIVER
- VCL_return deliver
** << BeReq >> 6
-- BereqURL /
-- VCL_call BACKEND_FETCH
-- VCL_return fetch
-- VCL_call BACKEND_RESPONSE
-- VCL_return deliver

The 2 transactions give a clear indication of what’s going on:

The example above is quite extensive, because the result is a cache miss, which requires backend access. When the object would be in cache, the logs could look as follows:

* << Request >> 6
- ReqURL /
- VCL_call RECV
- ReqURL /
- VCL_return hash
- VCL_call HASH
- VCL_return lookup
- VCL_call HIT
- VCL_return deliver
- VCL_call DELIVER
- VCL_return deliver

This is the decision make process for this request:

There are plenty of other scenarios, but a hit and a miss are the most essential scenarios.

The VCL_call and VCL_return tags are, in my opinion, the most useful tags. They provide meaning and context. Without them, it’s sometimes hard to determine what happened to certain requests or responses. Use them in conjunction with request and backend response information.

Miscellaneous tags

The Varnish Shared Memory Logs also contain some tags that are specifically cache related.

The ExpBan tag

The ExpBan tags is used when the VCL bans an object from cache, a log line with the ExpBan appears containing the VXID of the parent transaction.

When we would ban the /header resource and call the varnishlog -i "ReqUrl,ExpBan" command, the output could look like this:

* << Request >> 32789
- ReqURL /header
- ExpBan 98355 banned lookup

The Hit tag appears when a transaction results in a cache hit. The VXID of the parent transaction is returned. By executing varnishlog -i "ReqUrl,Hit", we’d get the following output:

* << Request >> 98369
 - ReqURL /footer
 - Hit 65597

The HitPass tag

The HitPass tag is used for log lines that are negatively cached in the Hit-For-Pass cache. This is caused when the vcl_backend_response method inspects the backend responds and decides the response cannot be cached, for a variety of reasons. This means the decision not to cache is cached in a special Hit-For-Pass cache. This avoids unnecessary request queueing.

When you encounter a HitPass tag appears with your request tags, this means the request will bypass the rest of the decision making process, and will immediately be sent to the backend. Here’s an example:

** << Request >> 6
-- ReqURL /footer
-- HitPass 3

The log extract above indicates that the response for the /footer resource was negatively cached. For the next 120 seconds, Varnish will bypass the decision whether or not to cache, and serve the data directly from the backend, regardless of other criteria.

The TTL tag

The TTL tag is used to log the Time-To-Live that is assigned to an object that is about to be stored in cache. Here’s the format of the value for this tag:

%s %d %d %d %d [ %d %d %u %u ]
|  |  |  |  |    |  |  |  |
|  |  |  |  |    |  |  |  +- Max-Age from Cache-Control header
|  |  |  |  |    |  |  +---- Expires header
|  |  |  |  |    |  +------- Date header
|  |  |  |  |    +---------- Age (incl Age: header value)
|  |  |  |  +--------------- Reference time for TTL
|  |  |  +------------------ Keep
|  |  +--------------------- Grace
|  +------------------------ TTL
+--------------------------- "RFC" or "VCL"

The TTL value provides a substantial amount of information regarding the decision making process in terms of the Time-To-Live:

If the appropriate Cache-control or Expires headers are provided, an order of precedence is respected:

If the TTL is set based on HTTP response headers, the value for the first column is RFC. If the TTL is set by VCL, the value is VCL.

The effective Time-To-Live is stored in the second column of the log line.

If the VCL has specific grace, or keep values, this is added to the Time-To-Live, and gives it some extra leeway. These values are added as the third and fourth column.

The fifth column contains the current timestamp.

When we’re dealing with a TTL that was set by the VCL, the last four columns aren’t displayed. Here’s such an example:

-- TTL VCL 120 10 0 1501597242

Let’s break it down:

And here’s an example when the backend sets the TTL:

-- RFC 500 10 -1 1501598872 1501598872 1501598872 0 500

Let’s analyze the output:

If you  would add a custom Age header to your HTTP response, you would in fact mess with your Time-To-Live: the amount of seconds that is stated would be deducted from your TTL. In the example below the response contains an Age=3 header which makes the object 3 seconds old from the go.

Instead of a 500 second TTL, the object will only be cached for 497 seconds. This is reflected in the Age column, which is off by 3 seconds:

-- RFC 500 10 -1 1501598872 1501598869 1501598872 0 500

The Begin tag

The Begin talk isn’t really new to us, I mentioned the tag earlier in this post. This tag indicates the beginning of a transaction, staging which kind of transaction it is, to which parent transaction it belongs, followed by a bit more context.

Here are a couple of examples:

- Begin bereq 98317 fetch
- Begin req 98317 esi
- Begin req 98316 rxreq
- Begin sess 0 HTTP/1

As mentioned before: the transaction is either a session, a request, or a backend request. Backend requests could be regular requests, or ESI requests.

The Link tag

Whereas the Begin tag links to its parent VXID, the Link tag expresses links to a child VXID. Again, a couple of examples:

- Link bereq 98312 fetch
- Link req 98311 rxreq
- Link req 98318 esi

The Timestamp tag

At several points in the log, Timestamp tags will occur. They contain timing information. This is the format:

%s: %f %f %f
|   |  |  |
|   |  |  +- Time since last timestamp
|   |  +---- Time since start of work unit
|   +------- Absolute time of event
+----------- Event label

Here’s an example containing timestamps:

* << Request >> 65539
- Timestamp Start: 1501601912.758662 0.000000 0.000000
- Timestamp Req: 1501601912.758662 0.000000 0.000000
- Timestamp Fetch: 1501601912.806733 0.048071 0.048071
- Timestamp Process: 1501601912.806750 0.048088 0.000017
- Timestamp Resp: 1501601912.806787 0.048125 0.000037
** << BeReq >> 65540
-- Timestamp Start: 1501601912.758753 0.000000 0.000000
-- Timestamp Bereq: 1501601912.758952 0.000199 0.000199
-- Timestamp Beresp: 1501601912.806677 0.047924 0.047725
-- Timestamp BerespBody: 1501601912.806749 0.047996 0.000072

What can we learn from this example?

When we dig deeper, and analyze the backend request, we learn the following lessons:

If you just want to know how long the full transaction lasted, the very last occurrence of the Timestamp in the request suffices. The other occurrences are useful if you want to know where a possible delay happened.

Filtering output

At this point, we have figured out how transactions work and which tags can occur in our log lines. When you run the varnishlog command, the unfiltered output tends to get very noise on a system that gets a reasonable amount of traffic.

The Asciicast below shows how it can quickly become a mess:

Let’s face it: most of the time, you don’t need to see all the data. You’ll filter the output, and only display the relevant tags for your use case.

Varnishlog offers 4 basic ways to filter your output:

Include tags

The most common type of filtering I do, is to include the URL of a request and its VCL behavior. The following command would do the trick for you:

varnishlog -i ReqUrl,VCL_call,VCL_return -g session

This could be the output you get:

 * << Session >> 252394
 ** << Request >> 252395
 -- ReqURL /
 -- VCL_call RECV
 -- VCL_return hash
 -- VCL_call HASH
 -- VCL_return lookup
 -- VCL_call HIT
 -- VCL_return deliver
 -- VCL_call DELIVER
 -- VCL_return deliver
 *** << Request >> 252397
 --- ReqURL /header
 --- VCL_call RECV
 --- VCL_return hash
 --- VCL_call HASH
 --- VCL_return lookup
 --- VCL_call HIT
 --- VCL_return deliver
 --- VCL_call DELIVER
 --- VCL_return deliver
 *** << Request >> 252399
 --- ReqURL /nav
 --- VCL_call RECV
 --- VCL_return hash
 --- VCL_call HASH
 --- VCL_return lookup
 --- VCL_call HIT
 --- VCL_return deliver
 --- VCL_call DELIVER
 --- VCL_return deliver
 *** << BeReq >> 252396
 --- VCL_call BACKEND_FETCH
 --- VCL_return fetch
 --- VCL_call BACKEND_RESPONSE
 --- VCL_return deliver
 **** << BeReq >> 252398
 ---- VCL_call BACKEND_FETCH
 ---- VCL_return fetch
 ---- VCL_call BACKEND_RESPONSE
 ---- VCL_return deliver
 **** << BeReq >> 252400
 ---- VCL_call BACKEND_FETCH
 ---- VCL_return fetch
 ---- VCL_call BACKEND_RESPONSE
 ---- VCL_return deliver

 

Including multiple tags can either be done by adding multiple -i parameter, or by passing a comma-separated list of values.

These values can also be globs. We could short the command above using globs:

varnishlog -i "ReqUrl,VCL_*" -g session

Exclude tags

We can do the exact opposite, and exclude tags we don’t care about. Let’s combine tag inclusion and exclusion in the following example:

varnishlog -i "Req*" -x ReqHeader,ReqUnset

In this example we include all tags that start with req, but we want to exclude the very verbose request headers, and potential unset actions for the request. This is our output:

* << Request >> 314125
- ReqStart 127.0.0.1 64585
- ReqMethod GET
- ReqURL /
- ReqProtocol HTTP/1.1
- ReqAcct 476 0 476 311 0 311

* << Request >> 314126
- ReqStart 127.0.0.1 64585
- ReqMethod GET
- ReqURL /footer
- ReqProtocol HTTP/1.1
- ReqAcct 370 0 370 309 0 309

Makes sense, right?

Include tags by regular expression match

A header that occurs a lot in our output is the request header. Quite often you don’t want to have all the request headers in your output, just that one will do. The -I option allows you to include log lines by regular expression match.

Image you’re interested the Accept-Language value for a specific request, you can use the following command to get the URL and the language:

varnishlog -I "reqheader:Accept-Language" -i requrl

What we’re doing is including the ReqUrl tag, and including all ReqHeader tags that match Accept-Language.

This is the output:

* << Request >> 374378
- ReqURL /
- ReqHeader Accept-Language: nl,en-US;q=0.8,en;q=0.6

* << Request >> 374379
- ReqURL /footer
- ReqHeader Accept-Language: nl,en-US;q=0.8,en;q=0.6

Exclude tags by regular expression match

Similar deal here, but for exclusion. Image you want to see the HTTP response headers that will be returned to the client, but exclude the custom headers that start with an x- or X-.

The regular output, without the exclusion would be:

* << Request >> 374384
- ReqURL /footer
- RespHeader Host: localhost
- RespHeader X-Powered-By: PHP/7.0.15
- RespHeader Cache-Control: public, s-maxage=500
- RespHeader Date: Wed, 02 Aug 2017 11:27:21 GMT
- RespHeader ETag: "d47ac09f5351f8f4c97c99ef5b3d2ecd"
- RespHeader Vary: Accept-Language
- RespHeader Content-Length: 80
- RespHeader Content-Type: text/html; charset=UTF-8
- RespHeader X-Varnish: 374384 15367
- RespHeader Age: 334
- RespHeader Via: 1.1 varnish-v4
- RespHeader Connection: keep-alive

Then we add the exclusion:

varnishlog -i RespHeader,ReqUrl -X "RespHeader:(x|X)-"

And this is the result:

* << Request >> 374384
- ReqURL /footer
- RespHeader Host: localhost
- RespHeader Cache-Control: public, s-maxage=500
- RespHeader Date: Wed, 02 Aug 2017 11:27:21 GMT
- RespHeader ETag: "d47ac09f5351f8f4c97c99ef5b3d2ecd"
- RespHeader Vary: Accept-Language
- RespHeader Content-Length: 80
- RespHeader Content-Type: text/html; charset=UTF-8
- RespHeader Age: 334
- RespHeader Via: 1.1 varnish-v4
- RespHeader Connection: keep-alive

Yet again, we’re combining multiple filtering options to get the ideal output.

All-in-one tag filter example

Let’s finish this section with an all-in-one example that combines the 4 kinds of filters:

varnishlog -i "RespHeader,Req*" -X "RespHeader:(x|X)-" -I "timestamp:Resp" -x reqprotocol,reqacct -g request

This command does the following:

This is the output:

* << Request >> 59383
- ReqStart 127.0.0.1 53195
- ReqMethod GET
- ReqURL /
- ReqHeader Host: localhost
- ReqHeader Connection: keep-alive
- ReqHeader Cache-Control: max-age=0
- ReqHeader Upgrade-Insecure-Requests: 1
- ReqHeader User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_5) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.115 Safari/537.36
- ReqHeader Accept: text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,image/apng,*/*;q=0.8
- ReqHeader Accept-Encoding: gzip, deflate, br
- ReqHeader Accept-Language: nl,en-US;q=0.8,en;q=0.6
- ReqHeader If-None-Match: W/"27f341f8e459dd35f1087c55351cacda"
- ReqHeader X-Forwarded-For: 127.0.0.1
- ReqUnset Accept-Language: nl,en-US;q=0.8,en;q=0.6
- ReqHeader accept-language: nl
- ReqHeader Surrogate-Capability: key=ESI/1.0
- ReqUnset Accept-Encoding: gzip, deflate, br
- ReqHeader Accept-Encoding: gzip
- RespHeader Host: localhost
- RespHeader Cache-Control: public, s-maxage=500
- RespHeader Date: Wed, 02 Aug 2017 11:46:49 GMT
- RespHeader ETag: "27f341f8e459dd35f1087c55351cacda"
- RespHeader Vary: Accept-Language
- RespHeader Content-Length: 3098
- RespHeader Content-Type: text/html; charset=UTF-8
- RespHeader Age: 152
- RespHeader Via: 1.1 varnish-v4
- RespHeader ETag: W/"27f341f8e459dd35f1087c55351cacda"
- RespHeader Connection: keep-alive
- Timestamp Resp: 1501674561.472358 0.000068 0.000020

Powerful stuff!

VSL queries

So far we’ve been filtering the output, and only displaying log lines that match specific tags we’re interested in. The good thing is that we’re reducing the amount of output we’re getting by doing tag-based filtering.

But that still means all transactions are processed, and tags are stripped off. It doesn’t allow us to reduce the amount of transactions we’re getting.

That’s where VSL queries come into play.

The -q option gives you the possibility to query for transactions that match patterns and values. The VSL query language can do this for you.

This is the format for VSL queries:

<record selection criteria> <operator> <operand>

The record selection criteria can be tags, taglists, tag prefixes, or even fields for tags prefixes. You can even decide at which transaction level you wish to query.

This is the format for the record selection criteria:

{level}taglist:record-prefix[field]

Here’s a list of operators you can use for record selection comparison:

VSL queries can be combined using and, or, and not boolean functions

VSL query examples

Let’s throw some examples into the mix, to show you how it works:

varnishlog -i VCL_call,VCL_return -g request -q "ReqURL eq '/'"

This command returns the VCL_call and VCL_return values for the homepage.

This is the output:

* << Request >> 374400
- VCL_call RECV
- VCL_return hash
- VCL_call HASH
- VCL_return lookup
- VCL_call HIT
- VCL_return deliver
- VCL_call DELIVER
- VCL_return deliver

We notice the homepage is served from cache, which is good news!

Another common example is displaying all URLs that aren’t stored in cache. They are either stored in the blacklist because they’re not cacheable, or they’re not yet in cache. The following command would display this information:

varnishlog -i ReqUrl -q "VCL_call eq 'MISS' or VCL_call eq 'PASS'"

A final example is a time measurement example:

varnishlog -i ReqUrl -I "Timestamp:Resp" -q "Timestamp:Resp[2] > 1.0"

This command displays the URL, and the response timestamp for all requests that take longer than 1 second. The query looks for the Timestamp tag, and takes the second column for any occurence that has Resp as the prefix. If that value exceeds 1 second, it gets reported.

Note that timestamp values are floats. Integers are invalid for this type of comparison

Miscellaneous command line options

Throught this article, a lot of useful varnishlog command line options were covered. There are other options you’ll probably never use, but I want to mention some of them anyway.

Getting the full memory log

By default we read the tail end of the log, because this contains the latest information. If you add the -d option, varnishlog will start at the beginning of the log. This means you’ll see older log items that are sent to stdout.

This could be useful to measure certain assumptions on a larger sample.

varnishlog -d

Writing the varnishlog output to a file

Varnishlog prints a continuously updated list of log items, or dumps the full log to stdout, depending on your output settings.

But varnishlog has the command line options to redirect the output to a file.

varnishlog -w file

The command above saves the varnishlog output to a file in binary format. This file can be replayed using the following command:

varnishlog -r file

You can also decide to append logs to the file by adding the -a option:

varnishlog -a -w file

And finally you can decide to store the output in plain text by adding the -A option:

varnishlog -A -a -w file

You can daemonize varnishlog to continuously log output to a file. You ‘ll add the -D option and you can save the PID to a file by adding the -P option. You can also add filtering criteria to continuously log items you care about:

varnishlog -i "ReqUrl,VCL_*" -D -a -A -w /var/log/varnish/custom.log -P /var/run/custom_varnishlog.pid

This command will make varnishlog run in the background and store human-readable logs in /var/log/varnish/custom.log and append data. The PID of the daemon is stored in /var/run/custom_varnishlog.pid.

Multiple instances

When you run multiple instances of Varnish on a server, the varnishd process will use the -n parameter to name the instance. Varnishlog can use that same option to identify the Varnish instances who’s shared memory log it should examine.

varnishlog -n myVarnishInstance

What about varnishtop?

All I’ve been talking about is varnishlog. It’s a great tool, but it just spits out logs. Varnishtop on the other hand, uses the Varnish Shared Memory Logs to create a continuously updated list of the most commonly occurring log entries.

It’s basically like the top command on Linux, but for Varnish.

It makes no sense what so ever to run varnishtop without filters. Here’s a command to see the most popular URLs that are processed by Varnish:

varnishtop -i ReqUrl

This is the output:

You can use all the filters and queries that varnishlog and create meaningful output.

Possible scenario: show the most popular browser languages for users that hit the homepage. We’d use the following command:

varnishtop -I reqheader:Accept-Language -q "ReqUrl eq '/'"

And this would be the output:

list length 4

0.86 ReqHeader Accept-Language: en-US
 0.80 ReqHeader Accept-Language: nl-NL,nl;q=0.8,en-US;q=0.6,en;q=0.4
 0.54 ReqHeader Accept-Language: nl,en-US;q=0.8,en;q=0.6
 0.39 ReqHeader Accept-Language: nl-BE

Why didn’t Varnish serve this request from cache?

90% of the times I use varnishlog, it’s to figure out why a page is not cached. The decision whether not to cache happens at 2 stages:

Each of these stages have their own set of criteria. The shared memory logs will help you figure out why Varnish didn’t cache your page.

This is the default behavior of the VCL in Varnish. If you extend the behavior with custom VCL scripting, you can change this rules and still decide to cache requests that were otherwise not cacheable.

The request is not cacheable

When Varnish receives an HTTP request from the client, it will check whether or not it is stateless. By default only stateless request are deemed cacheable.

Varnish will not cache if:

The following command can help us out in our quest to determine why the page is not cached:

varnishlog -i "Req*,VCL*" -x ReqAcct,ReqStart -q "VCL_call eq 'PASS'"

Cookies

Here’s some output for a non-cacheable request:

* << Request >> 12
- ReqMethod GET
- ReqURL /
- ReqProtocol HTTP/1.1
- ReqHeader Host: localhost
- ReqHeader User-Agent: curl/7.48.0
- ReqHeader Accept: */*
- ReqHeader Cookie: myCookie=bla
- ReqHeader X-Forwarded-For: 127.0.0.1
- VCL_call RECV
- VCL_return pass
- VCL_call HASH
- VCL_return lookup
- VCL_call PASS
- VCL_return fetch
- VCL_call DELIVER
- VCL_return deliver

We’r specifically looking for requests that were passed to the backend. In the output we notice that a cookie was set. That’s why the page cannot be cached.

POST calls

Here’s anther scenario, where there are not cookies, but the request is still passed:

<< Request >> 32779
- ReqMethod POST
- ReqURL /
- ReqProtocol HTTP/1.1
- ReqHeader Host: localhost
- ReqHeader User-Agent: curl/7.48.0
- ReqHeader Accept: */*
- ReqHeader X-Forwarded-For: 127.0.0.1
- VCL_call RECV
- VCL_return pass
- VCL_call HASH
- VCL_return lookup
- VCL_call PASS
- VCL_return fetch
- VCL_call DELIVER
- VCL_return deliver

Although there’s no cookie, the request is still passed to the backend. That’s because the request was sent using the POST method. Post calls aren’t cacheable, because they imply a state change.

Authorization headers

Here’s another example of a passed request without cookies and with a GET request:

* << Request >> 15
 - ReqMethod GET
 - ReqURL /
 - ReqProtocol HTTP/1.1
 - ReqHeader Host: localhost
 - ReqHeader Authorization: Basic dGhpanM6ZmVyeW4=
 - ReqHeader User-Agent: curl/7.48.0
 - ReqHeader Accept: */*
 - ReqHeader X-Forwarded-For: 127.0.0.1
 - VCL_call RECV
 - VCL_return pass
 - VCL_call HASH
 - VCL_return lookup
 - VCL_call PASS
 - VCL_return fetch
 - VCL_call DELIVER
 - VCL_return deliver

Again, the page is not cached. That’s because there’s an Authorization header. Authorization implies state, and is not cached by Varnish.

The response is not cacheable

When a request is considered cacheable, but the object is not found in cache, a backend fetch happens because of the miss. When the vcl_backend_response method receives the response from the backend, it examines the headers for non-cacheable values.

If it turns out that the response isn’t cacheable, Varnish will mark it as Hit-For-Pass for the next 2 minutes.

The following command lists URLs, VCL behavior, and backend response information for transactions that contain a HitPass log line.

varnishlog -i ReqUrl,VCL_*,Beresp*,TTL  -q "HitPass" -g request

Zero TTL

Imagine that this is the output:

* << Request >> 19010384
- ReqURL /my-url
- VCL_call RECV
- VCL_return hash
- VCL_call HASH
- VCL_return lookup
- VCL_call PASS
- VCL_return fetch
- VCL_call DELIVER
- VCL_return deliver
** << BeReq >> 19010385
-- VCL_call BACKEND_FETCH
-- VCL_return fetch
-- BerespProtocol HTTP/1.1
-- BerespStatus 200
-- BerespReason OK
-- BerespHeader Date: Thu, 03 Aug 2017 08:15:22 GMT
-- BerespHeader Server: Apache/2.4.10 (Debian)
-- BerespHeader Last-Modified: Tue, 01 Aug 2017 07:21:00 GMT
-- BerespHeader ETag: "5c0d-555abfd3f422f-gzip"
-- BerespHeader Vary: Accept-Encoding
-- BerespHeader Content-Encoding: gzip
-- BerespHeader Cache-Control: max-age=0
-- BerespHeader Expires: Thu, 03 Aug 2017 08:15:22 GMT
-- BerespHeader Content-Length: 7686
-- BerespHeader Content-Type: application/json
-- TTL RFC 0 10 -1 1501748123 1501748123 1501748122 1501748122 0
-- VCL_call BACKEND_RESPONSE
-- TTL VCL 120 10 0 1501748123
-- VCL_return deliver

What does this mean? The HTTP response for the /my-url resource was not stored in cache, because it did not comply to the criteria that are validated in the vcl_backend_response method.

In our case the Cache-Control: max-age=0 header that was returned by the webserver causes Varnish not to cache the object. It is marked as Hit-For-Pass, and subsequent requests are immediately sent to the backend.

We can retrieve this information from the BerespHeader, but the last field of the first TTL occurrence also show that the max-age or s-maxage value is zero.

Setting cookies

Another reason could be that a cookie was set in the backend. Varnish doesn’t cache responses that set cookies, as they imply state changes.

Here’s some output:

* << Request >> 65551
- ReqURL /set-cookie
- VCL_call RECV
- VCL_return hash
- VCL_call HASH
- VCL_return lookup
- VCL_call PASS
- VCL_return fetch
- VCL_call DELIVER
- VCL_return deliver
** << BeReq >> 65552
-- VCL_call BACKEND_FETCH
-- VCL_return fetch
-- BerespProtocol HTTP/1.1
-- BerespStatus 200
-- BerespReason OK
-- BerespHeader Cache-control: s-maxage=10
-- BerespHeader Set-Cookie: myCookie=bla
-- BerespHeader Content-type: text/html; charset=UTF-8
-- BerespHeader Date: Thu, 03 Aug 2017 08:39:04 GMT
-- TTL RFC 10 10 -1 1501749545 1501749545 1501749544 0 10
-- VCL_call BACKEND_RESPONSE
-- TTL VCL 120 10 0 1501749545
-- VCL_return deliver

The first occurrence of the TTL tag shows that there is a valid max-age or s-maxage value. The value is 10, and this is also reflected by BerespHeader Cache-control: s-maxage=10. However, the Hit-For-Pass still occurs, and that’s because of the cookie that was set. This is reflected in the logs by BerespHeader Set-Cookie: myCookie=bla.

If you set the Time-To-Live to zero in your Cache-Control header, or you add private, no-cache, or no-store expressions to your Cache-Control header, the object will not be cached. This also applies to an Expires header that was set to a time in the past. The same thing happens when you set a cookie. These are the most common reasons why Varnish doesn’t cache objects where the request look like it was cacheable.

Summary

By now, you should realize that the Varnish Shared Memory Logs (VSL) are tremendously useful. You can hook into what’s currently happening on the server. Both the varnishlog and varnishtop binaries are at your service to provide the necessary insight.

Both utilities have filter options and a query language to display the exact information you need.

The Varnish documentation site has a couple of pages that are dedicated to this topic:

If you really want to leverage these tools, you should have a decent knowledge of the internal flow within Varnish, the default VCL behavior, and the VCL methods. Any custom VCL code that your Varnish instances uses, will also influence this behavior.

Read my book

If you want to learn more about Varnish, I suggest you read my book. It’s published by O’Reilly.

In this book I explain:

The post Varnishlog: measure your Varnish cache performance appeared first on Thijs Feryn.


« Build a Facebook Messenger chatbot in … - murze.be

Laravel News - Facebook Releases Yarn Version 1.0 »