Use the Logs: Tableau Server Log Analytics – Part 2

Week 2: Apache


I believe the Apache/httpd log is the most interesting of the Tableau Server log files because of the depth of information it provides. What’s more, with the Unique ID at the end of each entry, you can tie this to other, more verbose, logs (for that link, you’ll need to wait until next week when I talk about VizQL).

Alerts built off of the Apache log

Thankfully, the Apache log uses the Common Log Format and, with that, you have a standard way to parse this text file. Tableau, in a testament to its vision, has added a few bits to the end of this standard. The three most important are:

  • Content-Length
  • %D (this is the best part and it is, according to the docs, ‘The time taken to serve the request, in microseconds.’)
  • Unique_ID

If you want to see the configuration, look in the httpd.conf file and look for the bits that talk about ‘mod_log_config’.

Additions to the Apache CLF

Having the time to serve request gives us the ability to understand Tableau Server performance and, in one query, get the average time it takes to serve a request over the last {n} days:

where(/HTTP\/1\.1″ “-” \d{3} \d([0-9]|1[0-9]) “\d([0-9]|1[0-9])” (?P<TTS>\d*)/) calculate(average:TTS)

Daily performance summary

Where’s the Log

From the Tableau Server admin guide:

The Tableau Server log directory is C:\ProgramData\Tableau\Tableau Server\data\tabsvc\logs if you installed Tableau Server on drive C, unless otherwise noted in the table below.

Apache, then, is in the folder: httpd. You’ll want to follow the *.log files and (for this one), the error.log.

What does it do for me?

Again, according to the docs, ‘Apache logs. Look here for authentication entries. Each request in the Apache log will have a request ID associated with it. This request ID is used throughout the server logs and you can use it to associate log entries with a request.’

But there’s so much more. Some things (but not all) we watch for:

  • http status (200,404,500, etc)
  • IP
  • DateTime
  • Request
  • Request Type (GET/POST, etc)
  • Request ID (mentioned above)
  • Time to Serve request
  • Csv downloads
  • Login attempts
  • 404/500 (the bad ones)

 For example, if you’re using a log analytics tool, you might be able to run a query like this:

where(/HTTP\/1\.1″ “-” (?P<status>\d{3})/ AND status!=200) groupby(status) calculate(count) sort(desc)

http statuses over {n} days

With the above query (which we run on Logentries), we can get information on http status codes and, hopefully, all of them are good.

Keep in mind, this is also how you’d want to leverage both alerting and tagging with your log analytics tool; you might do this because if there are {n} requests over a threshold, say 404s, you want to know about it. Conversely, you’d also want to be made aware of inactivity (or anomalies) from the logs. For example, if it was normally pretty chatty, you want to know ASAP if it has gone silent.

If you’re really feeling up to it, you can leverage regular expression field extraction and make some custom groupings out of the text file.

In this example, let’s say you wanted to do a quick check on the 90th percentile of the time to serve (NOTE: this is in microseconds) amounts just by using a REST API to query your logs. Here’s what you send with the API call:

where(/HTTP\/1\.1″ “-” \d{3} \d([0-9]|1[0-9]) “\d([0-9]|1[0-9])” (?P<TTS>\d*)/) calculate(percentile(90):TTS) timeslice(60)

You can send these as alerts as they happen to users via email and/or Slack. Or you can make this as part of a performance summary (via a Tableau Dashboard) about your Tableau infrastructure. I mentioned a more robust solution a while back. You can read about it here. We automatically parse the Apache log for the 200s and then use an IP lookup tool to geo-locate possible performance issues with our Tableau infrastructure.

Custom data model we have for Server performance

What should I be Alerted on?

Don’t forget: Too many wrong alerts, you miss the right alerts.

For the Apache logs, we alert on these bits:

  • Time to serve > 5 seconds
  • Excessive ‘Auto-Refresh’ events
  • Excessive CSV downloads
  • 404s > a rolling average
Same alerts/tags for an Apache log

As mentioned above, I believe the Apache log is the most useful log to gauge a lot of critical Tableau events, not the least of which is access and performance of views/workbooks/data sources.


2 thoughts on “Use the Logs: Tableau Server Log Analytics – Part 2

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s