6 minutes estimated reading time.

Are you accidentally storing private data in plain text?

Frank Rietta — 04/29/2019

Debug logs that chronicle data about errors and other exceptions on a web application are a vital tool for any web company. It enables engineering teams to troubleshoot problems - sometimes even before a customer reports an issue to support - and thus provide excellent service to customers. But the danger of over-logging is real. When sensitive data is logged, it becomes vulnerable to misuse and abuse. In this article, I’ll show you how to prudently minimize the data collected in logs.

Facebook has recently been in the news for a security situation where millions of users' plain-text passwords were compromised by an ill-advised practice of logging credentials in debug logs that were accessible by hundreds of employees. While the media is focused on their negligence, it’s important to realize how easy it is to make this same mistake within your own applications. Learning from Facebook’s mistakes can help you avoid the same fate that a fortune 500 is still reeling from.

What Data to Filter

It is a poor practice to log sensitive data used to authenticate users or control access. This includes session identifiers, which when compromised would allow an attacker to masquerade as an authenticated customer. Users' passwords should never be logged because of the extreme danger presented to your users when this data is disclosed. That danger does not stop at your systems as many people reuse the same email and password for many websites which may be all compromised when your debug logs are breached.

For starters, you should always filter out any security-related secrets, including but not limited to:

Plain text passwords
Authentication Tokens
Session Secrets

Other data that may seem benign at first, like email addresses and full names, is legally protected private information under the EU General Data Protection Regulation (GDPR) and California Consumer Privacy Act (CCPA). Duplicating it to your debug logs changes the scope of your information security program and hinders the efficient use of the logs for engineering purposes. The entire log becomes legally protected data the misuse of which may be considered a data breach or a violation of data protection law.

Other privacy centric data should be evaluated and unless required for providing customer service and if your company is ready to take on the legal risks associated with violating privacy laws then you should also filter these as part of a data minimization effort:

First Name
Last Name
E-mail
IP addresses
Other Personal Information like:
- Street address
- Phone numbers
- Social Security Numbers (SSN)
- Birth dates
- Message contents

Example in Ruby on Rails

At Rietta, we regularly use Rollbar to collect and aggregate debug logs for our deployed Ruby on Rails applications. However, we take extra steps to ensure that sensitive data is not logged inappropriately.

1. Enhanced Parameter Filtering in Ruby on Rails

Ruby on Rails specifies a list of fields that should be masked in logs in a file named config/initializers/filter_parameter_logging.rb. When a field is listed in this file, it is masked in the Rails and ActiveRecord logs. The default file generated with a new Rails application simply lists password as being protected data:

# Be sure to restart your server when you modify this file.

# Configure sensitive parameters which will be filtered from the log file.
Rails.application.config.filter_parameters += [:password]

By default, Rails filters out only fields called password. If you have fields with user chosen passwords of another name, like user_pass then that would be logged in plain text unless you add user_pass to the filter list.

The logs collected by a typical Ruby on Rails application will collect tons of other sensitive data like names, emails, and authentication tokens. You should do an audit of your codebase and add any fields you do not want collected to this list.

One of our applications specified at least 15 additional fields as protected data that we did not want to log so that our debug logs would not be unnecessarily sensitive. Its config/initializers/filter_parameter_logging.rb looked like this:

# Be sure to restart your server when you modify this file.

# Configure sensitive parameters which will be filtered from the log file.
Rails.application.config.filter_parameters += [
  :password,
  :access_token,
  :email,
  :code,
  :last_name,
  :first_name,
  :name,
  :current_sign_in_ip,
  :last_sign_in_ip,
  :ip_address,
  'X-User-Email',
  'X-User-Token',
  :token,
  :user_token,
  :user_email,
  :api_key,
  :hashed_token,
  :body           # Don't log messages typed by users in logs
]

2. Configure the Rollbar Agent to Filter the Same Parameters as Rails

Once the Ruby on Rails application filter parameters have been updated, we have to additionally configure Rollbar to scrub the same parameters. This is done in config/initializers/rollbar.rb like this:

######################################################
# Scrub sensitive data that is configured for not being logged.
# When you add additional filtered fields to `filter_parameter_logging.rb`,
# those will automatically be picked up by this.
config.scrub_headers |= Rails.application.config.filter_parameters
config.scrub_fields |= Rails.application.config.filter_parameters

A final step for GDPR - and soon CCPA - compliance, is to stop Rollbar from logging first and last name as well as e-mail address of users into the logs. Without this, we have to disclose to users our use of Rollbar as a data processor and consider any misuse of the Rollbar logs by a developer as a potential data breach.

This is done by editing config/initializers/rollbar.rb like this:

config.person_method = "gdpr_compliant_current_user_for_rollbar"

The method gdpr_compliant_current_user_for_rollbar is a controller method in app/controllers/application_controller.rb that passes the current_user object from the controllers' context into a personal data sanitizer.

def gdpr_compliant_current_user_for_rollbar
  SanitizePersonalData.new(current_user)
end

The sanitizer is defined in app/presenters/sanitize_personal_data.rb like this:

# Provides a consistent wrapper for user objects to mask data in compliance with
# the EU GDPR. Used by Rollbar and security alerts.
class SanitizePersonalData
  def initialize(source)
    @source = source
  end

  def id
    return nil unless @source
    @source.id
  end

  def email
    return nil unless @source
    email_domain = @source.email.split('@').last
    "...@#{email_domain}"
  end

  def username
    return nil unless @source
    "User ID #{id}"
  end

  def to_s
    "#{username} #{email}"
  end

  def name
    username
  end

  def full_name
    username
  end

  def last_name
    nil
  end

  def first_name
    nil
  end

  def as_json(opts={})
    to_r = @source.as_json(opts)
    to_r[:email] = email if to_r[:email]
    to_r[:full_name] = full_name if to_r[:full_name]
    to_r[:name] = name if to_r[:name]
    to_r[:last_name] = last_name if to_r[:last_name]
    to_r[:first_name] = first_name if to_r[:first_name]
    to_r
  end

end

The result of this is that individual users' given names and email addresses are excluded from our debug logs collected and aggregated for development support purposes on Rollbar. Instead what we see is that “User ID 1337” with an email “…@example.com” had an issue. Developers are able to troubleshoot the problem without being unnecessarily exposed to customers' protected private data.

Conclusion

When your debug logs contain plain-text passwords, authentication tokens, and other customer private data, your company is exposed to significant risk when that data is compromised, misused, or abused. Even if the data is only accessible normally to your own employees, it is best to minimize the collection of protected data so that you do not have to treat your aggregate debug logs as protected, security-sensitive data. This way your developers can use the logs for the intended purpose, to fix bugs and provide excellent customer service without being exposed to customers' private data in production. Finally, I showed a specific example with code on how to easily accomplish this filtering in a standard Ruby on Rails application that additionally logs its debug data to Rollbar. The same pattern will apply with other log aggregators.