Uniqueness Validation Race Condition in Ruby on Rails applications
Are you a practicing Ruby on Rails developer? It doesn’t matter if you are called a junior developer, senior developer, or the janitor. It is surprisingly easy for race conditions to slip into your code and out into production. Some of these can lead to annoying duplicate e-mails in your database or they could lead to serious security issues that impact your company’s bottom line.
As you read on, I’m going to teach you a bit about race conditions, also called hazards in some engineering circles, and give you a practical example of how one can slip into a Rails application if you were to choose to enforce validation constraints only within an application’s models with a validates :field_name, uniqueness: true
rather than through database constraints.
Before we begin, I do want to remind you about one thing. Preventing race conditions is not just something that can be added to Ruby on Rails because the methods for automatically detecting race conditions is an NP-hard problem in computer science. That’s why it’s so important that you understand something about spotting situations where they may occur so that you stand a better chance at leaving them out of your next deploy.
The idea for this article came from an apt observation by José Valim about the security implications of underuse of the database for data integrity enforcement:
It is a pity how small applications are vulnerable to similar race conditions due to underuse of the database: http://t.co/iNpGS2DKfD
— José Valim (@josevalim) April 27, 2015
What’s a Race Condition?
Wikipedia has a good, technically accurate definition of race condition:
A race condition or race hazard is the behavior of an electronic, software or other system where the output is dependent on the sequence or timing of other uncontrollable events. It becomes a bug when events do not happen in the order the programmer intended. The term originates with the idea of two signals racing each other to influence the output first.
The situation has been known to computer science for decades, but detecting them automatically is computationally infeasible, which is why you as a programmer need to know how to envision this condition for yourself. As Steve Carr, Jean Mayo, and Ching-Kuang Shene explain in Race Conditions: A Case Study (2001):
…since detecting race conditions is an NP-hard problem, no software is able to pinpoint exactly those race conditions in a program. Consequently, students frequently have difficulty in finding potential, especially subtle, race conditions in their programs…
If interested, you would be well served by digging deeper into the computer science details, but I will make this easier for you as a practicing Ruby on Rails developer. To that end, please understand three things:
- Race conditions exist anytime two or more processes concurrently access to the same shared resource, such as a file, a database table, or a mutable variable accessible by multiple threads in absence of a mutex
- Web applications are always concurrent with each request being served independently from others
- Always assume that two or more requests for an action will arrive simultaneously
- In the best case this could be a simple ‘double-tap’ situation where a user clicks submit twice in rapid succession
- In the worst case, it will be a willful manipulation of malicious adversary who is about to compromise the security of your application
Practical example of a Race Condition in a typical Ruby on Rails application
When two or more requests come in within a few milliseconds of each other, which is an event that is to be expected in web application, there is a race condition with the following code.
Our example is a simple data model with an Account and AppliedCoupon model, where Account has_many: :applied_coupons
and each coupon code can only be applied to a particular account once.
The tests
applied_coupons_spec.rb
require 'rails_helper'
RSpec.describe AppliedCoupon do
before(:each) do
@account = Account.find_or_create_by(name: 'Test Account')
end
it 'A coupon code must be 10 digits long' do
a = AppliedCoupon.new(code: 'abc')
expect(a).to be_invalid
expect(a.errors.messages[:code]).to eq ["is the wrong length (should be 10 characters)"]
end
it 'A 10 digit coupon code is considered valid for this example' do
a = @account.applied_coupons.new(code: 'abc1234567')
expect(a).to be_valid
end
it 'A coupon can only be applied to an account once' do
# First instance, A
a = @account.applied_coupons.new(code: 'abc1234567')
# Second instance, B
b = @account.applied_coupons.new(code: 'abc1234567')
# Both A and B are valid because neither has yet to be saved
expect(a).to be_valid
expect(b).to be_valid
# So we save A
expect do
a.save!
end.to_not raise_error
# And B is no longer valid
expect do
b.save!
end.to raise_error ActiveRecord::RecordInvalid
expect(b).to be_invalid
expect(b.errors.messages[:code]).to eq ["has already been taken"]
# TODO: An excercise to the reader. What would happen if a.save! and b.save! were to be called
# simultaneously?
end
end
The migrations
20150504152815_create_accounts.rb
class CreateAccounts < ActiveRecord::Migration
def change
create_table :accounts do |t|
t.string :name
t.timestamps null: false
end
end
end
20150504152917_create_applied_coupons.rb
class CreateAppliedCoupons < ActiveRecord::Migration
def change
create_table :applied_coupons do |t|
t.integer :account_id, null: false
t.string :code
t.timestamps null: false
end
add_index :applied_coupons, :account_id
add_foreign_key :applied_coupons, :accounts
end
end
The models
account.rb
# == Schema Information
#
# Table name: accounts
#
# id :integer not null, primary key
# name :string
# created_at :datetime not null
# updated_at :datetime not null
#
class Account < ActiveRecord::Base
has_many :applied_coupons, dependent: :destroy
end
applied_coupon.rb
# == Schema Information
#
# Table name: applied_coupons
#
# id :integer not null, primary key
# account_id :integer not null
# code :string
# created_at :datetime not null
# updated_at :datetime not null
#
class AppliedCoupon < ActiveRecord::Base
belongs_to :account
# A coupon must be applied to an account and since the database will raise an
# exception, let's have a validation here to give a friendlier error message
validates :account_id, presence: true
# Ensure that the code is present, is exactly 10 characters long, and that it
# has only been applied to a given account once.
validates :code,
presence: true,
length: { is: 10 },
uniqueness: { scope: :account_id }
end
Now let’s run the test suite to see if we can deploy
From the project directory, I run:
rspec spec/models/applied_coupons_spec.rb
And see the results:
Finished in 0.04268 seconds (files took 1.52 seconds to load)
3 examples, 0 failures
So all should be good! Right? Well, not exactly. Our test suite is single-threaded but a deployed copy of this web application is going to be multi-threaded. Go back and look at the validation on line 21 of applied_coupon.rb
and then the A coupon can only be applied to an account once
test in the applied_coupons_spec.rb
.
What happens if a
and b
could both be saved simultaneously. We already see that they were both valid at the expect statements on lines 26 and 27.
Somewhere in the ActiveRecord implementation is code that does something like this bit of pseudocode:
def save
if valid?
# Generate SQL code and send it to the database connection
else
# Return false or raise an error, sending no SQL code to the database
end
end
The race condition exists because there is time between the if valid?
and the send SQL to the database
bits. A second independent process can pass the if valid?
the primary process has, but before it sends the SQL code. The two processes are racing each other through the system to influence the output to the database first. Because the database itself can certainly store two records with the same account_id and code. Therefore, there is most certainly a race condition whereby two identical coupons can be applied to the same account!
The migration that fixes this race condition at the database level
20150504163550_add_unique_index_to_applied_coupons
class AddUniqueIndexToAppliedCoupons < ActiveRecord::Migration
def change
# Have the database raise an exception anytime any process tries to
# submit a record that has a code duplicated for any particular account
add_index :applied_coupons, [:account_id, :code], unique: true
end
end
And with this, the database will enforce the account data integrity by raising an exception anytime a second record would be saved that duplicates a code for any given account. The validation shows a friendly error message and the database enforces the rule.
In conclusion
When the database is the source of truth for your application, enforcing data integrity at the database level with constraints is important. As I’ve shown, there are situations where validations will pass even though the end result once persisted is invalid. In the worst case situations, this is a serious security flaw and even in the best case situations, it will lead to data integrity issues. The easy solution is to add a unique index on the combination of the model’s foreign key plus the coupon code field so that the database server will reject duplicates itself since modern database servers are both atomic and thread-safe.