Are you a practicing Ruby on Rails developer? It doesn’t matter if you are called a junior developer, senior developer, or the janitor. It is surprisingly easy for race conditions to slip into your code and out into production. Some of these can lead to annoying duplicate e-mails in your database or they could lead to serious security issues that impact your company’s bottom line.
As you read on, I’m going to teach you a bit about race conditions, also called hazards in some engineering circles, and give you a practical example of how one can slip into a Rails application if you were to choose to enforce validation constraints only within an application’s models with a
validates :field_name, uniqueness: true rather than through database constraints.
Before we begin, I do want to remind you about one thing. Preventing race conditions is not just something that can be added to Ruby on Rails because the methods for automatically detecting race conditions is an NP-hard problem in computer science. That’s why it’s so important that you understand something about spotting situations where they may occur so that you stand a better chance at leaving them out of your next deploy.
The idea for this article came from an apt observation by José Valim about the security implications of underuse of the database for data integrity enforcement:
It is a pity how small applications are vulnerable to similar race conditions due to underuse of the database: http://t.co/iNpGS2DKfD— José Valim (@josevalim) April 27, 2015
What’s a Race Condition?
Wikipedia has a good, technically accurate definition of race condition:
A race condition or race hazard is the behavior of an electronic, software or other system where the output is dependent on the sequence or timing of other uncontrollable events. It becomes a bug when events do not happen in the order the programmer intended. The term originates with the idea of two signals racing each other to influence the output first.
The situation has been known to computer science for decades, but detecting them automatically is computationally infeasible, which is why you as a programmer need to know how to envision this condition for yourself. As Steve Carr, Jean Mayo, and Ching-Kuang Shene explain in Race Conditions: A Case Study (2001):
…since detecting race conditions is an NP-hard problem, no software is able to pinpoint exactly those race conditions in a program. Consequently, students frequently have difficulty in finding potential, especially subtle, race conditions in their programs…
If interested, you would be well served by digging deeper into the computer science details, but I will make this easier for you as a practicing Ruby on Rails developer. To that end, please understand three things:
- Race conditions exist anytime two or more processes concurrently access to the same shared resource, such as a file, a database table, or a mutable variable accessible by multiple threads in absence of a mutex
- Web applications are always concurrent with each request being served independently from others
- Always assume that two or more requests for an action will arrive simultaneously
- In the best case this could be a simple ‘double-tap’ situation where a user clicks submit twice in rapid succession
- In the worst case, it will be a willful manipulation of malicious adversary who is about to compromise the security of your application
Practical example of a Race Condition in a typical Ruby on Rails application
When two or more requests come in within a few milliseconds of each other, which is an event that is to be expected in web application, there is a race condition with the following code.
Our example is a simple data model with an Account and AppliedCoupon model, where Account
has_many: :applied_coupons and each coupon code can only be applied to a particular account once.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44
1 2 3 4 5 6 7 8 9
1 2 3 4 5 6 7 8 9 10 11 12 13
1 2 3 4 5 6 7 8 9 10 11 12 13
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
Now let’s run the test suite to see if we can deploy
From the project directory, I run:
And see the results:
So all should be good! Right? Well, not exactly. Our test suite is single-threaded but a deployed copy of this web application is going to be multi-threaded. Go back and look at the validation on line 21 of
applied_coupon.rb and then the
A coupon can only be applied to an account once test in the
What happens if
b could both be saved simultaneously. We already see that they were both valid at the expect statements on lines 26 and 27.
Somewhere in the ActiveRecord implementation is code that does something like this bit of pseudocode:
1 2 3 4 5 6 7
The race condition exists because there is time between the
if valid? and the
send SQL to the database bits. A second independent process can pass the
if valid? the primary process has, but before it sends the SQL code. The two processes are racing each other through the system to influence the output to the database first. Because the database itself can certainly store two records with the same account_id and code. Therefore, there is most certainly a race condition whereby two identical coupons can be applied to the same account!
The migration that fixes this race condition at the database level
1 2 3 4 5 6 7
And with this, the database will enforce the account data integrity by raising an exception anytime a second record would be saved that duplicates a code for any given account. The validation shows a friendly error message and the database enforces the rule.
When the database is the source of truth for your application, enforcing data integrity at the database level with constraints is important. As I’ve shown, there are situations where validations will pass even though the end result once persisted is invalid. In the worst case situations, this is a serious security flaw and even in the best case situations, it will lead to data integrity issues. The easy solution is to add a unique index on the combination of the model’s foreign key plus the coupon code field so that the database server will reject duplicates itself since modern database servers are both atomic and thread-safe.