Karol Galanciak - Ruby on Rails and Ember.js consultant

Keeping Data Integrity in Check: Conditional Unique Indexes for Soft Delete

Soft delete is a pretty common feature in most of the applications. It may increase complexity of the queries, nevertheless, not deleting anything might be a right default as the data might prove to be useful in the future: for restoring if a record was removed by mistake, to derive some conclusions based on statistics and plenty of other purposes. It may seem like it’s a pretty trivial thing: just adding a column like deleted_at and filtering out records that have this value present. But what happens when you need to do some proper uniqueness validation on both model layer and database level? Let’s take a look what kind of problem can easily be overlooked and how it can be solved with a conditional index.

Case study: daily prices for vacation rentals

Let’s imagine we are developing a vacation rental software. Most likely the pricing for each day will depend on some complex set of rules, but we may want to have some denormalized representation of base prices for each day to make things more obvious and have some possiblity of sharing this kind of data with other applications, which is quite common in this domain. We may start with adding a DailyPrice model having a reference to a rental, having price value and of course date for which the price is applicable.

Ensuring uniqueness

Obviously, we don’t want to have any duplicated daily_prices for any rental, so we need to add a uniqueness validation for rental_id and date attributes:

app/models/daily_price.rb
1
validates :rental_id, presence: true, uniqueness: { scope: :date }

To ensure integrity of the data and that we are protected against race conditions and potental validation bypassing, we need to add a unique index on the database level:

db/migrate/20161030120000_add_unique_index_for_daily_prices.rb
1
  add_index :daily_prices, [:rental_id, :date], unique: true

Adding soft delete functionality

We have some nice setup already. But it turned out that for recalculating daily_prices if some rules or values influencing the price change it’s much more convenient to just remove them all and recalculate from scratch than checking if the price for given date needs to be recalculated. To be on the safe side, we may decide not to hard remove these rates, but do a soft delete instead.

To implement this feature we could add deleted_at column, drop the previous index and a new one which will respect the new column. We should also update the validation in model in such case:

app/models/daily_price.rb
1
validates :rental_id, presence: true, uniqueness: { scope: [:date, :deleted_at] }

And the migration part:

db/migrate/20161030120000_add_deleted_at_to_daily_prices.rb
1
2
3
  remove_index :daily_prices, [:rental_id, :date], unique: true
  add_column :daily_prices, :deleted_at, :datetime
  add_index :daily_prices, [:rental_id, :date, :deleted_at], unique: true

Everything should be fine with that, right? What could possibly go wrong here?

Adding index the right way

Let’s play with Rails console and check it out:

1
2
3
4
5
6
7
8
9
> time_now = Time.current
 => Sun, 23 Oct 2016 09:44:46 UTC +00:00
> DailyPrice.create!(price: 100, date: Date.current, rental_id: 1, deleted_at: time_now)
  COMMIT
> DailyPrice.create!(price: 100, date: Date.current, rental_id: 1)
  COMMIT
> DailyPrice.create!(price: 100, date: Date.current, rental_id: 1)
  ROLLBACK
ActiveRecord::RecordInvalid: Validation failed: Rental has already been taken

Nah, there can’t be any problem, looks like we have the expected behaviour - we couldn’t create a non-soft-deleted daily_price for given date and rental. But let’s check one more thing:

1
2
3
> price = DailyPrice.new(price: 100, date: Date.current, rental_id: 1)
> price.save(validate: false)
  COMMIT

Whoops! Something looks very wrong here. But how is it possible? The index looks exactly like the validation in our model, yet it didn’t work like that when we bypassed the validation.

Let’s consult PostgreSQL docs. There is something mentioned about null values:

Null values are not considered equal

That is our problem: ActiveRecord considered it as a unique value, but it doesn’t work quite like that in PostgreSQL.

Our index should be in fact conditional and look the following way:

1
add_index :nightly_rates, [:rental_id, :date], unique: true, where: "deleted_at IS NULL"

We could optionally improve the validation in our model to make it look much closer to what we have in database and use conditions option:

app/models/daily_price.rb
1
validates :rental_id, presence: true, uniqueness: { scope: :date, conditions: -> { where(deleted_at: nil) } }

And that’s it! There’s no way we could compromise data integrity now!

Wrapping Up

Keeping data integrity in check is essential for most of the applications to not cause some serious problems, especially when implementing soft delete. Fortunately, simply by adding PostgreSQL conditional unique indexes we can protect ourselves from such issues.

Comments