Breaking Encrypted Databases: Generic Attacks on Range Queries

Conference:  BlackHat USA 2019



New side-channel attacks can break database encryption for numeric data, and countermeasures involve adding dummy records, making dummy queries, or increasing client-side post-processing.
  • Leakage can arise from properties related to values, queries, and responses
  • Think about what points in the implementation all these kinds of leakage could arise
  • Mitigating leakage involves restricting the type or granularity of queries, adding dummy records or queries, and trusting hardware on the server
  • Trade-offs must be made in encrypted database solutions, sacrificing efficiency or completeness for security
An adversary can use access pattern leakage and volume leakage to determine the value of every record in a database, bypassing encryption. Countermeasures like adding dummy records or queries can help hide frequency information and smooth out the distribution of queries, but may sacrifice efficiency or completeness of query results.


Security researchers and practitioners have proposed many techniques for securely storing and querying outsourced data. I'll start this talk with an overview of common building blocks and the latest commercial and academic solutions, focusing on those that support range queries (e.g., selecting all records where the age attribute is between 18 and 65). These techniques are tailored to specific threat models. For example, if the database server is trusted but not the network, connections can be encrypted with TLS. If the database server is trusted but there is a risk of disk theft, full-disk encryption or page-level encryption of database files and logs (e.g., Transparent Data Encryption) can be enabled. If the database server isn't trusted at all, a system that encrypts all data before uploading it (e.g., via a CipherCloud gateway or CryptDB proxy server) could be employed.All of these solutions, however, leak some information when a query is processed -- like the set of records matching the query, or the size of this set. This information leaks even to an observer who doesn't have any cryptographic keys. The source of the leakage can vary; it could be network traffic, observed memory accesses, or database logs recovered by forensic analysis. I'll explain how this leakage can be exploited by an attacker to break the encryption and recover values in the database. These attacks are entirely generic and don't depend on the database implementation. They have connections to graph theory, Golomb rulers, and machine learning. I'll discuss proposed countermeasures, and finish by offering guidelines that practitioners can use when assessing the security claims of the latest and greatest database encryption solutions.