MongoDB 6.0 introduces a preview feature that pulls off the quasi-magical feat of allowing encrypted data to be used as the target of searches, without ever transmitting the keys to the database. Credit: Matejmo / Getty Images Queryable encryption was the main attraction at MongoDB World 2022, for understandable reasons. It introduces a unique capability to reduce the attack surface for confidential data in several use cases. In particular, data remains encrypted at insert, storage, and query. Both queries and their responses are encrypted over the wire and randomized for resistance to frequency analysis.The outcome of this is that applications can support use cases that require searching against classified data while never exposing it as plaintext in the data store infrastructure. Datastores that hold private information are a main target of hackers for obvious reasons. MongoDB’s encrypted fields means that this information is cryptographically secure at all times in the database, but still usable for searching. In fact, the database does not hold the keys for decrypting the data at all. That means that even a complete breach of DB servers will not result in loss of private information. Several prominent and sophisticated attack vectors are eliminated. For example: Unethical or hacked DB admin account.Accessing on-disk files.Accessing in-memory data.This is something like hashing passwords. We hash passwords in the DB for the same reasons, so that it is impossible for a hacker or even the admin of the DB to view the password. The big difference of course is that hashing passwords is a one-way affair. You can verify if the password is correct, but that’s it. There’s no querying such a field and no way to recover the plaintext. Queryable encryption retains the ability to work with the field. Another interesting characteristic of the system is that fields are encrypted in a randomized fashion, so the same value will output different ciphertext on different runs. This means the system is resistant to frequency analysis attacks as well. The system allows for a rigorous distinction between clients that have view privileges for the search results and those that don’t, by controlling which clients have access to the keys. For example, an application might store confidential information like a credit card number, alongside less sensitive information like username. A non-privileged client could see the username but not the credit card in a strict way, by not provisioning the client with the cryptographic keys. A client with access to the keys could see and use the credit card in searches, while keeping the card number encrypted at the steps of sending, searching, storing, and retrieving them. Tradeoffs of queryable encryptionOf course all this comes at a cost. Specifically, there is a cost to space and time requirements for queries involving encrypted fields. (MongoDB guidance is around 2-3 times extra storage requirements for encrypted data, but that is expected to come down in the future). Querying the encrypted data is handled by MongoDB incorporating metadata in the encrypted collections themselves, as well as separate collections with further metadata. These account for the increase in storage and time requirements when working with those data sets, along with the work of actual encryption and decryption.Moreover, there is architectural complexity that must be supported in the form of a key management service (KMS) and the overhead of coding for employing it and the work of encryption and decryption itself. How queryable encryption worksAt the highest level, it looks like Figure 1. Matthew TysonFigure 1. High-level architecture of queryable encryptionFigure 1 illustrates that the system adds an architectural component: the KMS. The other change to the typical flow of events is that the data and queries are encrypted and decrypted via the MongoDB driver. The KMS provides the keys for this process.Automatic and manual encryptionThere are two basic modes for queryable encryption: automatic and manual. In automatic, the MongoDB driver itself handles encryption and decryption. In manual, the application developer does more hands-on work using the keys from the KMS. Key types: customer master keys (CMK) and data encryption keys (DEK)In the queryable encryption system there are two types of keys in play: the customer master keys (CMK) and the data encryption key (DEK). The DEK is the actual work key for encrypting the data. The CMK is used to encrypt the DEK. This provides extra security. The client application itself can make use of the DEK (and the data encrypted with it) only by first decrypting it with the CMK.Therefore, even if the DEK is exposed in its encrypted form, it is useless to an attacker without access to the CMK. The architecture can be arranged such that the client application never holds the CMK itself, as described next with a key management service. The bottom line is that the dual key arrangement is an extra layer of security for your private keys.Data encryption keys (DEK) are stored in an extra key vault collection as described below. Key vaultsData is encrypted with symmetric secret keys. Those keys belong to the app developer and are never sent to MongoDB. They are stored in a key vault. There are three basic scenarios for managing the keys, described below in ascending order of security.Local file key provider. Suitable only for development.Keys are stored on local system alongside appKMIP (Key Management Interoperability Protocol) provider.Suitable for production, but less secure than using a KMS provider.Customer master keys (CMK) are transmitted to client applicationFull KMS (Key Management Service) provider. Suitable for productionSupported cloud KMS are: AWS, Azure and GCPOn-premises HSM (hardware security module) and KMS are supportedOnly data encryption keys are transmitted to client applicationLocal key provider for developmentAt development-time, the application developer can generate keys (say, with OpenSSL) and store them locally. Those keys are then used for encrypting and decrypting the information sent to and from the MongoDB instance. This is for development only because it introduces a major vulnerability to the secret keys that mitigates much of the advantages to queryable encryption.KMIP providerThere are a number of KMIP implementations (including open source) and commercial services. In this scenario, the CMK is stored at the KMIP provider, and transmitted to the client app when the need for encrypting or decrypting the DEK for use arises. If the key vault collection is breached, the data remains safe. This arrangement is described in Figure 2. Matthew TysonFigure 2. KMIP architecture outlineKMS providerBy using a KMS provider (like AWS, Azure or GCP) the customer master key is never exposed to the network or client app. Instead, the KMS provides the service of encrypting the DEK. The DEK itself is sent to the KMS, encrypted, and returned as cipher text, where it is then stored in a special key vault collection in MongoDB. The stored DEK can then be retrieved and decrypted with the KMS in a similar fashion, again preventing exposure of the CMK itself. As in KMIP, if the key vault collection is breached, the data remains safe.You can see this layout in Figure 3. Matthew TysonFigure 3. KMS Architecture outlineConclusionQueryable encryption is a preview feature, and at the moment, only equality queries are supported. More query types like ranges are on the roadmap. Although it requires extra setup, queryable encryption delivers a critical feature for use cases requiring search against confidential data that cannot be achieved in any other way. It is a compelling and distinctive capability. Related content news Germany blames Russian hackers for months-long cyber espionage The attacks by Russia-backed Fancy Bear used an Outlook exploit to compromise several German officials’ accounts. By Shweta Sharma May 06, 2024 4 mins Advanced Persistent Threats Hacker Groups feature AI governance and cybersecurity certifications: Are they worth it? Organizations have started to launch AI certifications in governance and cybersecurity but given how immature the space is and how fast it's changing, are these certifications worth pursuing? By Maria Korolov May 06, 2024 12 mins Certifications IT Training Careers news Most interesting products to see at RSAC 2024 Tools, platforms, and services that the CSO team recommends 2024 RSA Conference attendees check out. By CSO Staff May 06, 2024 10 mins RSA Conference Security news CISA, FBI urge developers to patch path traversal bugs before shipping The advisory highlights how developers can follow best practices to fix these vulnerabilities during production. By Shweta Sharma May 03, 2024 3 mins Vulnerabilities PODCASTS VIDEOS RESOURCES EVENTS SUBSCRIBE TO OUR NEWSLETTER From our editors straight to your inbox Get started by entering your email address below. Please enter a valid email address Subscribe