Recovering SQLCipher encrypted data with Frida

Our AppSec team has faced the SQLCipher library during some recent security audits of mobile applications. According to their GitHub README:

SQLCipher extends the SQLite database library to add security enhancements that make it more suitable for encrypted local data storage such as on-the-fly encryption, tamper evidence, and key derivation. Based on SQLite, SQLCipher closely tracks SQLite and periodically integrates stable SQLite release features.

This means that, even in the case of a rooted device, information stored in the database will not be accessible by third parties because it is encrypted, unless you can somehow obtain the encryption key. And this is where the real fun begins, because of course we, as security analysts, want to know what is stored in the database. In this post, we will summarize some of the methods we use to obtain the keys used to encrypt SQLCipher databases in Android applications.

The root of the problem

Storing secrets in a device to which a potential attacker has physical access is always hard. Several techniques exist in order to make it harder for attackers (or analysts) to obtain those secrets, but none of them is bulletproof in all possible scenarios. This topic has been discussed in SQLCipher’s FAQ, and some useful recommendations have been produced regarding generating and protecting key material. Also, obfuscation/RE protection tools like ProGuard or DexGuard can also help with this, since the harder it is to understand what the application is doing, the less chance of success the attacks against it will have.

But, in the end, most of the times it is a matter of time and dedication for a skilled enough attacker to access secrets in physically accessible devices, so the plausible goal is to make it just not worth the effort. Since our audits are time-constrained, we follow a methodological approach to this problem, and consider whether the application has enough protections to assume its stored secrets are reasonably secure (of course, we adjust these considerations depending on the importance of the data being protected).

Hard-coded keys

This is the gut-reaction of some developers and it is understandable: a key is needed for encrypting databases? Let’s just create a public static final String (or, you know, private if you are into good visibility practices”¦) so that the encryption functions can use it, and we are set!In these cases, the encryption will probably be rendered useless with low effort from the attacker. Just by using depackaging tools for Android applications (e.g. dex2jar or apktool), the obtained .class files can be decompiled with, for example, JD-GUI to obtain the source code or, alternatively, analyze the .smali files directly. At this point, it is trivial to search for sensitive strings containing passwords or cryptographic keys. Unless the strings are actually encrypted (for example, DexGuard offers string encryption functionality, which is another problem outside the scope of this post), they are obtainable if the attacker has access to the application’s APK file. So, in short, never, never hard-code sensitive data in your application’s source code!

Dynamically generated keys

A better practice is to dynamically generate the keys, so they are not anywhere in the source code to be found. This protects against static analysis”¦ unless the generation algorithm itself is in the code and no external data is used. So, for example, if your key is generated like the following:

String encryptionKey = getApplicationContext().getPackageName() + Build.DEVICE;

The amount of effort from the analyst required to obtain the key remains almost the same. Sure, the string is not directly hard-coded, but it is still trivial to recreate it given access to the source code. Still, these key generation algorithms can get really complicated, or the code can be heavily obfuscated, making the task of statically obtaining the key pretty frustrating (or directly impossible if the key is obtained from an external source, e.g. from a server via an API call). This is where other approaches are needed.

Frida to the rescue

One of the most useful tools mobile application analysts can have in their arsenal is Frida. From their website, Frida is a:

Dynamic instrumentation toolkit for developers, reverse-engineers, and security researchers.

Dynamic instrumentation is a very powerful technique by which analysts can hook to any part of the source code and spy on it, seeing what is going on with the data, or even altering inputs and outputs of a certain function. Frida allows to do this ridiculously easy, with a comprehensive API and support for the major programming languages and OSes. It is also not limited to mobile applications (it works on native desktop applications too), so definitely check it out if you have not done so already. For an Android usage example, you can check these snippets.

Now, you may be thinking: but the problem remains the same! If the code is heavily obfuscated, we may not even know which function to hook in the first place! And you would be correct. But, actually, we do not need to hook the key generation process. Since we can intercept function calls and mess with their inputs and outputs, we just need to hook the functions which are using the keys, i.e. receiving them as parameters. And of course these will be SQLCipher’s own functions. Looking at their SQLCipher for Android Application Integration article, it can be seen that a good hooking candidate probably is openOrCreateDatabase, since most apps will be using this function to interact with their encrypted databases, and the password is passed as a parameter! Luckily for us, SQLCipher is open-source, so we can review the Android implementation’s SQLiteDatabase.java class and see which are the exact signatures of the functions we want to hook (because openOrCreateDatabase has several overloads).

With this knowledge we can create a Frida script (written in JavaScript) to hook the desired functions and just print the password:

console.log("[-] Waiting for Java...");
while (!Java.available) {
}
console.log("[+] Java available!");
var CipheredSQLiteDatabase = Java.use("net.sqlcipher.database.SQLiteDatabase");
console.log("[+] Hooked: " + CipheredSQLiteDatabase);
CipheredSQLiteDatabase.openOrCreateDatabase.overload("java.lang.String", "java.lang.String", "net.sqlcipher.database.SQLiteDatabase$CursorFactory", "net.sqlcipher.database.SQLiteDatabaseHook").implementation = function(path, password, factory, hook) {
    console.log("[+] PASSWORD FOUND: " + password.join(""));
    var db = this.openOrCreateDatabase(path, password, factory, hook);
    return db;
};
// do this for every overload

As you can see, we make the actual call to the original openOrCreateDatabase and return the expected result since we do not want the application to break. With this, each time the application tries to access the encrypted database (or creates it for the first time), our Frida script will print out the password. Cool!

The raw key issue

Now, in real-life applications, this method did not always work. Specifically because the password was in raw key format and could not be printed reliably as string. We tried several ways of encoding the password so it could be used to open the encrypted database, but we must have been doing something wrong because none of them worked. In the end, we decided to take another approach. Frida not only allows us to spy on functions and alter their inputs and outputs. In fact, it can be used to execute any desired Java code at any moment, so let us take advantage of that! We may not be able to print the password, but as seen in the previous snippet we actually have access to the db variable returned by the openOrCreateDatabase function, which is a direct reference to the database. Why not dump its contents in a new, unencrypted database which we can open later without hassle? For doing that, we need to be able to create new files and to execute queries in the original database. Luckily we can do both things with Frida, as seen in the following snippet:

function dumpDb(File, db, path) {
    var file = File.$new(path + ".plaintext");
    file.delete();
    db.rawExecSQL("ATTACH DATABASE '" + path + ".plaintext' AS plaintext KEY '';SELECT sqlcipher_export('plaintext');DETACH DATABASE plaintext;");    console.log("t[+] Dumped plaintext database at " + path + ".plaintext");
}
console.log("[-] Waiting for Java...");
while (!Java.available) {
}
console.log("[+] Java available!");
var File = Java.use("java.io.File");
console.log("[+] Hooked: " + File);
var CipheredSQLiteDatabase = Java.use("net.sqlcipher.database.SQLiteDatabase");
console.log("[+] Hooked: " + CipheredSQLiteDatabase);
CipheredSQLiteDatabase.openOrCreateDatabase.overload("java.lang.String", "java.lang.String", "net.sqlcipher.database.SQLiteDatabase$CursorFactory", "net.sqlcipher.database.SQLiteDatabaseHook").implementation = function(path, password, factory, hook) {
    var db = this.openOrCreateDatabase(path, password, factory, hook);
    dumpDb(File, db, path)
    return db;
};
// do this for every overload

We made two important changes to the script. First, we hooked Java’s File class so that we were able to create new File instances (this is needed to remove previous databases in the next step). Second, we added the function dumpDb, which creates a new database (named after the original with the .plaintext suffix) and basically dumps everything into it. When the application accesses the encrypted database, we will now be constantly updating our plaintext version, so that we can, at any moment, adb pull it and inspect it as a normal SQLite database. Now, we admit this may not be as convenient as directly obtaining the actual password, but it is a cool way to demonstrate the capabilities of Frida in regards of hooking and code execution.

Conclusions

In this post, we showed several ways, going from static analysis to dynamic instrumentation, to extract SQLCipher keys from an Android application in order to access encrypted databases. Also, when this was not possible, we showed a way of using Frida to extract the encrypted data in plaintext so it can be analyzed.

With this, apart from showing how cool and useful Frida is, we intended to discourage the inclusion of sensitive information, such as passphrases or encryption keys, in the device storage. Our recommendation to avoid this security issues would be, as stated in SQLCipher’s FAQ, to generate a substantial part of the key material with user-provided data (e.g. a passphrase) which is not stored anywhere on the device. Also, obfuscation, encryption and even hook protection are valid options to hinder the analyst job, but ultimately cannot completely protect secrets stored in your application. When dealing with passwords and encryption in client devices you are going to always have a though battle ahead, but following good security practices and staying current with the tactics attackers and analysts are using, you will have good chances of success.

Thanks for reading, and see you in the next post!

Business photo created by yanalya – www.freepik.com