Tuesday, October 22, 2013

Extracting RSAPrivateCrtKey and Certificates from an Android Process

An Android application that I assessed recently had extensive cryptographic controls to protect client-server communication and to secure its local storage. To top that, its source code was completely obfuscated.
Combined, these two factors made the application a great candidate for reversing. In this blog I will detail the portion of work where I dumped X.509 certificates and constructed a RSA private key (RSAPrivateCrtKey) from the Android application memory using Eclipse Memory Analyzer Tool (MAT) and Java code.

Analyzing Android Memory with Eclipse MAT

Eclipse MAT is primarily a Java heap analyzer that has extensive usage beyond its primary purpose of identifying memory leaks. It can be used to identify and dump sensitive information in Android application memory, perform some memory forensics etc… If you are new to Android memory analysis, I recommend that you get intimate with this tool for its obvious benefits. The following articles can help you get started.
Okay, now back to our target application.

Locating the crypto material

As part of reversing process I used dex2jar to decompile the application apk to java files and started analyzing them. While following application logic and reviewing its obfuscated code, I stumbled upon a java file (com.pack.age.name.h.b.java) that contained instance variables of type SSLSocketFactory and X509TrustManager. Clearly, this class was performing important cryptographic operations with respect to client-server communication.
So I pivoted to this class to identify the source of its crypto material and all attempts led me from one rabbit hole to another. I then decided to directly look at application heap with Eclipse MAT. I launched the application and performed some operations to ensure that the application loads the required crypto material and then performed the following steps to create the HPROF file contain application heap dump.
  1. Select the application from the list of running apps
  2. Select the “Show heap updates” option for the target application
  3. Select “Dump HPROF file” for analysis.
  4. Since I had MAT plugin installed, ADT converted the Android memory dump to HPROF format and presented it for analysis. In case you do not have MAT plugin, you will need to convert the generated dump to MAT readable format with hprof-conv utility that comes with ADT.
After opening the heap dump, I clicked on the “Dominator Tree” to view the object graph. Supplying the name of the class which had SSLSocketFactory and X509TrustManager instance variables in the Regex area filtered out most of the unwanted stuff. I then navigated the object tree to identify the X.509 certificates and the RSAPrivateCrtKey is shown below.
Image shows two X.509 certificates and a RSAPrivateCrtKey in program heap

Dumping the certificates

The X.509 certificates were byte arrays of different lengths and extracting the certificates turned out to be quick. I right clicked on the byte array  navigated to Copy  Save Value to File  selected location to save the file and clicked Finish. MAT indicates that the copy functionality allows you to write char[], String, StringBuffer and StringBuilder to a text file but it handsomely handled the byte[] in the current context. Please note the extension of the exported file was set to .der on the windows system. The following screenshots will show you the steps followed and one extracted certificate.
Image shows selecting the “Save Value to File” functionality for the byte[]

Image shows file saved as certificate-1.der 


Image shows the extracted Root CA certificate from the Android application


Extracting the RSAPrivateCrtKey

The second important component was the RSAPrivateCrtKey and extracting it was a little more involved as we will see below. To summarize, the below provided steps were followed to retrieve the RSAPrivateKeyCrtKey:
  1. Locate components that make up the RSAPrivatecrtKeySpec
  2. Copy all the components and store them in file system
  3. Compute positive BigInteger values from these components
  4. Construct RSAPrivatecrtKeySpec from its components
  5. Use the RSAPrivatecrtKeySpec  object to construct RSAPrivatecrtKey
  6. Write the RSAPrivatecrtKey to the file system in PKCS8 format
  7. And optionally:
    1. Convert PKCS8 to PEM using OpenSSL
    2. Extract public key from the PEM file with OpenSSL
Let us now look at the involved details.
The third component from Figure 1 corresponds to an instance of RSAPrivatecrtKeySpec which was the starting point to construct the key. Selecting the com.android.org.bouncycastle.jcajce.provider.asymmetric.rsa.BCRSAPrivateCrtKey entry in the MAT’s Dominator Tree populated the Attributes tab with the information (type, instance name and object reference) pertaining to the several participating BigInteger instances that are required to build this RSAPrivateCrtKeySpec. The following are the participating BigInteger components that make up a RSAPrivateCrtKeySpec:
  1. modulus
  2. publicExponent
  3. privateExponent
  4. primeP
  5. primeQ
  6. primeExponentP
  7. primeExponentQ
  8. crtCoefficient


I used this information to segregate the BigInteger component values to different variables as their values were copied out to the file system (see figure below). For example, the crtCoefficient at @0x410b0080 in the Attributes tab (left) was mapped to an array of 32 integers (right).  The modulus at @0x410afde0 was 64 int’s long which indicated that the key size was 2048 bits. Since MAT does not know how to export BigInteger objects, I used the actual int[]  reference inside the corresponding BigInteger dropdown to copy out the binary content.
That is, I right clicked on the int[] dropdowns under the BigInteger while exporting their content. This process was repeated for all the BigInteger components to 8 local files and the files were named as per the Attribute names. The following two images show the Attributes pane and the corresponding int[] content dump.
Image shows the Atrributes and corresponding BigInteger objects in the heap


Image shows int[64] selected to export the binary representation of the array


The next step after extracting the BigInteger components was to check if I am able to use them to re-construct the RSAPrivateCrtKeySpec. So I decided to perform two basic tests before going forward.
  1. Read individual int values from the file where int[]was dumped and match them against values in the MAT
  2. Check that all BigInteger components are positive numbers
I wrote some Java code to help me test all the binary dumps against these two conditions. The results indicated that first condition was true for all BigInteger components, but the second condition was not met by 3 out of 8 BigInteger components that had negative values as shown below.

Image shows matching integers from the binary dump against MAT (Condition 1)
Image shows the negative value (Condition 2)
I searched around to identify the reason for the negative values and the comments in the OpenJDK code indicated that the negative values can be result of incorrect ASN.1 encoding. So I included the corresponding code to calculate and return 2’s complement for negative BigInteger values before supplying the values to RSAPrivateCrtKeySpec constructor.
The final Java code that reads the binary BigInteger (int[]) components from file system and creates RSAPrivateCrtKey in PKCS8 format is provided below.


import java.io.DataInputStream;
import java.io.EOFException;
import java.io.FileInputStream;
import java.io.FileNotFoundException;
import java.io.FileOutputStream;
import java.io.IOException;
import java.math.BigInteger;
import java.nio.ByteBuffer;
import java.nio.IntBuffer;
import java.security.KeyFactory;
import java.security.KeyStoreException;
import java.security.NoSuchAlgorithmException;
import java.security.NoSuchProviderException;
import java.security.PrivateKey;
import java.security.Security;
import java.security.spec.InvalidKeySpecException;
import java.security.spec.PKCS8EncodedKeySpec;
import java.security.spec.RSAPrivateCrtKeySpec;
import java.util.ArrayList;

import org.bouncycastle.jce.provider.BouncyCastleProvider;

public class GenerateKey {

 public static BigInteger bitIntFromByteArray(int[] byteArrayParam) {
     byte[] localByteArray = new byte[byteArrayParam.length * 4];
     ByteBuffer byteBuffer = ByteBuffer.wrap(localByteArray);
     IntBuffer intBuffer = byteBuffer.asIntBuffer();
     intBuffer.put(byteArrayParam);
     
     BigInteger bigInteger = new BigInteger(localByteArray);
     if(bigInteger.compareTo(BigInteger.ZERO) < 0)
      bigInteger = new BigInteger(1, bigInteger.toByteArray());
     return bigInteger;
 }
 
 public static BigInteger bigIntegerFromBinaryFile(String filename) throws IOException {
  ArrayList<Integer> intArrayList = new ArrayList<Integer>();
  DataInputStream inputStream = new DataInputStream(new FileInputStream(filename));
  try {
      while (true) 
          intArrayList.add(inputStream.readInt());
  } catch (EOFException ex) {
   
  } finally {
   inputStream.close();
  }
  
  int[] intArray = new int[intArrayList.size()];
  for(int i = 0; i < intArrayList.size(); i++) 
   intArray[i] = intArrayList.get(i);
  return bitIntFromByteArray(intArray);

 }
 
 public static void main(String[] args) throws KeyStoreException, NoSuchProviderException, NoSuchAlgorithmException, InvalidKeySpecException, FileNotFoundException, IOException, ClassNotFoundException {
  Security.addProvider(new BouncyCastleProvider());
  
  BigInteger crtCoefficient = bigIntegerFromBinaryFile("h:\\key-coeffs\\crtCoefficient");
  BigInteger modulus = bigIntegerFromBinaryFile("h:\\key-coeffs\\modulus");
  BigInteger primeExponentP = bigIntegerFromBinaryFile("h:\\key-coeffs\\primeExponentP");
  BigInteger primeExponentQ = bigIntegerFromBinaryFile("h:\\key-coeffs\\primeExponentQ");
  BigInteger primeP = bigIntegerFromBinaryFile("h:\\key-coeffs\\primeP");  
  BigInteger primeQ = bigIntegerFromBinaryFile("h:\\key-coeffs\\primeQ");
  BigInteger privateExponent = bigIntegerFromBinaryFile("h:\\key-coeffs\\privateExponent");
  BigInteger publicExponent = bigIntegerFromBinaryFile("h:\\key-coeffs\\publicExponent");
  
  System.out.println("crtCoefficient\t" + crtCoefficient);
  System.out.println("modulus\t" + modulus);
  System.out.println("primeExponentP\t" + primeExponentP);
  System.out.println("primeExponentQ\t" + primeExponentQ);
  System.out.println("primeP\t" + primeP);
  System.out.println("primeQ\t" + primeQ);
  System.out.println("privateExponent\t" + privateExponent);
  System.out.println("publicExponent\t" + publicExponent);

  
  RSAPrivateCrtKeySpec spec = new RSAPrivateCrtKeySpec(modulus, publicExponent, privateExponent, primeP, primeQ, primeExponentP, primeExponentQ, crtCoefficient);
  KeyFactory factory = KeyFactory.getInstance("RSA", "BC");
  PrivateKey privateKey = factory.generatePrivate(spec);
  System.out.println(privateKey);
  PKCS8EncodedKeySpec pkcs8EncodedKeySpec = new PKCS8EncodedKeySpec(privateKey.getEncoded());
  FileOutputStream fos = new FileOutputStream( "h:\\key-coeffs\\private-pkcs8.der");
  fos.write(pkcs8EncodedKeySpec.getEncoded());
  fos.close();
 } 
}

 


Converting PKCS8 to PEM

The next step of the process was to convert the private key from PKCS8 format to a PEM file and then to generate the public key from the private key with the following OpenSSL commands.
openssl pkcs8 –inform DER –nocrypt –in private-pkcs8.der –out privatePem.pem


openssl rsa –in privatePem.pem –pubout



Image shows OpenSSL converting the PKCS8
Image shows the RSA private key
Image shows OpenSSL extracting public key from the privatePem.pem file

Conclusion


Memory analysis is a powerful technique that can be used to identify and extract sensitive information from application runtime. In some scenarios, the extracted information can also be used to defeat client side security controls.

Wednesday, September 11, 2013

Validating Custom Sanitization in Web Applications with Saner

Introduction
I recently read a paper in which the authors combined static and dynamic source code review techniques to evaluate the effectiveness of custom built data sanitization routines in PHP based web applications. The paper was very interesting and I thought to summarize it for quick consumption.
The authors suggest that static analysis systems are not able to analyze custom sanitization routines and often report security vulnerabilities even when these routines are able to effectively neutralize the malicious characters. These reported vulnerabilities (true or false positives) are typically subjected to manual analysis to identify the effectiveness of the custom code. This process is prone and often leads to inaccurate results with false positives or negatives.
As a part of their research, the authors wrote Saner with the objective to analyze custom sanitization routines to identify XSS and SQL injection vulnerabilities in PHP based web applications. Saner works by combining Static and Dynamic analysis techniques which resulted in low false positive rates and it had the ability to identify the exact attack vectors that could bypass the custom sanitization code. It is based on Pixy; an open source web vulnerability scanner for PHP.
The following figure shows the two phases used by Saner.
Figure 1: Image shows the different stages of analysis performed by Saner


Static Analysis
There are two types of static analysis models, sound and unsound. The sound model flags custom sanitization routines as ineffective and the unsound model assumes that string manipulation operations on tainted input results in untainted output. The sound model can result in large number of false positives and the unsound model may lead to false negatives.
Pixy provides the data flow analysis between sources and sensitive sinks, identifies if any built in sanitization routines are applied to the identified data flow paths. Pixy follows sound analysis model and it flags custom sanitization routines as ineffective and that results in high false positive rates. Additionally, program variables in Pixy can be either tainted or untainted and Pixy cannot capture the set of values each variable can hold.
To address these shortcomings, Pixy was extended to derive an over-approximation of the values that program variables can hold for every point in the program. It was based on finite state automata to describe an arbitrary set of strings and associating taint qualifiers to the automata transitions.  This provided Saner with an ability to track the taint status of different parts of the string.
Saner performs postorder traversal on Pixy’s dependency graphs to derive the automata that describe the possible string values a program node can contain. The node can be a) a string, b) a variable or c) an operation. When a node represents a string literal, it is decorated with an automaton that describes the exact string. The automaton for program variables is calculated based on the successor nodes from the dependency graph.
Saner categorizes operations in two types of groups. The first group has the functions that are precisely modeled, i.e. Saner is uses finite state transducers to compute an automaton to describer all possible output strings from this category of functions. The Saner team developed a number of finite state transducers for custom string manipulation functions and also the functions that are commonly used for input sanitization. This is required to precisely capture the effect of the sanitization routines. The second group is of un-modeled functions where Saner depends on the values passed to the parameters of these functions and computes the automaton based on least upper bound of the taint status of the supplied parameters.
Saner uses Mohri and Sporat’s algorithm to model the functions. The automata used in the Mohri and Sporat’s algorithm are not taint aware. In order to get around the limitation, the algorithm was left unmodified and a clever workaround was used to leverage the existing algorithm to propagate taint information. The workaround replaced static strings with empty ones to ensure that static, untainted strings that contain dangerous meta-characters do not lead to false positives. To compensate for the loss of information from static string removal, an over approximation of possible string values was derived based on various modeled functions and the parameters they accept. This approach allowed removal of false negatives.
Finally, in order to determine if a potentially malicious input makes it to a sensitive sink, an intersection is calculated between the automaton that represents the sink’s input and the automaton that contains the set of undesired characters. For every non-empty intersection, the source-sink pair is flagged as a potential true positive and the information is passed to the dynamic analysis phase.


Figure 2: Image summarizes the static analysis phase





Dynamic Analysis
The static phase is conservative and may generate false positives and that requires developers to manually inspect the code to weed out the reported false positives. The dynamic analysis component attempts to automate this analysis by directly executing the custom sanitization routines on a set of malicious inputs and then analyzing the output to determine if the malicious characters were sanitized or not.
After receiving the source-sink pairs from the static analysis component, the dynamic analysis extracts all the nodes pertinent to the custom data sanitization and abstracts out all the other application details. It then calculates sanitization graph for each source-sink pair and uses that information to construct all possible paths from source to sink.
Based on the type of the sink, a test suite (XSS or SQL injection) is selected for evaluation. For example, if the sink forms a portion of a SQL query, SQL injection test suite will be run on the corresponding data flow paths. The final step of the process invokes the PHP interpreter to evaluate the result of executing each block of code using the corresponding test suite.
The results of each test were then analyzed by an oracle function to check for occurrence of particular substrings and the result was categorized as a true positive or a false positive.


Figure 3: Image summarizes the dynamic analysis phase


Results
Saner identified 13 novel vulnerabilities across five open source PHP applications. The time required to perform analysis was in the order of a few minutes for almost all applications.


Observations
  1. Saner’s dynamic analysis effectiveness is primarily driven by its input test suite which is limited. The whitepaper does not discuss the mutation engines, if any, used for the attack vectors.  An intelligent mutation engine can potentially make the tool more effective. Additionally, the tool was written to identify XSS vectors that rely on < symbol. Including other XSS injection techniques can also increase the detection rate.
  2. The interesting custom validation bypass attacks that Saner identified and discussed in the paper were Cross Site Scripting attacks and the authors did not discuss any identified SQL injection vulnerability.
  3. The dynamic analysis component can also be leveraged to write unit test cases for PHP web applications. I could not find Saner source code and plan to reach out to the authors to check its availability.

Nov 8, 2013 Update
I contacted the authors and it appears that Saner source code was never released and is not traceable.