27 February 2015

Motivation

Nowadays, every application trying to handle SSL Certificates or RSA keys quickly turns into quite a convoluted mess of code and UI, trying to support the plethora of available file formats out there, without any reliable way to destinguish them, with the involved files reaching from simple PEM and DER formatted keys to complex container structures like PKCS#12, code like the one below is not uncommon, leaving the product with unhandy, complex, slow and often counterintuitive code.

    require 'openssl'

    def is_pem?(blob)
      OpenSSL::PKey::RSA.new(blob)  # throws error if unparsable
      return blob[0..9] == "-----BEGIN"
    rescue OpenSSL::PKey::RSAError
      return false
    end

    def is_der?(blob)
      OpenSSL::PKey::RSA.new(blob) # throws error if unparsable
      return true
    rescue OpenSSL::PKey::RSAError
      return false
    end

    def is_pkcs12?(blob, pin)
      OpenSSL::PKCS#12.new(blob, pin)
      return true
    rescue OpenSSL::PKCS#12::PKCS#12Error
      return false
    end

    def identify_openssl_type(blob, pin)
      if pin && is_pkcs12?(blob, pin)
        return :pkcs12
      elsif is_pem?(blob)
        return :pem
      elsif is_der?(blob)
        return :der
      else
        return :unknown
      end
    end

Not only is this example very difficult to handle, it also relies on a lot of assumptions that can get in your way. For example, it assumes that the contents of the DER formated file is a RSA key, yet not a PEM formated file.

Furthermore, a PKCS#12 container can not be successfully identified using this method unless the user provides the correct PIN, which might not even be available at this point in the code flow, yielding an “OpenSSL::PKCS#12::PKCS#12Error: PKCS#12_parse: mac verify failure” which is not something a user will typically understand. PEM files are easy to detect due to their ASCII compatible encoding and the typical —–BEGIN header.

However, Ruby’s OpenSSL library does provide us with a some helpful functions that allow for easier identification of these file formats.

Attempt on finding a solution

The key to finding a better, useful solution is understanding the basic structure of these SSL formats to detect certain patterns that work reliable enough, depending on what you wish to accomplish. While this approach will not lead to a solution that will cover every obscure SSL file that one devious mind can devise, it will work reliably to handle every day SSL file detection and aid in simplifing your code flow.

I recommend reading a bit upon the subject on ASN.1 Data types used by OpenSSL, specifically what Tags are and how they are used, for example in here.

A number of tag numbers in the UNIVERSAL tag class can for example be read upon in the OpenSSL Ruby bindings RDoc.

To explore the general structure of a standard SSL file, Ruby offers a very nifty way via its OpenSSL library:

     require 'openssl'

     def decode_asn1(blob)
       OpenSSL::ASN1.decode(blob)
     end

    irb(main):012:0> asn = decode_asn1(File.read("test_key.der"))
    => #<OpenSSL::ASN1::Sequence ...>

Further inspecting yields that this is a ‘Sequence’, meaning ASN1s way of representing Array-like structures. Using standard #to_a will turn it into a more managable format.

DER keys are composed of integers, as they are a raw representation of the data making up (in this case) an RSA key, which in turn just is a very long integer value (bigint). Note that OpenSSL::BN just represents potentially large numbers which in turn can just be converted into Ruby’s fixnum and bignum datatypes.

    irb(main):072:0> k=OpenSSL::PKey::RSA.new(256)
    => #<OpenSSL::PKey::RSA:0x0000ff9ddfd228>
    irb(main):073:0> OpenSSL::ASN1::decode(k.to_der).to_a[0]
    => #<OpenSSL::ASN1::Integer:0x0000ff95e441a8 @tag=2, ...>
    irb(main):074:0> OpenSSL::ASN1::decode(k.to_der).to_a[0].value
    => #<OpenSSL::BN:0x0000ff9644e440>
    irb(main):075:0> OpenSSL::ASN1::decode(k.to_der).to_a[0].value.to_i
    => 0

This zero is always decoded from all RSA and DSA keys and the 2 in the tagging does denote an integer, according to specifications.

Analysing DER encoded X509 certificates is sadly even less insightful due to the structure involved, containing a lot of information that do not compute to meaningful patterns due to the complicated structure of X509.

A way to have a better look at this, is to automatically unwrap all Sequence data types. To understand the example code below, one should be aware that on parsing, ruby parses most tagged values into ASN1Data values (see the OpenSSL Ruby RDoc for more information on this).

    def to_tag_name(obj)
      return "-" unless obj.respond_to?(:tag)
      return "?" unless obj.tag_class.to_s == "UNIVERSAL"
      return OpenSSL::ASN1::UNIVERSAL_TAG_NAME[obj.tag]
    end

    def to_simple_asn1(obj)
      case obj
      when OpenSSL::ASN1::Sequence
        return :sequence => obj.to_a.map {|o| to_simple_asn1(o)}
      when OpenSSL::ASN1::ASN1Data
        return :data => to_simple_asn1(obj.value), :tag => to_tag_name(obj)
      when OpenSSL::BN
        return :bn => obj.to_i
      when Hash
        return obj.reduce({}) {|acc,(k,v)| acc[k] = to_simple_asn1(v);acc }
      when Array
        return obj.map {|kv| to_simple_asn1(kv)}
      else
        if obj.respond_to?(:tag)
          return :data => obj, :tag => to_tag_name(obj)
        else
          return :data => obj
        end
      end
    end

Call this by passing it an ASN.1 tree created by OpenSSL::ASN1.decode.

This given code snippet allows you get a rough impression of how the internal code structure does look like and which bits can be used to identify the file-format you are trying to detect. For example, with this method you can discover a reliable string ‘pkcs7-data’ for PKCS#12 files that is always present within the header of such files and is tagged in a specific manner. Please note that the ‘pkcs7-data’ merely marks signed/encrypted data according to the PKCS#7 standard, but it can serve as identifying mark for the PKCS#12 file here. Basically interpret this Ruby code snippet as the poor man’s SSL file deconstructor. Below you’ll find an real life example for applying this.

One possible Solution

This is a variation of how the code could look like to tell different OpenSSL formats apart:

    module OSSLSupport

      TAG_INT = 2
      TAG_ID = 6

      def self.is_pkcs12?(data)
        return false unless data.is_a?(OpenSSL::ASN1::Sequence)

        tag, container, _ = data.value
        if tag.tag == TAG_INT && tag.value.to_i == 3 && tag.tag_class == :UNIVERSAL
          identifier, _ = container.value
          if identifier.tag == TAG_ID && identifier.value == "pkcs7-data"
            return true
          end
        end

        return false
      end

      def self.identify(data)
        return unless data.present?
        data.chomp! if data[-1] == "\n"
        return :pem if data.is_a?(String) && data[0..9] == "-----BEGIN"

        data = OpenSSL::ASN1.decode(data)

        if is_pkcs12?(data)
          return :pkcs12
        end

        # if pkcs12 is not detected, assume :der
        return :der
      end
    end

This code uses a mixture of the ideas explained above. It will first get all PEM formats out of the way, as those are easy destinguishable and knowing it’s a PEM file usually is enough to instantiate the right classes. If it’s not a PEM file, the code aims to determine whether it’s a pkcs12 encoded file by looking for an ID tag that is pkcs7-data, in which case it is clear that it’s a pkcs12 encoded file. Otherwise, it is assumed to be simple der encoded certificate or key.

Similar to the code snippet above, detection of other formats can be performed in a similar manner.