There seems to be a lot of confusion about how to implement and work with X.509 certificates, either because of ASN.1 encoding issues, or because vagueness in the relevant standards means people end up taking guesses at what some of the fields are supposed to look like. For this reason I've put together these guidelines to help in creating software to work with X.509 certificates, PKCS #10 certification requests, CRL's, and other ASN.1-encoded data types.
I tend to go overboard sometimes perhaps, but I prefer when designing ASN.1 to use the later tools to verify my work. Generally, I will use ASN.1 tools to generate C code, then make sure that this C code will actually C compile. Sometimes I'll walk through the generated results and note changes to code generation that may result purely from how I code the ASN.1. This allows me to better tailor my work to target the needs of developers.
At the end of a project I will typically use a -test switch with the code generator that causes the tool to create a "main" within the generated .c file. The tools I use will generate a test driver containing one test case for each instance of ASN.1 value notation presented in the input ASN.1 modules.
I can use other switches to control the test case specifics at run time so that each test case is processed using various ER (DER, BER, PER, etc.). This helps to assure me that the ASN.1 I have created can produce working code, and it also allows me to capture these test results for in the specification or in related documentation.
There is indeed a lot of complexity in ASN.1. At the root, ASN.1 is a basic T-L-V encoding format, similar to what we see in multiple IETF protocols. However, for various reasons, ASN.1 includes a number of encoding choices that are as many occasions for programming errors: * In most TLV applications, the type field is a simple number varying from 0 to 254, with the number 255 reserved for extension. In ASN.1, the type field is structured as a combination of scope and number, and the number itself can be encoded on a variable number of bytes. * In most TLV applications, the length field is a simple number. In ASN.1, the length field is variable length. * In most TLV applications, structures are delineated by the length field. In ASN.1, structures can be delineated either by the length field or by an "end of structure" mark. * In most TLV applications, a string is encoded as just a string of bytes. In ASN.1, it can be encoded either that way, or as a sequence of chunks, which conceivably could themselves be encoded as chunks. * Most applications tolerate some variations in component ordering and deal with optional components, but ASN.1 pushes that to an art form. * I don't remember exactly how many alphabet sets ASN.1 does support, but it is way more than your average application. * Most applications encode integer values by reference to classic computer encodings, e.g. signed/unsigned char, short, long, long-long. ASN.1 introduces its own encoding, which is variable length. * One can argue that SNMP makes a creative use of the "Object Identifier" data type of ASN.1, but one also has to wonder why this data type is specified in the language in the first place.Then there are MACRO definitions, VALUE specifications, and an even more complex definition of extension capabilities. In short, ASN.1 is vastly more complex that the average TLV encoding. The higher rate of errors is thus not entirely surprising.
Last updated by Henning Schulzrinne