How to Parse TLVs in JavaScript

One of the nice things about chip cards (ICCs) is that the data that comes out of them is virtually always supplied in a standard format, called BER-TLV. In plain English: Basic Encoding Rules, Tag-Length-Value (a quaint but informative article about it can be found here).

The BER-TLV format is one of the ASN.1 (Abstract Syntax Notation) encodings defined by ITU X.690, which is a very old set of standards dating to the primordial predawn of the Internet.

Chip cards use the TLV scheme to encode card data. At its simplest, the Tag-Length-Value scheme just means that if you have a tag called (say) “5A” and its value is 8 octets represented by (for example) successive hex values “41 11 12 34 56 78 9A BC,” then the TLV encoding will look like 5A084111123456789ABC, where 5A is the tag, 08 is the length, and 4111123456789ABC is the value.

EMVCo (the card-issuer consortium behind the whole chip-card thing) defines a bunch of standard tags for chip-card transactions. For example 5A always encodes the PAN (primary account number, or card number), 9F02 encodes the Authorized Amount of a transaction, 5F2D encodes Language Preference, and so on. The complete list of EMVco-defined tags (and their meanings) can be found at https://www.eftlab.co.uk/index.php/site-map/knowledge-base/145-emv-nfc-tags.

Given that TLVs encode their own length, it should be a snap to parse TLV data, right?

Well, yes. Mostly. Kind of.

If every tag had a simple one-byte identifier (like 5A), it really would be super-duper-easy to parse a TLV stream. But the TLV scheme wouldn’t be very useful if identifiers could only ever take on one of just 256 possible values.

To make tag identifiers extensible, Basic Encoding Rules allow for the possibility of multi-byte tags. The rules say that if the bottom 5 bits of the first tag byte are set, then more tag-identifier bytes follow. In subsequent bytes, the top bit is set if more bytes follow, whereas the top bit is zero in the final byte. So for example, 5F24 is a legal 2-byte tag identifier, DFEF01 is a legal 3-byte tag, and so on.

EMVCo (which incorporates BER-TLV by reference in Book 3, Annex B, of the EMV specifications) also allows for the concept of “wrapper” tags, to enable hierarchical parent-child relationships (or nesting) among TLVs. Under EMV rules, if the sixth bit of a tag’s first byte is set, the tag is said to be “constructed” (I prefer the term compound). Thus, a 3-byte tag FFEE01 could be used to wrap (fictional) TLVs of 3F0188 and 3F025544 as follows: FFEE01073F01883F025544. The parent tag, FFEE01, has 7 bytes of data, consisting of a 3-byte TLV and a 4-byte TLV. Groups of tags can be nested to any desired depth using this scheme.

Note carefully, the Length byte of a TLV can also be multi-byte. Here, the extensibility rule (taken from EMV Book 3 Annex B2) is:

b2ap3_thumbnail_BER-TLV.png

A length byte with the top bit set will mean you have to treat the bottom 7 bits as the “length of the Length.” In other words, a Length byte of 0x82 means that there are two bytes of Length info (in the two bytes that follow). In the (fictional) TLV represented by 5F0F8103AABBCC, the tag is 5F0F, the length of the Length is one byte, the actual Length is 3 bytes, and the Value is AABBCC.

Clear as mud, right?

So, knowing all this, we’re able to create a fully general recursive-descent TLV parser in about 75 lines of JavaScript, as follows.

//  ===============  BER-TLV PARSER  ================

// All known tags (EMVCo and ID TECH):
_KnownTags = 
{"42":true,"50":true,"52":true,"56":true,"57":true,"61":true,"62":true,"70":true,"71":true,"72":true,"73":true,"77":true,"80":true,
"81":true,"82":true,"83":true,"84":true,"86":true,"87":true,"88":true,"89":true,"90":true,"91":true,"92":true,"93":true,"94":true,
"95":true,"97":true,"98":true,"99":true,"4F":true,"5A":true,"5D":true,"5F20":true,"5F24":true,"5F25":true,"5F28":true,"5F2A":true,
"5F2D":true,"5F30":true,"5F34":true,"5F36":true,"5F3C":true,"5F3D":true,"5F50":true,"5F53":true,"5F54":true,"5F55":true,"5F56":true,
"5F57":true,"6F":true,"8A":true,"8C":true,"8D":true,"8E":true,"8F":true,"9A":true,"9B":true,"9C":true,"9D":true,"9F01":true,"9F02":true,
"9F03":true,"9F04":true,"9F05":true,"9F06":true,"9F07":true,"9F08":true,"9F09":true,"9F0B":true,"9F0D":true,"9F0E":true,"9F0F":true,
"9F10":true,"9F11":true,"9F12":true,"9F13":true,"9F14":true,"9F15":true,"9F16":true,"9F17":true,"9F18":true,"9F19":true,"9F1A":true,
"9F1B":true,"9F1C":true,"9F1D":true,"9F1E":true,"9F1F":true,"9F20":true,"9F21":true,"9F22":true,"9F23":true,"9F25":true,"9F26":true,
"9F27":true,"9F28":true,"9F29":true,"9F2A":true,"9F2B":true,"9F2D":true,"9F2E":true,"9F2F":true,"9F32":true,"9F33":true,"9F34":true,
"9F35":true,"9F36":true,"9F37":true,"9F38":true,"9F39":true,"9F3A":true,"9F3B":true,"9F3C":true,"9F3D":true,"9F40":true,"9F41":true,
"9F42":true,"9F43":true,"9F44":true,"9F45":true,"9F46":true,"9F47":true,"9F48":true,"9F49":true,"9F4A":true,"9F4B":true,"9F4C":true,
"9F4D":true,"9F4E":true,"9F4F":true,"9F50":true,"9F51":true,"9F52":true,"9F53":true,"9F54":true,"9F55":true,"9F56":true,"9F57":true,
"9F58":true,"9F59":true,"9F5A":true,"9F5B":true,"9F5C":true,"9F5D":true,"9F5E":true,"9F5F":true,"9F60":true,"9F61":true,"9F62":true,
"9F63":true,"9F64":true,"9F65":true,"9F66":true,"9F67":true,"9F68":true,"9F69":true,"9F6A":true,"9F6B":true,"9F6C":true,"9F6D":true,
"9F6E":true,"9F6F":true,"9F70":true,"9F71":true,"9F72":true,"9F73":true,"9F74":true,"9F75":true,"9F76":true,"9F77":true,"9F78":true,
"9F79":true,"9F7A":true,"9F7B":true,"9F7C":true,"9F7D":true,"9F7E":true,"9F7F":true,"A5":true,"BF0C":true,"BF50":true,"BF60":true,
"C3":true,"C4":true,"C5":true,"C6":true,"C7":true,"C8":true,"C9":true,"CA":true,"CB":true,"CD":true,"CE":true,"CF":true,"D1":true,
"D2":true,"D3":true,"D5":true,"D6":true,"D7":true,"D8":true,"D9":true,"DA":true,"DB":true,"DC":true,"DD":true,"DF01":true,"DF02":true,
"DF03":true,"DF04":true,"DF05":true,"DF06":true,"DF07":true,"DF08":true,"DF09":true,"DF0B":true,"DF0C":true,"DF0D":true,"DF0E":true,
"DF0F":true,"DF10":true,"DF11":true,"DF12":true,"DF13":true,"DF14":true,"DF15":true,"DF16":true,"DF17":true,"DF18":true,"DF19":true,
"DF1F":true,"DF20":true,"DF21":true,"DF22":true,"DF23":true,"DF24":true,"DF25":true,"DF26":true,"DF27":true,"DF28":true,"DF29":true,
"DF2A":true,"DF2B":true,"DF2C":true,"DF30":true,"DF31":true,"DF32":true,"DF33":true,"DF40":true,"DF41":true,"DF42":true,"DF43":true,
"DF44":true,"DF45":true,"DF46":true,"DF47":true,"DF48":true,"DF49":true,"DF4A":true,"DF4B":true,"DF4C":true,"DF4D":true,"DF4E":true,
"DF4F":true,"DF50":true,"DF51":true,"DF52":true,"DF53":true,"DF54":true,"DF55":true,"DF56":true,"DF57":true,"DF58":true,"DF5A":true,
"DF5B":true,"DF5C":true,"DF5D":true,"DF5E":true,"DF5F":true,"DF60":true,"DF61":true,"DF62":true,"DF63":true,"DF64":true,"DF65":true,
"DF66":true,"DF68":true,"DF69":true,"DF6A":true,"DF6B":true,"DF6C":true,"DF6D":true,"DF6E":true,"DF6F":true,"DF70":true,"DF71":true,
"DF72":true,"DF73":true,"DF74":true,"DF75":true,"DF76":true,"DF77":true,"DF78":true,"DF79":true,"DF7A":true,"DF7B":true,"DF7C":true,
"DF7D":true,"DF7F":true,"DF8101":true,"DF8102":true,"DF8104":true,"DF8105":true,"DF8106":true,"DF8107":true,"DF8108":true,
"DF8109":true,"DF810A":true,"DF810B":true,"DF810C":true,"DF810D":true,"DF810E":true,"DF810F":true,"DF8110":true,"DF8111":true,
"DF8112":true,"DF8113":true,"DF8114":true,"DF8115":true,"DF8116":true,"DF8117":true,"DF8118":true,"DF8119":true,"DF811A":true,
"DF811B":true,"DF811C":true,"DF811D":true,"DF811E":true,"DF811F":true,"DF8120":true,"DF8121":true,"DF8122":true,"DF8123":true,
"DF8124":true,"DF8125":true,"DF8126":true,"DF8127":true,"DF8128":true,"DF8129":true,"DF812A":true,"DF812B":true,"DF812C":true,
"DF812D":true,"DF8130":true,"DF8131":true,"DFDE04":true,"DFEE12":true,"DFEE15":true,"DFEE16":true,"DFEE17":true,"DFEE18":true,
"DFEE19":true,"DFEE1A":true,"DFEE1B":true,"DFEE1E":true,"DFEE1F":true,"DFEE20":true,"DFEE21":true,"DFEE22":true,"DFEE23":true,
"DFEE24":true,"DFEE25":true,"DFEE26":true,"DFEE27":true,"DFEF1E":true,"DFEF1F":true,"DFEF20":true,"DFEF21":true,"DFEF22":true,
"DFEF23":true,"DFEF24":true,"DFEF25":true,"DFEF26":true,"DFEF27":true,"DFEF28":true,"DFEF2C":true,"DFEF2D":true,"DFEF2E":true,
"DFEF2F":true,"DFEF30":true,"DFEF31":true,"DFEF32":true,"DFEF33":true,"DFEF34":true,"DFEF35":true,"DFEF36":true,"DFEF37":true,
"DFEF38":true,"DFEF39":true,"DFEF3A":true,"DFEF3B":true,"DFEF40":true,"DFEF41":true,"DFEF42":true,"DFEF43":true,"DFEF4B":true,
"DFEF4C":true,"DFEF4D":true,"DFEF59":true,"DFEF5A":true,"DFEF5B":true,"DFEF5C":true,"DFEF5D":true,"DFEF5E":true,"DFEF5F":true,
"DFEF60":true,"DFEF61":true,"DFEF62":true,"FF60":true,"FF62":true,"FF63":true,"FF69":true,"FF70":true,"FF71":true,"FF72":true,
"FF73":true,"FF74":true,"FF75":true,"FF76":true,"FF77":true,"FF78":true,"FF79":true,"FF7A":true,"FF7B":true,"FF7C":true,"FF7D":true,
"FF8101":true,"FF8102":true,"FF8103":true,"FF8104":true,"FF8105":true,"FF8106":true,"FFE0":true,"FFE1":true,"FFE2":true,"FFE3":true,
"FFE4":true,"FFE5":true,"FFE6":true,"FFE7":true,"FFE8":true,"FFE9":true,"FFEA":true,"FFEE01":true,"FFEE02":true,"FFEE03":true,
"FFEE04":true,"FFEE05":true,"FFEE06":true,"FFEE07":true,"FFEE08":true,"FFEE0A":true,"FFEE0B":true,"FFEE0C":true,"FFEE10":true,
"FFEE11":true,"FFEE12":true,"FFEE13":true,"FFEE14":true,"FFEE1C":true,"FFEE1D":true,"FFF0":true,"FFF1":true,"FFF2":true,"FFF3":true,
"FFF4":true,"FFF5":true,"FFF6":true,"FFF7":true,"FFF8":true,"FFF9":true,"FFFA":true,"FFFB":true,"FFFC":true,"FFFD":true,"FFFE":true,
"FFFF":true};

// ‘data’ should look like “95050010203000…” etc.

// In other words: TLVs, serialized, as one big string.

// A TLV object is returned. Use it to look up Values by Tag name.

// TLV[’95’] will contain the value of tag 95.

// TLV[‘9F26’] will contain the value of tag 9F26, etc.

function parseTags( data ) {

	var TLV = {}; // results go here
	
	// inner method
	function readData( amt, tag ) {

			data = data.slice( amt ); // get past tag bytes
			
			// find the Length (the L in TLV)
			var length = data.slice(0,2);  // read two nibbles
			data = data.slice(2); // get past those nibbles
			length = 1 * ("0x" + length);  // cast to Number
			
			if (length & 0x80) {  // high bit set? (EMV Book 3 Annex B2)
				highBitsByteExisted = 1;
				var lengthOfLength = length & 0x1F;
				lengthOfLength *= 2; // convert to nibbles!
				length = data.slice(0, lengthOfLength);
				data = data.slice( lengthOfLength ); // get past actual length
				length = 1 * ("0x" + length);  // cast to Number			
			}			
			
			length *= 2;  // number of nibbles of data to read

			var V = data.slice(0, length);

			// push the V onto the TLV array
			TLV[tag] = V;

			// if it was a constructed tag (FFEE01, e.g.) recurse:
			if ( tag.slice(0,2) == 'FF' ) {
				tmptlv = parseTags( V );
				for (var t in tmptlv)
					TLV[t] = tmptlv[t];
			}
			data = data.slice( length ); // get past the data
		}
	
	var THE_SUN_SHINES = 1;
	var amtRead,tag;

	while( THE_SUN_SHINES ) {

		if ( data == "") break; // loop ends
		
                for ( amtRead = 2; amtRead <= 6; amtRead += 2 )      
		    if ( (tag = data.slice(0,amtRead).toUpperCase() ) in _KnownTags )
                break;
		    else if ( amtRead == 6 ) {  // no known tag?
			data = data.slice( 2 );  // no tag found; just advance 2 chars
			console.log( "Expected a tag, found none. Data: n" + data );
		    } // if not 3-byte

		readData( amtRead, tag ); // This method shortens data (by amtRead) each time	
	}  // while loop
	
	return TLV;	 // return an object in which TLV[ key ] == V
}

 

The tactic we use here is brain-dead simple:

First, make available a big dictionary of tag identifiers, containing all known EMVCo (industry standard) tags, plus all known ID TECH proprietary tags. We call this dictionary _KnownTags, and you can test an identifier like ‘5A’ for existence by seeing if _KnownTags[ '5A' ] returns true.

Next: Parse!

Our parsing algorithm is super simple:

Read two nibbles at a time into a tag variable, and test whether the tag exists in the dictionary. All tags in the dictionary will be one, two, or three bytes long, so if we read 6 nibbles without finding a known tag, just advance the reading frame by 2 nibbles and continue on like nothing happened (after emitting a console message saying “Expected a tag, found none”). If you want to be fussy and throw an exception here, you can, but my philosophy is that (depending, of course, on the circumstances) a parser should by default be fail-soft (fault tolerant), in case you still want to use the rest of the parsed data.

Once a tag is found, use a worker method, in this case an inner function called readData(), to read past the tag, read the Length, and use the Length to read the Value. (Here, we need to be careful to check the top bit of the presumed Length, to see whether we need to follow the length-of-the-Length extensibility hack  rule mentioned earlier.)

Put the Value into a storage object under a lookup key of tag.

At the end, return the storage object.

So let’s try a real-world example. Suppose you’ve got an ID TECH Augusta chip-card reader, and you’re using it in keyboard mode to capture Quick Chip data. The data that streams out of the device when you dip a card might look like:

DFEE25020002DFEE26022000DFEE120A62994900000000000074DFEF5D105128CCCCCCCC2877D1801622CCCCCCCC57189F7E8B5A206B4F2CEA931148704EC549EDBAB728643E9197DFEF5B085128CCCCCCCC28775A10B5DECD79E3D200A6DE66A20C18DE80ACF201A2F434849502054455354204341524420202020202020202020205F24031801315F25031501015F280208405F2A0208405F2D02656E5F3401005F57010050104465626974204D6173746572436172644F07A0000000041010820239008407A00000000410108C219F02069F03069F1A0295055F2A029A039C019F37049F35019F45029F4C089F34038D0C910A8A0295059F37049F4C088E1200000000000000004203440341031E031F039C01009F02060000000000009F03060000000000009F10120110200005620400000000000000000000FF9F13009F20009F2608C837A85C5DFE75739F2701009F34031E03009F360202669F3704BB8050C99F38009F3901079F4D009F4F00950504000000009B02E8008A025A3399009F5B00DFEF4C06002100000000DFEF4D28AA839B4B402083DDEC00614D1703B139A07586453583B4A03AB333FB210FD1CD4F8AC3603D75688E

This is a big block of TLV data that begins with an ID TECH proprietary tag of DFEE25. (You can learn more about what ID TECH’s tags mean by downloading the ID TECH TLV Tag Reference Guide from https://idtechproducts.atlassian.net/wiki/spaces/KB/overview.) Most of the tags in this block, however, are industry-standard EMVCo tags. If we assign the block, as a string, to a JS variable called tagblock, and then load the above parser and run it with parseTags( tagblock ), we’ll get back an object with tags and values, like this:

50: 4465626974204D617374657243617264
57: 9F7E8B5A206B4F2CEA931148704EC549EDBAB728643E9197
82: 3900
84: A0000000041010
95: 0400000000
99:
DFEE25: 0002
DFEE26: 2000
DFEE12: 62994900000000000074
DFEF5D: 5128CCCCCCCC2877D1801622CCCCCCCC
DFEF5B: 5128CCCCCCCC2877
5A: B5DECD79E3D200A6DE66A20C18DE80AC
5F20: 2F43484950205445535420434152442020202020202020202020
5F24: 180131
5F25: 150101
5F28: 0840
5F2A: 0840
5F2D: 656E
5F34: 00
5F57: 00
4F: A0000000041010
8C: 9F02069F03069F1A0295055F2A029A039C019F37049F35019F45029F4C089F3403
8D: 910A8A0295059F37049F4C08
8E: 00000000000000004203440341031E031F03
9C: 00
9F02: 000000000000
9F03: 000000000000
9F10: 0110200005620400000000000000000000FF
9F13:
9F20:
9F26: C837A85C5DFE7573
9F27: 00
9F34: 1E0300
9F36: 0266
9F37: BB8050C9
9F38:
9F39: 07
9F4D:
9F4F:
9B: E800
8A: 5A33
9F5B:
DFEF4C: 002100000000
DFEF4D: AA839B4B402083DDEC00614D1703B139A07586453583B4A03AB333FB210FD1CD4F8AC3603D75688E

 

Some of these tags are empty. Some (like 9F27) contain a Value of 00. Some are encrypted. But basically, you have all the tags you need, right here, to run an EMV transaction.

Why use JavaScript to do TLV parsing? Well, if I told you the real answer to that, I’d have to kill you be spoiling the suspense you’re no doubt feeling right now if I hint around about ways to use Node.js in the payment-app environment, how to talk to credit-card readers using JavaScript, how to hit back-end test servers using Servlets and AJAX, etc. All of which is coming up soon right here, so bookmark this blog and come back soon!