Rijndael/80186 Version 1.2 -------------------------- This is an 80186/80286 assembly language implementation of the Rijndael encryption algorithm developed by Joan Daemen and Vincent Rijmen. This is a decent (I think) assembly language implementation of the algorithm, hardcoded to use 128-bit blocks and 256-bit keys. Probably, the code could still be 'zenned' (to borrow an expression from Michael Abrash), and this process continues as now I finally have the embedded board which is my target. This implementation uses 798 bytes of lookup tables (768 bytes for the xtime table, direct, and inverse s-boxes, and 30 bytes for the round constants for key scheduling) and requires an additional 240 bytes for the key scheduling constants for 15 rounds. Stack usage is minimal; none of these routines uses local stack variables. Complete code size is currently 624 bytes, making total ROM usage 1422 bytes. This version fixes a major bug in the original version of the key scheduling code which I missed (it computes *wrong* keys, and stores them in the *wrong* place too). It's also somewhat smaller than the original, having procedurized some common snippets of code, and used more space-efficient instructions as suggested by Robert G. Durnal (afn21533@afn.org). Initial benchmarking shows that my code is capable of encrypting data at a rate of about 17,288 bytes/second on a 40 MHz Am186ES. 60,000 encryptions takes about 55.56 seconds, so each encryption takes 9260 cycles (1 cycle = 4 oscillator periods). These figures are somewhat rough, and better estimates will be provided in the future as I develop a better means of code profiling. The assembler I used was NASM 0.98, and the environment under which this was tested was linux-8086, using the elksemu program, and also with the AMD Net186 evaluation board. There seems to be no reason why this could not be compiled with a 16-bit DOS compiler such as Turbo C++, but it would probably work only with the small or tiny model. It will be necessary to make modifications to make it work with other memory models, such as the loading of far pointers and adjustments in the size of stack variables and return addresses. Note that, as is, this code will not work with 8086/8088 because it uses shift instructions with counts greater than 1, however, it shouldn't be too hard to convert these multiple counts into several shift instructions with a count of 1. This is the only 80186/286 feature I have used in this code. These shift instructions only occur in the key expansion code, because of the somewhat complex indexing necessary there. There has to be a better way than this; the quest for improvements continues. Included here is a small perl script used to generate the xtime tables used by the program. -- Rafael R. Sevilla +63 (2) 4342217 ICSM-F Development Team, UP Diliman +63 (917) 4458925 PGP Key available at http://home.pacific.net.ph/~dido/dido.pgp