CONNOTECH Experts-conseils Inc.

Core Frogbit Algorithm Explanations

by Thierry Moreau

May 1997 (revised and expanded July 1999)

Table of contents

Introduction
A New Arrangement for Cryptographic Data Integrity
Factual Summary of the Frogbit Algorithm
. . . . Overview
. . . . Run Length Process
. . . . Index Permutation Process
. . . . Key Source Selection
. . . . Initialization Procedure
An Heuristic Development Process
Component Collision Analysis

Introduction

A few different formulations are given elsewhere for the core Frogbit algorithm, but justification of the algorithm deserves a little more narrative explanation. This is the purpose of the present document.

A New Arrangement for Cryptographic Data Integrity

A stream cipher operates on cleartext messages serially, applying XOR encryption to each message bit. It uses a pseudo-random bit generator (PR generator) for which the seed is derived from the cipher secret key. Our definition of data integrity protection uses the bit-serial characteristic of a stream cipher. Let us assume that a bit-serial cleartext message is encrypted into ciphertext by a sender, and that a session key is established for this purpose (we ignore attacks based on different messages that could be encrypted with the same key). The legitimate receiver decrypts the ciphertext and gets the recovered cleartext. An adversary is given the opportunity to modify the ciphertext bits in transit from the sender to the receiver.

Definition.: We define a cipher that offers data integrity protection as one where the modification of a single bit in the ciphertext by the adversary forces the randomization of the remaining portion of the recovered cleartext.

It is easy to turn a cryptographic scheme strictly conforming to this definition into a scheme where any modification of the ciphertext has at most a 2^-L probability of remaining undetected by the legitimate receiver: simply append L bits of redundancy to the cleartext message (e.g. L bits set to zero), and have the legitimate receiver check their presence before accepting the recovered cleartext as genuine. We assume that the adversary neither cuts nor extends the ciphertext bit string, but such a concern may be adressed with other forms of simple message redundancy (e.g. a cyclic redundancy check).

We are now ready to describe the arrangement intended to provide the data integrity protection as defined. This is illustrated in figure 1. This figure is concrete with respect to the double XOR encryption applied serially to each cleartext message bit. This double encryption creates the inner sequence (the intermediate result in the double XOR encryption), with the useful property that the adversary can not alter the cleartext bit without altering the inner sequence bit.

Figure 1

The other parts of the figure 1 show a more generic and abstract view of actual schemes providing data integrity protection. The key store with its read logic represents the supply of random or pseudo-random bit pairs to the double XOR encryption. Secrecy is achieved by hiding the contents of the key store for the adversary. At the abstract level, the key store has no state information, and an explicit address or pointer value is provided independently for each bit pair to be retrieved (e.g. if the key store was implemented by a sequential PR bit generator, the abstract pointer moves by two bit positions at each step). The data integrity protection is provided by a properly designed decision machine. The only part of the arrangement that maintains state information is the decision machine. It has two independent inputs, both being unknown to the adversary: the inner sequence bit, and the key bit used by the second stage of the double XOR encryption. The design goal of the decision machine is to vary the pointer passed to the key store read logic according to past inputs.

The Frogbit algorithm is a practical version of our data integrity arrangement. We developed it with by a trial and error process where adversary attacks were simulated under the most defavourable conditions.

Factual Summary of the Frogbit Algorithm

Overview

The goal of the Frogbit cipher is to make sure that inverting one bit in the ciphertext (by a defrauder) leads to inverting the corresponding bit of the cleartext (as recovered by the legitimate receiver) plus a permanent change in the rest of the key stream, (and thus the rest of the recovered message is randomized for the legitimate receiver). So the data integrity protection may be provided by the Frogbit cipher with some cleartext message redundancy which should allow the receiver party to detect any such message randomization with high probability. This redundancy may take the form of a simple "exclusive or" checksum of sufficient length appended to the message.

With the Frogbit cipher, each cleartext bit m_i is doubly encrypted by two key stream bits, k1_i and k2_i. The advantage of double encryption is not strengthened confidentiality. Instead, the double encryption creates a secret inner sequence bit s_i. It is significant that the secret inner sequence bit s_i and the second secret key stream bit k2_i, are both (pseudo-)random and (pseudo-)independent of each other. Moreover, inversing one bit of the ciphertext changes both the cleartext recovered by the legitimate receiver (according to the encryption equation e_i=k1_i XOR m_i XOR k2_i) and the inner sequence (according to the equation for inner sequence s_i).

The permanent change in the rest of the key stream comes from the variations of subsequent key stream bits k1_i+1 and k2_i+1, key stream bits k1_i+2 and k2_i+2, and so on, by a decision machine having finite state information and two inputs, respectively the secret inner sequence bit s_i and the second secret key stream bit k2_i. Suitable decision machines are such that it is hard to find a collision, that is two message substrings
m_i,m_i+1,...,m_i+k-1 and
m'_i,m'_i+1,...,m'_i+k-1
that start and end with the same state information in the decision machine. The core Frogbit algorithm is one such decision machine. There are three intermixed components in the core Frogbit algorithm:

the run length process,
the index permutation process, and
the key source selection.

Run Length Process

The term run length encoding refers to a coding or compression method based on counting consecutive repeated values in a message. For the Frogbit cipher, the run length encoding is applied to bits s_i of the inner sequence. In the mathematical formulation of the core Frogbit algorithm, the run length is the function r(i). In practice, the run length must be bounded by a finite upper limit, the parameter n. In the mathematical formulation of the core Frogbit algorithm, the bounded run length is the function r'(i). In the mathematical formulation of the core Frogbit algorithm, each bounded run is marked with r'(i)=0, and is the occasion for to "draw" a number, that is the function d(i). The draw function uses the second half of the key stream (k2_i) to produce the number. The preferred embodiment uses the parameter n set to 2.

The Frogbit run length process is only the first part of a finite state machine characterized by its dual input s_i and k2_i. The distinctive feature of the core Frogbit specification is that a change in s_i will trigger, with a very high probability, an arbitrary modification to the following bit pairs <k1_i+1,k2_i+1>, <k1_i+2,k2_i+2>, and so on.

Index Permutation Process

The index permutation process maintains a table of ten permuted indexes, that is the function T(i). When the run length process draws the number d(i), two things occur: a new index d'(i) is established as the pseudo-random sequence selector for the next bit positions, and the index table function T(i) is updated. The function i'(i) is merely a function that "points to" the beginning of a bounded run that finishes at bit position i.

The permutation table P is an important source of diffusion for the index permutation process. The reference permutation table in was selected by a "fair" pseudo-random program that enforced a number of criteria. It limited the frequency of a given number in any column of the table (a limit of three occurrences per column). It forced the permutation of each row to contain exactly two cycles (e.g. in the first row, positions 0, 2, 5, and 7 are permuted among themselves, as are every other positions). It didn't allow any cycle of length one. Additional care was taken to avoid cycles made of the same positions in two different rows.

Key Source Selection

The ten independent pseudo-random sources need only to meet the generic requirements to have ten independent pseudo-random sequences securely derived from a secret key (without knowledge of the secret key, it is computationally infeasible to guess any portion of any sequence, or any correlation between any two sequences). The pseudo-random sequences are independently utilized by the Frogbit cipher, one bit at a time: the utilization (or absence of utilization) of one sequence must not change the state (e.g. current location within the sequence) of any other sequence. Then, for each bit i processed by the Frogbit cipher, the index permutation process provides an index d'(i-1) which selects the pseudo-random sequence to utilize for the pair of bits <k1_i,k2_i>.

The Frogbit cipher accommodates any pseudo-random generator which is acceptable as a stream cipher. Moreover, the Frogbit cipher algorithm might be secure even with bad generators, of which security would be questionable if used in a normal stream cipher arrangement. For instance, the LFSR (Linear Feedback Shift Register) is a type of pseudo-random number generator easily implemented in hardware.

Initialization Procedure

A simple initialization procedure for the Frogbit cipher is to set the seeds for the ten independent pseudo-random generators from the secret key. In fact, secret initialization can go beyond the ten pseudo-random seeds: every single element of the Frogbit state can be derived from a secret key. It is thus possible to use diverse conventions for the secret part of the initial state information.

The Frogbit cipher state comprises the following elements: for the run length processing, 1) the previous bit for the run length processing, that is s_i-1, 2) the current bounded run count, that is r'(i-1), 3) a partial sum, that is the accumulated binary representation k2_i-r'(i-1), k2_i-r'(i-1)+1, ... k2_i-1 (actually, when n=2, this is at most a single bit of information); for the index permutation process, 4) the current permutation table, that is T(i-1-r'(i-1)); for the key source selection, 5) the current key stream number, that is d'(i-1), and finally 6) the state of the ten pseudo-random generators, or the ten key stream positions.

The number of states for item 6) of the Frogbit cipher state can be viewed as 10 times the count for each pseudo-random generator (for a purely periodic generator, the state count is the period of the generator). Even with the simplest pseudo-random generators, the total is a reasonable figure by current standards of secret key cryptography. Alternatively, for the state of the ten independent key streams, we may focus on the advance of each one. This information is relative to the start of message processing. Then, the state consists of the count of bit pairs extracted from each key stream. These two approaches are applicable respectively to the proprietary algorithms and the hash function.

An Heuristic Development Process

The design rationales for the core Frogbit algorithm are better described as an ad-hoc heuristic development process than as the refinement of prior research in the field of cryptographic data integrity. Yet, the original incentive for our work was the sparing published work on practical data integrity algorithms in the context of a stream cipher. From the implementor's perspective, it is somehow annoying to implement a block cipher for data integrity services (using the CBC-MAC construct) when a good stream cipher is used for confidentialy services. A representative application occurs in embedded cryptographic applications if a configuration file is stored outside of a tamper-proof enclosure which encompasses little more than the computing core of the apparatus. In this context, the following requirements can be stated:

the needed protections include both confidentiality and data integrity,
the standardization of algorithms is irrelevant,
the key management scenario is simple (the secret key is randomly generated within the tamper-proof core, and it is stored in a small memory also within this core), and
if a stream cipher is used, any update to the configuration file requires the generation of a fresh random secret key.

The actual heuristic development process started with the vague idea that the run length encoding could provide the projection of a single bit modification to adjacent bits in a stream cipher. Every other feature or component was added by a sequential trial and error process, with some empirical measure of progress towards the Frogbit primary objective. In this random search, we did not follow routes with very bad efficiency characteristics. In addition, we stopped when the algorithm on the drawing board appeared to fulfill the requirements. Admittedly, it is possible that the end-result applies in a sub-optimal way some cryptographic properties for which formal definitions and proofs are yet to be found.

For the record, the important aspects of the Frogbit algorithm were put into place in the following sequence:

the inner sequence, and the utilization of independent and secret bits s_i and k2_i,
the key store organization where no PR bit is ever discarded (we considered other schemes where PRNG output bits are discarded), and finally
the full permutation of the index table.

As we headed towards a better looking algorithm, we had to refine our security measurement approach. The obvious weaknesses of early attempts were easily found with manually entered test data. Then, exhaustive search for collisions was used, and we obviously couldn't avoid making use of the birthday paradox of data integrity algorithms (although we were looking for data integrity of encrypted messages, we attempted to observe collisions in the clear). When the residual collisions of our algorithm were not visible with the "resolution" of direct birthday attacks, we reverted to component collision search. We will discuss the concept of component collision in the next section.

There are a couple of ad-hoc verifications which may be applied to the final Frogbit algorithm, but which can only give negative indications about its validity. The Frogbit state is a structured piece of information, something radically different from the familiar set {0,1}^L (binary strings of length L, to which current data integrity algorithms maps the message space, e.g. L=160 for the SHA). Whatever the set of possible Frogbit state is, its probability distribution is definitely not uniform. We refer to the probability distribution induced by uniformly sampling messages from the message space and taking the corresponding final Frogbit state (assuming a fixed secret key). So, we felt it was necessary to assess the Frogbit state entropy associated with the final Frogbit state probability distribution. Short of a mathematical analysis of our algorithm, we created an ad-hoc data compression program to assess the typical bit length of compressed Frogbit final state. At best, this gives an upper limit on the actual Frogbit state entropy. These experimental results showed bit counts usually between 108 and 160. Given imperfections in our methodology, we can not claim that the actual entropy is over 100 bits, but it shouldn't be too far. This order of magnitude seems consistent with the stated goal for the Frogbit algorithm.

In a sense, another test is the mere observation of the algorithm efficiency in comparaison with other data integrity algorithms. On bit rate comparaison using a 32 bit computer, the Frogbit algorithm is about four times slower than a fast DES implementation (using S-box cooking and full precomputation of the key schedule). The Frogbit algorithm is relatively more performant on small computers (it processes mainly small values). If the Frogbit algorithm was really fast, this single fact would alone justify an attitude of scepticism. Even if a fast secure hash algorithm existed with the data integrity arrangement of figure 1, it is improbable that our heuristic development methodology would have found it.

At the end of the journey, the heuristic development process gave the core Frogbit algorithm, which fits into the data integrity arrangement shown in figure 1. The algorithm and the arrangement are radically different from prior design principles for cryptographic data integrity solutions. For this reason, the notion of modes of operation has to be revisited in the context of the Frogbit algorithm.

Component Collision Analysis

Our strongest validation of the Frogbit algorithm security is component collision search. Let us use the notation G_i for the Frogbit state just before the processing of bit i. A full Frogbit collision is two inner sequence substrings s_c, s_c+1, ..., s_c+k-1 and s'_c, s'_c+1, ..., s'_c+k-1 that meets two requirements:

the two inner sequences start and end with the same pair of states G_c and G_c+k, and
they use identical sequences from the key store for its k2_i components (the k1_i components are embedded in the inner sequence bits, the latter being alterable externally through the ciphertext).

The Frogbit state contains structured data. It is thus possible to isolate one or more component of the Frogbit state and search collisions for this partial state information. From a theoretical perspective, requirement 1) encompasses requirement 2) (e.g. for finite length messages, it is always possible to define a huge key store that encompasses every possible PR sequences, and then the key stream positions in the state G_c would imply the exact PR sequences used in requirement 2). To be applicable in practice, a collision must also meet the following requirement:
the collision remains undetected with the mode of operation used in a particular implementation (e.g. cleartext message redundancy in an application of data integrity and encryption).

Using an analogy with block algorithms, a collision involving partial Frogbit state information is comparable with cryptanalysis of a block algorithm with a reduced number of rounds.

For a collision search, we used the table of permuted indexes as the core component of the Frogbit state. Indeed, since this table has only 10!=~20⁵ states and 20 different transitions at each update operation, so collisions are deemed to occur with small number of transitions. For instance, the permutation rows 14, 9, and 7 lead to the same state of the index table as the permutation rows 3, 17, and 0. Specialized research results in the field of graph theory might contribute to the understanding of the inherent difficulty of finding collisions given the permutation table and other Frogbit rules.

To find collisions when more components of the Frogbit state are taken into account, we programmed a simulation with exhaustive search using a birthday attack strategy. It took about 12 hours of computing time on a Pentium 75MHz processor to find a collision involving all of the Frogbit state, except for the sequences of k2_i. In the theoretical perspective where the Frogbit state implicitly defines the sequences of k2_i, the collision found has identical relative keystream positions, in the final state G_c+k, but the absolute keystream positions in the initial state G_c do not agree. The specific collision occurs from the (randomly chosen) initial state G_c where s_c-1=1, r(c-1)=0, T(i'(c))={4,1,5,2,0,9,7,8,3,6}, and d'(c-1)=7; the two alternative sequences of d(c+i)'s leading to the same final state are %, 9, %, 4, %, 6, %, 9, %, 6 and %, 5, %, 8, %, 6, %, 3, %, 2 (undefined values for d(c+i) are indicated by %). Exactly two bit pairs are extracted from the PR generators 5 to 9, and none from generators 0 to 4.

Coming back to our initial definition of data integrity, if we were able to find a complete collision of the core Frogbit algorithm, we could hope to estimate the very small probability that a random alteration of the ciphertext has a limited impact on the recovered plaintext. Our inalility to find such collisions is the strongest validation of the Frogbit algorithm so far. It is as if the PR generators embedded in the Frogbit algorithm were very important for its validity. The combination of a mode of operation with the core Frogbit algorithm should provide another layer of data integrity protection, but analytical cryptanalysis of a given combined specification (Frogbit plus PR generators plus mode of operation) remains possible in theory.

CONNOTECH Experts-conseils Inc.
9130 Place de Montgolfier
Montréal, Québec, Canada, H2M 2A1
Tél.: +1-514-385-5691 Fax: +1-514-385-5900