Sometimes you want the digest in both readable notation (such as hexadecimal) and raw binary. At other times you want the digest in a notation other than hexadecimal.
The following getDigestNotation() function takes a binary string and returns it in base 2, 4, 8, 16, 32, or 64 notation. It works with sha1(), md5(), hash(), or anything else that can output a raw binary string.
It works similar to the session.hash_bits_per_character php.ini configuration option.
You can specify which characters to use for each position, or use the default, which matches session.hash_bits_per_character (0-9, a-z, A-Z, "-", ","). The practical range of bits to use per character ($bitsPerCharacter) is 1 to 6; you may use more, but you will have to provide your own base character string ($chars) that is at least pow(2, $bitsPerCharacter) characters long. So even with 7 bits per character you need to specify a value for $chars that is 128 characters long, which exceeds the number of printable ASCII characters.
The output's radix relates to the value of $bitsPerCharacter as follows:
1: base-2 (binary)
2: base-4
3: base-8 (octal)
4: base-16 (hexadecimal)
5: base-32
6: base-64
<?php
$raw = sha1(uniqid(mt_rand(), TRUE), TRUE);
echo getDigestNotation($raw, 6);
function getDigestNotation($rawDigest, $bitsPerCharacter, $chars = NULL)
{
if ($chars === NULL || strlen($chars) < 2) {
$chars '0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-,';
}
if ($bitsPerCharacter < 1) {
$bitsPerCharacter = 1;
} elseif (strlen($chars) < pow(2, $bitsPerCharacter)) {
$bitsPerCharacter = 1;
do {
$bitsPerCharacter++;
} while (strlen($chars) > pow(2, $bitsPerCharacter));
}
$bytes = unpack('C*', $rawDigest);
$byteCount = count($bytes);
$out = '';
$byte = array_shift($bytes);
$bitsRead = 0;
for ($i = 0; $i < $byteCount * 8 / $bitsPerCharacter; $i++) {
if ($bitsRead + $bitsPerCharacter > 8) {
$oldBits = $byte - ($byte >> 8 - $bitsRead << 8 - $bitsRead);
if (count($bytes) == 0) {
$out .= $chars[$oldBits];
break;
}
$oldBitCount = 8 - $bitsRead;
$byte = array_shift($bytes);
$bitsRead = 0;
} else {
$oldBitCount = 0;
}
$bits = $byte >> 8 - ($bitsRead + ($bitsPerCharacter - $oldBitCount));
$bits = $bits - ($bits >> $bitsPerCharacter - $oldBitCount << $bitsPerCharacter - $oldBitCount);
$bitsRead += $bitsPerCharacter - $oldBitCount;
if ($oldBitCount > 0) {
$bits = ($oldBits << $bitsPerCharacter - $oldBitCount) | $bits;
}
$out .= $chars[$bits];
}
return $out;
}
?>
Lastly, depending on the digest length, there may be fewer bits remaining for the last character than $bitsPerCharacter, so the last character will be smaller. The same thing happens with PHP's session ID generator, when 5 or 6 is used for session.hash_bits_per_character.