Professional Documents
Culture Documents
hashin
Hash File Index Standard
Revision 1
2. Scope
This document specifies a hash-based index format.
It is applicable to individual files, both lossless and lossy, and zip-archives.
It is not applicable to collections of files and different types of archives than
zip.
Lossless
Lossy
Byte-String
Sequence of bytes.
Flag-Bits
User-Perceived File-Duplicate
Two files which seem very similar or the same to a user but are in fact
different and thus produce different hashes.
hashin
4. Clauses
4.1. Definition
The hashin (Hash File Index Standard) consists of two separate parts:
4.1.1. Byte-String
Part 3: The hash of the content of the source-file as defined by the indivual
version.1
The tier requirements are rules that define which category hashins shall fall
under, depending on their source-files. They also define how and in which
format source-files shall be hashed.
Tier 2 requirements are aimed at source-files which are under high risk of
generating different hashins for user-perceived file-duplicates. It should
be used for casual management of archives.
The versions are current revisions that shall define the exact specifications of
the standard. Every new version shall be backwards-compatible to all old
versions in case of the hash-algorithm staying the same between versions.
4.2.0 Version 0
4.2.0.1. Byte-String
4.2.1. Version 1
4.2.1.1. Byte-String
Part 1: 8 flag-bits, the first 6 defining the version, the last two defining the
tier requirement list it falls under. The first 6 bits shall always be 000000.
The last two shall read 00 in case of Tier 1 Requirements met, 01 in case
of Tier 2 Requirements met, 10 in case of Tier 3 Requirements met and 11
in case of Tier 4 Requirements met.
2. Document formats
.exe
.sh
.bin
.iso
.pdf
.epub
.csv
.doc
.docx
.odt
.xls
.xlsx
.ods
.flv
.flac
.png
.bmp
.tiff
Special rules:
Any source-image file shall be archived into a zip (as per specification
given in Tier 3 requirement list) containing a single file (itself) to apply
for Tier 1.
4.2.1.2.2. Tier 2 Requirements
.mp4
.mkv
.mp3
.wav
.jpg
Special rules:
Any source-image file shall be archived into a zip (as per specification
given in Tier 3 requirement list) containing a single file (itself) to apply
for Tier 2.
Tier 3 shall consist of any source-files not applicable to Tier 1 and Tier
2, except images. Archives shall also be included, which shall be in the
zip-format only.
Special rules:
Source-files which are not dependent on each other or fall into Tier 1
or Tier 2 should not be packed together into one archive. Additionally,
files which are dependent on each other but easily tracable (for
example, a small .exe to unpack one or more large .bin files) should be
hashed indepentently instead of packed together with one single
hash for all files.
Special rules:
4.2.1.3. Examples
Example 1:
02-89-71-5F-B8-A3-3F-2E-77-1E-66-D8-68-C1-C5-05-91-E8-75-93-F5-B9-
0D-37-0E-43-FA-DA-79-5B-E5-E4-3A-00-00-00-00-00-08
In parts:
89-71-5F-B8-A3-3F-2E-77-1E-66-D8-68-C1-C5-05-91-E8-75-93-F5-B9-0D-
37-0E-43-FA-DA-79-5B-E5-E4-3A hex, the sha-256 hash.
Third parties may use any version greater than 0 and extend it with their own
attributes (which they shall put exclusively after the intact byte-stream) as
long as they remain compatible to the original specification. They shall set the
last flag-bit to 1 and give their extension a name in the following name-
scheme (asterisks will from here on mark place-holders):
Version number shall be replaced with the version they are deriving from and
to which they are remaining compatible.
Version number shall be replaced with the version number of the extension
developed by the third party.
Third parties specifically may not create standards with names in the following
name-scheme:
Third parties may create standards with names in the following name-
scheme:
Lastly, it should be noted that the rules for extension and derivative name-
scheming were made to avoid confusion between the specifications made by
third parties and this organization and ensure maximal compatibility from
derived works while still allowing third parties to use the Hashin name and
give the opportunity to continue developing the standard should this
organization cease to be.
5.2 Annex B (normative)