Thursday, May 5, 2005

First Rev design done

Have completed a first paper design of FRET. It will analyse a group of files in three phases, each phase performing a different task;
  • Phase 1: Parse each file individually and store detected structures in a database.
  • Phase 2: Parse each file, comparing its raw data to the already detected structures and use this to identify new structures.
  • Phase 3: Compare all the detected structures for each file and all the files raw data against each other, identifying new structures.

I´ve also developed (borrowed really) the following terminology to describe whats happening: GRAM - a data structure or pattern of bytes detected in a file or buffer. Each GRAM will have a position, length, type, confidence level and parent. The term GRAM is taken from the classical Greek for a letter. I´m using it because it is the root of the words bigram and trigram that are used in cryptography when performing statistical analysis. FRET, after all, was inspired by coincidence counting - why not treat an unknown file format like an encrypted file for analysis purposes?