Digital - Computational
Digital - Computational
1
Computational Forensics
Outline
Introduction to Digital Forensic Science
Digital Evidence
Digital Forensic Ontology
Computational Forensics
Machine Learning and Data Mining
Pattern Classification, Search, Clustering
Dimensionality Reduction
Application Examples
Approximate String Search
Feature / Attribute Selection
Distributed Malware Detection
F o r e n s i cs L a b
2
Computational Forensics
Digital Forensics:
A Brief Introduction
3
Computational Forensics
NISlab-App to Information Security
Biometrics Forensics
Forensic Readiness
User Authentication
Incidence Response
BTA Protocol
Investigation/Analysis
5
Computational Forensics
Challenges and Demands in
Forensic Science
Challenges:
Tiny Pieces of Evidence are hidden in a
mostly Chaotic Environment,
Trace Study to reveal Specific Properties,
- Conjectures. F o r e n s i cs L a b
6
Computational Forensics
Current Situation
Knowledge and intuition of the
human expert plays a central role in daily
forensic casework.
Courtroom forensic testimony is
often criticized by defense lawyers as
lacking a scientific basis.
Huge amount of data, tide operational times,
and data linkage pose challenges.
7
Computational Forensics
F o r e n s i cs L a b
8
Computational Forensics
Computational vs.
Computer (Digital) Forensics
Computational Forensics uses computational
sciences to study any type of evidence:
Computer forensics
Crime Scene Investigation
Forensic paleography
Forensic anthropology
Forensic chemistry
Computer Forensics studies digital evidence:
File-system forensics
Live-system forensics
Mobile-device forensics etc. F o r e n s i cs L a b
9
Computational Forensics
Examples of Ongoing Research I
10
Computational Forensics
Examples of Ongoing Research II
Reconstruction of
Shredder and Ripped-Up
Documents by
Ukowich et al. 2007,
de Smet 2007, and
Chanda et al. 2010.
11
Computational Forensics
Computational vs.
Computer (Digital) Forensics
Computational Forensics uses computational
sciences to study any type of evidence:
Computer forensics
Crime Scene Investigation
Forensic paleography
Forensic anthropology
Forensic chemistry
Computer Forensics studies digital evidence:
File-system forensics
Live-system forensics
Mobile-device forensics etc. F o r e n s i cs L a b
12
Computational Forensics
Digital Evidence Sources
13
Computational Forensics
Examples of Digital Evidence I
Undeleted (renamed) files, Deleted files Digital Images, Videos, Audio files
Windows registry, Log files Text Documents, Notes, Emails, Chat
Print spool files, Browser caches Documents (e.g., GPS location, MACtimes)
Temp files (all those .TMP files!) Registries, Log files
Swap files Bomb-making diagrams
Alternate partitions Malicious Software (e.g. Viruses, Worms)
Devices
Computers, PDAs, cellular phones,
SIM / Smart Card,
videogame consoles,
Copy machines, printers, F o r e n s i cs L a b
14 Cameras, electronic pen-tablets
Computational Forensics
Examples of Digital Evidence II
Wireless telephones Landline Telephones &
Numbers called Answering machines
Incoming calls Incoming/outgoing messages
Systems
operating systems, database systems, networks, middleware, F o r e n s i cs L a b
17
Computational Forensics
Selected Forensic Tools
EnCase by Guidance Software, Windows suite of forensic
tools, Quasi-standard
Forensic Toolkit (FTK) by AccessData, court-validated
investigator platform for forensic analysis, incl.
decryption and password cracking capabilities, popular
alternative to EnCase suite.
Autopsy & The Sleuth Kit is Open Source, Autopsy is
graphical interface for The Sleuth Kit (TSK) command line
tools, both on UNIX platforms, and Cygwin for Windows.
Oxygen Forensic Suite by Oxygen Software, is a mobile
forensic software, smart forensics for smart phones
COFEE by Microsoft Inc., a useful tool for basic forensics
F o r e n s i cs L a b
18
Computational Forensics
Forensics Tasks vs Problem Areas
Tasks Accomplished Problem Areas
(Examples) Damaged Hardware - device
Reveal evidence that put a is physically destroyed,
person at a keyboard at a
specific time.
Securely overwritten - tools
are used to destroy all the
Recover deleted files,
binary data on the disk,
Discover when files where
created, modified, deleted, Encrypted devices - unless
applications run and installed, encryption key can be
websites, obtained
Reassemble fragmented parts of
images, and other files.
F o r e n s i cs L a b
19
Computational Forensics
Forensic-tool Testing
Background / Motivation
US Supreme Court ruling of Frye v. United States and
Daubert v. Merrell Dow Pharmaceuticals Inc
Daubert Criteria
Has the method in question undergone empirical testing?
Has the method been subjected to peer review?
Does the method have any known or potential error rate?
Do standards exist for the control of the technique's operation?
Has the method received general acceptance in the relevant scientific community?
NIST Computer Forensics Tool Testing (CFTT)
http://www.cftt.nist.gov/
Scientific Working Group on Digital Evidence (SWGDE)
2009-01-15 SWGDE Recommendations for Validation Testing Version v1.1
IEEE Standard 829 - Standard for Software Test Documentation:
F o r e n s i cs L a b
20
1983 version superseded by 1998 version.
Computational Forensics
21
Computational Forensics
Objective
Comprehensive overview of the main
topics and concepts
Update framework ontology for the
domain of digital forensics
Attempted to map some of the
existing relations between these concepts
Intend to be seed for further definition
Over time common reference and define
common vocabulary
F o r e n s i cs L a b
22
Computational Forensics
Previous Work
Significant contributions
Forensics wiki. http://www.forensicswiki.org/wiki/.
Ashley Brinson, Abigail Robinson, and Marcus Rogers.
A cyber forensics ontology: Creating a new approach to
studying cyber forensics. In The Proceedings of the 6th
Annual Digital Forensic Research Workshop (DFRWS 06),
volume 3, 2006.
David Christopher Harrill and Richard P. Mislan.
A small scale digital device forensics ontology. Small scale
digital device forensics journal, 1, 2007.
23
Computational Forensics
Developed Ontology two layers expanded
F o r e n s i cs L a b
24 http://www.mindmeister.com/maps/show/48668592
Computational Forensics
Main Concepts I
Digital Evidence Digital Forensic Tools
Physical Evidence Counter-Forensics
Logical
Encryption
External
Steganography
Digital Forensic Methods Proxies
Data Duplication
Image Analysis Storageless devices
Audio Analysis Secure Deletion
Document Analysis Data Tampering
File Analysis Digital Forensic Crime Cases
Network Analysis Cyber Crime Cases
Data Reduction Traditional Crime Cases
Data Recovery
Data Analysis F o r e n s i cs L a b
25
Computational Forensics
Main Concepts II
Digital Forensic Process Professions
Preparation Law
Identification Academia
Approach Strategy Military
Preservation Private sector
Collection Legal Aspects
Examination / Analysis Terminology
Presentation
Returning evidence
F o r e n s i cs L a b
26
Computational Forensics
Future Work
Mapping / Linking all the relations that
exist across classes
Represent the digital forensics ontology in
machine readable form
Usage of web ontology language (OWL)
World Wide Web Consortium (W3C). Owl web ontology language
overview. http://www.w3.org/TR/owl-features/.
F o r e n s i cs L a b
27
Computational Forensics
Computational Forensics:
Admission of Artificial Intelligence
Methodologies in Forensic Sciences
Katrin Franke
Norwegian Information Security Laboratory (NISlab)
Gjvik University College
www.nislab.no
F o r e n s i cs L a b
28
Computational Forensics
Requirement of Adapted
Computer Models & Operators
Brain
FL
NN
Reasoning
EC
Imprecision, Computational
Uncertainty, Intelligence
Partial Truth NN: Neuronal Networks
FL: Fuzzy Logic
EC: Evolutionary Computation
Natural Evolution
F o r e n s i cs L a b
29
Computational Forensics
Computational Methods
Signal / Image Processing : one-dimensional signals and two-dimensional
images are transformed for better human or machine processing,
Computer Vision : images are automatically recognized to identify objects,
Computer Graphics / Data Visualization :
two-dimensional images or three-dimensional scenes are synthesized from
multi-dimensional data for better human understanding,
Statistical Pattern Recognition :
abstract measurements are classified as belonging to one or more classes, e.g.,
whether a sample belongs to a known class and with what probability,
Machine Learning : a mathematical model is learnt from examples.
Data Mining : large volumes of data are processed to discover nuggets of
information, e.g., presence of associations, number of clusters, outliers, etc.
Robotics : human movements are replicated by a machine.
F o r e n s i cs L a b
30
Computational Forensics
Objective
Study and development of computational
methods to
Assist in basic and applied research, e.g. to
establish or prove the scientific basis of a
particular investigative procedure,
Support the forensic examiner in their
daily casework.
F o r e n s i cs L a b
31
Computational Forensics
Computational Forensics -
Definition
It is understood as the hypothesis-driven investigation of a
specific forensic problem using computers, with the primary
goal of discovery and advancement of forensic knowledge.
CF works towards:
1) In-depth Understanding of a forensic discipline,
2) Evaluation of a particular scientific method basis and
3) Systematic Approach to forensic sciences by applying
techniques of computer science, applied mathematics and
statistics.
It involves Modeling and computer Simulation (Synthesis)
and/or computer-based Analysis and Recognition
F o r e n s i cs L a b
32
Computational Forensics
Admission of
Computational Forensics
1. Need of Automatization,
Standardization, and Benchmarking
F o r e n s i cs L a b
33
Computational Forensics
Automatization, Standardization,
and Benchmarking
Increase Efficiency and Effectiveness
Perform Method / Tool Testing regarding their
Strengths/Weaknesses and their Likelihood Ratio
(Error Rate)
Gather, manage and extrapolate data, and to
synthesize new Data Sets on demand.
Establish and implement Standards for data,
work procedures and journal processes
34
Computational Forensics
Joint Research & Development:
Forensic and Computer Scientist
Education and training,
Revealing the state-of-the art in *each* domain
Sources of information on events, activities and financing
opportunities
International forum to peer-review
and exchange, e.g., IWCF workshops
Performance evaluation, benchmarking, proof and
standardization of algorithms
Resources in forms of data sets, software tools, and
specifications e.g. data formats
New Insights on problem description and procedures
F o r e n s i cs L a b
35
Computational Forensics
Legal Framework ?!
F o r e n s i cs L a b
36
Computational Forensics
Educational Information
CompFor courses and study programs
Gjvik University College, NO (Master, PhD)
Uni. of Amsterdam, NL (Master)
TU Kaiserslautern, DE (Master, announced)
Article in Wikipedia *
http://en.wikipedia.org/wiki/Computational_foren
sics
Brief Tutorial and Overview Article
http://sites.google.com/site/compforgroup/publica
tions
Links to relevant data collections *
http://sites.google.com/site/compforgroup/data
F o r e n s i cs L a b
37 * To be extended !
Computational Forensics
http://iwcf2012.arsforensica.org
F o r e n s i cs L a b
38
Computational Forensics
Computational Methods:
Machine Learning and Data Mining
Katrin Franke
Norwegian Information Security Laboratory (NISlab)
Gjvik University College
www.nislab.no
F o r e n s i cs L a b
39
Computational Forensics
Machine Learning, Data Mining
and Artificial Intelligence
F o r e n s i cs L a b
40
Computational Forensics
General ML Approach
Data Collection
Large sample of data of how humans
perform the task
Model Selection
Settle on a parametric statistical model of
the process
Parameter Estimation
Calculate parameter values by inspecting
the data
Using learned model perform: Search
Find optimal solution to given problem
F o r e n s i cs L a b
41
Computational Forensics
Example Problem:
Handwritten Digit Recognition
F o r e n s i cs L a b
42
Computational Forensics
Role of Machine Learning
Principled way of building high performance information
processing systems
F o r e n s i cs L a b
43
Computational Forensics
Pattern Recognition
A
Goals:
A supervised / unsupervised classification of
A patterns by means of computer technology
B
small intraclass and large interclass variation
B
B
B
Pattern:
C as opposite of a chaos;
C it is an entity, vaguely defined, that
C
C
could be given a name Watanabe 1985
X
F o r e n s i cs L a b
44
Computational Forensics
Pattern Classification
C
C
C
? A
A
A
B
B B
B
* * B X
*
45
Computational Forensics
and Selection by * *
using Training Patterns Feature Vector 2
B *
Cross-validation by using
Feature Vector 3
Test Patterns
F o r e n s i cs L a b
46
Computational Forensics
Pattern Representation and
Classification Example
X C A B B
A
Feature Vector 1
1** 2** 1** 1** 2** 2
* *
Feature Vector 2
Feature Vector 3
5
14A 24C 16A 16B 20B
Classes
Size Label
Number of
corners
F o r e n s i cs L a b
47
Computational Forensics
Classifier Training, or
How does Computers learn?
Learning by Example !
Requirements
Representative Sample Data
Appropriate Feature
Encoding
Challenge
Class Discrimination
Avoid Over Learning
F o r e n s i cs L a b
48
Computational Forensics
Best-known Approaches for
Pattern Recognition
Template Matching
Syntactical or Structural PR
Statistical PR
Neuronal Networks
F o r e n s i cs L a b
49
Computational Forensics
Model for Pattern Recognition
Test
pattern Feature
Preprocessing Measurement Classification
Classification
Training
F o r e n s i cs L a b
50
*
Computational Forensics
Recognition Methods in Numbers
Statistical Pattern Recognition: A Review, A.K. Jain, R.P.W. Duin and J. Mao, 2000, PAMI F o r e n s i cs L a b
*
51 Note that biological-inspired methods come in addition
Computational Forensics
F o r e n s i cs L a b
52
Computational Forensics
Challenges in Cybercrime
F o r e n s i cs L a b
53
Computational Forensics
F o r e n s i cs L a b
54
Computational Forensics
Model for Pattern Recognition
Test
Feature
pattern Preprocessing Classification
Measurement
Classification
Training
Preprocessing Feature Learning
Training
Extraction /
pattern
Selection
F o r e n s i cs L a b
55
Computational Forensics
Our Research Focus
1. Generalization of several feature selection measures.
2. Optimization to derive globally optimal feature subsets.
F o r e n s i cs L a b
56
Computational Forensics
CFS and mRMR Feature Selection
M. Hall. Correlation Based Feature Selection for Machine Learning. H. Peng, F. Long, and C. Ding. Feature selection based on mutual
Doctoral Dissertation, University of Waikato, Department of Comp. information: criteria of max-dependency, max-relevance, and min-
Science, 1999. redundancy. IEEE Transactions on PAMI, Vol. 27, No. 8, pp.1226-1238,
2005.
F o r e n s i cs L a b
57
Computational Forensics
Generic Feature Selection (GeFS)
Question: Can the CFS measure and the mRMR measures be
fused and generalized into a generic feature selection measure?
Definition 1: A generic feature selection (GeFS) measure is
defined as follows:
58
Computational Forensics
Search Procedures
Exhaustive search Heuristic search
Globally optimal feature Locally optimal feature subsets
subsets Faster than exhaustive search
Slow with complexity of
O(2n)
Examples: Examples:
Exhaustive search Greedy search
Breadth-first search Gradient search
Depth-first search
Iterative deepening
Simulated annealing
Branch-and-Bound Genetic algorithms
59
Computational Forensics
Problem Transformation
Changs method for Our method for
solving PM01FP solving PM01FP
Linearizing Differently linearizing
PM01FP problem into mixed PM01FP problem into mixed
0-1 linear programming 0-1linear programming
problem (M01LP). problem (M01LP).
The number of variables & The number of variables &
constraints: constraints:
n2 4n+1
Branch and Bound algorithm. Branch and Bound algorithm.
C-T. Chang. On the polynomial mixed 0-1 fractional
programming problems, European Journal of
Operational Research, vol. 131, issue 1, F o r e n s i cs L a b
61
Computational Forensics
Experimental Results
62
Computational Forensics
Summary
63
Computational Forensics
Cross-Computer Malware
Detection in Digital Forensics
Anders Orsten Flaglien, Peter Ekstrand Berg, Lars Arne Sand
Katrin Franke, Andre Arnes
F o r e n s i cs L a b
64
Computational Forensics
Model for Pattern Recognition
Test
pattern Feature
Preprocessing Classification
Measurement
Classification
Training
Preprocessing Feature Learning
Training
Extraction /
pattern
Selection
F o r e n s i cs L a b
65
Computational Forensics
Distributed Malware, as Botnets
F o r e n s i cs L a b
66
Computational Forensics
Application Scenario
F o r e n s i cs L a b
67
Computational Forensics
Applied Method
Data Collection
Examination
File Metadata Extraction
Hash Filtering
Feature Extraction
Analysis (Link Mining)
Combining
Pre-processing
Clustering
68
Linking Machines F o r e n s i cs L a b
Computational Forensics
deLink
A method that can assist human analysis in order to improve the decision
making and further improve the result of digital forensics
F o r e n s i cs L a b
69
Computational Forensics Features of interest has to be selected, that
best represent the characteristics of the
input data
File metadata represent most of the
features
Content-based features improve
knowledge and represent file content
Special characteristics that
reflect typical malware patterns
Strings from regular expressions
Case supplied metadata should only
be used for tracing its origin, not for
the clustering task
F o r e n s i cs L a b
70
Computational Forensics Examination of input data to create a
textual and structured representation
Feature files are created from Feature File Machine n
selected features, extracted from Feature File Machine
File ObjectFeature File Machine
1.... Feature A, B, C,1D, E, F, G, H
copied disk images FileObject
File Object1....
2....Feature
FeatureA, A,B,B,C,C,D, D,E,E,F,F,G, G,HH
ARFF file format is used (built File
FileObject
File Object1....
Object 3....Feature
2.... FeatureA,A,A,B,B,B,C,C,C,D,D,D,E,E,E,F,F,F,G,
Feature G,H
G, H
H
using Fiwalk [16] and python File
FileObject
Object2....
3....Feature
FeatureA,A,B,B,C,C,D,D,E,E,F,F,G, G,HH
File
FileObject
Object3.... Feature
n.... A,A,B,B,C,C,D,D,E,E,F, F,G,G,H
Feature H
scripts for additional features) File Object n.... Feature A, B, C, D, E, F, G, H
File Object n.... Feature A, B, C, D, E, F, G, H
71
Computational Forensics Link Mining is performed to identify
correlations, and thus identify malware traces
A dataset is generated from all Feature Files
from the machines (separated with case
metadata)
A descriptive data mining method, using an
unsupervised clustering algorithm
Clustering provides a group detection, the
links between multiple machines exist for
clusters in which files are present
The simple K-means algorithm is used for clustering, with Euclidean
Distance for proximity measure
Preprocessing is performed to suit and to optimize the result of the
clustering algorithm
Removing redundant features
Converting features to an appropriate format (nominal and numeric)
F o r e n s i cs L a b
72
Computational Forensics Link Mining Evaluation is performed to
measure the results and to find clusters of
interest
Unalloc
Misinterpretation of the link
mining and use of features have to
be considered and can be Dir
identified through link mining
evaluations [24, 33]
Classified data can be compared to Raw
73
Computational Forensics The practical implementation and involved
processing steps for Machines m in Case n
F o r e n s i cs L a b
74
Computational Forensics Multiple Experiments were executed to
evaluate deLinks efficiency and effectiveness
Three experiments were
executed in a virtual
environment
Proof-of-concept
Passive self developed bot
malware
Malware from the wild
Spybot v1.3 Figure 2: Botnet with C&C server and attack website
Most realistic results were obtained from the experiment using Spybot v1.3
A network infrastructure with command & control server, infection site and 5
victim computers were used
Online banking attack scenario, where the computers have been infected and
taken control over by an adversary bot master to, e.g., execute DDoS
F o r e n s i cs L a b
75
Computational Forensics Hash-based filtering in the experiments
reduced the amount of input data for the
link mining to process
F o r e n s i cs L a b
76
Computational Forensics Cluster creation in the experiment grouped
the file objects, according to their
characteristics
SOM diagrams was used to help estimate the nature of
input dataset (used as input k to k-means, but could also
be used alone for clustering objects)
Three segments were identified in the dataset from the
Malware from the Wild experiment
Figure 3: SOM diagram
Clustered the file objects (with k=3, means)with common
characteristics across all machines (using Weka machine
learning tool)
5
Machine ID
4
C1 C2 C3 F o r e n s i cs L a b
Clusters 77
77 Figure 4: Machine files representation in clusters
Computational Forensics Timeline visualization of experiment results,
reflecting one cluster, filtered with the
malicious IP address
Expert knowledge about an incident are available, e.g., the approximated
time when an incident occurred, origin, indications of effect and extent.
1
2
Machine ID
5
T0 AccessT1
Time T2
Figure 5: One cluster with files, having suspected IP, timelined F o r e n s i cs L a b
78
T0: IE history files accessed, T1: Infection file accessed, T2: Additional infections files accessed
Computational Forensics
Summary
F o r e n s i cs L a b
79
Computational Forensics
Concluding Remarks
Computational forensics holds the
potential to greatly benefit all of the
forensic sciences.
For the computer scientist it poses a new
frontier where new problems and challenges
are to be faced.
The potential benefits to society, meaningful
inter-disciplinary research, and
challenging problems should attract high
quality students and researchers to the field.
F o r e n s i cs L a b
80
Computational Forensics
Further Reading
NAS Report: Strengthening Forensic Science in the United States: A Path Forward
http://www.nap.edu/catalog/12589.html
van der Steen, M., Blom, M.: A roadmap for future forensic research. Technical report, Netherlands Forensic Institute
(NFI), The Hague, The Netherlands (2007)
M. Saks and J. Koehler. The coming paradigm shift in forensic identification science. Science, 309:892-895, 2005.
Starzecpyzel. United states vs. Starzecpyzel. 880 F. Supp. 1027 (S.D.N.Y), 1995.
http://en.wikipedia.org/wiki/Daubert_Standard
C. Aitken and F. Taroni. Statistics and the Evaluation of Evidence for Forensic Scientists. Wiley, 2nd edition, 2005.
K. Foster and P. Huber. Judging Science. MIT Press, 1999.
Franke, K., Srihari, S.N. (2008). Computational Forensics: An Overview, in Computational Forensics - IWCF 2008, LNCS
5158, Srihari, S., Franke, K. (Eds.), Springer Verlag, pp. 1-10.
http://sites.google.com/site/compforgroup/
Nguyen, H., Franke, K., Petrovic, S. (2010). Towards a Generic Feature-Selection Measure for Intrusion Detection, In
Proc. International Conference on Pattern Recognition (ICPR), Turkey.
Nguyen, H., Petrovic, S. Franke, K. (2010). A Comparison of Feature-Selection Methods for Intrusion Detection, In
Proceedings of Fifth International Conference on Mathematical Methods, Models, and Architectures for Computer
Networks Security (MMM-ACNS), St.Petersburg, Russia, September 8-11. (accepted for publication)
Nguyen, H., Franke, K., Petrovic, S. (2010). Improving Effectiveness of Intrusion Detection by Correlation Feature
Selection, International Conference on Availability, Reliability and Security (ARES), Krakow, Poland, pp. 17-24.
Flagien, A.O., Arnes, A., Franke, K., (2010). Cross-Computer Malware Detection in Digital Forensics, Techn. Report,
Gjvik University College, June 2010.
F o r e n s i cs L a b
81
Computational Forensics
F o r e n s i cs L a b
82
Computational Forensics
F o r e n s i cs L a b
83
Computational Forensics
Model for Pattern Recognition
Test
Feature
pattern Preprocessing Classification
Measurement
Classification
Training
Preprocessing Feature Learning
Training
Extraction /
pattern
Selection
F o r e n s i cs L a b
84
Computational Forensics
Challenges
F o r e n s i cs L a b
85
Computational Forensics
Objective
F o r e n s i cs L a b
86
Computational Forensics
Applied Method
M ROLEX di
fragi
F o r e n s i cs L a b
88
Computational Forensics
Edit Distance Revisited
Definition: The edit distance between two words (strings) is
the minimal number of edit operations (insertions, deletions,
or substitutions) that must be performed to convert one word
into the other.
di = 1 di = 2 di = 3 di = 2
89
Computational Forensics
Constrained Edit Distance
F - the maximum number of
consecutive deletions
G - the maximum number of
M ROLEX di consecutive insertions
R ...
... ... ... ... ... ... ...
... ...
... ...
... ...
... O
......
......
......
......
......
......
......
......
...L.....................E ...
... ...
... ...
X ...
F
N
F o r e n s i cs L a b
90
Computational Forensics
Results I
R = f (N, F)
Dependency of the data-set
reduction R on the length of
the fragments (N) for
different values of
consecutive deletions (F).
Const. restrictness
threshold
=0
IF ( di ( N - M + ))
THEN (accept (fragi))
M - length of the search string
F o r e n s i cs L a b
fragi - the specific fragment
92 di - edit distance of fragi
Computational Forensics
Summary
Two-stage search procedure
Constrained edit distance for pre-selection
(1st phase)
Enables rejection of fragments, in which the
detected string is too distorted
Level of tolerance is controlled by means of
the value of the constraints
95