1 Motif extraction problem
Motif extraction is one of the important problems in genetic information
processing. It extracts common patterns(motifs) from amino acid sequences
of the same protein category, which are conserved in the evolution process
and characterize the function/structure of proteins.
2 Motif evaluation by the MDL principle
As a criterion for motif evaluation, the minimum description length(MDL)
principle was adopted. The MDL principle selects a motif with the shortest
description length defined below.
Description length = Complexity of motif + Classification error rate
The MDL principle enables us to compare a simple motif with exceptions
and a complex motif without exceptions as illustrated below.
- 94 -