Abstract 

This system proves the effectiveness of the Parallel Inference Machine(PIM) 
for the motif extraction problem, which extracts common patterns(motifs) 
from s protein database. This uses the minimum description length principle 
and genetic algorithms. 

Features 

The experimental motif extraction system automatically extracts common 
patterns in some protein categories, such as cytochrome c. The system regards 
a motif as a stochastic rule to deal with exceptions to the classification of 
proteins. 
  1. The Minimum Description Length (MDL) principle was adopted as a criterion for motif evaluation to avoid motif's overfitting to sample data.
  2. Genetic Algorithms (GA) were employed as a motif search method to reduce the effects of the combinatorial explosion and to reduce search time.
  3. Highly parallelism on the PIM was achieved by exploiting trial, divide- and-conquer and data parallelism.
P.93 Figure 1
Configuration of Experimental Motif Extraction System
- 93 -