This is an old revision of the document!
Sequence motifs can contain degenerate symbols, meaning they match more than one option at a given position.
WGATAR - [A or T] G A T A [A or G]
This is very close to the regular expression you would use to search a sequence.
grep '[AT]GATA[AG]' sequence.fa
The brackets in regular expression syntax stand for a character set. Any single match can have one of the characters inside the bracket. The match length for the above example is always the same length: 6 characters.
The degenerate symbols are from the IUPAC standard:
|Symbol||Stands for||Character set|
Our sequence output was in lowercase, go to use these patterns, we must either type them in lowercase or use the
-i flag for “case insensitive”.
grep -i pattern file.fa