Every genome contains a large number of uncharacterized proteins that may encode entirely
novel biological systems. Many of these uncharacterized proteins fall into related sequence
families. By applying sequence and structural analysis we hope to provide insight into novel
We analyze a previously uncharacterized Pfam protein family called DUF4424
[Pfam:PF14415]. The recently solved three-dimensional structure of the protein lpg2210
from Legionella pneumophila provides the first structural information pertaining to this
family. This protein additionally includes the first representative structure of another Pfam
family called the YARHG domain [Pfam:PF13308]. The Pfam family DUF4424 adopts a 19-
stranded beta-sandwich fold that shows similarity to the N-terminal domain of leukotriene A-
4 hydrolase. The YARHG domain forms an all-helical domain at the C-terminus. Structure
analysis allows us to recognize distant similarities between the DUF4424 domain and
individual domains of M1 aminopeptidases and tricorn proteases, which form massive
proteasome-like capsids in both archaea and bacteria.
Based on our analyses we hypothesize that the DUF4424 domain may have a role in forming
large, multi-component enzyme complexes. We suggest that the YARGH domain may play a
role in binding a moiety in proximity with peptidoglycan, such as a hydrophobic outer
membrane lipid or lipopolysaccharide.
This article was published online in BMC Bioinformatics and is free to access.