An algorithm capable of identifying short repeat motifs was developed and used to screen the whole genome sequence available for Haemophilus influenzae, since some of these repeats have been shown to affect bacterial virulence. Various di- to hexanucleotide repeats were identified, confirming and extending previous findings on the existence of variable-number-of-tandem-repeat loci (VNTRs). Repeats with units of 7 or 8 nucleotides were not encountered. For all of the 3- to 6-nucleotide repeats in the H. influenzae chromosome, PCR tests capable of detecting allelic polymorphisms were designed. Fourteen of 18 of the potential VNTRs were indeed highly polymorphic when different strains were screened. Two of the potential VNTRs appeared to be short and homogeneous in length; another one may be specific for the H. influenzae Rd strain only. One of the primer sets generated fingerprint-type DNA banding patterns. The various repeat types differed with respect to intrinsic stability as well. It was noted for separate colonies derived from a single clinical specimen or strains passaged for several weeks on chocolate agar plates that the lengths of the VNTRs did not change. When several strains from different patients infected during an outbreak of lung disease were analyzed, increased but limited variation was encountered in all VNTR sites analyzed. One of the 5-nucleotide VNTRs proved to be hypervariable. This variability may reflect the molecular basis of a mechanism used by H. influenzae bacteria to successfully colonize and infect different human individuals.

