GCG programs require that the sequence files contain no more than 325,000 nt. The file containing the T. pallidum sequence in the Fasta format (gtp4.1.seq) is a single contig of 1,138,011 nt. Coordinates and annotations in later versions may change as the result of additional corrections.
To make it easier to create a GCG database from the T. pallidum sequence, the contig has been subdivided into four GCG-formatted sequence files:
gtp4.1a.seq nt 1-300,000 gtp4.1b.seq nt 300,001-600,000 gtp4.1c.seq nt 600,001-900,000 gtp4.1d.seq nt 900,001-1,138,011
Instructions for downloading these four sequence files are provided below.
Use a user-friendly ftp interface to download the four files listed above, or alternatively enter the following commands from a unix prompt:
ftp utmmg.med.uth.tmc.edu Login: anonymous Password: type in your email address here ftp> cd pub ftp> cd t_pallidum ftp> bin (this sets the ftp mode to binary) ftp> mget * (Answer y to each prompt.) ftp> quitAt any point, you may use the pwd and dir commands to see where you are and what files are there.
Questions, requests, or comments: treponema@utmmg.med.uth.tmc.edu