|
Our laboratory has developed a relational database, PA14 Transposon Insertion Mutant Database (PATIMDB, see Figure 1), to track,
sort and analyze the thousands of mutants in the library. During construction of the library, sequence data from the stretch of genomic
DNA adjacent to the inserted transposon was entered into the database. PATIMDB stores the results of BLAST alignments of these
sequences with sequences in the PA14 genome (http://ausubellab.mgh.harvard.edu/pa14sequencing/) and with sequences in the the Pseudomonas aeruginosa
strain PAO1 genome (see http://www.pseudomonas.com and references 6 and 16).
The PATIMDB system performs process and sample tracking, and the automation of mutant sequence analysis.
It has three main parts. These include (1) a database repository to store experimental status information,
sample locations, and mutant characterizations, (2) a Data-entry application to track samples,
and experimental progress, and perform automated sequence analysis for each sample to identify the locus
of each mutation, and (3) a Data-retrieval web-application to allow public access to the data.
The database was implemented using the MySQL RDBMS hosted on a multi-processor Intel system running RedHat Linux.
The data-entry application was written in Java and runs on Windows 2000. This application implements both
process tracking as well as the automated sequence analysis, pipeline. Sequences in the form of ABI files
are imported into the application, and base-calling is performed using PHRED. The resulting raw sequence
is trimmed to select only the high quality sequence, which is then compared by BLAST to the PA14 genome to
identify the insertion site of the transposon and the identity of the disrupted ORF.
The data-retrieval system was implemented using Perl-based CGI web application hosted on multi-processor Intel
system running RedHat Enterprise Linux with Apache. The functions of the Data-retrieval web-application include
performance of database queries and the download of data from PA-TIMDB over the web, for example, a list of mutants
with identified insertion locations. The database can be accessed online by clicking
here
Extensive quality assurance testing was performed on the database, the data-input application, and the
data-retrieval application, to insure that the PA-TIMDB system accurately relates and processes each
mutant DNA sequence file.
|