Software used in the project
- Installation and configuration of leading genomics software:
- Phred - the base calling programme (from Phil Green, Uni. Washington) see www.phrap.org
- Phrap - assembly programme (Phil Green, UW)
- Consed - assembly editor (David Gordon, UW)
- Blast - database search engine from NCBI
- PERL Scripts - tools created in this project by Markiyan Oliynyk:
- Auto_asm - automated processing and assembly
- Contigs2fasta - prepares consensus for multiply blast search
- Ace_map - assembly mapper
- Web_map - interactive map generator
- Blast Interactive Interface
- BlastAnalyse - Extracts results of a multiply blast search
- Prime - primer design programme for large assemblies
- PERL Scripts - tools created in this project by Markiyan Samborskyy:
- phd_read_db_1_4.pl - import of phred's data from phd files into phd.read_phd table (information about quality of each read, search of "hits into structure" - like reads)
- ace_read_db_2_7.pl - import all the data about assembly from ace file into ace.contigs_ace, ace.contigs_ab1 and ace.contigs_RT tables
- ace_cosmid_db_4_1.pl - simple mapping of cosmids ends to contigs - this information later is used for primer design by primer_db_1_1.pl
- primer_db_1_1_2.pl - new primer generator - uses data from databases (ace and phd) generated by ace_cosmid_db_4_1.pl, ace_read_db_2_7.pl and phd_read_db_1_4.pl scripts. It creates primers_new_for_xxx (xxx - genome assembly version). The information from this table is later used for generating plates of primers for ordering them.
- fasta_file_gen_1_0.pl - generates fasta file used for creation genome blast search database. It includes assembly version, contig N and list of cosmids partially or wholly overlapping with this contig.
- gnm_status.pl - updates status section of genome web page each time the genome is assembled.
- fasta_file_gen_primers_usable_1_0.pl - generates fasta file with useable designed primers. Takes data from the following tables: primers.primer_plates_designed and primers.primer_plates_designed_data - data about all primers designed for this project, and selects primers that are marked as useable. This file is used for primer_locate and for generating Saccharopolyspora erythaea finishing primers BLAST database.
Data for the project is generated using one ABI Prism 3730xl capillary
The sequencing data is processed and analysed on a updated Jseq computer cluster
consisting of two AMD and two intel machines:
GENBASE (Intel Pentium4 3.4 GHz LGA , 3 GB RAM),
- Genome assembly, mySQL database server, web server,
GENWORKWIN (Intel Pentium4 3.2 GHz LGA, 1 GB RAM),
- Win 2000 SP4, databases monitoring by the means of MS Access 2000)
John (AMD Athlon MP 1700+ Dual),
JBLSEQDAT (AMD Athlon 700 MHz). - firewall
Two AMD and one Intel machine are running Linux Mandriva operating system