Quick start


A working example is distributed with ColinearScan which shows howto use CoinearScan to dectect colinear between chromosomes (chromsome 2 of arabidopsis and chromosome 5 of indica). The following shows the workflow step by step based on the example. A script named run_example.sh is provided to run the whole pipeline.

BLAST

BLAST is used to detect homologs between 2 chromosomes (using all anchors, genes in the example, of one chromosome as subject and ones of the other as query). The BLAST output of chromsome 2 of arabidopsis and chromosome 4 is ready in the example which is named ath_chr2_indica_chr5.blast.

Parse BLAST output and get pairs

BLAST results should be parsed and extract homologous pairs of anchors (genes in this example) satisfying indicated rule. In this example we use score >= 100 as the rule.

cat ath_chr2_indica_chr5.blast | get_pairs.pl --score 100 > ath_chr2_indica_chr5.pairs

Mask highly repreated anchors

Highly repeated anchors which are mostly generated by continuous single gene duplication events make those colinear segements vague to be detected. A very simple program is distributed to mask highly repeated anchors. If an anchor has pairs more than indicated number is concidered as highly repeated and masks it off from pair file. Because repeat mask is highly related to character of specific genome data, more complicated algorithm may need to be built by users.

cat ath_chr2_indica_chr5.pairs | repeat_mask.pl -n 5 > ath_chr2_indica_chr5.purged

Estimate maximum gap length

Maximum gap length (mg) is the most important parameter to the algorithm dectecting colinearity. A program named max_gap.pl is provided to estimate mg values using pair file (with highly repeated anchored masked). Note that chrosome lengthes is necessary to estimate mg values and max_gap.pl requires length file(s) (see File Specification for more information).

max_gap.pl --lenfile ath_chrs.lens --lenfile indica_chrs.lens --suffix purged

Detect blocks from pair files

When mg values is estimated, we can use it to scan colinearity between chromsomes.

block_scan.pl --mg 321000 --mg 507000 --lenfile ath_chrs.lens --lenfile indica_chrs.lens --suffix purged
And the results will to written to block files (see File Specification for more information) corresponding to pair files which is used to generate the result.



  <<Home

Copyright © 2006 Center of Bioinformatics, Peking University. All rights reserved.