Select folding in .ct file

2 replies [Last post]
ltchris
Offline
Joined: 02/25/2013

Hi,

I am very new into UNAFold and even in silico RNA folding, so I hope my question will be trivial.
For learning how to use UNAFold I am using the document : http://www.springerprotocols.com/Abstract/doi/10.1007/978-1-60327-429-6_1

For the moment I am just trying to fold an RNA sequence, using hybrid-ss and hybrid-ss-min.

My problem is that the .ct generated contains a bunch of foldings, and I want to display one of them, or maybe the "most representative" (e.g. the one with the lower free energy) using sir_graph.
So how can we select in a .ct file, one of the folding ?

(I have to say that I am running on windows 7, so I have the same problem than here http://mfold.rna.albany.edu/?q=node/109#comment-form even if my PATH environment variable is set to the right UNAFold folder. Thus, I am running hybrid_ss and sir_graph directly in the shell, because I can't use your UNAFold.pl script)

Thanks in advance for your help

zukerm
Offline
Joined: 11/12/2010
The UNAFold.pl script

The UNAFold.pl script contains a function that splits a .ct file into separate files, each containing a single secondary structure. It is, obviously, written in Perl. I offer you the simple awk script, appended below.

Usage is: awk -f split_ct.awk file-to-split prefix-for-split-files

For example, if "fubar.ct" contains 23 secondary structures, then the command

awk -f split_ct.awk fubar.ct junk

would create "junk_1.ct", "junk_2.ct", ... "junk_23.ct".

Each of these files would contain a single secondary structure.

---------- Begin awk file split_ct.awk---------------------------


# Split a ct file into individual files.                                         

BEGIN   {
        count = 0; i = 0; n = 0;
        if ( ARGC-- != 3 ) {
          print "usage:";
          print "nawk -f split.awk file-to-split prefix-for-split-files";
          exit 1;
          }
        CurrentFile = ARGV[2];
        }
{
        i++;
        if (i==(n+1) || n==0) {
        n = $1; i = 0;
        if ( CurrentFile != ARGV[2] ) close(CurrentFile);
        count++;
        CurrentFile = ARGV[2] "_" count ".ct";
        print "CurrentFile=",CurrentFile;
        };
        print $0 > CurrentFile;
}

END {   if ( count ) close(CurrentFile); }

----------- END of awk script ---------------------------------

zukerm
Offline
Joined: 11/12/2010
I can't help you with the

I can't help you with the "Windows problem". I have an awk script that splits a ct file into individual ct files that contain one structure each. sir_graph can be used with those files. If you run the UNAFold.pl script, then the large ct output file is split into multiple files automatically.