ct file does not include length

3 replies [Last post]
jwhipple
Offline
Joined: 05/16/2013

When running UNAfold.pl on a list of RNAs, I sometimes encounter the following error message:

Input File: somesequences.fasta_761.ct
The first record of ct file does not include length
The first record is: 0 dG = 999.999 chrI:14553292:14553313

Abnormal exit from sir_graph.
Exit status 43 from sir_graph

While the sequence is quite small, in this case 21 nt, UNAfold.pl sees the length as zero and exits from sir_graph. Because my list is not ranked by length, this error prevents UNAFOLD.pl from generating a structure for the remaining sequences in the file. In the past, I've extended the length of the sequence by a few nucleotides and this fixed the problem, however this is something I'd like to avoid doing.

Thanks in advance for your help,

Joe W

deev21
Offline
Joined: 11/25/2014
I'm having the same issue

I'm having the same issue with my code and im not at all able to fix it. I tried finding it online but nothing seems to work for me.
Here is the website im working on at the moment: devennreview.in

Any help would be much appreciated.

Kind Regards,
Deev

zukerm
Offline
Joined: 11/12/2010
UNAFold.pl and bad ct files

This is my second and brief reply. In the UNAFold.pl.in "pre-script", find every occurrence of " die " and replace with " print ". I just verified that it works. Then issue "make" to generate the new script and "make install" to copy it (and everything else) into the appropriate "bin" directory. Note: You might have to erase the existing UNAFold.pl script to get "make" to generate a new version. Again, this is our fault for not designing a better makefile precursor.

zukerm
Offline
Joined: 11/12/2010
I have not viewed the

I have not viewed the sequences that give you a problem, but I am almost certain that these RNA or DNA sequences have no secondary structure. The result is a nonsense output file (.ct file) consisting of one record (line). The length is zero and the "free energy" is infinite, which is either 999.999 or "inf" depending on the architecture of your computer.

I am personally embarrassed that our software contains such poor error handling in this instance. What should happen in these cases is that a normal ct file should be created with no base pairs. The sir_graph structure drawing program will not generate an error.
The structure prediction programs in the UNAFold package hybrid, hybrid_ss, hybrid_min, hybrid_ss_min should be modified to output a proper ct file with no base pairs if no structure exists. Failing that, the UNAFold.pl Perl script should be modified so that invalid ct files are ignored. I have a quick fix for you at the moment. Edit the UNAFold.pl.in file as directed below. The issue "make" and then "make install". The first command will generate UNAFold.pl from UNAFold.pl.in. The second command will copy the script to $PREFIX/bin, where $PREFIX is whatever directory you selected when you configured the package (default is /usr/local)

Modifications to UNAFold.pl(.in)

Open the file in a text editor. Search for lines that begin with:
system($sirgraph, @flags, -ss => "${prefix}_$fold")
Replace == 0 or die ($? == -1 ? $! : 'Exit status ' . ($? >> 8) . " from $sirgraph\n"); with a simple semi-colon to end the line. In other words if sir_graph fails, the script will continue to run. You may wish to replace everything after or with a print statement informing the user that sir_graph failed. That is probably a good idea. You will probably have to do the same with other lines that use system commands. These are system('pstopdf' ... and system('ct2rnaml' .... In other words, get rid of the die on any line that causes trouble. That is the best I can do at the moment. I can probably devise an updated script that doesn't "die" when it encounters a problem with a single output file.