Dear Dr. Zuker, First of all I wish to thank you and your group members for developing and organizing the DNA hybridization/folding prediction tools as a package. I am using UNAFold for predicting the free energy changes associated with the hybridization reaction between two single stranded DNA molecules (length < 100bp). A given pair of single stranded DNA molecules could form multiple/distinct hybrid structures. I wish to determine the probabilistic distribution of distinct hybrid structures (and the associated free energy values) that could form between a pair of DNA sequences. Currently I am facing difficulties in understanding the results from stochastic tracebacks in ‘hybrid’ command.

For the given pair of sequences located in the two files
Contents in file S1.seq: AGTGCGTGCA;
Contents in file S2.seq: AGGGCGCGAA;

I performed hybridization predictions (T = 37, DNA, default options for other parameters) using
(1) Hybrid-min for predicting the minimum free energy duplex structure and its free energy value
(2) Hybrid with stochastic tracebacks option to get a distribution of different duplex structures together with their relative frequency and free energy values

I expected that the structure predicted by hybrid-min would be the structure with high frequency in the results from stochastic tracebacks. Although the hybrid-structure that occurs with highest frequency in the stochastic traceback results shares the same duplex structure with the hybrid structure predicted by hybrid-min, there is a significant difference in the energy values. From the analysis of the elements that contribute to the observed energy values, I noticed that the elements in the “Exterior” contribute to this observed difference in the energy. I do not understand the reason behind the observed difference in the free energy values predicted by hybrid-min and the stochastic tracebacks. It will be helpful for me if you can kindly provide me some details on how the stochastic traceback option works.

