![]() |
banana |
This program calculates the magnitude of local bending and macroscopic curvature at each point along an arbitrary B-DNA sequence, using any desired bending model that specifies values of twist, roll and tilt as a function of sequence.
The data, based on the nucleosome positioning data of Satchwell et al 1986 (J. Mol. Biol. 191, 659-675), correctly predicts experimental A-tract curvature as measured by gel retardation and cyclization kinetics and successfully predicts curvature in regions containing phased GGGCCC sequences. (This is the model 'a' described in the Goodsell & Dickerson paper).
This model - showing local bending at mixed sequence DNA, strong bends at the sequence GGC, and straight, rigid A-tracts - is the only model, out of six models investigated in Goodsell & Dickerson paper, that is consistent with both solution data from gel retardation and cyclization kinetics and structural data from x-ray crystallography.
The consensus sequence for DNA bending is 5 As and 5 non-As alternating. "N" is an ambiguity code for any base, and "B" is the ambiguity code for "not A" so "BANANA" is itself a bent sequence - hence the name of this program.
The program outputs both a graphical display and a text file of the results.
Two possible models of sequence-dependent bending in free DNA have been proposed in the past. Nearest neighbor models propose that large-scale measurable curvature may arise by the accumulation of many small local deformations in helical twist, roll, tilt and slide at individual steps between base pairs. junction models, on the other hand, propose that bending occurs at the interface between two different structural variants of the B-DNA double helix. Note that in both of these models, sequences which are anisotropically bendable - for instance, sequences with steps that preferentially bend only to compress the major groove - will lead to an average structure which is similar to a sequence with a rigid, intrinsic bend. The Goodsell & Dickerson paper does not distinguish between these two possibilities.
B-DNA has the special property of having its base pairs very nearly perpendicular to the overall helix axis. Hence the normal vector to each base pair can be taken as representing the local helix at that point, and curvature and bending can be studied simply by observing the behaviour of the normal vectors from one base to another along the helix. This is both easy to calculate and simple to interpret. This program display the magnitude of bending and curvature at each point along the sequence. It is not intended as a substitute for more elaborate three-dimensional trajectory calculations, but only to express bending tendencies as a function of sequence. The power of this simple appraoch is in its ease of screening for regions of a given DNA sequence where phased local bends add constructively to form an overall curve.
For purposes of clarity the terms bending and curvature will be used in a restricted sense here. Bending of DNA describes the tendency for successive base pairs to be non-parallel in an additive manner over several base pair steps. Bending most commonly is produced by a rolling of adjacent base pairs over one another about thir long axis, although in principle, tilting of base pairs about their short axis could make a contribution. In contrast curvature of DNA represents the tendency of the helix axis to follow a non-linear pathway over an appreciable length, in a manner that contributes to macroscopic behaviour such as gel retardation or ease of cyclization into DNA minicircles. The distinction between local bending and macroscopic curvature is illustrated (poorly) in the following figure (see figure 1 of the Goodsell & Dickerson paper for a better view).
bend bend bend - - - uncurved / \ / \ / \ -----/ \-/ \-/ \----- bend bend bend bend /-------\ / \ curved |bend |bend | | | |
An x-ray crystal structure analysis cannot show curvature, but can and often does show local bending. On the other hand gel electrophoresis and cyclization kinetics can detect macroscopic curvature, but not bending. A complete knowledge of local bending would permit the precise calculation of curvature, but a knowledge of macroscopic curvature alone does not allow one to specify precisely the local bending elements that produce it. This is one of the scale paradoxes that have plagued the DNA conformation field for a decade or more. There is more than a passing resemblence to a familiar problem of classical statistical mechanics: A complete knowledge of instantaneous positions and velocities of all molecules of a gas allows one to calculate bulk properties such as temprature, pressure and volume. But the most detailed knowledge of bulk properties cannot lead one to precise molecular positions. Many molecular arrangements can produce identical bulk properties, and in the present case, many bending combinations can produce identical macroscopic curvature.
The program begins by applying the indicated twist, roll and tilt at each step along the sequence, and calculating the resulting base pair normal vector. The first base pair is aligned normal to the z axis, with a twist value of 0.0 degrees. the specified twist is applied to the second base pair, and roll and tilt values are use dto calculate its normal vector relative to the first. If either roll or tilt is non-zero, the new normal vector will be angled away from the z axis, producing the first 'bend'. the process is continued along the sequence, applying the appropriate twist, roll and tilt to each new base pair relative to its predecessor. The result is a list of normal vectors for all base pairs in the sequence.
Local bends are then calculated from the normal vectors. The bend for base N is calculated across a window from N-1 to N+1.
Curvature is calculated in two steps. Base pair normals are first averaged over a 10-base-pair window to filter out the local writhing of the helix. The normals of the nine base pairs from N-4 to N+4, and the two base pairs N-5 and N+5 at half weight, are averaged and assigned to base pair N. Curvature then is calculated from these averaged normal vector values, using a bracket value, nc, with a value of 15. That is, the curvature at base pair N is the angle between averaged normal vectors at base pairs N-nc and N+nc.
% banana -graph data Bending and curvature plot in B-DNA Input sequence: tembl:rnu68037 Created banana.dat |
Go to the input files for this example
Go to the output files for this example
Mandatory qualifiers (* if not always prompted): [-sequence] sequence Sequence USA * -graph graph Graph type Optional qualifiers: -anglesfile datafile angles file -residuesperline integer Number of residues to be displayed on each line -outfile outfile Output file name Advanced qualifiers: -data boolean Output as data General qualifiers: -help boolean Report command line options. More information on associated and general qualifiers can be found with -help -verbose |
Mandatory qualifiers | Allowed values | Default | |
---|---|---|---|
[-sequence] (Parameter 1) |
Sequence USA | Readable sequence | Required |
-graph | Graph type | EMBOSS has a list of known devices, including postscript, ps, hpgl, hp7470, hp7580, meta, colourps, cps, xwindows, x11, tektronics, tekt, tek4107t, tek, none, null, text, data, xterm, png | EMBOSS_GRAPHICS value, or x11 |
Optional qualifiers | Allowed values | Default | |
-anglesfile | angles file | Data file | Eangles_tri.dat |
-residuesperline | Number of residues to be displayed on each line | Any integer value | 50 |
-outfile | Output file name | Output file | banana.profile |
Advanced qualifiers | Allowed values | Default | |
-data | Output as data | Boolean value Yes/No | No |
ID RNU68037 standard; RNA; ROD; 1218 BP. XX AC U68037; XX SV U68037.1 XX DT 23-SEP-1996 (Rel. 49, Created) DT 04-MAR-2000 (Rel. 63, Last updated, Version 2) XX DE Rattus norvegicus EP1 prostanoid receptor mRNA, complete cds. XX KW . XX OS Rattus norvegicus (Norway rat) OC Eukaryota; Metazoa; Chordata; Craniata; Vertebrata; Euteleostomi; Mammalia; OC Eutheria; Rodentia; Sciurognathi; Muridae; Murinae; Rattus. XX RN [1] RP 1-1218 RA Abramovitz M., Boie Y.; RT "Cloning of the rat EP1 prostanoid receptor"; RL Unpublished. XX RN [2] RP 1-1218 RA Abramovitz M., Boie Y.; RT ; RL Submitted (26-AUG-1996) to the EMBL/GenBank/DDBJ databases. RL Biochemistry & Molecular Biology, Merck Frosst Center for Therapeutic RL Research, P. O. Box 1005, Pointe Claire - Dorval, Quebec H9R 4P8, Canada XX DR SWISS-PROT; P70597; PE21_RAT. XX FH Key Location/Qualifiers FH FT source 1..1218 FT /db_xref="taxon:10116" FT /organism="Rattus norvegicus" FT /strain="Sprague-Dawley" FT CDS 1..1218 FT /codon_start=1 FT /db_xref="SWISS-PROT:P70597" FT /note="family 1 G-protein coupled receptor" FT /product="EP1 prostanoid receptor" FT /protein_id="AAB07735.1" FT /translation="MSPYGLNLSLVDEATTCVTPRVPNTSVVLPTGGNGTSPALPIFSM FT TLGAVSNVLALALLAQVAGRLRRRRSTATFLLFVASLLAIDLAGHVIPGALVLRLYTAG FT RAPAGGACHFLGGCMVFFGLCPLLLGCGMAVERCVGVTQPLIHAARVSVARARLALALL FT AAMALAVALLPLVHVGHYELQYPGTWCFISLGPPGGWRQALLAGLFAGLGLAALLAALV FT CNTLSGLALLRARWRRRRSRRFRENAGPDDRRRWGSRGLRLASASSASSITSTTAALRS FT SRGGGSARRVHAHDVEMVGQLVGIMVVSCICWSPLLVLVVLAIGGWNSNSLQRPLFLAV FT RLASWNQILDPWVYILLRQAMLRQLLRLLPLRVSAKGGPTELSLTKSAWEASSLRSSRH FT SGFSHL" XX SQ Sequence 1218 BP; 162 A; 397 C; 387 G; 272 T; 0 other; atgagcccct acgggcttaa cctgagccta gtggatgagg caacaacgtg tgtaacaccc 60 agggtcccca atacatctgt ggtgctgcca acaggcggta acggcacatc accagcgctg 120 cctatcttct ccatgacgct gggtgctgtg tccaacgtgc tggcgctggc gctgctggcc 180 caggttgcag gcagactgcg gcgccgccgc tcgactgcca ccttcctgtt gttcgtcgcc 240 agcctgcttg ccatcgacct agcaggccat gtgatcccgg gcgccttggt gcttcgcctg 300 tatactgcag gacgtgcgcc cgctggcggg gcctgtcatt tcctgggcgg ctgtatggtc 360 ttctttggcc tgtgcccact tttgcttggc tgtggcatgg ccgtggagcg ctgcgtgggt 420 gtcacgcagc cgctgatcca cgcggcgcgc gtgtccgtag cccgcgcacg cctggcacta 480 gccctgctgg ccgccatggc tttggcagtg gcgctgctgc cactagtgca cgtgggtcac 540 tacgagctac agtaccctgg cacttggtgt ttcattagcc ttgggcctcc tggaggttgg 600 cgccaggcgt tgcttgcggg cctcttcgcc ggccttggcc tggctgcgct ccttgccgca 660 ctagtgtgta atacgctcag cggcctggcg ctccttcgtg cccgctggag gcggcgtcgc 720 tctcgacgtt tccgagagaa cgcaggtccc gatgatcgcc ggcgctgggg gtcccgtgga 780 ctccgcttgg cctccgcctc gtctgcgtca tccatcactt caaccacagc tgccctccgc 840 agctctcggg gaggcggctc cgcgcgcagg gttcacgcac acgacgtgga aatggtgggc 900 cagctcgtgg gcatcatggt ggtgtcgtgc atctgctgga gccccctgct ggtattggtg 960 gtgttggcca tcgggggctg gaactctaac tccctgcagc ggccgctctt tctggctgta 1020 cgcctcgcgt cgtggaacca gatcctggac ccatgggtgt acatcctgct gcgccaggct 1080 atgctgcgcc aacttcttcg cctcctaccc ctgagggtta gtgccaaggg tggtccaacg 1140 gagctgagcc taaccaagag tgcctgggag gccagttcac tgcgtagctc ccggcacagt 1200 ggcttcagcc acttgtga 1218 // |
The graphical display shows the sequence together with the local local bending (solid line) and macroscopic curvature (dotted line).
Base Bend Curve a 0.0 0.0 t 19.7 0.0 g 17.7 0.0 a 21.1 0.0 g 28.5 0.0 c 26.2 0.0 c 19.7 0.0 c 18.7 0.0 c 12.5 0.0 t 9.7 0.0 a 14.9 0.0 c 16.5 0.0 g 17.5 0.0 g 26.2 0.0 g 28.5 0.0 c 20.7 0.0 t 11.7 0.0 t 6.4 0.0 a 9.3 0.0 a 14.9 0.0 c 17.7 20.0 c 15.7 19.2 t 15.7 18.5 g 17.7 17.9 a 21.1 17.1 g 28.5 15.9 c 25.2 14.6 c 12.5 13.3 t 7.2 11.9 a 13.2 10.8 g 20.1 10.1 t 19.5 9.6 g 15.1 9.2 g 14.9 9.1 a 19.5 9.5 t 19.7 10.2 g 17.7 10.8 a 17.7 11.0 g 25.2 11.2 g 26.2 11.3 c 15.3 11.5 a 11.4 11.7 a 14.5 12.0 c 13.9 12.2 a 11.4 12.3 a 14.9 12.5 c 17.7 12.8 g 19.5 13.3 t 19.1 13.5 [Part of this file has been deleted for brevity] g 15.1 15.2 a 17.7 15.5 g 25.2 15.8 g 32.5 16.0 c 25.2 15.8 c 15.7 15.0 a 16.3 14.2 g 15.5 13.5 t 10.8 12.8 t 13.7 12.3 c 19.5 12.1 a 20.1 12.1 c 16.3 12.1 t 16.7 11.9 g 22.1 11.4 c 21.1 11.1 g 14.9 10.7 t 9.7 10.3 a 16.1 9.8 g 24.5 9.4 c 21.1 8.9 t 15.1 8.4 c 16.1 7.7 c 17.5 7.3 c 15.3 6.9 g 24.0 6.4 g 26.2 5.8 c 20.5 5.4 a 19.1 5.1 c 15.3 26.0 a 16.3 0.0 g 20.1 0.0 t 19.5 0.0 g 25.2 0.0 g 28.5 0.0 c 20.7 0.0 t 13.3 0.0 t 13.7 0.0 c 15.7 0.0 a 19.1 0.0 g 28.5 0.0 c 25.2 0.0 c 19.5 0.0 a 20.1 0.0 c 17.9 0.0 t 13.9 0.0 t 13.9 0.0 g 19.1 0.0 t 19.5 0.0 g 0.0 0.0 a 0.0 0.0 |
The data file consists of three columns separated by blanks or tab characters.
The first column is the sequence.
The second column is the local bending.
The third is the curvature.
The description of this bending model is as follows:
The roll-tilt-twist parameters of this model are derived purely from experimental observations of sequence location preferences of base trimers in small circles of DNA, without reference to solution techniques that measure curvature per se. For this reason, they may be the most objective and unbiased parameters of all. Satchwell, Drew and Travers studied the positioning of DNA sequences wrappped around nucleosome cores, and in closed circles of double-helical DNA of comparable size. From the sequence data they calculated a fractional preference of each base pair triplet for a position 'facing out', or with the major groove on the concave side of the curved helix. The sequence GGC, for example, has a 45% preference for locations on a bent double helix in which its major groove faces inward and is compressed by the curvature (tending towards positive roll), whereas sequence AAA has a 36% preference for the opposite orientation, with major groove facing outward and with minor groove facing inward and compressed (tending toward negative roll). These fractional variances have been converted into roll angles in the following manner: Because x-ray cyrstal structure analysis uniformly indicates that AA steps are unbent, a zero roll is assigned to the AAA triplet; an arbitrary maximum roll of 10 degrees is asigned to GGC, and all other triplets are scaled in a lenear manner. Where % is the percent-out figure, then:
Roll = 10 degrees * (% + 36)/(45 + 36)
Chenging the maximum roll value will scale the entire profile up or down proportionately, but will not change the shape of the profile. Peaks will remain peaks, and valleys, valleys. The absolute magnitide of all the roll values is less important than their relative magnitude, or the order of roll preference. Twist angles were set to zero. Because these values correspond to base trimers, the values of roll, tilt and twist were applied to the first two bases for the calculation.
Program name | Description |
---|---|
btwisted | Calculates the twisting in a B-DNA sequence |
chaos | Create a chaos game representation plot for a sequence |
compseq | Counts the composition of dimer/trimer/etc words in a sequence |
dan | Calculates DNA RNA/DNA melting temperature |
freak | Residue/base frequency table or plot |
isochore | Plots isochores in large DNA sequences |
sirna | Finds siRNA duplexes in mRNA |
wordcount | Counts words of a specified size in a DNA sequence |
The original program ('BEND') is described in the Goodsell & Dickerson paper. Created 1999/06/09. Last Updated 1999/06/14.