PRISM (Phase Refinement through Iterative Skeletonization of Maps) At some point you expressed interest in our PRISM density skeletonization software. We now have a new version of the code available for UNIX computers. This code represents a complete package for solvent flattening, skeletonization, and NCS averaging. PRISM: Last Update November 11 1997 v 2.18a ******************************************************************************* 2.18a fixed small bug in mtz write header 2.18 fudged to use new mtz format 2.17 now only needs crystallographic ASU of structure factor data this fixes everal significant problems with freeR set 2.16a another mtz bug fixed 2.16: several bug fixes 2.15: several serious bugs have been fixed (including MTZ bug) now supports CCP4 MTZ, XPLOR and ASCII structure factor files also more info is included in the README file on running prism for different applications. This file is appended here. Refinement of NCS operators is coming ******************************************************************************* The code is available through our anonymous ftp server: util.ucsf.edu (128.218.69.126). userid anonymous pasword your address then cd /pub the file prism.tar or prism.tar.Z (compressed using Unix utility compress) will contain source code and a makefile for SGI/UNIX also pickup: prism.asc is a sample of test data prism.com will run the test prism.log is a copy of our output README this file Alternatively, the skeletonization code is also available as a stand alone program that has been made compatible with the current release of the CCP4 crystallograhic package by Kim Hendrick at the MRC. That code is available from him using anonymous ftp: connect to al.mrc-lmb.cam.ac.uk userid anonymous pasword your address cd pub file is prism.tar.Z in subdir source there is a file compile to compile the program on unix/sgi *************************README*********************************************** This file contains general and current status information for the SGI implementation. The goal of this program is to use various density modification schemes including slovent flattening, NCS averaging, and Skeletonization to improve phases obtained either from molecular replacement or experimental sources. In addition, the program is setup to handle a fixed partial model. This model could be the known part from molecular replacement or a partially complete model from experimental density fitting. This program is also setup to be useful as a general testbed for exploring new density modification schemes. More of this will be coming in future releases. ***Formats supported*** STRUCTURE FACTORS: The current version supports three different structure factor formats: MTZ (current version of CCP4 binary structure factor format) XPLOR ascii structure factor format ASCII a simplified ASCII labeled-column format structure factor format that is particularyly convenient for importing/exporting structure factors to other packages. This ASCII-LCF format consists of 2 header lines: cell dimensions column names (do not include H K L in colun names) followed by the data in free format organized as IH IK IL data1 data2 data3.... Note a complete set of P1 structure factors are required. currently H+-, K+-, L+ sorted H slowest, K, then L are used convertsf.f will convert data from one format to another (XPLOR, LCF_ASCII) sortlcf.f will sort data appropriately H slowest, K then L (XPLOR,LCF_ASCII) MAPS: Both CCP4 binary format and XPLOR ascii format maps are supported. ***Dimensions***************** The program dynamically allocates space so no dimension changes should need to be made. NOTE all data is stored in core, no clever disk I/O buffering is being done. ****************************************************************************** Eventually, there will be a manual, but in the interim please feel free to email me at agard@msg.ucsf.edu TO RUN*********************************************************************** prism input_sf_file output_sf_file [switches, continuation as needed to continue, end each line with a "\" ] typically, we direct the ouput into a log file. Since the log file is updated throughout the refinement, it is useful to check it and see what is going on: key data can be gotten using: grep PRISM logfile.log grep largest logfile.log the first shows current cycle number, skeleton parameters, and 3 rfactors (crystallographic, ncs, free R) the second line shows percent nodes in largest graph in skeletonization this provides info on connectivity SWITCHES********************************************************************* Although there are many switches listed, only a subset are generally used for any particular application. Of particular note is the ability to write out diagnostic maps to monitor the progress, check on masks, etc. We have found this feature particularly useful when setting up. Also, it is possible to write out "smoothed" maps, using the same type of non-linear filtering that is used for generating solvent masks. This is quite useful for following the chain in low res or noisy maps and for making masks for NCS averaging. NOTE on OUTPUT MAPS: the maps written out -mapin are starting maps for each cycle and are complete. By contrast, if you are doing partial structure refinement with a fixed "known" part, that segment will be blanked out of the final output map (-mapo). However, the structure factors Fc, PHIC correspond to the total structure factor. So eiither use the structure factor output and go into your favorite FFT program, do a round in prism without the -partial flag, or do a single round and use the -mapi flag. GENERAL -cell=v1-v6 cell parameters (required for XPLOR format data!!) -spg=n space group # (not required with MTZ data) -nxyz=n1-3 sampling size -res=v1:v2:v3:v4 min,max resolution (v3,v4 are optional) v3= max resolution for data storage (def= res_max) v4= extend phasing to this resolution -nzones=n # zones for scaling -ncycle=n # of refinement cycles -data= xplor XPLOR structure factor format ascii ascii-LCF (Labeled Column Format) (default) mtz MTZ format from CCP4 system -labelin="string" string to translate(map) struct fact column names eg if file column names were FCALC PHASE then: -labelin="FC=FCALC PHI=PHASE" -labelout="string" string to translate(map) struct fact column names eg if file column names were FCALC PHASE then: -labelout="FC=FCALC PHI=PHASE" by default labelout = labelin -title="string" title for output data (used only for MTZ format) FIRST CYCLE -start='FO/FC/FOFC' source of starting structure amplitudes default = "FO" -noscale do not scale Fc on first cycle -fom weight data using figure of merit - look for FOM KNOWN part -partial look for partial structure factors FC2 PHIC2 on input FC,PHIc should be total structure factor -partial=FC use Fc,PHIc as start for partial -mask=coordname calcs mask from atomic coordinates default= PDB -diamask coords are in Diamond format -postmask mask known region after skeleton (default=before) -maskrad=v1:v2 v1,v2 = inner,outer radi for masking SOLVENT FLATTENING -solv=V1:V2:V3 for solvent flattening V1= %solvent V2= calc new mask every V2 cycles, V3=smoothing radius default = [50:1:8] -nonneg multiply neg density values inside mask by .1 SKELETONIZATION -skel select skeletonization -alternate alternate skeletonization with 2 cycles NCS-solv flat -minden=v:m minimum density for skeletonization in sigma >mean -maxden=v:m maximum " " " -epcut=v:m -mingraph=n:m delete graphs with < n nodes -temp=v:m "B-factor" used to build density from skeleton -refine refine skeletonization parameters turned on does simple grid search optimization with nsteps/var parameter selection using "m=1" eg. -minden=1.2:1 -refine=n specifies number of steps for each variable default=5 -pdbout=fname output final skel as PDB file NCS AVERAGING -ncsdat=fname file name for NCS info (format: line 1 # ncs operators followed by sets of 3 lines with 4 values a11,a12,a13,t1 -ncsmask=fname filename for mask defining 1 molecule -ncs average according to NCS operators, NCS mask -ncs=average average according to NCS operators, NCS mask -ncs=monitor monitor agreement without averaging -ncs=difference monitor difference (no averaging) -ncsref refine ncs parameters (n= 4,m=5) -ncsref=n:m refine ncs parameters, n=# ncs refine cycles NEW MAP GENERATION -sim use sim weighting -2fofc use 2Fo-Fc or 2wFo-Fc weighting -usefc use Fc for unobserved reflectons (else set=0) -philim=v1-v3 v1,v2=scale,Bfactor for phase update, v3=max dphi (def=1,0.,180) -combine merge phases with those on input (A,B,C,D) -combine=v1-v3 " v1=wt for calc phase def=1. v2=fom threshold to use observed phase def=1. v3=multiplier for input A,B,C,D def =1. will output fom as FOM STATS -freer=v % reflections to set aside for free R factor -phiref calc phase error to known phase DIAGNOSTIC OUTPUT MAPS output each cycle all .map_N (N = cycle#) -map= CCP4/XPLOR CCP4 =default map format -rootname= default prism root name for map filenames -mapi "root"_in starting map this cycle -mapf "root"_flt smoothed map for solvent mask -mape "root"_env envelope generated by -solv from smoothed map -maps "root"_skel skeleton -mapn "root"_ncmask ncs mask -mapk "root"_kmask known molecule mask (calc only once) -maph "root"_hamask known molecule mask for heavy atoms -mapd "root"_hadif heavy atom difference map -mapo "root"_out map from last cycle MAP SMOOTHING -smooth done at end on last map, this can be extremely useful to help find NCS masks and for looking at overall structure/map quality smoothed map calc with radius=2.8 map is output as "root"_smo (see diagnostic maps) -smooth=v calc smoothed map using radius=v (2.8A good value) STRUCTURE FACTORS COLUMNS*************************************************** structure factor columns expected on input []=optional: FO FC PHI [FC2 PHIC2] [PHIREF] [FREE] [A B C D] [FOM] where: FO = observed data FC = amplitude of total structure factor (see -partial=FC) PHI = starting phase [FC2 PHIC2] = partial structure factors of known part (see -partial) [PHIREF] = "true" phase for testing (see -phiref) [FREE] = user selected reflections (data>0) for freeR calcs [A B C D] = Hendrickson Latman phase probability used for phase combination [FOM] = figure of merit structure factor columns on output: (as input unless noted) FO FC PHI FW [FC2 PHIC2] [PHIREF] [FPAR PPAR] [FOM] [A-D] FW = final weighted total amplitude (see -2fofc, -sim) [FPAR PPAR] = partial structure factor of skeleton part (-partial) [FOM] = Sims weight (see -sim) or FOM (see -combine] NOTE: column names in the input file can be mapped to the required names at runtime using the -labelin option, above output column names will also be mapped to preserve user names EXAMPLES*********************************************************************** sample command files for a number of standard applications comments shown to right are for example only, no comments should be in real com file. also you can have many switches per line order is arbitrary It is only necessary to map labels if different names are used, for clarity in the following examples unnecessary mappings are used (eg. FO=FO) Molecular Replacement or partial structure********************************* at start input requires file with Fobs, Fcalc, Phicalc (Fc,Phic, from model) this does solvent flattening and skeletonization (/whatever/prism/prism sf_in.asc sf_out.asc \ -data=ascii -spg=1 \ data format,spg -nxyz=42:52:100 \ map sampling intervals (4x resolution) -res=100:2.5 -nzones=15 \ res limits, zones for scaling -start=FOFC \ use both Fo & Fc at start (w2Fo- fc) -ncycle=5 \ # refinement cycles -sim -2fofc \ use sim weighting, 2fo- fc for new maps -usefc \ use fcalc for missing reflections -phiref \ compare phases to ref set (testing only) -skel \ do skeletonization -minden=1.2 -maxden=2.6 -epcut=2 -mingraph=15 -temp=6 \ skel params -solv=50 \ do solvent flattening with 50% solv -partial=FC \ partial structure, at start assume partial data in FC,PHIC -mask=prism.pdb \ coords for making partial struct mask -map=CCP4 -rootname=prism -mapo \ output map format,filename and out last -labelin="FO=FO FC=F2 PHI=PHIC2 PHIREF=PHNAT " \ label mapping ) > prism.log log file NOTE: for subsequent cycles, change -partial=FC to -partial Skeleton Parameter refinement********************************* a simple grid search has been set up to make parameter refinement more convenient. To specifiy the free rfactor switch must be on: -free=5 for setting aside 5% of reflections for free r add: -refine or -refine=n (n=#steps per parameter, default = 5) and select which parameters to refine with a :1 -minden=1.2:1 -maxden=2.6:1 -epcut=2 -mingraph=15 -temp=6:1 \ will optimze minden,maxden and temp, the most important NonCrystallographic Symmetry********************************* NCS can be easily incorporated either with or without skeletonization proper NCS averaging requires that a user defined mask be used to specify the "standard" molecule (the NCS operators then specify other copies At the start, it is sometimes useful to simply average the entire asu while this will produce junk in many places, (normally removed by the mask) there should be properly averaged density somewhere. This pluss map smoothing can help determine that first mask. In addition to NCS averaging, some other options are provided that allow just monitoring the NCS correlations without actual;ly averaging, to look at the difference between the average and the current copy (looking for NCS operator errors) and in the next release (refining the NCS operators) -ncsdat=fname file name for NCS info (format: line 1 # ncs operators followed by sets of 3 lines with 4 values a11,a12,a13,t1 -ncsmask=fname filename for mask defining 1 molecule -ncs average according to NCS operators, NCS mask -ncs=average average according to NCS operators, NCS mask -ncs=monitor monitor agreement without averaging -ncs=difference monitor difference (no averaging) -ncs=refine (next release) SIR/MIR (Fom weighting)********************************* at start input requires file with Fobs, PHI, FOM (you can skip the fom, but this is not recommended) (/whatever/prism/prism sf_in.asc sf_out.asc \ -data=ascii -spg=1 \ data format,spg -nxyz=42:52:100 \ map sampling intervals (4x resolution) -res=100:2.5 -nzones=15 \ res limits, zones for scaling -start=FO -fom \ use only Fo, PHI at start, use fom wts -ncycle=5 \ # refinement cycles -sim -2fofc \ use sim weighting, 2fo- fc for new maps -usefc \ use fcalc for missing reflections -skel \ do skeletonization -minden=1.2 -maxden=2.6 -epcut=2 -mingraph=15 -temp=6 \ skel params -solv=50 \ do solvent flattening with 50% solv -map=CCP4 -rootname=prism -mapo \ output map format,filename and out last -labelin="FO=FO PHI=PHIB FOM=FOM" \ label mapping ) > prism.log log file SIR/MIR (Phase combination)********************************* at start input requires file with Fobs, PHI, A B C D [FOM optional] will do phase combination with experimental data at each cycle as above with addition of -combine or -combine= -combine merge phases with those on input (A,B,C,D) -combine=v1-v3 " v1=wt for calc phase def=1. v2=fom threshold to use observed phase def=1. v3=multiplier for input A,B,C,D def =1. will output fom as FOM -labelin="FO=FO PHI=PHIB A= B= C= D= " \ label mapping ****************************************************************************** Good luck!!! Sincerely David A. Agard