If you use GSM for the first time, try using gsm scirpt for a test.

EXPERIMENT


INSTRUCTIONS


Sample output files

You will find these files among others in g_000 if the model runs successfully.

Use the space below to upload your files related to the questions on the discussion page

fcstout.ft00_acem

Discussions

khideki

sfc0

khideki 28 July 2006 01:09:41

A user emailed me today about a problem with sfc0.x.

---
+ rm -rf sfc0.x
+ ln -fs /usr/local/G-RSM/gsm/bin/sfc0.x sfc0.x
+ ./sfc0.x
PGFIO-F-209/unformatted read/unit=19/'OLD' specified for file which does
not exist.
In source file naopen.f, at line number 40
---

sfc0 reads a lot of files from the library, as unit 19. I suspect sfc0 can't find one of the input files. Please look at sfc0.out and which file is causing the error. (I posted my sfc0.out on Wiki for your reference)

alessandrocem
alessandrocem 01 August 2006 14:56:39

Thank you for your suggestions, but I still can't manage to run the 'gsm' script. It always aborts when executing sfc0.x.
Differently from last time I have downloaded and compiled using the command 'inst gsm_latest lo', which I think works correctly. I have the 5.1 version of the PG fortran compiler.

All my sfc0.out file says is this:

PGFIO-F-209/unformatted read/unit=19/'OLD' specified for file which does not exist.
In source file naopen.f, at line number 40

It seems the program isn't able to open even the first file, or othwerwise I would have at least a piece of the sfc0.out similar to the one you posted.
I browsed through the source files (starting from sfc0.f), but I wasn't able to find which subroutine calls 'naopen.f' and what input file it's referring to.

Thank you in advance for other ideas or suggestions.

alessandrocem

fcst.x - 64 bit machines

alessandrocem 11 August 2006 12:37:38

I thought it was wiser to open a new thread of discussion, especially because sfc0.x now works correctly.
I printed the 'status' array and by changing in the routine nainit.f the value of status in which isize is defined, as you told me to do, the executable sfc.x works and terminates correctly.

but... now I have several problems with fcst.x.
First of all for 64 bit machines, "gm" libraries are linked and requested. They are NOT easy to obtain, and also since we have ETHERNET and not MIRYNET, in our opinion, they should be useless.
Anyway we did obtain these libraries and installed them, but the execution of fcst.x stalls almost immediately. I don't know if these libraries have anything to do with it, I have tried both linking them, and both modifying the makefiles so they are not linked, but the program stalls in the exact same way.
I will copy the file fcstout.ft00:


running /usr/local/G-RSM/runs/g_000/fcst.x on 8 LINUX ch_p4 processors
Created /usr/local/G-RSM/runs/g_000/PI2609
0getcon 62 28created april 92
0begin setsig - getting sigs from unit 11
1800. 0.92 0.80E+16 0.60E+16
reduce grid is on with 1 digit accuracy.
archv data from da,mo,yr= 0 3 66
...last date/time and current itim
0 1 1 90 8760.0 1
-------llyr,klowb = 6 4
co2 concentration is 3.4799999999999995E-004
rdsig lab 000b2 p sigma surface file n= 11
rdsig unit,fhour,idate= 11 0.0000000000000000 0
3 9 1990
number of tracers input = 1
number of cloud input = 0
rdsig gz z00= 329.2745436306341
rdsig q
rdsig te
rdsig di ze
rdsig rq
n1,itread,fhour after tread 11 0 0.0
input t=t0 full values
0div vort temp mixratio ln(ps) 0.85440144E-05 0.29136744E-04 0.25247229E+03 0.28508945E-02 0.45882381E+01
0.1238397114E-04 0.2204099817E-04 0.2888138897E+03 0.1161741630E-01
0.1215744913E-04 0.2349267774E-04 0.2880251284E+03 0.1121752612E-01
0.1162732756E-04 0.2504732324E-04 0.2869776967E+03 0.1074251095E-01
0.1057871503E-04 0.2595711210E-04 0.2856848077E+03 0.9859597806E-02
0.9075835834E-05 0.2584168528E-04 0.2841827167E+03 0.8569951030E-02
0.7587511711E-05 0.2476755789E-04 0.2824618701E+03 0.7492116218E-02
0.6626171057E-05 0.2371425427E-04 0.2806813982E+03 0.6400565791E-02
0.6512874369E-05 0.2335317677E-04 0.2784283916E+03 0.5462819446E-02
0.6794808790E-05 0.2363278408E-04 0.2756867530E+03 0.4596046369E-02
0.6892657240E-05 0.2406497907E-04 0.2724327121E+03 0.3725222207E-02
0.6853515540E-05 0.2515131087E-04 0.2683346805E+03 0.2877821560E-02
0.7187910596E-05 0.2779897989E-04 0.2635980815E+03 0.2234286054E-02
0.7409846957E-05 0.3168895088E-04 0.2577439552E+03 0.1609530220E-02
0.7991824166E-05 0.3664828580E-04 0.2510893702E+03 0.1090838298E-02
0.8888281333E-05 0.4163566053E-04 0.2437689619E+03 0.7402229396E-03
0.9607470158E-05 0.4404003882E-04 0.2357555520E+03 0.4783731284E-03
0.1015297215E-04 0.4241966109E-04 0.2284474071E+03 0.2531392236E-03
0.1089892464E-04 0.3724121506E-04 0.2220597300E+03 0.1047865826E-03
0.1067267332E-04 0.3102312609E-04 0.2162262618E+03 0.3456758205E-04
0.1079113490E-04 0.2533836083E-04 0.2114581829E+03 0.1170774166E-04
0.1000120317E-04 0.2114722876E-04 0.2071529513E+03 0.4986736203E-05
0.9230246229E-05 0.1705449550E-04 0.2068910773E+03 0.3137862543E-05
0.8838494866E-05 0.1482399686E-04 0.2099786732E+03 0.2386806010E-05
0.8880659046E-05 0.1520549559E-04 0.2141365663E+03 0.2457863029E-05
0.9136429626E-05 0.1767674144E-04 0.2183607314E+03 0.2530153449E-05
0.9169535142E-05 0.2180435179E-04 0.2236630677E+03 0.2478900717E-05
0.1008651591E-04 0.1991421991E-04 0.2324701195E+03 0.2214743552E-05
0.3835812659E-05 0.2009827500E-04 0.2539653894E+03 0.2019388695E-05
0div vort temp mixratio ln(ps) 0.85440144E-05 0.29136744E-04 0.25247229E+03 0.28508945E-02 0.45882381E+01
0.1238397114E-04 0.2204099817E-04 0.2888138897E+03 0.1161741630E-01
0.1215744913E-04 0.2349267774E-04 0.2880251284E+03 0.1121752612E-01
0.1162732756E-04 0.2504732324E-04 0.2869776967E+03 0.1074251095E-01
0.1057871503E-04 0.2595711210E-04 0.2856848077E+03 0.9859597806E-02
0.9075835834E-05 0.2584168528E-04 0.2841827167E+03 0.8569951030E-02
0.7587511711E-05 0.2476755789E-04 0.2824618701E+03 0.7492116218E-02
0.6626171057E-05 0.2371425427E-04 0.2806813982E+03 0.6400565791E-02
0.6512874369E-05 0.2335317677E-04 0.2784283916E+03 0.5462819446E-02
0.6794808790E-05 0.2363278408E-04 0.2756867530E+03 0.4596046369E-02
0.6892657240E-05 0.2406497907E-04 0.2724327121E+03 0.3725222207E-02
0.6853515540E-05 0.2515131087E-04 0.2683346805E+03 0.2877821560E-02
0.7187910596E-05 0.2779897989E-04 0.2635980815E+03 0.2234286054E-02
0.7409846957E-05 0.3168895088E-04 0.2577439552E+03 0.1609530220E-02
0.7991824166E-05 0.3664828580E-04 0.2510893702E+03 0.1090838298E-02
0.8888281333E-05 0.4163566053E-04 0.2437689619E+03 0.7402229396E-03
0.9607470158E-05 0.4404003882E-04 0.2357555520E+03 0.4783731284E-03
0.1015297215E-04 0.4241966109E-04 0.2284474071E+03 0.2531392236E-03
0.1089892464E-04 0.3724121506E-04 0.2220597300E+03 0.1047865826E-03
0.1067267332E-04 0.3102312609E-04 0.2162262618E+03 0.3456758205E-04
0.1079113490E-04 0.2533836083E-04 0.2114581829E+03 0.1170774166E-04
0.1000120317E-04 0.2114722876E-04 0.2071529513E+03 0.4986736203E-05
0.9230246229E-05 0.1705449550E-04 0.2068910773E+03 0.3137862543E-05
0.8838494866E-05 0.1482399686E-04 0.2099786732E+03 0.2386806010E-05
0.8880659046E-05 0.1520549559E-04 0.2141365663E+03 0.2457863029E-05
0.9136429626E-05 0.1767674144E-04 0.2183607314E+03 0.2530153449E-05
0.9169535142E-05 0.2180435179E-04 0.2236630677E+03 0.2478900717E-05
0.1008651591E-04 0.1991421991E-04 0.2324701195E+03 0.2214743552E-05
0.3835812659E-05 0.2009827500E-04 0.2539653894E+03 0.2019388695E-05
initial solhr = 0.0000000000000000
fixio field read in from unit= 11
fh, idate= 0.0 0 3 9 1990
fixio completed.
forward step: kdt in gsmstep= 1
* nnday of year = 68
0from heatl3 jdnmc etc 2447959 0.50 0.00 68.00 68.00
0 forecast date 9 mar. 1990 at 0 hrs 0.00 mins
julian day 2447959 plus 0.500000
radius vector 0.9928091
right ascension of sun 23.2742673 hrs, or 23 hrs 16 mins 27.4 secs
declination of the sun -4.6811514 degs, or -4 degs 40 mins 52.1 secs
equation of time -10.7192070 mins, or -643.15 secs, or-0.046899 radians
solar constant 2.0254750
1 ozoneKilled by signal 2.



I am sending the execution on 8 processors, the program on the frontend machine of the cluster doesn't abort, it remains idle (Killed by signal 2 is caused by me killing manually the process), while it aborts on all the machines in which it's ditributed.

As I was saying before I do not know if the problems is related to a difficulty or a mistake in distributing the calculations on the different processors or it's more simply due to a difficulty in reading the ozone or the aerosol files.


Thank you again for your time and your prompt answers.

alessandrocem
alessandrocem 18 August 2006 14:01:18

First of all, thanks to all of you for your responces. I sadly am not working full time on this, so some days can pass without me responding.

Anyway, I tried each one of your suggestions but I am still stuck at the same point.

To answer to Hideki, I tried running this experiment on only one processor and the models runs correctly.

I tried deleting the "-lgm" option, but with or without them linked, nothing changes.
I also exported the variable P4_GLOBMEMSIZE on every machine of the cluster, but no improvement.
I also "unlimited the stacksize" on each machine of the cluster.

TO answer to the last suggestion, the ozone files are in a directory common to all the machines and are accessible to all the machines of the cluster.

I copied in the previous message the file fcstout.t00.
On every other machine (where the calculations are actually performed) that is not the frontend of the cluster I have a SEGMENTATION FAULT error.

I did have to change the FCSTENV in the "fcst" runscript, by adding to the "mpirun -np 8" command the "-nolocal" flag, that excludes the frontend from being used for calculations. I hope this isn't what causes the problem, but it shouldn't be.
Although at this point I'm really clueless of what the problem could be.

Thank you again to all of you, also for any other suggestion you might have.

Alessandro

johnlow

fcst.x segmentation fault

johnlow 12 April 2007 15:03:35

Hi,
I am trying the gsm script but it bombed at fcst.x. I am running on a Intel Dualcore 32 bit Fedora Core 6 OS, compiled with intel fortran and cc compiler. The output from fcstout.ft00 is as follows. Appreciate your help and advice. Regards.


0getcon 62 28created april 92
0begin setsig - getting sigs from unit 11
1800. 0.92 0.80E+16 0.60E+16
reduce grid is on with 1 digit accuracy.
archv data from da,mo,yr= 0 3 66
...last date/time and current itim
0 1 1 90 8760.0
1
-------llyr,klowb = 6 4
co2 concentration is 3.480000000000000E-004
rdsig lab 000b2 p sigma surface file n= 11
rdsig unit,fhour,idate= 11 0.000000000000000E+000 0
3 9 1990
number of tracers input = 1
number of cloud input = 0
n1,itread,fhour after tread 11 0 0.0
input t=t0 full values
0div vort temp mixratio ln(ps) 0.85433449E-05 0.29136147E-04 0.25247260E+03
0.28511797E-02 0.45882404E+01
0.1238524904E-04 0.2204041752E-04 0.2888142375E+03 0.1161842702E-01
0.1216014694E-04 0.2349131558E-04 0.2880252874E+03 0.1121855204E-01
0.1162882514E-04 0.2504642516E-04 0.2869777775E+03 0.1074386814E-01
0.1057842618E-04 0.2595736506E-04 0.2856850120E+03 0.9860991165E-02
0.9071456234E-05 0.2584183053E-04 0.2841829998E+03 0.8571089111E-02
0.7584678192E-05 0.2476777288E-04 0.2824623884E+03 0.7493099371E-02
0.6624593239E-05 0.2371392185E-04 0.2806819558E+03 0.6401504353E-02
0.6511718762E-05 0.2335245139E-04 0.2784286568E+03 0.5463595603E-02
0.6795465802E-05 0.2363319585E-04 0.2756866631E+03 0.4596512529E-02
0.6891973529E-05 0.2406624433E-04 0.2724325588E+03 0.3725400049E-02
0.6849890626E-05 0.2515147890E-04 0.2683344775E+03 0.2877888144E-02
0.7185201987E-05 0.2779832201E-04 0.2635982520E+03 0.2234355825E-02
0.7407612453E-05 0.3168850001E-04 0.2577445061E+03 0.1609612271E-02
0.7990313274E-05 0.3664754626E-04 0.2510897977E+03 0.1090859768E-02
0.8888666820E-05 0.4163310143E-04 0.2437696524E+03 0.7402548507E-03
0.9609657588E-05 0.4403760768E-04 0.2357566434E+03 0.4783807291E-03
0.1015248058E-04 0.4241946845E-04 0.2284484912E+03 0.2531504865E-03
0.1089895001E-04 0.3724219046E-04 0.2220602996E+03 0.1047980466E-03
0.1067449852E-04 0.3102092170E-04 0.2162265650E+03 0.3457000963E-04
0.1079128936E-04 0.2533640529E-04 0.2114585878E+03 0.1170811216E-04
0.1000359232E-04 0.2114556943E-04 0.2071529704E+03 0.4986693503E-05
0.9232133900E-05 0.1705398366E-04 0.2068907417E+03 0.3137979874E-05
0.8836808377E-05 0.1482297606E-04 0.2099783791E+03 0.2386893199E-05
0.8879986184E-05 0.1520431595E-04 0.2141362776E+03 0.2457842365E-05
0.9136027261E-05 0.1767598861E-04 0.2183605743E+03 0.2530151862E-05
0.9168999859E-05 0.2180582122E-04 0.2236629969E+03 0.2478906186E-05
0.1008650672E-04 0.1991282916E-04 0.2324700107E+03 0.2214746368E-05
0.3835984676E-05 0.2009913486E-04 0.2539651569E+03 0.2019391175E-05
0div vort temp mixratio ln(ps) 0.85433449E-05 0.29136147E-04 0.25247260E+03
0.28511797E-02 0.45882404E+01
0.1238524904E-04 0.2204041752E-04 0.2888142375E+03 0.1161842702E-01
0.1216014694E-04 0.2349131558E-04 0.2880252874E+03 0.1121855204E-01
0.1162882514E-04 0.2504642516E-04 0.2869777775E+03 0.1074386814E-01
0.1057842618E-04 0.2595736506E-04 0.2856850120E+03 0.9860991165E-02
0.9071456234E-05 0.2584183053E-04 0.2841829998E+03 0.8571089111E-02
0.7584678192E-05 0.2476777288E-04 0.2824623884E+03 0.7493099371E-02
0.6624593239E-05 0.2371392185E-04 0.2806819558E+03 0.6401504353E-02
0.6511718762E-05 0.2335245139E-04 0.2784286568E+03 0.5463595603E-02
0.6795465802E-05 0.2363319585E-04 0.2756866631E+03 0.4596512529E-02
0.6891973529E-05 0.2406624433E-04 0.2724325588E+03 0.3725400049E-02
0.6849890626E-05 0.2515147890E-04 0.2683344775E+03 0.2877888144E-02
0.7185201987E-05 0.2779832201E-04 0.2635982520E+03 0.2234355825E-02
0.7407612453E-05 0.3168850001E-04 0.2577445061E+03 0.1609612271E-02
0.7990313274E-05 0.3664754626E-04 0.2510897977E+03 0.1090859768E-02
0.8888666820E-05 0.4163310143E-04 0.2437696524E+03 0.7402548507E-03
0.9609657588E-05 0.4403760768E-04 0.2357566434E+03 0.4783807291E-03
0.1015248058E-04 0.4241946845E-04 0.2284484912E+03 0.2531504865E-03
0.1089895001E-04 0.3724219046E-04 0.2220602996E+03 0.1047980466E-03
0.1067449852E-04 0.3102092170E-04 0.2162265650E+03 0.3457000963E-04
0.1079128936E-04 0.2533640529E-04 0.2114585878E+03 0.1170811216E-04
0.1000359232E-04 0.2114556943E-04 0.2071529704E+03 0.4986693503E-05
0.9232133900E-05 0.1705398366E-04 0.2068907417E+03 0.3137979874E-05
0.8836808377E-05 0.1482297606E-04 0.2099783791E+03 0.2386893199E-05
0.8879986184E-05 0.1520431595E-04 0.2141362776E+03 0.2457842365E-05
0.9136027261E-05 0.1767598861E-04 0.2183605743E+03 0.2530151862E-05
0.9168999859E-05 0.2180582122E-04 0.2236629969E+03 0.2478906186E-05
0.1008650672E-04 0.1991282916E-04 0.2324700107E+03 0.2214746368E-05
0.3835984676E-05 0.2009913486E-04 0.2539651569E+03 0.2019391175E-05
initial solhr = 0.000000000000000E+000
fixio field read in from unit= 11
fh, idate= 0.0 0 3 9 1990
fixrdrec completed for ts
fixrdrec completed for smc
fixrdrec completed for sno
fixrdrec completed for stc
fixrdrec completed for tg3
fixrdrec completed for z0
fixrdrec completed for cv
fixrdrec completed for cvb
fixrdrec completed for cvt
fixrdrec completed for alb
fixrdrec completed for sli
fixrdrec completed for vegcov
fixrdrec completed for canop
fixrdrec completed for f10m
fixrdrec completed for vegtyp
fixrdrec completed for soiltyp
fixrdrec completed for albf
fixrdrec completed for ustar
fixrdrec completed for fm
fixrdrec completed for fh
fixrdrec completed for prcp
fixrdrec completed for srflag
fixrdrec completed for snodph
fixrdrec completed for slc
fixrdrec completed for shdmin
fixrdrec completed for shdmax
fixrdrec completed for slope
fixrdrec completed for snoalb
fixio completed.
forward step: kdt in gsmstep= 1
* nnday of year = 68
0from heatl3 jdnmc etc 2447959 0.50 0.00 68.00 68.00
0 forecast date 9 mar. 1990 at 0 hrs 0.00 mins
julian day 2447959 plus 0.500000
radius vector 0.9928091
right ascension of sun 23.2742673 hrs, or 23 hrs 16 mins 27.4 secs
declination of the sun -4.6811514 degs, or -4 degs 40 mins 52.1 secs
equation of time -10.7192070 mins, or -643.15 secs, or-0.046899 radian
s
solar constant 2.0254750
1 ozone climatology for month,day= 3 9
aerosol, o2, co2, h2o, o3 = 1 0 0 1 1
global checked speed maxima for all layers
spdmx(01:10)= 32. 32. 36. 39. 40. 41. 39. 36. 35. 38.
spdmx(11:20)= 45. 52. 56. 63. 73. 86. 94. 97. 83. 66.
spdmx(21:30)= 54. 55. 55. 62. 72. 72. 69. 57.
horizontal diffusion parameters
effective 76.452 microhertz at wavenumber 62
maximum wavenumber for zero diffusion 34
order of diffusion 2
forrtl: severe (174): SIGSEGV, segmentation fault occurred
Image PC Routine Line Source
fcst.x 080D6714 Unknown Unknown Unknown
fcst.x 0805A764 Unknown Unknown Unknown
fcst.x 0804A4CC Unknown Unknown Unknown
fcst.x 080499D3 Unknown Unknown Unknown
fcst.x 08049976 Unknown Unknown Unknown
libc.so.6 006CDF2C Unknown Unknown Unknown
fcst.x 080498B1 Unknown Unknown Unknown

khideki
khideki 12 April 2007 18:34:52

Hi - it is a bit difficult to figure out what is wrong with this limited information. I suspect a problem with rand() in setras.f . See the previous thread by alessandrocem and lonestar%20at%20TACC.html
. If it is not causing the error, I suggest recomipiling the code with DBG option on. See Debug.html
Please let me know when you find out more. Hideki