Holloway Group at Sage: 2011

Thursday, November 10, 2011

running WRF with NARR--an overview

updated slides from our WRF club/WRF party this week are attached as a movie (!!)

Comment here with questions or additional info. you find ...
because ain't no party like a W R F party 'cause a W R F party don't stop...

(don't want to squint-read or read enlarged blurry text? want to be able to click on the links on the slides? a pdf of slides is also on nitrate: /archive/shared/WRF_club.pdf )

Friday, October 28, 2011

How to compile CMAQ in parallel

Recently, Phil and I put together a parallel compilation of CMAQ on nitrate, the recently purchased Linux box. Direct comparison with a serial compilation on the same machine, with the same input data, reveals that CMAQ scales reasonably well on a small number of cores. We observed a 7x increase in run speed by running on all 8 cores as compared to a single core. However, some key modifications need to be made to the installation scripts for successful compilation. These are listed, to the best of my knowledge, below. Please note that this is NOT meant to be a comprehensive guide on how to install CMAQ, nor can I guarantee that I've caught all the necessary changes, since I'm writing this up about a week after the actual install. That said, it may serve as a useful resource for someone who already knows how to compile CMAQ in serial and wants to get it running successfully in parallel.

1. Before installing
Some libraries that work well for serial compilations don't play nicely with the parallel compilation. To save yourself grief later, link these libraries into the CMAQ libraries location:

libnetcdf.a - Make sure this is compiled without DAP support. I have no idea what DAP support is, but it breaks parallel compilations and we don't use it. There is a flag that can be passed to the configure script when installing netcdf that turns it off.

libioapi.a - Surprisingly, a standard version of IOAPI will work just fine with a parallel compilation. IOAPI has a bunch of parallel IO options that you can set when compiling, but CMAQ doesn't use them. CMAQ (at least 4.7.0) is only parallelized for processing, not for file IO, so just use whatever library you used for the serial compilation. Of course, make sure you've properly included the fixed_src folder as you'll need the contents throughout.

libmpich.a - This doesn't have an explicit folder the way IOAPI and netCDF do in the CMAQ installation, but you'll need it for a parallel installation. If it isn't on your system, download and install it.

2. pario
Installing the parallel IO library (pario) is not necessary for a serial installation, but it is necessary to build CMAQ in parallel. Install it as you would any other component (bcon, icon, etc.) by modifying the library paths and the compiler path/flags.

3. stenex
The stencil exchange library has both parallel and serial components. You can get away with just installing sef90_noop for a serial build (built from bldit.se_noop), but for parallel you'll also need to run bldit.se to generate se_snl. It may be possible to skip the installation of sef90_noop if you want to run strictly in parallel, but I haven't tried. In any case, the only difference in installing these two files is that se_snl needs the mpich header file location.

4. Other components
To the best of my knowledge, the installation for m3bld, jproc, icon, and bcon is all unchanged from serial installation. Build these as you normally would.

5. cctm
This is probably where the largest numbers of changes need to occur. Let's break it down into two categories: building and running

Building cctm:

Make the following changes to the bldit.cctm script:

Uncomment the line reading "set ParOpt"
appropriately set location of mpich in the MPICH variable. Note that this is the top-level directory, and should have include, bin, and lib directories underneath it
Change FC from whatever compiler you were using before to mpif90 (provided it is installed on your system). mpif90 is a wrapper compiler that adds in extra flags as needed for compiling parallel programs. Note that this may not be available for MPI implementations other that MPICH
Add a flag to F_FLAGS reading -f90=oldCompiler where "oldCompiler" is the compiler you were using before. This makes sure mpif90 wraps the correct compiler.
find the line where the script sets the COMP variable. Comment it out and replace with
set COMP = "intel"

Running cctm:

Make the following changes to the run.cctm script

Change the variables NPCOL_NPROW and NPROCS to reflect the number of processors you would like to use and their organization. There should be an example commented out in the file already. Note that the two values for NPCOL_NPROW should multiply to give NPROCS
At the very bottom of the file, comment out the line "time $BASE/$EXEC"
uncomment the four lines beginning "set MPIRUN", "set TASKMAP", "cat $TASKMAP", "time $MPIRUN"
Change the location of MPIRUN to reflect the actual path to the executable on your system (at the command line, run "which mpirun" to find the executable if you don't know where it is)

Make the following changes BEFORE RUNNING

There should be a file in the cctm directory labeled "machines8". Open up this file, erase the contents (they are meaningless) and enter in "sysname:num" on each line, where sysname is the name of the system you're working on, and num is a number starting at 1, and 1 larger each line. Put this string on each line, continuing until you've reached the max number of processors. IE for nitrate, the machines8 file looks something like
nitrate:1
nitrate:2
nitrate:3
nitrate:4
nitrate:5
nitrate:6
nitrate:7
nitrate:8

You should now be ready to run CMAQ in parallel!

Thursday, September 8, 2011

How to find my stuff. Chapter 1.

Please let me know if there are any permissions issues or if this info seems incomplete. I tried to make everything public but may have missed things.

The output from SMOKE/emissions input to CMAQ are all on mercury in:
/Volumes/archive/luedke/data/emis/
There is a separate subfolder for each scenario I did (baseline, no HDDV, no onroad, no PTIPM)

The output from CMAQ is on mercury in:
/Volumes/archive/luedke/data/cctm/
Again there's a subfolder for each scenario.

On SOX I have all the data and scripts I used to make plots, and the plots themselves. They are in folders generally pertaining to the type of plot:

/Users/luedke/8hr_max : getting the average monthly 8hr max ozone

/Users/luedke/extreme_plotting : plotting extreme events. some scripts are to plot the frequency and location of these events, and others are for percentage contribution only during those events. what constitutes an "extreme event" can be defined by you.

/Users/luedke/making_averages : here is where i made averages of emissions from the SMOKE output to compare with NEI attributes listed on the EPA website

/Users/luedke/nitrate_plotting : nitrate pm2.5

/Users/luedke/no2_plotting : NO2 plots across the whole CONUS

/Users/luedke/nox_plotting_in_the_east : NO and NO2 across just east US

/Users/luedke/ozone_plotting : monthly average ozone. not as useful as 8hr max so this didn't make it to my thesis.

/Users/luedke/pm25_plotting : total PM2.5, calculated by just adding all the PM2.5 species together (see previous blog post about getting PM2.5 from CMAQ. this is the result of the simplest method from the CMAS presentation in 2010)

/Users/luedke/sulfate_plotting : sulfate pm2.5

On NOX I have all the SMOKE goodies. In the future our group will hopefully be using an updated version of SMOKE and this may not be useful (I used 2.4), but here it is:

/home/luedke/assigns : where my assigns files are. some sectors needed different scripts which was dumb but that's why there are some extras

/home/luedke/go : where my runscripts are for each sector and for merging them

/home/luedke/intermed/2002ac : intermediate data from each sector script. the results of merging in the right combination were moved to mercury.

thank you and good day.

Wednesday, August 31, 2011

Notes on data, scripts, and externals

I am attaching several word documents that will give you all a little more information on my scripts, data, and externals.

In general, all my externals can be found in room 287 on one of the bookshelves. This document describes what is on each external. You might have to do a little "digging" to find exactly what you want, but most of it is explained in the documentation

http://www.scribd.com/doc/63671067/Externals?secret_password=ttur6t1wu39esfd00at

The next two documents are not so detailed. I made a quick "fact" sheet - called DATA- of where important data can be found - this will help instead of "digging" around the external sheet. Mostly it will help for Caitlin's old runs (SMOKE, WRF, CMAQ) and my new CMAQ runs

http://www.scribd.com/doc/63671344/Data?secret_password=2b3o23k974ub33mvi4xb

The last just states where you can find my early NCL scripts, my newer ones for my MS thesis, ioapi scripts, and where I ran CMAQ and CHEMMECH.

http://www.scribd.com/doc/63671304/Script-Notes?secret_password=2e1lrlfyupye3dm1t2xk

If anyone needs to contact me with questions you can contact me at jamorton74@gmail.com
Thanks everyone and it has been my pleasure to work with you all!

J

Tuesday, August 9, 2011

Getting PM2.5 from CMAQ

PM2.5 is a big deal. But it is quite a beast to model, and CMAQ does so in a speciated way. This means that CMAQ keeps track of all the different types of PM (from nitrate, sulfate, whatever) and keeps them separate.

That sounds cool, because I mean that would let you look at a specific one if all you actually care about is sulfate for example. Makes sense.

But! It is way less cool for you if you really only care about PM2.5 as a total count, or if you want to use the total count as a frame of reference to a specific kind. The EPA NAAQS regulate PM2.5 as a total, so you really should care about PM2.5 as a whole too.

The reason it is less cool for you is because there is no "total PM2.5" output from CMAQ at all. You have to make it yourself. This sounds easy but can be VERY misleading, so I want to try and help you do this the right way. I did it the wrong way and it caused me more work so let's avoid that. If this confuses you even more I apologize...

There are conflicting options for calculating PM2.5. The way I did it was according to a presentation given by CMAS in 2010. That powerpoint is here: http://arset.gsfc.nasa.gov/umbc/files/Session1Day3/CMAQ-Introduction-for-ARSET.ppt

On slide 10, it says that PM2.5 is equal to the sum of a bunch of different species. So that's what I did. Is that right?

I once thought it was. Then things changed.

Instead, you can follow the approach in this document here:
www.epa.gov/CAIR/pdfs/CMAQ_Evaluation.pdf

This says on page 3 that you did the right thing when you picked your species to count in the mix, but you really should be scaling two of them by 1.167. Is that right?

Possibly? Maybe?

Next we check "Evaluation of the community multiscale air quality (CMAQ) model version 4.5: Sensitivities impacting model performance; Part II—particulate matter" by K. Wyat Appel. Note that this is for version 4.5 and you shouldn't be using that version, but in any case it says not only should you be doing that scaling bit, but also you should be looking at sodium and chlorine species too.

This is also in "A multi–pollutant,risk–based approach to air quality management:Case study for Detroit" by Karen Wesson. Is that right?

Could be?

Sometimes in life and in modeling, you have to make your own answers.

Monday, August 8, 2011

Mercury Paper Figure Documentation

I am attaching my word document describing the figures in the mercury paper that Tracey and Caitlin wrote. Each page is describing a new figure and what data was used to create it. Most of the figures were made using excel, but some where created with NCL.

http://www.scribd.com/full/61868877?access_key=key-1zae00432g0xpzpcp93a

If figures were made in excel the pathway is bolded in purple.
If figures were made with NCL the pathway to the script is bolded in black. The data for the NCL plots are bolded in blue.

I will posting more documentation on other figures I have made for my thesis work.

Jami

Thursday, July 28, 2011

CMAQ Output: ACONC vs. CONC

At last week's group meeting, I discussed the difference between 2 of CMAQ's output files: ACONC and CONC.

CONC = instantaneous pollutant concentration at each output timestep (each hour)

ACONC = average pollutant concentration for each model hour (average throughout hour)

The CONC file automatically includes all pollutants at all vertical model layers, whereas the pollutants and vertical layers in the ACONC are set by the user in the CCTM run script. In general, use ACONC for comparison with ground-based measurements and CONC for comparison with satellite data.

Overall, I found that ACONC and CONC concentrations compare very well over both the CONUS and GL domains for secondary (formed in atmosphere) pollutants. A much bigger difference is seen between ACONC and CONC for primary (directly emitted) pollutants. This difference is expected - subhourly variations in wind speed and other meteorological factors will affect the local concentrations of primary pollutants emitted from point sources (e.g. SO2, CO) much more that the concentrations of the more well-mixed secondary pollutants (e.g. O3, ASO4).

Nothing earth-shattering, but might be helpful to someone running CMAQ for the first time.

Steve

Tuesday, May 24, 2011

Requested Projections for Satellite Regridding

So, as most of you know, I'm currently working on some scripts that will regrid and reproject satellite data into a usable form. I'm doing this by writing a python framework, and I'm building it with the intention that it should be easy enough to add functionality for a given projection/regridding algorithm once I'm done.

That said, if down the line someone needs a projection I didn't initially program in, they'll have to either A) wait for me to code it in for them or B) figure out enough of how the program works to code in a projection themselves. Neither of those options would take too terribly long, but I figured I'd put as many useful projections in the program as I can now, so that later no one has to wait at all.

The only problem is, I don't really know what projections people want! For now, all I've programmed is the lambert conic conformal, so any and all suggestions are welcome! They don't take long to program, and it's actually kind of fun, so don't be shy. Just drop me a comment below with any projection you'd like to see and I'll put it in there for you.

Saturday, May 21, 2011

Scripts and File Description document

I have finally documented all the NCL scripts and data files I have created/compiled over the last few years. The Word document is currently stored on 'sox' at /Users/plachinski/Data_Management_SDP.docx and will probably updated a few more times. All of the NCL scripts described in this document are in my personal directory on 'rainforest', and all of my externals are also currently hooked up to 'rainforest'.

Feel free to utilize anything I have created. Hopefully this will save everyone some time with data analysis.

Steve

Tuesday, April 19, 2011

Nested Runs in CMAQ

So the default CMAQ CCTM scripts have a section that says:

#> remove existing output files?
set DISP = delete
#set DISP = update
# set DISP = keep

If you choose "delete" there, it clears out your $OUTDIR before running. This can be cool if you are fixing something that you did wrong the first time (which applies to 99% of the time for me), but can cause problems if you want to do a run that is more than 1 day (which should be 100% of the time). What will happen if you try to run with an initial condition of the output from last run is that the output will be deleted before it's accessed as an initial condition. This is really annoying and embarrassing. It also means that...

you pretty much don't want to use the default CMAQ CCTM scripts basically ever.

Instead you want to use a custom-made one that has some if-statements and for-loops. This allows you to have a spin-up period, redirect the output from the day before as the initial condition for the previous day, and so on.

If you're like me, you're like "that sounds easy enough but really tedious." In which case your first plan of attack is to:

don't waste your time and just copy off someone else's

I put a copy on mercury in /Users/Luedke/cmaq/cctm/copy.this.cctm.runscript
I did my best to comment in what you need to change and stuff so it finally makes sense.
Have fun.

Monday, April 11, 2011

Remote disk image installation

With most of the software we work with, the only way to install it is by building it from source code. As everyone has at some point experienced, building from source can be difficult and frustrating.

Fortunately, some software comes as "Disk images" (you'll know because the files end in .dmg). These "disk images" are the classic, point and click kind of installers that do all the work for you. The only problem is it takes a few obscure commands to get them to work over command line, and it's not always possible/convenient to go sit down at the machine itself and install it from the GUI. Here's a step-by-step on how to install disk images over the command line.

NOTE: curly braces denote that you should input whatever is appropriate for you and should not actually be input. IE if I read

ls {myHomeDirectory}

I would actually type

ls /Users/oberman

1) Download the file and put it in your home directory (don't worry, we won't be installing it here.)

2) Mount the disk image. The command for this is:

hdiutil attach {filename.dmg}

3) cd to the /Volumes directory. You should see your disk image as one of the volumes (don't worry if it doesn't have the exact same name as the filename)

4) cd into your disk image's volume. Note that if the name contains spaces, using tab completion avoids having to monkey around with escape sequences.

5) You should see at least one file that ends with .pkg, .mpkg, or some variant. This is the file we actually want to install. To do so, we use the following command:

sudo installer -verbose -pkg {packageFile.pkg} -target {/install/location/} >& {/log/filename}

If you're unsure what to put for /install/location, a good bet is almost always /usr/local/. Note that most programs build into subdirectories of the target directory (IE /target/location/bin, /target/location/lib, etc...).

6) Check the logfile (whatever you set /log/filename to be) for any error messages.

7) If you don't find any error messages, cd back to /Volumes

8) Dismount the installation volume. The command for this is:

hdiutil detach {/VolumeName}

And you've successfully installed your program! You can now dispose of the original file you downloaded if you wish, but I find I like having them filed away in case I need to reinstall. Hopefully this saves some searching next time you need to install software!

Compiling WRF for CMAQ

So, you want meteorology files for you cmaq run, but for some reason, you need to recompile WRF. Wouldn't it be nice if the standard way of compiling WRF would just automatically create perfect meteorology files that could be plugged straight into MCIP? Well, it doesn't.

Here's what you are going to have to do to the wrf source code (before compiling!) in order o use WRF output as input to MCIP and, hence, cmaq, smoke, or any other models-3 product:

step 1)

Download the wrf tarball into a clean directory. If you don't have the source code, you can find it here: http://www.mmm.ucar.edu/wrf/users/download/get_source.html

step 2)

Extract the code from the tarball:

gunzip WRFV3.3.TAR.gz

tar -xf WRFV3.3.TAR

(note, assuming you will also need to pre-process your meteorological input files, you'll probably want to download the WPS code at the same time. WPS code is available at the same site as the WRF code - make sure to get the WPS version that matched eh version of WRF you will be using.)

step 3)

The default WRF compilation does not write to output certain variables that are required by MCIP/CMAQ. You have to edit the registry file before you compile to get these variable to write out.

cd WRFV3/Registry ;move to the "Registry" directory in the WRF file structure

vi Registry.EM ;open "Registry.EM" in your text editor of choice

Now find the "ZNT" variable entry, and add an "h" to the eighth column. This will tell WRF to write out the roughness length to your output files. When you are done, your ZNT registry line should look like this:

state real ZNT ij misc 1 - i3rh "ZNT" "TIME-VARYING ROUGHNESS LENGTH"

You should repeat this activity for the following variables:

fractional land use (LANDUSEF)

aerodynamic resistance (RA)

stomatal resistance (RS)

vegetation fraction in the Pleim-Xiu LSM (VEGF_PX)

roughness length (ZNT)

and inverse Monin-Obukhov length (RMOL)

step 4)

Once you are done with the registry, it is time to compile. You should use wrf's configure script to automatically set your compile options and flags. When doing so, be sure to choose the ifort and icc options, and, if you want to enable parallel processing, choose the "dm" (distributed memory) option. Note: on our linux systems, do NOT choose an "sm" (shared memory) option.

cd ../

./configure

>choose ifort/icc/dm run

>choose basic nesting

step 5) compile:

./compile em_real >& compile.log

Thats it! You should now have a WRF compilation that will produce met files ready for use in MCIP.

Friday, April 8, 2011

Running CMAQ with mis-matched emissions & met years

It's possible (and common) to run CMAQ with mis-matched emissions and meteorology years (eg. because the NEI is only released every 3 years). If emissions are processed in SMOKE with the same meteorology to be used in CMAQ, then you're all set. However, if you're using emissions already processed in SMOKE with a different year (eg. Steve's 2003 CONUS files) than the meteorology year (eg. 2005) then you need to modify the emissions files to match the timesteps in the meteorology.

To do this, DO NOT use M3EDHDR - as this only changes the SDATE in the file and not the embedded timestep array and will cause CCTM to not be able to read in your files. Instead, use M3TSHIFT (also an m3tool included with IOAPI).

A sample script for changing files for Dec/Jan and Jun/Jul, from 2003 to 2005 can be found here on SOx: /Users/bickford/data/conv_tools/m3tools_scripts/m3tshift_2003conus.csh

Saturday, April 2, 2011

"badly formed number" error in csh scripts on linux

I have several .csh scripts that scroll through months and days to process daily files (eg. for MCIP). When running one such script this week on nitrate, I ran into a "badly formed number" error on days 08 and 09 -- and only on those days. Turns out, this is why:

"If your script uses comparisons of numbers that begin with a 0, CSH will interpret it as an octal number. if the number contains 8 or 9, it will fail because 8 and 9 do not exist in octal. To get around this problem, you should switch to a different shell (like bash) or use a more robust scripting language, such as Perl."
from:http://www.purdue.edu/eas/info_tech/faq/faq_linux.php#csh

My in-the-moment solution was to hardcode separate scripts for those days, but there are several possible workarounds, including switching shells or operating systems.

Wednesday, March 23, 2011

Editing SMOKE input data text files

Editing SMOKE input data text files can be a long process that sucks and is prone to mistakes. Or it can be a medium-length process that doesn't suck and is way less prone to mistakes.

Here's how to go with that second method:

1. sftp/scp a copy of the file you wanna edit to your desktop. (Ex: get ptinv_ptipm_cap2002v2_02apr2007_v4_orl.txt)

2. import that sucker into Excel:

2a. Say it's a text file and find it on your computer. Then say it's delimited, and start the import at line 16 or wherever the data actually starts after the header. there's a little preview window so you don't mess it up.

2b. tell it that the delimiter is a comma, and that text has no identifier ({none} instead of "). again there's a preview window, and make sure the columns are in the right places and there's still quotes around the text. Then just click Finish

3. You can do all sorts of Excel stuff with the data then. For example I sorted by the longitude column and then deleted all the point sources east of SAGE (-89.415 deg long.)

4. Then go File>>Save As and choose .csv as your file format and save it someplace.

5. Then find that file on your computer and Open With TextEdit.

6. It'll open, but will have a million (actually 3) quotation marks everyplace there should be just one. So go Edit>>Find, and do a Find/Replace of 3 quotes (""") with 1 quote ("). It takes a second and then looks way better.

7. Go to the place on NOX or whatever server and create a new file by going vi "newfile.txt"

8. Press i to go into insert mode. Then do a Select All and Copy on the window showing the .csv file on your computer, and Paste into your vi window. It takes a little while but eventually it all gets there.

9. Then if you like you can copy the header from the original file at the top of this file too, so that it looks identical to your original (other than the data you changed).

So now you can do all sorts of stuff like zero-out certain pollutants, or certain sources. Or multiply emissions by a certain factor. Whatever Excel will let you do, really.

Monday, March 7, 2011

Note about running WRF for CMAQ

Looks like for WRF meteorology grids to be compatible with emissions file grids in CMAQ, the wrf grid needs to have an odd number of grid cells in both the x and y directions such that the projection centerpoint (eg. 40N 97W) is the middle of the center grid, not the vertex of four center grids.

Check lat-lons (xlat_m, xlon_m) from the wrf produced geo_em.d01.nc file compared to the projection centerpoint or known emissions grid lat lons to verify grid match.

Tuesday, February 22, 2011

Make SSH/scp simpler

Want to be able to login remotely by simply typing username@computername, or scp without the whole server extension (sage.wisc.edu)?

It's pretty simple to do. On a Mac, go to Preferences>Network Preferences>Advanced>DNS tab

Under search domains put "sage.wisc.edu." Now whenever you ssh, scp etc, the terminal will automatically look for it on the sage server so you don't have to type it in. If the computer is on another server (eg. aos.wisc.edu), you'll still have to type in the full path.

Monday, February 21, 2011

Geospatial Methods website

Hi everyone,

I mentioned this a couple of weeks ago at the group meeting and figured I would share it on here. It's a useful website that contains some code/libraries for geospatial methods. It's largely focused on satellite data, but there are a lot of useful tidbits here that cut across everyone's work.

http://geospatialmethods.org/

Monday, January 24, 2011

A Blog New World

Do you ever think, "Oh, I wish I remember what Tracey or Keith or Erica said the other day in group meeting"? I have a number of times. At Claus's suggestion the Holloway Research Group at Sage now has a blog. The idea is, if we share useful ideas and/or questions here, they will be accessible to everyone in the group. The hope is that, eventually we'll have a nice searchable tips-and-tricks style repository of Holloway group knowledge. If we all jump on board, I think it could be immensely helpful for everyone.

May it be a space where information and advice freely flows from our minds to our monitors.

A couple of administrative notes:

We (the Holloway group)should all have the capability to authors posts, let me know if you didn't receive this email
The blog is open as of right now, but that can change if so desired
I don't think I can make anyone else an admin, but if there's is something you'd like changed, I'm happy to accomodate