GRIB2 Migration

January 31st, 2008 by mike

A few days ago I started having a problem with the previous GFS forecast data source I was using for Seasonality forecasts. Just as a bit of background, Seasonality will download forecasts for U.S. locations from the NWS’s NDFD service. For locations outside the U.S., I have to get the data from somewhere else, and the best data source I’ve been able to find is 0.5 degree GFS model output, also available from the NWS. Because the GFS model data is in GRIB format and very large, I first download it to a “forecast” server that I host, and throw it all into a database. Then Seasonality will contact my forecast server to get the forecast for its configured locations.

With the old GFS data source offline, I had to find a new one. I came across a production server at NCEP that hosted 0.5 degree GFS model output, but the data was in GRIB2 format, which presented a problem because up until this point, all of my forecast data was in GRIB (or GRIB1) format.

Fortunately, there were very few modifications because of the format change, but there are some adjustments that needed to be made. I am using the wgrib and wgrib2 tools to convert the GRIB data to ASCII format first, so it’s easier to parse when importing it into the database. With wgrib, I would use a command like this to convert the data to ASCII:

wgrib -s <grib_file> | egrep \":<variable>:<level>\" | wgrib -i -grib <grib_file> -text -o <txt_file>

The authors of wgrib2 changed things a bit, and actually made the command a little bit easier:

wgrib2 -s <grib_file> | egrep \":<variable>:<level>\" | wgrib2 -i <grib_file> -text <txt_file>

The first two commands in the string don’t change at all, but the third one does. The -grib and -o flags are no longer needed. wgrib2 assumes that the input file is in GRIB2 format, and if you specify -text, the next argument is assumed to be the output file. The problem was that my output data wasn’t right. For some reason, my forecasts were just all screwed up, even though the ASCII files themselves looked okay. I ended up finding some documentation on the -text flag, and it seems that the default output order has been changed from the original North->South to South->North, so all of my data was being inverted. This was somewhat problematic, since I don’t really want to return a forecast for the southern hemisphere when someone asks for a location in the northern hemisphere. The fix is fairly simple, just pass -order raw to the wgrib2 command…

wgrib2 -s <grib_file> | egrep \":<variable>:<level>\" | wgrib2 -i <grib_file> -order raw -text <txt_file>

In general, it looks like most of the NWS datasets will only be available in GRIB2 in the near future, so I hope this information saves other wgrib users some time when doing their own GRIB -> GRIB2 migrations.

Populated Places

November 2nd, 2007 by mike

When developing weather software, one of the most important pieces of data you will work with is a location list. A weather application is useless if the user can’t find a city and download weather data for it. There are several different methods users might want to be able to search for a weather location. The methods Seasonality supports include zip codes (U.S. only, for now), city, state, country, and ICAO. Actually, ICAO was just a convenience method I added, so when a user double-clicks on an ICAO weather station in the satellite imagery, the add location dialog box will pop up with a listing of locations that use that ICAO.

So where do you find a good location listing online? The best resource I’ve found is the GeoNames database. They have slightly over 80,000 cities, each having a population over 1,000 people in a simple CSV file you can download from their website. Each city has a state (if applicable), a country, latitude/longitude, elevation, population, time zone, and some other fields as well. Our basic needs are only the city name and it’s latitude/longitude, but some of these other fields come in handy when trying to calculate the local time, deciding which cities to mark on the map (higher population cities should take precedence), etc.

It’s not much use to have this location data unless you can correlate it with your weather data sources. You’ll definitely need a relational database to keep track of everything. I prefer PostgreSQL for my master location database. There are several aspects of data correlation that you’ll need to think about here. Everything from matching every location with the closest weather station ICAO, to getting zip code listings and matching each one with an appropriate city. It’s a lot of work (you’ll learn a lot of SQL along the way) but eventually you’ll have a giant web of location data that you can pull from.

Now that you have this master location database, you’ll want to trim it down to something you can actually ship with your application. For example, Seasonality’s master location database is over 4.5GB in size, which would obviously be too large to include in an application archive. A small download size is desirable, so only include the data you absolutely need. Also, PostgreSQL is a bit of overkill to be running in the background for a database this size, so convert your dataset to something different, like SQLite. Selecting a subset of data and converting it from PostgreSQL to SQLite all takes place in a custom script I wrote, so each time I update a location (and trust me, with so many locations there will be errors and corrections) all I need to do is re-run the export script and I have a fresh location database to include in the next version of Seasonality.

Precipitation Intensity

August 31st, 2007 by mike

Today I revisited the code on the Seasonality forecast server that decides on precipitation intensity. I.E., how much rain does it take to have a drizzle, light rain, moderate rain, or heavy rain? This is forecasted in 12 hour blocks, and I take the sum of precipitation over that time frame to get a total precipitation for that block of time.

Previously the intensity chart looked something like this:

  • 0 - 3mm: Drizzle
  • 4 - 7mm: Light Rain
  • 8 - 25mm: Moderate Rain
  • > 25mm: Heavy Rain

What I found was the GFS model output would often place a millimeter or two of precipitation at random locations and this would trigger a “drizzle” forecast far too often. I decided that I needed to revisit the precipitation intensity chart.

I ended up finding this page at the MetOffice (United Kingdom). They have about twice as many intensity categories, from Very slight to Downpour, so I tried to adapt them to my intensity names (which match those of the NDFD forecasts put out by the U.S. National Weather Service). I ended up using these values:

  • 3 - 6mm: Drizzle
  • 7 - 12mm: Light Rain
  • 13 - 25mm: Moderate Rain
  • > 25mm: Heavy Rain

We’ll try these out for awhile to see how accurate they are. If you’re a Seasonality user, these changes were made all on the forecast server, so no software update is necessary.

90 Meter Elevation Data

August 17th, 2007 by mike

To generate an international forecast for Seasonality, I take in a GeoPoint and find the 4 closest surrounding data points from the GFS Forecast model. The GFS data is 0.5 degree resolution, so if I had a location at point 50.2° latitude, 30.8° longitude, I would use the following data points when generating the forecast: (50°, 30.5°), (50°, 31°), (50.5°, 30.5°), and (50.5°, 31°). The problem is that these surrounding points could be fairly distant from the weather location being queried, so depending on the terrain a forecast can vary widely. However, the forecast could be made more accurate if the elevations of each GeoPoint is taken into account.

So where does the 90 meter data come from? Well, the SRTM mission a few years back captured this level of detail for the area from -60° to 60° latitude, which includes almost all the land masses. Outside that range, a lower resolution dataset is used to fill in the gaps. A data point every 90 meters doesn’t sound like much; afterall that’s elevation points about a football field away from each other. I imagine in some terain where you have a lot of quick elevation changes, such as canyons or cliffs, this wouldn’t be enough. However, do not underestimate the amount of data here. The compressed download is around 1.2 gigabytes, and it expands to a ~7 gigabyte data file. The entire dataset is 86400×43200 data points, or 3,732,480,000 GeoPoints. Multiply that by 2 bytes per data point and you have a 7 gigabyte file.

Fortunately, this file is easy to parse. There is no header, just the raw data. The map projection is a simple equirectangular projection, giving equal distance across latitude and longitude. The location starts at (90°, -180°) and continues across an entire 86400 point row before moving South to the next row. The final GeoPoint is (-90°, 180°). Each data point is a signed “short” integer (2 bytes long). Perl’s unpack function works wonders here to get a short value from the binary data.This data is going to be perfect for taking elevation into account while forecasting. It’s resolution is over 14000x (!) more detailed than my 0.5 degree GFS dataset, giving 120 GeoPoints between each GFS GeoPoint in each dimension. This should be plenty of data to work with. Maybe I’ll talk about how I use elevation to make more accurate forecasts in a future article.

Welcome!

August 16th, 2007 by mike

Welcome to the weather developer blog.  I wanted to start up this site both as a resource for developers working on weather software, and as a site for weather geeks who wish to know more about what goes onbehind the scenes.

Getting the ball rolling on Seasonality 2.0 development, in the coming months I will have several topics to post about.  The first includes some details on determining elevation for any location on the globe using SRTM data.  I’ll also be talking about the process of making more accurate forecasts from grid data in the near future.

This site is meant to be a collection of information.  Readers are encouraged to submit their own articles to share with other readers.  To submit an article, email “submissions” at this domain.  Frequent authors will be considered for login access, so the posting of more frequent articles can be made easier.

Again, welcome to the site.  I hope this is as entertaining and educational for you to read as it is enjoyable for me to write.