GSODR: Global Surface Summary of the Day (GSOD) Weather Data from R

Travis-CI Build Status Build status codecov DOI CRAN_Status_Badge JOSS

Introduction to GSODR

The GSOD or [Global Surface Summary of the Day (GSOD)]https://data.noaa.gov/dataset/dataset/global-surface-summary-of-the-day-gsod/) data provided by the US National Centers for Environmental Information (NCEI) are a valuable source of weather data with global coverage. However, the data files are cumbersome and difficult to work with. GSODR aims to make it easy to find, transfer and format the data you need for use in analysis and provides five main functions for facilitating this:

When reformatting data either with get_GSOD() or reformat_GSOD(), all units are converted to International System of Units (SI), e.g., inches to millimetres and Fahrenheit to Celsius. File output can be used in an R session as a tibble(), saved as a Comma Separated Value (CSV) file or in a spatial GeoPackage (GPKG) file, implemented by most major GIS software, summarising each year by station, which also includes vapour pressure and relative humidity elements calculated from existing data in GSOD.

Additional data are calculated by this R package using the original data and included in the final data. These include vapour pressure (ea and es) and relative humidity.

It is recommended that you have a good Internet connection to download the data files as they can be quite large and slow to download.

For more information see the description of the data provided by NCEI, http://www7.ncdc.noaa.gov/CDO/GSOD_DESC.txt.

Quick Start Install

Stable Version

A stable version of GSODR is available from CRAN.

install.packages("GSODR")

Development Version

A development version is available from from GitHub. If you wish to install the development version that may have new features or bug fixes before the CRAN version does (but also may not work properly), please install the devtools package, available from CRAN. We strive to keep the master branch on GitHub functional and working properly.

#install.packages("devtools")
devtools::install_github("ropensci/GSODR", build_vignettes = TRUE)

Using GSODR

GSODR’s main function, get_GSOD(), downloads and cleans GSOD data from the NCEI server. Following is an example of its capabilities, for more detailed examples of its capabilities, please see the GSODR introduction vignette.

Example - Download weather station data for Toowoomba, Queensland for 2010

library(GSODR)
## 
## GSOD is distributed free by the U.S. NCEI with the
## following conditions.
## 'The following data and products may have conditions placed
## their international commercial use. They can be used within
## the U.S. or for non-commercial international activities
## without restriction. The non-U.S. data cannot be
## redistributed for commercial purposes. Re-distribution of
## these data by others must provide this same notification.
## WMO Resolution 40. NOAA Policy'
## 
## GSODR does not redistribute any weather data itself. It 
## only provides an interface for R users to download these
## data, but it does redistribute station metadata in the
## 
##                 package.
tbar <- get_GSOD(years = 2010, station = "955510-99999")

tbar
## # A tibble: 365 x 48
##      USAF  WBAN        STNID          STN_NAME  CTRY STATE  CALL    LAT
##     <chr> <chr>        <chr>             <chr> <chr> <chr> <chr>  <dbl>
##  1 955510 99999 955510-99999 TOOWOOMBA AIRPORT    AS  <NA>  <NA> -27.55
##  2 955510 99999 955510-99999 TOOWOOMBA AIRPORT    AS  <NA>  <NA> -27.55
##  3 955510 99999 955510-99999 TOOWOOMBA AIRPORT    AS  <NA>  <NA> -27.55
##  4 955510 99999 955510-99999 TOOWOOMBA AIRPORT    AS  <NA>  <NA> -27.55
##  5 955510 99999 955510-99999 TOOWOOMBA AIRPORT    AS  <NA>  <NA> -27.55
##  6 955510 99999 955510-99999 TOOWOOMBA AIRPORT    AS  <NA>  <NA> -27.55
##  7 955510 99999 955510-99999 TOOWOOMBA AIRPORT    AS  <NA>  <NA> -27.55
##  8 955510 99999 955510-99999 TOOWOOMBA AIRPORT    AS  <NA>  <NA> -27.55
##  9 955510 99999 955510-99999 TOOWOOMBA AIRPORT    AS  <NA>  <NA> -27.55
## 10 955510 99999 955510-99999 TOOWOOMBA AIRPORT    AS  <NA>  <NA> -27.55
## # ... with 355 more rows, and 40 more variables: LON <dbl>, ELEV_M <dbl>,
## #   ELEV_M_SRTM_90m <dbl>, BEGIN <dbl>, END <dbl>, YEARMODA <date>,
## #   YEAR <chr>, MONTH <chr>, DAY <chr>, YDAY <dbl>, TEMP <dbl>,
## #   TEMP_CNT <int>, DEWP <dbl>, DEWP_CNT <int>, SLP <dbl>, SLP_CNT <int>,
## #   STP <dbl>, STP_CNT <int>, VISIB <dbl>, VISIB_CNT <int>, WDSP <dbl>,
## #   WDSP_CNT <int>, MXSPD <dbl>, GUST <dbl>, MAX <dbl>, MAX_FLAG <chr>,
## #   MIN <dbl>, MIN_FLAG <chr>, PRCP <dbl>, PRCP_FLAG <chr>, SNDP <dbl>,
## #   I_FOG <int>, I_RAIN_DRIZZLE <int>, I_SNOW_ICE <int>, I_HAIL <int>,
## #   I_THUNDER <int>, I_TORNADO_FUNNEL <int>, EA <dbl>, ES <dbl>, RH <dbl>

Other Sources of Weather Data in R

There are several other sources of weather data and ways of retrieving them through R. Several are also rOpenSci projects.

rnoaa, from rOpenSci offers tools for interacting with and downloading weather data from the United States National Oceanic and Atmospheric Administration but lacks support for GSOD data.

bomrang, from rOpenSci provides functions to interface with Australia Government Bureau of Meteorology (BoM) data, fetching data and returning a tidy data frame of précis forecasts, current weather data from stations, agriculture bulletin data, BoM 0900 or 1500 weather bulletins or a raster stack object of satellite imagery from GeoTIFF files. Data (c) Australian Government Bureau of Meteorology Creative Commons (CC) Attribution 3.0 licence or Public Access Licence (PAL) as appropriate. See http://www.bom.gov.au/other/copyright.shtml for further details.

riem from rOpenSci allows to get weather data from Automated Surface Observing System (ASOS) stations (airports) in the whole world thanks to the Iowa Environment Mesonet website.

CliFlo from rOpenSci is a web portal to the New Zealand National Climate Database and provides public access (via subscription) to around 6,500 various climate stations (see https://cliflo.niwa.co.nz/ for more information). Collating and manipulating data from CliFlo (hence clifro) and importing into R for further analysis, exploration and visualisation is now straightforward and coherent. The user is required to have an internet connection, and a current CliFlo subscription (free) if data from stations, other than the public Reefton electronic weather station, is sought.

weatherData provides a selection of functions to fetch weather data from Weather Underground and return it as a clean data frame.

Other Sources for Fetching GSOD Weather Data

The GSODTools by Florian Detsch is an R package that offers similar functionality as GSODR, but also has the ability to graph the data and working with data for time series analysis.

The ULMO library offers an interface to retrieve GSOD data using Python.

Notes

Other Data Sources

Elevation Values

90m hole-filled SRTM digital elevation (Jarvis et al. 2008) was used to identify and correct/remove elevation errors in data for station locations between -60˚ and 60˚ latitude. This applies to cases here where elevation was missing in the reported values as well. In case the station reported an elevation and the DEM does not, the station reported is taken. For stations beyond -60˚ and 60˚ latitude, the values are station reported values in every instance. See https://github.com/ropensci/GSODR/blob/master/data-raw/fetch_isd-history.md for more detail on the correction methods.

WMO Resolution 40. NOAA Policy

Users of these data should take into account the following (from the NCEI website):

“The following data and products may have conditions placed on their international commercial use. They can be used within the U.S. or for non-commercial international activities without restriction. The non-U.S. data cannot be redistributed for commercial purposes. Re-distribution of these data by others must provide this same notification.” WMO Resolution 40. NOAA Policy

Meta

References

Jarvis, A., Reuter, H. I., Nelson, A., Guevara, E. (2008) Hole-filled SRTM for the globe Version 4, available from the CGIAR-CSI SRTM 90m Database (http://srtm.csi.cgiar.org)

ropensci