Data quality control


КData quality control was in accordance with the procedures accepted by the Ocean Climate Laboratory NODC (Ingleby, Huddleston, 2007; Johnson et. al., 2009).

To create this Atlas and the Climatic Atlases of the Sea of Azov of 2006 and 2008, general approaches to data quality were applied (Матишов и др., 2005; 2009; Matishov et. al., 2009; 2010; Moiseev et al., 2012).

Quality control procedures include:

  • An automated quality control of data performed by computer software developed for this purpose;
  • A stage of subjective analysis made by a specialist.

The procedure is iterative since corrections are subject to subsequent automated and subjective checks.

The automated process of quality control includes the searches for and consideration of possible errors in data. The result is a report that contains information about the registered errors and warnings.

All checks may be divided into several groups:

  • Control over data formats;
  • Check of spatial-temporal location;
  • Check of vertical structure of changes;
  • Check of measurement values;
  • Search for duplicates.

The first group of quality control include date and time checks (e.g., date of station and duration of cruise, order of data collection by station). Data collected from stations and cruises is also checked for missing data.

The second group of quality control verifies the accuracy of coordinates and time stamps (e.g. date, time):

  • The temporal interval between two consecutively visited stations to the vessel’s maximum speed;
  • Shift of coordinates of stations onto land;
  • Verification of depth at station in relation to known bathymetry;
  • Verification of a vessel’s route to determine sudden and sharp changes of directions (e.g. zigzags).

The third group of quality control verifies:

  • Duplication of data;
  • Negative data values;
  • Valid bathymetry data points;
  • Check of correlation of profiles’ values to station’s depth (in case there is the depth value at the station);
  • Valid gradients of hydrological and hydrochemical parameters.

The fourth group of quality control verifies parameters associated with allowable and known ranges (e.g. water areas, time of the year, and time of the day). Density inversion and freezing temperatures are verified against measurements of water temperature and salinity. With respect to instrument validation, significance of measured values was verified. Also, comparative data associations were made when appropriate. For example, water salinity was checked against chlorine values.

In addition to verifying the correctness of the data, there is a problem of the appearance of duplicate information. With a large number of information sources, the probability of obtaining the same data is repeatedly increased. Therefore, the search and elimination of duplicates is an urgent task. Difficulties in solving it are due to the fact that some of the data is not presented in its original form, but after some processing, the characteristics of which are, as a rule, unknown.

To carry out the visual control over the spatial location, all stations are marked on the chart by geo-informational system using ArcGis Desktop 9.*.

With the help of developed programs,(Matishov et. al., 2009; 2010) data is imported into the database. This is the final stage of data processing. All oceanographic data used for the development of atlases within the International Ocean Atlas and Information Series are available without restrictions via the Internet portal, supported by the NOAA National Oceanographic Data Centre..