View all publications

4 Data

4.1 Environmental variables

4.1.1 EEZ

Table 1 lists the candidate environmental variables for the EEZ classification. Maps of some of these variables are shown in Figure 1. The rationale for these candidate variables was fully discussed in the draft design report (Snelder et al. 2001) and the development of each variable is more fully discussed in Hadfield et al. (2002). Many of these variables are temporally variable (e.g. solar radiation, orbital velocity). Therefore, the variables represent the long-term average value and have been designed to discriminate differences in mean characteristics between locations (not between times). The following section provides a brief discussion of the relevance of these variables and the development of spatial coverages describing them. Figure 1 maps the eight variables that were eventually chosen to define the EEZ classification (see Hadfield et al. 2002 for maps of all candidate variables).

Table 1: Candidate environmental variables derived for the EEZ classification

View candidate environmental variables derived for the EEZ classification (large table)

Figure 1: Maps of the environmental variables derived for the EEZ classification (Depth, Mean orbital velocity, Wintertime SST, Annual mean solar radiation, Annual Amplitude SST, Spatial gradient SST, Tidal currents, Slope)

A map of the EEZ is shown for each variable. The range for each variable is depicted by using colour to express the value. Variables and their data range were as follows: Depth (0 - 10,000 m ), Orbital Velocity (0 - 2478 mm/s), Wintertime SST (0.005 - 40 °C), Annual Mean Surface Irradiance (90 - 250 W/m2), Annual Amplitude SST (0.000017-5 °C), Spatial Gradient SST (0.000003 - 0.2 °C), Depth Averaged Maximum Tidal Current (0 - 3.2 m/s) and Slope Value (0 - 0.999851). Generally speaking, the EEZ varies significantly in depth and slope due to a number of continental shelfs, orbital velocity is higher in proximity to the coastline, tidal currents are influenced by coastline and changes in seabed slope and water temperature rises as you move north and/or shallower waters.

A map of the EEZ is shown for each variable. The range for each variable is depicted by using colour to express the value. Variables and their data range were as follows: Depth (0 - 10,000 m ), Orbital Velocity (0 - 2478 mm/s), Wintertime SST (0.005 - 40 °C), Annual Mean Surface Irradiance (90 - 250 W/m2), Annual Amplitude SST (0.000017-5 °C), Spatial Gradient SST (0.000003 - 0.2 °C), Depth Averaged Maximum Tidal Current (0 - 3.2 m/s) and Slope Value (0 - 0.999851). Generally speaking, the EEZ varies significantly in depth and slope due to a number of continental shelfs, orbital velocity is higher in proximity to the coastline, tidal currents are influenced by coastline and changes in seabed slope and water temperature rises as you move north and/or shallower waters.

Depth was chosen because it is correlated with many physical drivers of biological distributions. Light, temperature, pressure and salinity all vary with depth, although mostly in a non-linear fashion. Depth also mediates the supply of organic matter from the surface to the seabed. Depth was estimated for each cell in the 1 km classification grid from a NIWA bathymetric layer interpolated from a large quantity of depth data of variable quality and resolution.

Annual mean surface solar radiation is an important factor controlling rates of primary production. The pattern of solar radiation variation over the EEZ is essentially one of latitudinal variation, which is modified by cloud cover. The clear-sky solar irradiance was calculated from the instantaneous solar elevation using the method of Davies et al. (1975), with allowances for atmospheric water vapour and dust appropriate for clean oceanic air at 40°S (water vapour content = 1.6 cm, dust transmission coefficient = 0.95). Daily mean solar irradiance was then calculated by numerical integration of clear-sky solar irradiance for noon-time solar elevation calculated for the mid-date of each month, combined with monthly-mean cloud cover data from the International Satellite Cloud Climatology Project D2 dataset of global cloud parameters monthly means from July 1983 through December 1995 (Rossow and Schiffer 1999).

Winter surface solar radiation was derived in order to discriminate between locations that have similar mean annual solar radiation, yet have differences in maximum summer or minimum winter solar radiation and, therefore, have different productivity. Winter surface solar radiation was calculated as for annual mean surface solar radiation for the shortest day of the year (day 172, late June) and combined with cloud cover data for June.

Sea surface temperature (SST) was expressed by four variables formulated to capture specific oceanographic processes, both physical and chemical, that affect biological pattern (Snelder et al. 2001). The variable layers based on SST are all calculated from a SST climatology dataset derived from the NIWA SST archive. The procedures for collecting satellite radiometer data, detecting cloud and retrieving SST are described by Uddstrom and Oien (1999). The climatology was prepared by compositing data for each of the 96 months in the years 1993 to 2000 on a grid with approximately 9 km resolution. The climatologies were later interpolated onto the 1 km2 classification grid. This interpolation was considered reasonable because of the relatively smooth and slowly changing character of most of the SST variables. Wintertime SST was chosen as a proxy for water mass, which is related to differences in both temperature and chemical characteristics of the water including nutrient availability. Wintertime SST was evaluated by spatial smoothing of temperature at the time of typically lowest SST (day 250, early September). The annual amplitude of SST was chosen to reflect differences in stratification and wind mixing that together produce a mixed layer across the classified area. Annual amplitude of SST was evaluated from the annual harmonic which is spatially smoothed. The spatial gradient of annual mean SST is used to recognise frontsin oceanic water masses that are expected to correlate with variation in primary productivity. Spatial gradient of annual mean SST was produced by smoothing annual mean SST then evaluating the magnitude of the spatial gradient (in °C km-1) for each grid cell by centred differencing. The summertime SST anomaly is expected to define anomalies in temperature that are due to hydrodynamic forcing, such as upwelling and vigorous mixing due to eddies. Areas with high summer SST anomaly are expected to correlate with high primary productivity. Summer SST anomaly was derived from SST measured in late February data (day 50), the time of year when SST is typically highest by band-pass filtering at scales between 20 and 450 km.

Mean orbital velocity and Extreme orbital velocity describe the variation in velocity at the sea bed that is induced by swell waves. This velocity plays an important role in structuring benthic communities by inducing bed stress and re-suspension of bed material. Both average and extreme (represented here by the 95th percentile) orbital velocities were considered to be potentially important. The mean orbital velocity represents the variation in mean wave energy whereas extreme orbital velocity discriminates locations on the basis of rare high magnitude wave events. The EEZ scale orbital velocity variables were based on a wave climatology derived from a 20-year hindcast (1979-1998) of swell wave conditions in the New Zealand region (Gorman and Laing 2000). The wave climatology was used to interpolate the mean and 95th percentile values of significant wave height and mean values of wave peak period onto the 1 km bathymetry grid. The wave height, period and depth were used to estimate mean and 95th percentile bed orbital velocities. Bed orbital velocities were assumed to be zero where depth was greater that 200 m. No accounting was made for refraction or sheltering by land inside the 50 m isobath, resulting in some unreasonably high values in sheltered coastal environments.

Tidal current can be important in structuring benthic communities and also affects mixing properties of the water column. Variation in tidal currents was described using the modelled maximum depth-averaged tidal currents (m s-1). The tidal current layer was derived using the model described by Walters et al. (2001).

Seabed relief was developed into four layers from analysis of the 1 km bathymetry grid. These were (1) curvature, (2) profile, (3) plan, and(4) slope. Each of these variables was computed for each grid cell by analysis of the surrounding cells in the bathymetry grid (Hadfield et al. 2002).

Sediment type is a factor that determines the composition of benthic communities. Variation in sediment types was derived from the New Zealand Region Sediments chart (scale of 1:6,000,000) (Mitchell et al. 1989). The chart was digitised and converted to a grid showing 23 categories based on the dominant and subdominant sediment type. These sediment types were also converted to effective particle size and averaged and ranked to give the continuous variable rank sediment size, a variable suitable for correlation analyses. Although this variable showed some relationships with biological datasets (see Image et al. 2003), it was eventually discarded because of difficulties with including this categorical variable in the classification procedure (see section 7.1.1).

Freshwater input was recognized as an important variable, particularly in coastal waters (Snelder et al. 2001). Although a freshwater fraction layer for the EEZ would probably best be developed from remotely sensed data, such a product was not available and a placeholder for the freshwater fraction variable was used instead. This was based on a simple GIS-based routine that modelled the mixing and dispersal of freshwater inputs from rivers into the coastal environment (Hadfield et al. 2002). Although this variable showed some relationships with biological datasets (see Image et al. 2003), it was eventually discarded because of concerns about its accuracy.

4.1.2 Regional scale - Hauraki

Table 2 lists the candidate environmental variables for the Hauraki Gulf classification. Maps of the eight variables eventually chosen to define the Hauraki Gulf classification are shown in Figure 2. The rationale for the selection of these candidate variables is fully discussed in the draft design report (Snelder et al. 2001) and the development of each variable is more fully discussed in Hadfield et al. (2002).

Table 2: Candidate environmental variables derived for the Hauraki Gulf classification

View candidate environmental variables derived for the Hauraki Gulf classification (large table)

Figure 2: Maps of the environmental variables derived for the Hauraki Gulf classification (Depth, SST annual phase, SST semi-annual amplitude, SST monthly standard deviation, Mean orbital velocity, Tidal currents, Freshwater fraction, Slope)

A map of the EEZ is shown for each variable. The range for each variable is depicted by using colour to express the value. Variables and their data range were as follows: Depth (0.311050 - -180.442001 m), SST Annual Phase (31 - 62.9 days), SST Semi-Annual Amplitude (0.15 - 0.6 °C), SST Within Month Standard Deviation Value (0.4 - 0.9), Orbital Velocity (0 - 32.5 cm/s), Maximum Tidal Current (0.007 - 4.3 m/s), Freshwater Fraction Value (0 - 0.999851) and Slope Value (0 - 4.858312). Generally speaking, water depth decreases as you move into the Firth of Thames. Water temperatures on average become warmer in the Firth of Thames but this is associated with higher temperature variability. Freshwater fractions are concentrated at the bottom of the Firth of Thames and Waitemata Harbour (and generally increase at river mouths). Tidal currents are particularly strong around Cape Colville and within Waitemata Harbour . Changes in orbital velocity and slope are associated with proximity to coastline.

A map of the EEZ is shown for each variable. The range for each variable is depicted by using colour to express the value. Variables and their data range were as follows: Depth (0.311050 - -180.442001 m), SST Annual Phase (31 - 62.9 days), SST Semi-Annual Amplitude (0.15 - 0.6 °C), SST Within Month Standard Deviation Value (0.4 - 0.9), Orbital Velocity (0 - 32.5 cm/s), Maximum Tidal Current (0.007 - 4.3 m/s), Freshwater Fraction Value (0 - 0.999851) and Slope Value (0 - 4.858312). Generally speaking, water depth decreases as you move into the Firth of Thames. Water temperatures on average become warmer in the Firth of Thames but this is associated with higher temperature variability. Freshwater fractions are concentrated at the bottom of the Firth of Thames and Waitemata Harbour (and generally increase at river mouths). Tidal currents are particularly strong around Cape Colville and within Waitemata Harbour . Changes in orbital velocity and slope are associated with proximity to coastline.

In general, similar variables were developed for the Hauraki Gulf and EEZ classifications. However, there were some differences in choice of variables due to differences in scale. For example, solar radiation was not included as a candidate variable because at the scale of the Hauraki Gulf, solar radiation is effectively spatially invariant. The SST statistics were also different to those used for the EEZ classification. In addition, the need for increased spatial resolution for the Hauraki Gulf layers, compared to the EEZ, meant that higher resolution data and more detailed modelling was used to generate the Hauraki Gulf variables. The following section provides a brief discussion of the relevance and development of spatial coverages describing these variables.

Depth for each cell in the 200 m classification grid was interpolated from a large quantity of depth data of variable quality and resolution. Five SST variables were formulated to capture specific oceanographic processes (Snelder et al. 2001). The variables that were based on SST were calculated from the same NIWA SST Archive as the EEZ classification variables but the grid spacing for the composited monthly data was reduced to 2 km. The data were later interpolated onto the 200 m classification grid. SST meanwas used to capture the contrast between cool inner-gulf water and warmer East Auckland current water offshore. SST annual amplitudeis related to the depth of the mixed layer, large amplitudes corresponding to deeper mixed layers. Annual amplitude decreases inshore because the mixed layer depth is limited by the depth of the water. SST annual phase is also related to mixed layer depth. Deeper mixed layers take longer to warm and cool seasonally so the phase of the annual cycle lags. It was considered that this variable may also have some direct effect on biota, in that, where the annual phase lag is large, the time of maximum irradiance may not coincide with the time of maximum temperature. SST semi-annual amplitude was chosen because the semi-annual harmonic causes the seasonal cycle to be distorted. The physical processes controlling this quantity are not well understood. One process that should be significant offshore is the seasonal variation in the mixed layer depth, which tends to allow the sharp SST maximum in summer but a broad SST minimum in winter. The SST within-month standard deviationwas used as a measure of variability in SST. This quantity was expected to be large where strong eddy activity occurs in regions of strong spatial gradients. It may also be large in regions of large, variable freshwater influence.

For the Hauraki Gulf classification an existing hydrodynamic model was used to estimate freshwater fraction and tidal current across the Hauraki Gulf. Tidal currents and freshwater dispersion in the Hauraki Gulf were simulated using the three-dimensional model MIKE 3, with one (depth-averaged) layer and a 750 x 750 m cell size. Tides were forced at the open boundaries using the M2 tidal component (i.e. no spring-neap variation). The model output the resulting depth-averaged maximum tidal current at each node in the model grid. The estimated mean freshwater inflows for the significant river systems draining into the Gulf were added to the model and it was run for a two-month period to allow the freshwater to disperse and modelled values to stabilise. The equilibrium freshwater fraction at each node in the grid was used to represent the mean freshwater fraction. The values at each model node were interpolated onto the 200 m grid.

Mean orbital velocity and extreme orbital velocity were derived from a simulation of the Hauraki Gulf using the SWAN shallow-water wave model (Booij et al. 1999; Ris et al. 1999). The model was driven using a boundary swell derived from NIWA's 20-year hindcast of wave conditions in the New Zealand region, and associated ECMWF (European Centre for Medium-Range Weather Forecasts) winds. The spatial grid had a 750-metre resolution covering the Hauraki Gulf.

The Hauraki Gulf sediment layer was derived by digitising the Hauraki Coastal Sediment Series chart (scale 1:200,0000) (DSIR 1992). The dominant and subdominant sediment codes from the chart were used to categorise each grid cell. These sediment types were also converted to an ordinal variable representing effective particle size, which was suitable for correlation analyses. Although sediment showed some relationships with biological datasets (see Hewitt and Snelder 2003), it was eventually discarded because of difficulties with including categorical variables in the classification procedure (see section 8.1.1).

Four layers representing seabed relief were developed from analysis of the 200-metre bathymetry grid. These were (1) curvature, (2) profile, (3) plan and(4) slope. Each of these variables was computed for each grid cell by analysing the surrounding cells in the bathymetry grid (Hadfield et al. 2002).

4.2 Biological data

4.2.1 EEZ biological data sets

Biological data used in the EEZ validation were drawn from four sources as follows. Research trawlers have collected a large dataset describing the distributions of mainly demersal fish species since 1961 (Figure 3). This data, herein after called the 'fish dataset' is fully described by Francis et al. (2002) who used it to describe demersal fish assemblages in New Zealand waters. The dataset contained 19,232 stations and 123 species after removal of stations that fell outside the scope of the environmental variable grids and rare species that did not occur in more than 1% of the trawls. Because sampling efficiency varies due to differences in nets (types and sizes), and vessels (towing power) this dataset was amenable to presence/absence analysis only.

Figure 3: Ministry of Fisheries research demersal fish trawl survey stations within the New Zealand Exclusive Economic Zone

A map of the EEZ shows the 19,232 stations included in the Fish Dataset. The points are concentrated on or over the continental shelf with few points in the deeper parts of the EEZ.

Benthic species data (presence/absence) were available from three continental shelf surveys, jointly called hereafter the 'shelf dataset'. In order to reduce any likely error associated with species level identifications, data were analysed at the taxonomic level of family. Analysis of data at the family level can be sufficient to identify natural spatial pattern in marine macrofauna assemblages (see Olsgard and Somerfield 2000). This dataset comprised 274 stations and 145 species.

Additional benthic data were obtained from NIWA's AllSeaBio database. Limitations with this data (see Image et al. 2003) restricted its use to species belonging mainly to the echinoderm orders Asteroidea and Ophiuroidea. These two orders were selected for their commonality and broad geographic/depth distribution. In addition, their taxonomic identification within the database was reliable due to recent attention (McKnight 2000; Clark and McKnight 2000; Clark and McKnight 2001).

Ocean colour data derived from Sea-viewing Wide-Field-of-view Sensor (SeaWiFS) was used to estimate the mean chlorophyll concentration. Light data from the ocean surface in six visible wavebands collected between September 1997 and July 2001 were composited at a variety of spatial and temporal scales, partly to help overcome problems with cloud cover. This product was used in an empirical algorithm to retrieve the concentration of chlorophyll-a at a spatial resolution of about 9 km (Figure 4). The coverage of estimated long term mean chlorophyll was randomly subsampled at approximately 9600 points and this 'chlorophyll dataset' was used for the validation, testing and tuning analyses. Because chlorophyll estimates in coastal waters are unreliable due to suspended solids in the water column, we only used data from water that was deeper than 30m.

Figure 4: Mean annual sea surface chlorophyll concentrations within the New Zealand Exclusive Economic Zone derived from remotely-sensed (satellite) ocean colour data collected between September 1997 and July 2001

A map of the EEZ shows chlorophyll concentration data (0.02 - 3.00 mg m-3) by using colour to denote chlorophyll concentration. Chlorophyll concentrations tend to be higher in proximity to coastlines and above the continental shelf.

4.2.2 Hauraki biological data

Biological data used in the Hauraki Gulf validation are fully described by Fenwick and Flanagan (2002) and were drawn from four sources as follows. A large plankton dataset, hereafter called the pelagic dataset, was amenable to analysis of abundance/concentration. The pelagic dataset included chlorophyll concentration and abundance data for five types of large zooplankton (brachyuran and decapod shrimp larvae, Sagitta sp., medusae and enteropneust), a number of types of microzooplankton, and fish larvae and eggs. This data were collected from 54 stations (see Figure 5) at approximately 10 and 30 m depths from the months of November, December and January in 1985-87; September, October, December, January and February in 1996-98; and throughout 1999-2001. The validation analysis was restricted to the chlorophyll, large zooplankton and microzooplankton components of this dataset and for specific sampling occasions. In the subsequent tuning phase of the work, all biological components were amalgamated in a single community analysis and all sampling occasions were combined into a single average abundance/concentration for each station.

Figure 5: Location of stations for the Hauraki plankton dataset

A map of the Hauraki Gulf shows the position of the 54 stations used in the collection of data for the Pelagic Dataset.

Benthic datasets from within the Gulf, including the Allseabio database and data collected by other investigations comprising epifauna and infauna data were collated and combined (see Fenwick and Flanagan 2002). Infaunal data that was sampled by coring, hereafter called the core dataset, were available for 216 stations. All but 39 of the core dataset stations were in the Firth of Thames; there were none in the middle deep areas or in the vicinity of Great Barrier Island. Infaunal data from grab sampling were available for 121 stations, hereafter called the grab dataset. All but 31 stations from the grab dataset were in the Firth of Thames; there were none in the middle deep areas or in any harbours or estuaries (see Figure 6). Not all of these sites were in the area covered by the environmental variables layers. In addition, a stratified (by location) random selection of the Firth of Thames samples was used to prevent the data from this area biasing the analyses.

Figure 6: Locations of benthic macrofauna sites. Blue circles are core sites, red stars are grab sites, black crosses are sites from the Allseabio dataset and purple triangles are sites that could not be used because they fell outside the data grid.

Note: Blue circles are core sites, red stars are grab sites, black crosses are sites from the Allseabio dataset and purple triangles are sites that could not be used as they fell outside the data grid.

A map of the Hauraki Gulf shows the position of core sites, grab sites, Allseabio dataset sites and sites that could not be used in the collection of data for the benthic macrofauna data.

Demersal fish data, hereafter called the fish dataset, were as used in Kendrick and Francis (2002), except for 107 points outside the Hauraki grid and 86 points for which no environmental data were available (see Figure 7). These data were collected between 1982 and 1997 in spring and autumn using the same net type and ship (Kaharoa).

Figure 7: Locations of fish data stations in the Hauraki Gulf

A map of the Hauraki Gulf shows the points from which Fish dataset data were collected.

4.2.3 Limitations of biological datasets

Before any analyses were performed, we examined the environmental distribution of the sampling stations for the biological datasets relative to the total environmental variation described by the EEZ and Hauraki Gulf environmental variable layers. The representation of the environmental space by biological data was summarized in a frequency plot for each environmental variable that is overlaid with the corresponding frequency of biological sites. Examples of these plots are shown in Figures 8 and 9. The graphs show that large parts of the range of many of the environmental variables are not sampled by the biological datasets. This restricted our ability to validate the environmental variables and to test the effect of classification decisions such as transforming and/or weighting variables. In addition, these data were also used to test the classification. The lack of data over much of the environmental domain limited our ability to fully test the classification and to describe the biological characteristics of many environmental classes.

Figure 8: Representation of the environmental 'space' for the EEZ by the fish dataset.

Each plot shows the frequency distribution for each environmental variable (solid blue), which is overlaid by the corresponding frequency of fish stations (green line).

The figure shows a plot for each of the following environmental variables: Depth, Slope, Extreme Orbital Velocity, Mean Orbital Velocity, Annual Mean Solar Radiation, Winter Solar Radiation, Annual Amplitude of Sea Surface Temperature, Summertime Sea Surface Temperature Anomaly, Spatial Gradient Annual Mean Sea Surface Temperature, Wintertime Sea Surface Temperature and Tidal Current. Plots show the distribution of the environmental variable values overlayed with the frequency of fish stations sampled at a particular value of the variable. Plots describe how well the biological data coincided with the range of each variable.

Figure 9: Representation of the environmental 'space' for the Hauraki Gulf by fish dataset.

Each plot shows the frequency distribution for each environmental variable (solid blue), which is overlaid by the corresponding frequency of biological stations (green line).

The figure shows a plot for each of the following environmental variables: Depth, Mean Freshwater Fraction, 95 th Percentile Peak Bed Orbital Velocity, Mean Peak Bed Orbital Velocity, Slope, SST Annual Amplitude, SST Annual Phase, SST Mean, SST Within-Month Standard Deviation, SST Semi-Annual Amplitude and Depth Averaged Maximum Tidal Current. Plots show the distribution of environmental variable value overlayed with the frequency of fish stations sampled at a particular value. Plots describe how well the biological data coincided with the range of each variable.