View all publications

Appendix 2: Developing the Guidelines

In this appendix:

Two basic approaches

Microbiological water-quality guidelines for recreational areas may be developed from two main strands of enquiry into health effects: epidemiological studies or quantitative risk assessment. In the former the focus is on direct measurement of health effects while in the latter the focus is first on pathogen concentrations, with health effects then being inferred using known dose-response relationships. Both types of study have been used to develop these guidelines: epidemiological for the marine waters and quantitative risk assessment for the freshwaters.

The main reason for choosing to base marine water guidelines on the epidemiological approach is that there is a wealth of results from such studies (Prüss 1998), including some for New Zealand (McBride, Salmond et al 1998). On the other hand very few such studies are available for freshwaters, and those that are available are nearly all confined to lakes, not rivers. However, we are now in possession of a large amount of data for pathogens and indicators in freshwaters and this has facilitated a quantitative risk assessment approach. Accordingly, the guidelines for freshwaters are based on this approach.

It is important to note that the wealth of results from international studies now available point overwhelmingly to an association between illness risk to recreational water users and the concentration of suitable faecal indicators (as reviewed by Prüss 1998). They show that careful studies are needed to reveal the relationship, particularly because many of the illnesses concerned are mild and no records are kept of their occurrence (i.e. they are not ‘notifiable’). [More severe illnesses (e.g. typhoid) do occur among swimmers at grossly polluted beaches (e.g. in Egypt, El-Sharkawi & Hassan 1979; Cabelli 1983a).] Furthermore, these illnesses include both gastrointestinal and respiratory categories (when sought, respiratory illness effects have often been found; e.g. Fattal et al 1986; Corbett et al 1993; Fleisher, Kay, Salmon et al 1996; McBride, Salmond et al 1998). [Ear, nose, throat and skin symptoms are also found, often being attributed to bather-to-bather transmission, rather than to micro-organisms of faecal origin.] While some studies fail to detect an association (e.g. New Jersey Department of Health 1989), this appears to be caused either by a lack of sufficient statistical power or a lack of an ‘exposure gradient’; i.e. a sufficient range of contamination in the waters studied (as was the case in the New Jersey study).

Epidemiological studies

In these studies one aims to discover the illness record of a number of water users who used a recreational site on a particular day – a day on which water-quality samples were also taken. This calls for an intensive effort in interviewing beach users on the sampling day, and following them up some days later to obtain a record of health effects; i.e. a record of self-diagnosis is obtained (medical records are not available for examination because most swimming-associated illnesses are mild and not notifiable). Associations between health effects and the swimmers versus non-swimmers are sought, to get an estimate of any swimming-associated, pollution-related effects.

The group of people interviewed may be those who have decided of their own volition to attend the beach, without knowing that a study was in progress, in which case it is an uncontrolled prospective study. On the other hand people may be recruited into the study and taken to a particular beach where they may swim, in which case it is a controlled cohort study. Most epidemiological studies have been of the former kind, but more recent efforts have used the controlled approach.

Results from controlled cohort studies have recently been endorsed by the World Health Organization and are being incorporated into their international guidance (WHO 2003). Accordingly, these New Zealand guidelines are also based on that approach, marking a distinct change from previous editions of the New Zealand guidelines (1992, 1998, 1999).

Nevertheless, comprehending the history of the development of these studies may be helpful, as given in the following sections.

US studies

Concerns about health risks to bathers in contaminated water in America in the 1940s led to the US Public Health Service conducting a series of uncontrolled prospective follow-up epidemiological studies at river, lake and coastal sites from 1948 to 1950 (Stevenson 1953). This was a large study that reported two statistically significant associations between swimmers’ health risk and water quality, measured as concentration of total coliforms. These two findings were for beaches on Lake Michigan, at Chicago, and for the Ohio River at Dayton, Kentucky (none were found for the two marine sites, in New York City).

The study design consisted of three major elements:

  • approaching people at the beach to see whether they would agree to being questioned a few days later about their health
  • making water-quality measurements at the beach on the same day as beach-goers are approached
  • within a few days, questioning those who agreed to the follow-up as to any subsequent illness, as well as a host of other possibly confounding factors (such as other swimming, foods eaten, animal contact and household sickness).

Elements of the design of studies used to develop guidelines have been questioned over the years (e.g. Cabelli et al 1975; Moore 1975). This has included the following objections:

  • ‘swimming’ did not necessarily include head immersion
  • total coliforms are not very effective indexes of faecal pollution
  • non-swimmers were not at the beach.

Such flaws were addressed in major further studies carried out in the US in the late 1970s and early 1980s for both marine waters (Cabelli 1983a) and freshwaters (Dufour 1984), with the motivation of providing public policy agencies with a relationship between “swimming-associated, pollution-related illness risk” and typical concentrations of faecal indicators (Cabelli et al 1975). The idea was that those agencies could use an ‘acceptable’ health risk to derive limits for faecal indicator concentrations. These researchers never did advocate any acceptable particular values of these risks. Motivation for the other studies was similar. Some workers have actually recommended guidelines or standards (Grabow et al 1989; Wyer et al 1999).

Other marine water epidemiological studies have been carried out in the USA, at New Jersey beaches (New Jersey Department of Health 1989) and at Santa Monica Bay, near stormwater outfalls (Haile et al 1996). The latter study did discover health effects related to proximity to the outfalls.

UK studies

Meanwhile, in the UK a view had held sway since the late 1950s that swimmers’ health risk had no relationship to degree of faecal contamination, unless beaches were “aesthetically revolting” (MRC 1959) or “aesthetically very unsatisfactory” (PHLS 1959). That view was challenged increasingly over the years as being untenable (e.g. Kay & McDonald 1986a; 1986b): it was derived from a retrospective case control study for two severe and notifiable illnesses, and generalised to all illnesses (many of which are not notifiable, and so lack any substantial data). Accordingly, some prospective uncontrolled studies have also been carried out in the UK also (Pike 1994). [This report reviews two sets of UK studies: controlled studies at four beaches (subsequently reported by Kay et al 1994 and by Fleisher, Kay, Salmon et al 1996), and uncontrolled studies at another eight beaches (partly reported by Balarajan et al 1989 and Alexander et al 1992). A full open-literature paper on the eight-beach study has not been sighted.]

Another UK group has proposed that such studies are better carried out using a recruited, controlled cohort approach (Jones et al 1991). This cohort is randomly split into swimming and non-swimming parts. All eat the same foods. The swimmers are directed where to swim and for how long (immersing the head three times). Many water-quality measurements are made at the assigned swimming points. The follow-up consists of both self-reporting and clinical examinations.

The findings of these careful studies (endorsed by Telford 1996) have reported that faecal streptococci were related to adverse gastrointestinal health effects (threshold 32 per 100 mL, Kay et al 1994) and to acute febrile respiratory illness (threshold 60 per 100 mL, Fleisher, Kay, Salmon et al 1996). (This definition of respiratory illness requires an accompanying fever, and so is more stringent than that used in other studies, such as the one carried out in New Zealand by McBride, Salmond et al 1998.)

Note that the analysis of these UK studies postulates the existence of a threshold effect (a value of water quality below which there is no illness risk to swimmers whatsoever and above which there is). Analyses of other studies (e.g. Cabelli 1983a; Dufour 1984) have used continuous relationships between water quality and swimmers’ health. This is more consistent with the mixture of ages and health status of usual beach-goers: while an individual may have a particular threshold, it is unlikely that a whole population would share the same value. For that reason the UK analysis notes that the threshold should not be considered as an “absolute” value, noting that it may be set to a lower value were the study to have included a larger number of people (Wyer et al 1999).

Other relevant UK marine water studies are Brown et al 1987; Balarajan et al 1989 (see also its discussion by Hall & Rodrigues 1992); Alexander et al 1992 and Fewtrell et al 1994. Freshwater studies for canoeists and rafters have been reported by Fewtrell et al (1992, 1994) and by Lee et al (1997).

New Zealand studies

Prospective epidemiological studies were carried out at seven New Zealand beaches in the 1995/96 bathing season (McBride, Salmond et al 1998). This was particularly driven by concerns in the Auckland region about possible health effects at marine beaches.

Prior to statistical analysis each beach was placed into one of three categories: (i) impacted by human wastes, (ii) impacted by animal wastes, or (iii) pristine.

An association between enterococci concentration and respiratory illness symptoms among those entering the water was identified. [Some statistically significant associations with faecal coliform and E. coli concentrations were also noted. However their strength was lower because they tended not to rise through the quartiles of that indicator’s concentrations (whereas enterococci did so rise) and their relative risks were also lower than for enterococci.] This included “paddlers”, who entered the water but did not immerse their heads (e.g. tending small children). Relative risks in the highest enterococci quartile were rather high: 4.5 for the paddlers and 3.3 for long-duration swimmers. The unexpectedly limited range of beach contamination during the survey precluded the possibility of developing a detailed statistical model of health risk versus indicator density, as had been hoped. [The enterococci quartiles cut-offs were 1.5, 3.75 and 13 enterococci per 100 mL.] No substantial differences in illness risks were found between the two types of impacted beach, but the health risks for both were separable from that at pristine beaches.

Studies in other countries

Other relevant epidemiological studies have been conducted in the following countries (an asterisk * denotes a retrospective study):

  • Australia: Corbett et al 1993; Harrington et al 1993
  • Canada: EHD 1980; Seyfried et al 1985a and 1985b; Lightfoot 1989
  • Egypt: Cabelli 1983a; El-Sharkawi & Hassan 1979*
  • France: Foulon et al 1983; Ferley et al 1989*
  • Holland: Medema et al 1995, 1997
  • Hong Kong: Holmes 1989; Cheung et al 1990; Kueh et al 1995
  • Israel: Fattal et al 1986, 1987, 1991
  • New Zealand: Bandaranayake et al 1993; McBride, Salmond et al 1998
  • Spain: Mujeriego et al 1982*; Mariño et al 1995b
  • South Africa: Von Schirnding et al 1992; 1993

Comparing controlled and uncontrolled studies

Most of the studies listed above are of the uncontrolled type, so it is appropriate to consider which of the two main approaches to epidemiological studies is the better.

The first consideration is to note that there is no optimal way of conducting epidemiological surveys – each approach has its drawbacks (Lacey and Pike 1989). Clearly some are better than others, and flaws in older studies have been identified and remedied in later studies, such that a reasonably consistent body of evidence has now been gathered (Prüss 1998). In spite of the uncertainties involved in epidemiological studies with low attack rates, it is often still considered to be the best line of approach (cf. a risk calculation approach), where feasible (Ware 1990). This view is in spite of the fact that funds that would be spent on extensive interviewing in the epidemiological approach could be spent on a wider range of indicators and pathogens in a risk-calculation approach.

The controlled cohort prospective approach offers the most accurate methodology, minimising bias and providing for a balanced matching of swimmers and non-swimmers (Fleisher 1990b; Fleisher, Jones, Kay & Morano 1993). However, its use does require pre-publicity, which may cause enhanced self-reporting rates (Wheeler & Alexander 1992). Also, the cohort used (healthy adult volunteers) is not typical of the usual beach-going population (which includes many ages and variable health status), and the type of swimming activity may not be typical either – especially for high wave-energy New Zealand beaches where boogey boarding and body surfing are so popular. (This is important, because close-to-shore waters are often the more polluted.)

The uncontrolled cohort prospective approach offers the advantages of minimal pre-publicity and of using the actual population using the beach. It suffers from difficulties in assigning water quality to particular swimmers (according to where they swam) and in having a somewhat unbalanced set of swimmers and non-swimmers.

The view taken in these guidelines is that, given the endorsement of the WHO (2003), the results of the carefully conducted controlled cohort UK studies will be used as a basis for marine beach grading. Note that the illness rates reported by these studies, both for swimmers and for non-swimmers, tend to be higher than for uncontrolled studies.

The quantitative risk assessment approach

Some early work leading to the setting of water-quality microbiological standards and guidelines was based on a risk calculation approach. For example, Streeter (1951) calculated an individual’s risk of contracting typhoid fever or “diarrhoea-enteritis” assuming a concentration of 1000 total coliforms per 100 mL. For 90 consecutive daily exposures to this concentration the calculated risks were 1:950 and 1:50 for the two illnesses, respectively. Also, Furfari (1968) reports how shellfish standards have been calculated. [By requiring that no more than 50% of 1 mL samples were positive for coliforms, that being equivalent to an MPN of about 70 per 100 mL.]

Recent developments in this approach have noted a number of shortcomings (see especially Haas et al 1999). In particular:

  • water users experience a range of concentrations of pathogens and indicators from one day to the next, and even within a day
  • they also have variable rates of ingestion or inhalation of water, and for varying times
  • dose-response relationships (between illness risk and indicator density) have been lacking.

With the advent of powerful computer technology these issues can now be addressed relatively easily, using ‘Monte Carlo’ mathematical modelling. This is known as the Quantitative Risk Assessment (QRA) approach. Historical data is used to assign statistical distributions to the ingestion/inhalation rates, duration of exposure, and the concentration of pathogens in the water. Then a random sample is taken from each distribution to calculate the dose, which is then turned into infection or illness probabilities, or into cases, using a dose-response curve. This sampling is done many times over to simulate a large population being exposed to beach water that may, on some occasions, be contaminated.

When this is done for a population of people at a given beach (not dispersed over many beaches) the end result is that on a majority of occasions there are no cases of infection, but on a few occasions (when the contamination is unusually high and recreational water contact actually occurs) a number of infections, and hence illness, could occur.

The greatest weakness of this approach is the paucity of dose?response information. However, a surprising amount is now available – as reviewed by Teunis et al (1996) for gastrointestinal pathogens; McBride et al (2002) also include a review of material for adenovirus respiratory pathogens, only some of which is covered by Haas et al (1999).

The New Zealand study

The QRA approach has been reported in some detail for New Zealand freshwater recreational waters, as a consequence of a large national study at 22 river and three lake sites, in which five indicators and six pathogens were sampled fortnightly for 15 months in the period 1998–2000 (McBride et al 2002). The main reason for adopting the QRA approach was that epidemiological studies – either controlled or uncontrolled – were held to be unfeasible. While McBride et al (1996) concluded that a controlled cohort study would in fact be feasible, two subsequent considerations ruled it out. First was the difficulty in recruiting suitable cohorts within proper ethical requirements, and second was the paucity of available data on pathogens and indicators in New Zealand freshwaters. Given that there were many indications that freshwaters were rather more contaminated than marine beaches, it seemed prudent to attempt to plug this gap and to use the results in a QRA approach.

That study produced a wealth of information on the distribution of pathogens and indicators. In particular it found that Campylobacter was present in 60% of all samples, and that human enterovirus and/or human adenovirus were present in 54% of samples. Concentrations and occurrences of Salmonellae, Giardia cysts and Cryptosporidium oocysts were low. Catchments impacted predominantly by birds were the most contaminated, followed by those in which the dominant impact was from dairying or sheep farming. The degree of contamination was strongly related to the turbidity of the water.

In essence the study confirmed the continued use of E. coli as a faecal health-risk indicator, at least so far as Campylobacter is concerned. [Campylobacteriosis illness forms more than half of all reported notifiable illnesses in New Zealand, in recent times being around 300 cases per 100,000 population per annum (see the New Zealand Public Health Reports, published by ESR).] Unfortunately correlations between this indicator and the two virus groups examined (human enterovirus and human adenovirus) were very poor, as was the case for all other indicators examined – somatic coliphage, FRNA phage and C. perfingens spores. Also, correlation between the enterovirus and adenovirus groups was poor (enterovirus has also been used as a general virus indicator).

Accordingly, in the health-risk modelling particular attention was paid to Campylobacter infection as well as enterovirus and adenovirus infections. [The endpoint of this analsyis was taken as infection, not illness, principally because once infection rates are controlled to low levels, so too will be the illness rate. Also there are some severe practical difficulties in determining the probability of illness (given infection has occurred) as a function of dose (Teunis et al 1999).] Of the two virus groups, dose-response information suggests strongly that adenovirus is much the more infective, with risk profiles for a given beach being very similar to that for Campylobacter infections.

Deriving the guidelines

The following sections describe the basis of the guidelines for marine water and then for freshwater, both for the grading of beaches and for their ongoing surveillance. In each case a subsection describes the changes from previous guidelines.

General considerations

Since 1999 the WHO has favoured using 95 percentiles of microbiological concentrations for grading beaches via a Microbiological Assessment Category (first proposed in the Annapolis Protocol, WHO 1999). That approach is adopted here; i.e. these 2003 guidelines incorporate a risk-based approach to monitoring recreational waters, in addition to single samples as used in previous guidelines. The purpose of incorporating risk assessment is to overcome the constraints of these previous guidelines, as discussed in Section C.

Taking risk into consideration when assessing a site is achieved via a grading process, combining historical microbiological results and sanitary inspection information to give an overall Suitability for Recreation Grade. This grade provides an assessment of the condition of a site at any given time, while single samples are used to identify any immediate health risk.

The WHO provides no guidance for surveillance values; their derivation is explained below.

See Note G(x) for discussion on the Annapolis Protocol.

Marine waters

Beach grading

Results of the UK controlled epidemiological trials have been used to develop a four-category scale, as shown in Table H1. The essential results are those for gastrointestinal illness (as reported by Kay et al 1994; Wyer et al 1999), but also accompanied by the results for respiratory illness reported by Fleisher, Kay, Salmon et al (1996). In essence 95 percentile enterococci values have been identified relating to cut-off gastrointestinal risks of 1%, 5%, and 10%. The associated respiratory illness risk cut-offs are 0.3%, 1.9% and 3.9%. [Note that these are risks of acute febrile respiratory illness (AFRI). In the New Zealand study (McBride, Salmond et al 1998), and in other studies, a less strict definition of respiratory illness has been used – essentially not requiring fever as an accompanying symptom. This was a deliberate choice in the New Zealand study, given our incidence of asthma. Accordingly, the risks of more respiratory illness would be higher than these AFRI results indicate.] Readers are directed to these publications for details of the modelling approach used to derive these values.

Beach surveillance

Neither the WHO (2003) nor the authors of the UK studies on which the WHO guidance is based give any guidance for deriving surveillance values. Accordingly, we have used results from previous uncontrolled epidemiological studies, in particular those of Cabelli (1983a), also used in previous versions of the guidelines. While this could be argued to be somewhat dislocated, it has the advantage of maintaining good continuity with past practice.

These values are obtained by assuming that the distribution of enterococci is lognormal, the standard deviation of the logarithms of enterococci concentration is 0.7 (a reasonable average of available data) and the enterococci concentration is at the previous limit of a median of 35 per 100 mL (corresponding, under Cabelli’s model, to a swimming-associated risk of 19 per 1000 bathing events). Then the alert (amber) and action (red) limits are taken as the 80% and 90% upper one-sided tolerance limits for that distribution. [In the one-sided case, tolerance limits and confidence limits are operationally identical; this is not so in a two-sided case.] These figures may be simply calculated as 136 and 276 enterococci per 100 mL. [The 80 percentile and 90 percentile abscissae of the unit normal distribution are 0.8416 and 1.2816 respectively. So the one-sided 80% upper tolerance limit is 35x100.8416x0.7 = 135.9. Similarly the 90% upper tolerance limit is 35x101.2816x0.7 = 276.2.] Having regard to the uncertainty in estimating the standard deviation (of the logarithms) it seems appropriate to round these figures to 140 and 280 enterococci per 100 mL.

It is important to note that while this calculation is based on the assumption that waters are marginal for compliance with the median value, this does not mean that the alert and action limits will only be exceeded once the health risk rises above 19 per 1000 swimming events. In fact the alert and action limits begin to be exceeded when the true median concentration is someway below 19 per 1000 events. The unfortunate fact is that we can only associate particular risks with average or median values of enterococci; it is impossible to associate risks with any particular enterococci value. All that can be said is that if the alert and action levels are not exceeded then the illness risks are some way below 19 per 1000 recreational events.

Changes from previous guidelines

The previous guidelines were based on Cabelli’s uncontrolled study, as implemented by the USEPA. That is a median enterococci concentration of 35 per 100 mL, corresponding to a swimming-associated illness risk of 19 per 1000 swimmers. Also the alert and action limits have been rounded from 136 and 277 enterococci per 100 mL to 140 and 280 enterococci per 100 mL.

Background to the changes

It is of interest to note that the previous guidelines were not based on an explicit adoption of an ‘acceptable illness risk’ of 19 per 1000 bathers. What in fact happened is that the USEPA first proposed that criteria be based on a maximum acceptable illness risk of six per 1000 bathers (in the Federal Register 1984), corresponding to a geometric mean (or median) concentration of three enterococci per 100 mL. Submissions on the proposal noted this limit was so low as to be impractical (i.e. unattainable) in near-shore coastal environments. The counter-argument was made that the previous limit (geometric mean 200 faecal coliforms per 100 mL) had appeared to work satisfactorily, so why not use corresponding limits of enterococci? This argument was accepted by the USEPA, which used ratios of faecal coliforms versus enterococci to establish correspondences between faecal coliforms and the new preferred enterococci indicator (Favero 1985; USEPA 1986a), i.e. 200 faecal coliforms per 100 mL corresponded to 35 enterococci per 100 mL in coastal waters. Using Cabelli’s relationship, this corresponds to an illness risk of 19 per 1000 bathing events.

That is, the ‘acceptable’ illness risks were not chosen a priori[It would be very odd if they had been, since 19 is not a number that the public might generally adopt as being ‘acceptable’.] but were calculated, once it was decided that risks corresponding to the previous criteria (200 faecal coliforms per 100 mL, as a geometric mean) should be adopted.

In contrast, these new guidelines are based on an explicit choice of acceptable risks.

Freshwaters

Beach grading

For consistency with the marine grading system, the same basic structure is used for the Microbiological Assessment Category, i.e. a four-category scale. However, the risk cut-offs have been set at lower values: 0.1%, 1% and 5%. There are two reasons for adopting these lower values. First, they are based on Campylobacter infection only: we simply lack credible information to develop risk figures for other illnesses. However, as this particular infection is important in the New Zealand setting these risks have been taken as a suitably precautionary approach. Second, the upper level (5%) represents a doubling of the background infection rate and this has been viewed as a tolerable upper limit.

These risks have been calculated in the New Zealand QRA study (McBride, Till, Ryan et al 2002) for Campylobacter infection, results for which as given in Table H2.

The derivation of that table’s figures relied on the moderate correlations found in the New Zealand study between Campylobacter and E. coli concentrations. The values of calculated Campylobacter concentrations corresponding to the risk cut-offs were obtained. These corresponded to Campylobacter percentiles of 55%, 70% and 80–85%. [Using the results for all beaches and all times in Table A3.7.3 of McBride et al (2002).] The corresponding percentiles of E. coli were then read from its distribution, being 131, 261 and about 550 E. coli per 100 mL. These values were then rounded to the values in Table H2 (viz, 130, 260, 550 E. coli respectively).

Beach surveillance

In the absence of better information, the alert and action levels have been taken as the second and third E. coli cut-offs for beach grading.

Changes from previous guidelines

The previous guidelines were based on the Dufour (1984) study, as implemented by the USEPA. That is a median E. coli concentration of 126 per 100 mL, corresponding to a swimming-associated illness risk of 8 per 100 bathers. Also, the alert and action limits have been rounded from 273 and 410 E. coli per 100 mL to 260 and 550 E. coli per 100 mL.

Background to the changes

Once again, the previous guidelines were not based on an explicit adoption of an ‘acceptable illness risk’ of 8 per 100 bathers. The reasoning was entirely similar to that given above for marine waters, in which 200 faecal coliforms per 100 mL was found to be equivalent to 126 E. coli per 100 mL. That is, the ‘acceptable’ illness risks were not chosen a priori but were calculated, once it was decided that risks corresponding to the previous criteria (200 faecal coliforms per 100 mL, as a geometric mean) should be adopted.

In contrast, these new guidelines are based on an explicit choice of acceptable risks.

Implementation issues

Change in sampling depth

The provisional guidelines (Department of Health 1992) required sampling at adult chest depth, the reason being that this was the sampling depth used by the two studies on which the guidelines were based (Cabelli 1983a; Dufour 1984). This has now been changed to sampling at 0.5 m. This was partly driven by a concern expressed by some New Zealand sampling teams about the safety of sampling in New Zealand’s high-energy coastal waters. But it was also done in the light of other studies. Controlled-cohort studies in the UK have shown very clearly that swimming-associated illness risks are related to water quality measured at and only at the swimming location (Fleisher, Jones, Kay, Stanwell-Smith et al 1993; Kay et al 1994).

The 1998 guidelines advocated sampling at 0.5 m. This considered children, who use shallower-depth water and may be more susceptible to illness than adults, and that adults are also exposed to shallower water. The technical justification for the change in sampling depth is as follows.

In the New Zealand study, bathers were exposed to a variety of depths. Water samples were taken at both chest and knee depths. Using a median of 35/100 mL and taking into consideration both gastrointestinal and respiratory illnesses (not considered in studies on which the 1992 provisional guidelines were based), the maximum level of risk to swimmers at a depth of 0.5 m (previously applied to chest depth) remained at 19/1000. Therefore the risk level of 19/1000 relates to the number (35/100 mL) of indicator bacteria measured at any given depth.

However, concentrations of indicator bacteria at 0.5 m are nearly always found to be more than at chest depth (and considerably more than water beyond breaking waves). In fact the New Zealand study found that, on average, the values found at chest depth were about 50% lower than those found at 0.5 m.

Single category for bathing areas

The provisional guidelines followed the USEPA (1986b) in using four separate categories for maximum limits for indicator bacteria. These four categories correspond to four levels of beach usage (infrequent use, light use, moderate use and designated bathing beaches). This was based on the notion of minimising community risk rather than of minimising individual risk, a factor often criticised (Fleisher 1991). The adjustment of the limits to one category of use is based on the principle that the level of risk at a beach is independent of its popularity.

Difference between faecal streptococci and enterococci

The faecal streptococcus group consists of a number of species of the genus Streptococcus, such as S. faecalis, S. faecium, S. avium, S. bovis, S. equinus and S. gallinarum. They have all been isolated from the faeces of warm-blooded animals, and S. avium and S. gallinarum occur in poultry. S. bovis and S. equinus are residents of the bovine and equine intestinal tracts and, although detectable in their faeces, do not survive well outside the animal host and die off rapidly once exposed to aquatic environments. The faecal streptococci have been used in many of the European, UK and Australian studies of water pollution.

The enterococcus group is a subgroup of the faecal streptococci that includes S. faecalis, S. faecium, S. gallinarum and S. avium. Procedures for the isolation and identification of the enterococcus group from aquatic environments have been well validated as identifying this group as a valuable indicator for determining the extent of faecal contamination of recreational marine waters. Studies at marine and freshwater bathing beaches have indicated that swimming-associated gastroenteritis and respiratory illness can be related directly to the quality of the bathing water, and that enterococci are the most efficient bacterial indicator for marine water quality (Cabelli, Dufour, et al 1983; Dufour 1984; McBride, Salmond et al 1998). The enterococci have been used predominantly in American studies of water pollution, but also in Europe. Studies in New Zealand using enterococci as the indicator of choice form the basis of the current recreational water-quality guidelines for the marine environment.

Urban and rural run-off

Rural and urban run-off can contain both human and animal faeces. The catchment type will reflect the likely proportions of each. Animals can carry pathogens that may be passed on to humans (zoonoses) such as Giardia, Salmonella, Campylobacter, Cryptosporidium (Donnison and Ross 1999) and verotoxic E. coli (E. coli 0157). In times of high rainfall the pathogens present in animal faeces can be transferred into waterways via stormwater drains or overland flow.

The results of some studies (Calderon et al 1991, as interpreted by McBride 1993; McBride, Salmond et al 1998) indicate that illness risks posed by animal versus human faecal material should be considered to be equivalent, although the first Hong Kong study results are less clear (Cheung et al 1990).

For urban catchments, at least, indicators at beaches could be elevated by inflow (wrong connections) and infiltration of stormwater into the sewer system, leading to sewer overflows which then contaminate stormwater conduits (streams, channels, direct pipes, etc). These contamination events are probably dependent on rain intensity and may be able to be quantified. Also, animal faecal matter, soil and vegetative indicator inputs and other undefined sources can contaminate stormwater itself (without sewer overflow). Directly leaking sewers (termed ‘exfiltration’) can occur in any weather, contributing significantly to infiltration when it does rain or when groundwater levels rise.

Stormwater from unreticulated (but not necessarily ‘rural’) areas may contain faecal indicators from direct surface run-off containing animal or bird faecal matter and vegetative inputs, direct stock access to waterways, direct stormwater delivery to the coastal marine area, and indirectly via streams, all of which are believed to constitute a risk to human health.

Only one study (Haile et al 1996) has been conducted in waters impacted by direct urban run-off (storm drains). The rates of illnesses presented were similar to those of other studies conducted in waters contaminated with domestic sewage. However, Ferguson et al (1996) found that increased levels of faecal coliforms, faecal streptococci, Clostridium perfringens spores, Giardia and Cryptosporidium occurred in an urban estuary after rainfall. Gibson et al 1998 found that combined sewer overflows contributed increased Cryptosporidium and Giardia in both dry and (particularly) wet weather. Data from a study by Grohmann et al (1993) suggests that stormwater was a source of virus contamination in river and coastal water systems. It is therefore very likely that increased indicator levels identified at a beach following rainfall are indicating increased pathogen levels, and increased risk.

During the epidemiological study in New Zealand (see below), insufficient questionnaires were filled out immediately following rainfall events because people were not swimming at those times. However, the microbiological analyses were still conducted and the data shows that indicator levels go up at times of rainfall as a result of subsequent run-off.

The original claim from Cabelli was that rural point sources probably do not pose as great a risk as sources of human wastes. The advent of findings of Giardia, Cryptosporidium and Campylobacter, and the reinterpretation of the Connecticut rural swimming-pond study (McBride 1993), confounds this. Furthermore, the 1995 New Zealand marine beaches study (McBride, Salmond et al 1998) showed no statistically significant difference between beaches with urban versus rural impacts; indeed, the illness risks actually measured for those two impacts were very similar.

Illness:indicator relationships

Many more studies have been conducted for marine waters than for freshwaters. For marine waters the indicator of choice has usually been faecal streptococci (especially in the UK, Europe and Australia) and its subset, enterococci (especially in the US and New Zealand). For freshwaters, E. coli is usually the indicator of choice.

The essential idea behind the use of bacterial faecal indicators is that they may best represent overall pathogenicity of the water, as they may be well correlated to some pathogens but poorly correlated to others (e.g. protozoa and viruses). This lack of correlation has been shown in many studies (e.g. Elliott and Colwell 1985; Grabow et al 1989; Ashbolt et al 1993; Ferguson et al 1996). It must always be remembered that only a portion of the pathogens are ever measured, and some (e.g. Norwalk-like viruses) cannot be enumerated routinely.

Both the controlled and uncontrolled cohort approaches have identified significant (in the statistical and social senses of that word) increasing relationships between the risk of illness to swimmers in water containing faecal residues and the concentration of one or more bacterial faecal indicators. While many of the illness-causing organisms are not bacterial (e.g. viruses and protozoan cysts) and may not be well correlated to the bacterial indicator(s), the general form of this relationship is found among many studies (as reviewed recently by Prüss 1998).

Symptoms are reported to increase with increased exposure to water and to aerosols (spray from breaking waves). Studies have increasingly found significant relationships between respiratory illness risk and a bacterial indicator (Balarajan et al 1989; Cheung et al 1990; Fewtrell et al 1992; Corbett et al 1993; Fleisher, Kay, Salmon et al 1996; McBride, Salmond et al 1998), as well as gastrointestinal illness. Skin rash, and eye and ear complaints seem to be related more to the presence of other bathers (bather-to-bather transmission occurs) than to degree of faecal contamination.

Children may be more susceptible to illness than adults, although they do tend to swim in waters that are shallower and are hence more polluted.

Interpreting human health risk

Investigations on the possible relationship between water contamination and human health risk (e.g. of swimmers) have to account for various uncertainties. These arise because we only have a small number of samples from which to characterise the degree of contamination of the water the swimmers use, and we have health status information from only a tiny part of the whole population of swimmers. [As well as some non-swimmers, who are used as controls, it is the difference in illness risk between swimmers and non-swimmers that we are interested in.] If we took the measured relationship to be the true relationship, these uncertainties would be ignored and our relationship would have a high chance of being incorrect. Statistical methods allow us to account for these uncertainties.

It is traditional to use ‘null hypothesis tests’ to account for uncertainties. These test the hypothesis that there is no association between the indicator concentration and the swimmers’ risk of illness. [That is, one posits a ‘null’ hypothesis – that the correlation is exactly zero.] If the association (measured by a correlation coefficient or by a regression coefficient) indicated by the data is in some sense strong enough [The computed p-value would be less than half of the a priori significance level (usually denoted by a). The p-value is the probability of getting a correlation at least as extreme as has been obtained, if the null hypothesis were true.], one concludes that the null hypothesis should be rejected and that a ‘statistically significant’ result has been found. In this case, the true association is estimated by that found in the sample data. The level of uncertainty in this estimate is indicated by an interval (usually the 95% confidence interval) within which we might usually expect the underlying true value to lie, 95% of the time.

On the other hand, failing to reject the hypothesis simply means that the observed data is not inconsistent with the null hypothesis. It does not mean that we can regard the null hypothesis as being true, i.e. that there is no association at all. There may well be an association, but our sample data is held to be insufficient to reliably infer either this or its magnitude. This can be either because there is too much variability in the data, or that insufficient data was collected given its degree of variability. Increasing the size of the sample data set may identify an association – unless the variability continues.

A cautionary note: the form of words used by some authors to interpret a negative result for a null hypothesis test can be ambiguous (McBride, Loftis et al 1993). For example, in reporting on a study of bathers’ illness risks in a freshwater pond contaminated by animal faecal material (Calderon et al 1991), a null hypothesis test (only just) failed to reach ‘statistical significance’. This led the authors to state that there was “no association” between swimmers’ illness risk and animal faecal contamination.

This conclusion had quite dramatic consequences because it was thought to support the idea that bathers’ illness risks from exposure to animal faecal residues are much lower than for exposure to human faecal residues, and perhaps did not even exist (as is implied by the phrase “no association”). But if more swimmers had been included it is entirely plausible that the test result would have been statistically significant (McBride, Loftis, et al 1993). [Technically, the ‘power of the test’ increases with the number of data used, making it more likely that a null hypothesis will be rejected.]

In fact the measured relationship was quite large, but the data’s variability and limited size meant that the associated null hypothesis test failed to attain statistical significance. In other words, failure to attain statistical significance does not necessarily imply that the relationship tested lacks practical significance. The actual results found can be used, with results from other studies, in some kind of meta-analysis. [Meta-analysis refers to pooling data from a number of studies to reanalyse the relationship.]

[ Index ]