MnModel Survey Bias

Map of negative survey points and archaeological sites — Map of negative Phase 3 survey points and archaeological sites

MnModel is the first archaeological predictive model to make survey bias explicit in the final results so that the model's value for any given place can be assessed.

As we evaluated the Phase 3 Site Probability Models, it became apparent that they predicted surveyed places almost as well as they predicted site locations. This implied a high degree of survey bias and reduced our confidence in the interpretation of the predictive models, posing the question of whether areas were categorized as low probability because no sites were there or because there had been no surveys there. This led to the development of the Phase 3 Survey Probability Model, which might be thought of as a model of survey bias. This model has no precedent.

Reasons for Survey Bias

Archaeological field survey is very labor intensive, time-consuming, and, ultimately, expensive. Naturally, archaeologists are more interested in finding archaeological features and artifacts than in spending many hours surveying and finding nothing at all. Consequently, they tend to focus their survey efforts on places where they expect sites to be - usually places near water. Even so-called 'probabilistic' surveys usually are stratified so that more locations are surveyed near water than away from water. Although the goal of the 1995 and 1996 MnModel field surveys was to 'provide data on site location and non-site locations based on random sampling,' the archaeologists' stratified survey design gave 'rarer landforms... priority over more common ones to insure they be represented in the sample strategy.' A truly random survey, such as the 1997 MnModel field survey, would result in a sample representing each landform in proportion to its occurrence in the landscape.

Detecting Survey Bias

In Phases 1 and 2 of MnModel, we used 'negative survey points' as non-site locations. These were points mapped in sections that were surveyed, at least in part, but where no sites were found. By comparing the distribution of surveyed locations within the early models' high, medium, and low probability areas, it became apparent that survey distributions reflected archaeologists' notions of where sites were most likely to be found. Consequently, MnModel's predictive models are biased by the archaeologists' own intuitive models of probable site location. The map (above right) shows Phase 3 negative survey points in red and archaeological sites in black. The relationship of both to water features is apparent.

Modeling Survey Bias

To account for this bias, survey locations were modeled using the same methods and environmental variables that were used to model site locations. The resulting Phase 3 Survey Probability Model indicates which parts of the landscape have been adequately surveyed. When we mapped many more surveys for Phase 4, we were able to develop an even more precise model of which landscapes have been adequately surveyed.

Incorporating Our Understanding of Bias into the Predictive Models

In Phases 3 and 4 Site Probability and Survey Probability models were combined to create the Survey Implementation Model. This model qualifies the values of site probability by reference to the values from the survey probability model. In the Phase 4 Survey Implementation Model, the following categories are used:

High Site Potential/Well Surveyed. There is a high degree of confidence in these areas, since these are the kinds of environments that have been well surveyed in Minnesota and where sites have been found.
High Site Potential/Poorly Surveyed. There is less certainty about site potential in these environments because they have not been as well surveyed.
Low Site Potential/Well Surveyed. We feel confident that sites are less likely to be found here since these types of environments have been well surveyed. That does not mean, however, that sites are absent. About five to ten percent of sites are not predicted by our models, and some of these will be in these areas.
Unknown Site Potential/Poorly Surveyed. In areas classified as unknown, both survey probability and site probability are low. Archaeologists, using their expert systems models, have assumed that sites will not occur here. Yet sites are occasionally found in the 'unknown' areas. We need more surveys to help us understand the environmental factors associated with these sites.