A recently completed archaeological predictive model (APM) of the state of Pennsylvania provided an unprecedented opportunity to explore the current status of APM methods and extend them based on current methods derived from related scientific fields, medicine, and statistical computing. Through this process many different types of models were created and tested for validity, predictive performance, and adherence to archaeological theory. One result of this project is a comprehensive view of the problems that beset existing APM methodologies, solutions to some of these problems, and the nature of challenges that we will face going forward with new techniques. Most, if not all of the findings of this project are applicable to the eastern deciduous United States, and much of the methodological scope is useful to APMs in any geography. This paper will discuss the primary lessons learned from this project in regards to archaeological data, modeling methods, and theory, as well as touch on best-practices for future APM efforts.
2. FHWA Statement
“The contents of the report reflect the views of the author(s) who are
responsible for the facts and accuracy of the data presented within. The
contents do not necessarily reflect the official view or policies of the
Department or FHWA at the time of publication.”
Report available at: www.penndotcrm.org
3. “Remember that all models are
wrong; the practical question is
how wrong do they have to be
to not be useful.”
~ George E. P. Box, 1987
4. Organization of talk
• Introduction to PA Model
• Data lessons
• Methodological lessons
• Policy lessons
• Concluding observations
10. DATA Lessons Learned
• Unique characteristics of archaeological data
• Representation of archaeological data
• Archaeological site prevalence
• Covariates and correlation
• Dealing with uncertainty
11. Characteristics of Archaeological Data
Population Generating Process:
• Highly dynamic & complex
• Non-mechanistic
• Cultural and Agency
• Dynamic environment
• Changing parameters
• Subjectively defined expression
• Censored through taphonomy
Sample Generating Process:
• Non-systematic
• Subjective & inconsistent
• Extensive measurement error
• Imperfect detectability
• Non-representative of population
• Spatially biased
• Over simplification
19. Methodological Lessons Learned
• Define your objectives and assumptions
• Reproducibility
• Create a model building system
• ArcGIS is only part of the answer
• Understand your algorithms
• Test and validate all results
22. Model Building
System
● Variable creationand analysis
● Train model hyperparameters
● Algortihm Selection
● Test error with Cross-Validation
● Assess performance
● Model selection
● Mosaic and aggregate
26. Policy Lessons Learned
• Model purpose dictates policy applications
• Implementation requires explicit assumptions
• Error rates and uncertainty must be known
• Scale of data is critical in scale of use
• Methods to visualize uncertainty
27. How it all works...
PURPOSE ASSUMPTIONS METHODS
ALGORITHMS /
MODELS
INTERPRETATIONPOLICY
32. Purpose
Assess all aspects of a model relative to its purpose
Policy and implementation are based on model purpose
33. Not all doom and gloom!
• Face modeling issues head-on
• Model for our unique data
• Standardize our approaches
• Formalize our theory
• Compare our results