Data Distributions in Regulatory Science

Data distributions in the human services as they relate to regulatory compliance are generally very skewed distributions which means that the majority of facilities being assessed/inspected will usually fall very close to the 100% compliance level. There will also be an equally large number of facilities that are in substantial regulatory compliance (99% – 98% compliance levels). And then there are much fewer facilities that are either at a mid or low level of regulatory compliance (97% or lower compliance levels). One might say that getting a score of 97% on anything doesn’t sound like it is mediocre or low but keep in mind we are addressing basic health and safety rules and not quality standards. So having several health and safety rules out of compliance is a big deal when it comes to risk assessment. It could be argued that a state licensing agency was not upholding its gatekeeper function by allowing programs to operate with such regulatory non-compliance.

Why is the regulatory compliance data distribution important from a statistical point of view. Generally when we are dealing with social science data, the data are normally distributed or pretty close to being normally distributed. It is a trade mark of a well designed assessment tool for example. So when data are compared to other normally distributed data, there is a good chance that some form of a linear relationship will be ascertained, albeit, not reaching statistical significance in many cases but linear regardless.

When a very skewed data distribution is one of the variables as in the case with regulatory compliance data and it is compared with a normally distributed data set such as a program quality tool, ERS or CLASS. Well, the result is generally a non-linear relationship with a marked ceiling effect or plateau effect. In other words, the data distribution is more curvilinear than linear. From a practical standpoint this creates selection problems in the inability to identify the best programs that have full regulatory compliance. This can create a public policy nightmare in that those programs which are in substantial but not full regulatory compliance are as good or in some cases of higher quality than those programs in full regulatory compliance. The interesting question is does the combination of normally distributed data distributions with variables that have skewed data distributions always produce this nonlinear result?!

And lastly, will having two variables that are skewed data distributions produce a more random result than if one of the two above conditions are present?

About Dr Fiene

Dr. Rick Fiene has spent his professional career in improving the quality of child care in various states, nationally, and internationally. He has done extensive research and publishing on the key components in improving child care quality through an early childhood program quality indicator model of training, technical assistance, quality rating & improvement systems, professional development, mentoring, licensing, risk assessment, differential program monitoring, and accreditation. Dr. Fiene is a retired professor of human development & psychology (Penn State University) where he was department head and director of the Capital Area Early Childhood Research and Training Institute.
This entry was posted in Regulatory Science, RIKInstitute. Bookmark the permalink.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s