Enter the Data Quality Campaign, whose goal is "to ensure that every citizen is prepared for the knowledge economy." In their most recent document Pivotal Role of Policymakers as Leaders of P–20/Workforce Data Governance the DQC wrote, "Achieving this goal requires unprecedented alignment of policies and practices across the early childhood; elementary, secondary, and postsecondary education; and workforce sectors (P–20W). Consequently, many policy questions require data from multiple agencies to answer."
See, they need data from all these agencies in order to answer policy questions about education. But they have a problem. Though states have independent databases that track the information policy makers claim they need (we'll get back to that in a minute) they run into "challenges" accessing this information due to: turf, time, technical issues, and trust.
Challenge 1 Turf - Data is power and money. One does not just casually hand that over to another agency just because the other agency has claimed a need for it. Those who currently manage the data "silos" need assurance that they will not lose control or have another entity assigned oversight on what they do. This is a reasonable concern since education data collection which started in the states has had rules and restrictions placed on it by the states that cannot and should not be violated. DQC's response is to "define clear and distinct roles and responsibilities aligned to commonly established goals. This creates and fosters a culture of shared responsibility..."
Challenge 2 Time - Only so many hours in a day and money to pay people to manage all this data. And since all that money comes from taxpayers, regardless of whether it is a government employee or a government contracted company, there needs to be assurances in place that the time/money is well spent on data management.
Challenge 3 Technical Issues -each agency defines its own data standards and protocols and procedures for data use, making sharing data difficult and inefficient. Here is where DQC can really shine because their goal is to make all these databases talk to each other so sharing data across them is - they use the word efficient, but let's call it - easy. These inefficiencies and mismatching may be the last thing protecting your privacy and DQC is working like bunnies to strip that away.
Challenge 4 Trust -"Agencies are concerned about how their data might be used once the data are linked, matched, and shared." How about parents? Mightn't they be concerned about how this data will be used once matched and shared? Throughout this entire document the people who really "own" this data, the children and those who speak for them, their parents, are never mentioned.
Maybe I came too late to the discussion. When was it discussed that the government had a right to collect and use personal data on every single American? That seems to already have been agreed upon by unelected bureaucrats who don't answer to parents. Here are the Board members of DQC.
Tom Luce, Chair Chairman, National Math and Science Initiative
John Bailey Director, Dutko Worldwide
Tammi Chun Policy Analyst, Office of the Governor, State of Hawaii
Kathy Cox CEO, U.S. Education Delivery Institute
Kati Haycock President, The Education Trust
Bruce Hoyt Former Board Member, Denver Public Schools Board of Education
Sharon Robinson President and CEO, American Association of Colleges for Teacher Education
Bob Swiggum Chief Information Officer, Georgia Department of Education
Gene Wilhoit Executive Director, Council of Chief State School Officers
Their process looks like this:
- Link Systems to allow for efficient matching of data that have been deemed necessary for specified purposes.
- Match Data to create datasets with connected records on the same individuals from two or more databases.
- Share information to provide participating agencies and institutions knowledge that was unavailable prior to the data matching.
The problem is more in the Field of Dreams area. If you build it, they will come. If you begin to create a completely integrated data stream of personal data (which everyone always refers to as lacking individually identifiable data, right) with guidelines on how to set up new databases that can link to it and job descriptions that include making sure your data is compatible with the integrated system, you begin to create something so powerful that its governance should not be in the hands of any single individual or agency. Try preventing that from happening.
Most people only look at the privacy issues in terms of the individual databases. So what if someone knows my kid's student ID. Who cares if I'm part of the public record as someone who receives unemployment payments. With groups like DQC working to connect all this data and develop policy on it, who knows what kinds of policies could be developed because of someone's interpretation of that data. Maybe a policy needs to be established that requires an automatic visit by Child Protective Services for every child whose parent has become unemployed because past data showed a statistical potential for neglect when a parent loses a job.
The bigger issue is that government agencies will be self directed by data to address problems that the public has not asked to be addressed. Our elected representatives could, in essence, be replaced by databases. Whatever efficiencies or solutions might be gained by creating such a system should be weighed heavily against the possibility of such systems being abused by someone you don't agree with. In addition should always be the concern of such data being compromised, maybe even from entities outside the U.S. One of the key elements in the P-20 system is that it be accessible. That means, by definition, outside entities need to have a way in. There is no such thing as a completely secure system that needs broad access and any honest IT person will confirm that. So how much data do we want to put in such a system? Has anyone asked us?