Data is all around us and connected in so many ways. In schools or any institution there are multiple point at which data can be entered into a system or systems so that it can be used and shared “easily“.
Proper data entry is key to making use of that data otherwise you end up the “garbage in and garbage out“.
You can’t run a donor report by constituent type if those types aren’t entered properly. You can’t run an inquiry list by city of town of you’ve entered the full name of a town in one record and an abbreviate version in another record.
A clear set of rules and style guidelines for entering data needs to be establish. Creating a document that can be used in all domains within the school would help.
I’m undertaking a project to do just that! In meetings with members of our various offices we are going to discuss the types of data they are entering into their systems and the ways in which that information is recorded.
What follows is a rough outline of some of those areas and the rules or guidelines to would apply to the data entered. What I am hoping you will do is read though the list and add your comments and additions at the end. My promise is to share with you what I have come up with so you can use it if you choose.
Data Types/Format Concerns:
- Dates
- Phone Numbers
- Abbreviations
- Cities
- NYC vs New York City
- SF vs San Fran vs San Francisco
- States
- Counties
- Countries
- Schools
- MSU vs Montclair State vs Montclair State University
- Business Names
- Cities
- Prefixes & Suffix (& usage)
- Relationships
- Explicit coding
- Father or Mother vs Parent
- Explicit coding
Entry Rules
- Use fields as define
- One field per piece of data
- Accuracy trumps speed
- “Garbage in… garbage out”
- Record an individuals data int he individual records
- Do not track spouse or child(rens) data in another spouse/parent record
- Track as much detail as possible – completeness of data
- Employment/Education Data
- example:
- Company = Mountainside Hospital
- Field = Health Care
- Position = Physician
- Title = Director of Pediatrics
- example:
- Employment/Education Data
- Does the record exist?
- Check before entering data or creating a new record
- When there is a question on where, how or if it is possible to track a piece of data ask.
Procedural & Ownerships
- Domain Control
- People responsible – “ownership” of the data
- Data Collection
- Paper
- Phone
- Online
- Data flow
- in & out, forward & back
- Inter-office & Inter-departmental
- Managed vs manual
This is a rough draft… just the starting point to developing this guide. Please take a moment to add your thoughts, comments, questions and concerns and I will be sure to share with you all that we’ve uncovered and learned so that you to may benefit.
I think this is a good start on a complicated issue! I’m so glad you are sharing this. On the technical side, something to spend a lot of time with is relationships and how your various databases identify relationships. You mention it is important to keep separate records for separate individuals. That is wise but how that is done from db to db can be tricky.
Another technical thing to think about is how user defined fields can be used. It is best, as you suggest, to use the defined fields. However, user-defined fields can be powerful to help you do things like improve the ease of creating relationships in another db if you are connecting them.
OK…on the soft side, I think it is important to be careful when dealing with the people that manage the databases. Few if any of the people that touch data in schools were trained to work in databases and yet they may spend much of their day doing this. You can be certain they have developed some of their own systems that traditional logic could not predict. Your staff may be very sensitive (and maybe even a little defensive) about these issues. Provide those folks plenty of room and time to share those with you so you understand their thinking. Somewhere in the story will be a legitimate business case that you will need to consider when tying your db’s together.
Alex Inman
Sidwell Friends School
Educational Collaborators, LLC
Thanks for such a thoughtful response! To provide a little background on our situation at my school:
1. We are using similar systems throughout the school which all leverage FileMaker. We’re moving away from an older FileMaker system to a new one designed by The Proof Group.
2. Because we are on FileMaker and the database systems are open we can define new fields for user easily and do so often (our new system has a great tagging feature too…).
3. We are moving the systems to a centralize CRM (contact relation management system) where all people are held in a central core with their biographical, demographic and relation data stored and then in each of the other systems (Admissions, Academics, Advancement & Auxiliary) each person has a role, where appropriate, and key data is stored in those systems based on that domain.
4. The people and personalities at one of the biggest topics of conversations and those people responsible for the data can provide a great deal of insight for sure. As long as their own systems for tracking do not impact function and the sharing of informations across system they can have at it.
Again… thanks for the comment!
Hi! I’m curious to know if you ever completed this guide. I would love to read it if you’ve made it public.
I am in the process of writing another post on the topic and will be including the document with that post.
I don’t think the one field to one piece of data rule can be over stated. As I work with older databases, I find that City, State, and Zip have been lumped together and therefore sorting by state and zip is impossible. The hard part is, of course, defining “one piece of data”. Is phone number one piece of data (ten digits) or three separate fields (area code, exchange, and last four digits)?
Good point… I think that you get very granular with what you track in a system and need to keep that in mind to protect peoples sanity.
Your example of a phone number is a good point to that and as long as you enter it consistently — 555-555-5555 vs (555) 555-5555 — or through programmed data validation. Issue I have seen in multiple email address in a field rather than in their own distinct fields. This needs to handle from a structural perspective so long as you have the ability to add fields or relational structures.
Thanks again for the comment.
Here’s one resource that might be helpful:
http://www.yale.edu/ppdev/Guides/hr/StandardsDataEntry.pdf
Brilliant… thanks!
This is a great idea and a challenging one, yet is important to improve the quality, effectiveness and accuracy in the use of db and learning analytics in your school. I can visualize what your school will go through as we have started the new school year with a new SIS thus eliminating five older databases. This has taken quite some time to coordinate data, set new processes and we are continually tweaking the solution but it is working!
I have to say that even though you set standards or parameters for key data points, the key aspect for success is putting together the correct human resources team in place to input, enter and manage the processes to construct and maintain the system when it is live. Most people appear comfortable changing data structures but become less comfortable with changes in data processes. The key factor in success in changing data models is the people, as it normally requires change in habits and data handling, something most school personnel have little background or training in.
I also think visualization of the importance of ‘data’ is key, the significance of unique data identifiers and QA or data validation is a must in any data scheme. Let me illustrate with a couple of examples. I have seen examples of an assistant fill in blank data fields, such as copying data (a phone number) for more than one person. In some db systems this may cause problems, depending upon how you structure your key fields and use the data for external purposes. In another case, a data validation assistant was comparing data sets, however both lists were generated from the same original database. The greater you can visualize the importance of data and its impact to each persons role, it improves the overall probability of success and reduces the number of passes you need to make on specific data.
If you have multiple sysadmins try to limit access to only one person per sector. This builds ownership but also greater responsibility assuring higher degree of accuracy of the data element.
It is challenge but is a good way to review your data structure and data workflow.
Vincent Jansen
Lower Canada College
I love the point about ownership as it is one I think that people can often struggle with!
Two things come to mind. One is email addresses, which I don’t think you listed above. Think about how many email addresses you’ll actually accept or ask for. Years ago, people couldn’t easily access work email from home or home email from work, but now many people can access all email from just about anywhere including their phone. Although I lean in the direction of only collecting one e-mail address, I would definitely not allow more than two (one work and one home) if you can’t make the case for only collecting one. Regarding collecting only one address, I say let your constituents decide what email address they want to use as opposed to the school (or some offices in school) collecting multiple email addresses and then someone later trying to figure out what addresses to use without annoying people by sending to to many.
In our globally mobile world, keep international phone numbers and addresses in mind. Even if you don’t allow student to enroll unless their parents live in the USA (i.e. living with a guardian in the US), you may need to track foreign addresses during the admissions process or as alumni.
Global address are surely a concern that schools need to keep in mind, along with multiple address for different types of constituencies.
Whether physical or virtual address I would have to say that you need to be able to capture and track as many as you can manage. You must always be mindful of you constituents across multiple domains. The address used by Admissions may be the same used in Academics, but different then what you want to use for Advancement. Donors may list one address for public contact, but have an entirely different one that they want to use for questions of donations and support. A family could have multiples as well for contacting each parent individually.
What you can track will be largely based on how your system(s) is designed and what flexibility you have to make changes.
Thanks for the comment!
Pingback: Social Media, audience and old friends. | edSocialMedia
Whoa… nice find, Vasil!
As someone who has been crossing out “Father” and writing in “Mother” on official forms for the last 10 years, I would love to see the parent/guardian fields be less normative than the typical “Father/Mother” option. (Then again, my favorite form ever had three gender options — “male,” “female,” and “working on it” — so I recognize that I’m waaaayy outside the main stream on this one.) :-DIn addition to Peter’s great “one field, one piece of data” rule of thumb, I would add the “and you’ll always need more fields than you think you will” corollary. You may need more relational fields to handle things like “custodial parent,” for example, and maybe even sadder stuff like “restraining order in place.” Thinking of the special cases can actually be quite helpful.
Finally, in my experience there is often a disconnect between what the folks who initially input data (admissions?) need and what the folks who will be using the data several years later (departmental secretaries?) will need down the line. Expected year of graduation? Freshman year advisory section? Good to make sure that anyone who is running reports off a database now is consulted at some point.Hope that helps!
Looking at all the points of data that is need across the whole school (or institution) is extremely important, along with the point at which that data is collected. Recently we had some ask for a report on data that we weren’t even tracking and the forms that we were using to gather that “type” of information weren’t even asking the right questions.
Part of what we are going to us this document will be to also use it to look and lead discussions around these topics and issues.
Having spent 10 months cleaning up our SIS this topic is near and dear to my heart! We decided to go with the USPS address regulations with regard to address abbreviations, etc – that way there was no one arguing for their favorite abbreviation style. Only one person has permission to enter data in our SIS and one person has permission to enter data in our Advancement database.
When it comes to data cleanup, exporting the data in excel and using the excel find and replace feature really helps speed things up – then you just reimport the scrubbed data.
The biggest challenge I found was not the technical aspect of data management but the political aspect. I work it IT so I am used to having to consider how different departments are impacted by changes in an enterrprise system, but many individual departments think only in terms of their wants or needs. Planning a strategy for navigating the political waters is equally important.
I like the idea of using the USPS regulations as a standard. Anything where a defined standard can be applied to I think is a very good idea. We actually tried using ZIP codes to assign towns but had found that town post office often have “sub stations” and while people in those particular areas identify themselves as being in that area (Montclair vs Upper Montclair) the postal code translation puts them all in Montclair.
The point about the people is coming through loud and clear with regards to any set of rule or guidelines that are developed.
When it comes to our systems we often say “It’s all about the labels”, but when it comes to the data “It’s all about the people.”
Perfect timing! I have read with interest all of the posts. However, the other day I came across directions for using a DB and the rule was each person had his/her own record. So, no more Mr & Mrs (mother and father). Have any of you seen this and if so, how on earth is it managed? (a gift from Mr. and Mrs. Brown for instance). Thanks for any additional insights.
Helen Ruisi
Naples, FL
I have seen this and our system is moving all records to this format. There are different ways you can manage this, but the general idea is that each person is their own entity with attributes(DOB, email, first & last name, etc…) and then a family is it’s own entity with member attached to it. This model can be applied to admissions w/families and in advancement with gift by creating a giving group (like a family).
I have to say that while going with USPS standards might be a good idea for the student side of things, on the development/advancement side, I prefer not using abbreviations. When large mail files are sent to a mail house, the abbreviations are going to happen automatically, but when much of printing, i.e. acknowledgment letters, invitations, etc. happens in-house, we want our envelopes to look as formal as possible.
Very thoughtful discussion. Thanks!
We also find that, where possible, it can be helpful to create drop-down menus for certain data fields (I am thinking of “Occupation”). This limits the way the data will be entered which facilitates searching and also allows for more useful reports.
One other thought, perhaps trivial, but it can be problematic. Make sure the data entry system has a clear way to move from one data field to the next. In some cases, for example, the tab or carriage return character can be entered as part of the data and sometimes not. Multi-line data (in an address field, for example) affects how the layouts are set up.
Christopher York
The Spence School
Pingback: Defining Data Domains for Entry, Ownership and Support | williamstites.net
I have an issue with a company (Seventh Avenue) For the past two months I have been receiving statements within two weeks of each other. I also have been making payments on time every month year. I contacted the company and asked them what was going on with my account and I was informed that there was a data entry error and someone keyed my full address and one with an abbreviation (Trail – Trl). A rep informed me that both accounts would be combined into one, but I continue to receive notices of threats to report me to credit bureaus. How can I be at fault when Seventh Avenue’s staff created the initial problem. What type of argument can I present to this company to defend my account and hold them responsible for this error?