Best Practices
Survey Length and Data Collection
As a rule of thumb, online forms (especially in the case of surveys) are generally kept to a page or two. While you could make them much longer, users tend to get overwhelmed when they have to scroll across page after page of questions. Very long forms also take longer to complete. If you are distracted mid-form and do not save your work, your internet session could time-out (after 30 minutes of inactivity) and all unsaved data would be lost. So, if you think you’re starting to get carried away with a long series of questions, it may be preferable to break your questions up into multiple forms.
Group variables together that follow the data entry work flow, and use field types that minimize changing from keyboard to mouse. For example, you can enter a dropdown field option by typing the first character of the label, allowing you to “tab and type” through the data entry fields, while radio buttons require using the mouse to select an option). Keep forms fairly short to minimize risk of data loss (by saving more often when completing a form) and make it easier to identify data entry errors.
Use categorical response (dropdown, radio button, checkbox) field types when possible to reduce risk of data entry error. If these fields are not feasible, use text fields with validation (date, phone, email, integer, number) whenever possible to reduce the use of free-text fields. When using a text field with validation types of number or integer, define range minimum/maximum as much as possible to allow REDCap to perform basic data validation/quality control.
HIPAA Compliance and PHI
Export only when necessary. One of REDCap’s greatest strength is the security of your data. Take precaution when exporting data and only export data if you need to run reports or analyses outside of REDCap. Limit user privileges to only allow export rights to those who really need it.
There are 18 pieces of information that are considered identifiers (also called protected health information, or PHI) for the purposes of HIPAA compliance. When you indicate a variable as an Identifier, you have the option to “de-identify” your data on data exports. In the Data Export Tool, the identifier variables appear in red and there are de-identification options you can select prior to exporting the data.
The 18 identifiers are as follows:
1. | Name |
2. | Fax number |
3. | Phone number |
4. | E-mail address |
5. | Account numbers |
6. | Social Security number |
7. | Medical Record number |
8. | Health Plan number |
9. | Certificate/license numbers |
10. | URL |
11. | IP address |
12. | Vehicle identifiers |
13. | Device ID |
14. | Biometric ID |
15. | Full face/identifying photo |
16. | Other unique identifying number, characteristic, or code |
17. | Postal address (geographic subdivisions smaller than state) |
18. | Date precision beyond year |
Send-It is a secure data transfer application that allows you to upload a file (up to 32 MB in size) and then allow multiple recipients to download the file in a secure manner. Each recipient will receive an email containing a unique download URL, along with a second follow-up email with the password (for greater security) for downloading the file. The file will be stored securely and then later removed from the server after the specified expiration date. Send-It is the perfect solution for anyone wanting to send files that are too large for email attachments or that contain sensitive data.
Data Dictionary
An example data dictionary has been included under the "Additional Resources" section of this wiki page. This file shows entries for each required column, as well examples of calculated fields and branching logic.
- The first variable on the first form should be the record identifier (e.g. Participant ID) because it will be used by REDCap as a key variable linking forms for a particular record. The default variable name is "Study_ID". Demographics is normally the first form, but this is not required. All new projects are provided with a sample Demographics form, but you are free to modify/replace this.
- Put variables collected together on the same form to improve data entry workflow. Putting demographics together and labs together on separate forms makes data entry more reliable.
- Include Field Notes describing units, formats, etc. whenever appropriate. Do not assume the data entry person knows the expected units or formats.
Note: Although you will be working in Excel, the file type you work with is actually a ".csv" (comma separated variables) file. When saving your data dictionary, be sure to select the .csv format to upload to REDCap.
This section describes the function of each column in the data dictionary spreadsheet, and whether or not it is required or optional.
Column A - Variable/Field Name (Required)
- Variable/Field names specify the variable name that will be used in reporting, data export, and data analysis. They are not displayed on the data entry form.
- Variable names may contain letters, numbers, and underscores, but no spaces or special characters.
- Variable names cannot start with a number.
- Variable names must be unique, and cannot be repeated within a database, even in different forms.
- Variable names should be brief; they do not need to be descriptive as the description will be included in the Variable Label. A common example is the use of "dob" as a variable name, with the corresponding variable label of "Date of birth."
- Remember that if you change a variable name in one place, you must change it everywhere it is used, e.g. calculations, branching logic.
Column B - Form Name (Required)
- Forms are groupings of variables within the database. It's a good idea to divide your variables into several fairly short forms for ease of data entry, and more opportunities to save data at the end of each form.
- Form names must be all lowercase in the Excel spreadsheet, but will be displayed in REDCap with initial capitals. If your form name contains more than one word, connect the words with an underscore, such as "form_name". The underscore will appear as a space in REDCap.
- All variables in a form must be in adjacent rows in the data dictionary. For example, you cannot have a variable in row 6 be in the "demographics_form", a variable in row 7 be in the "first_visit" form, and then a variable in row 8 back in the "demographics_form". Variables will appear on the form in the order they appear in the data dictionary.
Column C - Section Header (Optional)
Section Headers are used to visually separate items within a form, primarily to aid data entry. If you are entering data directly into REDCap while interviewing a study participant, you may also want to use Section Headers to display interview script between questions, e.g. to introduce a new topic.
Column D - Field Type (Required)
- Specifying the field type determines what types of responses are allowed, and how they will be displayed. Field types include: dropdown, radio button, checkboxes, text box, note box, calculated field, file upload, and section header.
- Categorical field types (dropdown, radio buttons, checkboxes) must also have response options (choices) defined in Column G. Terms used in Column E to define these field types are: dropdown, radio, checkboxes.
- Text field types (text box or note box) should have validation (Column I) whenever possible. If the validation is "integer" or "numeric", you should also include the allowable minimum and maximum values (Columns J & K). Text variable cannot also have choices listed in Column G. Terms used in Column E to define these field types are: text, notes.
- Calculated variables display the result of a calculation based on responses to previous variables. Data cannot be entered in calculated fields. The term used to define this field type in Column E is: calc.
- The file upload field type allows you to attach a document (e.g. consent form) to the record. The maximum file size for any document is 50Mb. The term used to define this field type in Column E is: file.
- A section header is a field type used to visually separate fields in a form. Text to be included in a section header should be entered in Column D. The term used to define this field type in Column E is: text.
Column E - Field Label (Required)
- A Field Label (or variable label) is a word or phrase that is more descriptive than the variable/field name. It is displayed on the form because (instead of the variable name) because it provides more information to the reader.
Column F - Choices, Calculations OR Slider Labels (Required)
- All categorical field types (yes/no, dropdown, radio buttons, checkboxes) must specify response options associating numerical values with labels. For example, Yes=1, No=0.
- All calculated fields must specify the calculation here. Examples of calculation syntax can be found in the REDCap Help Section and in the demonstration data dictionary.
- The slider field allows you to label three anchor points: left, middle, and right. An example might be "Strongly Disagree", "Neutral", and "Strongly Agree."
Column G - Field Notes (Optional)
- Field notes are used to provide information to assist in data entry. Examples are specifying the expected format of a validated field (e.g. phone number), or units (e.g. kg vs. lb).
Column H - Text Validation Type OR Show Slider Number (Optional)
- Format validation types for text fields are: date, time, integer, number, zip code, phone, and email. An error message is displayed if an entry does not match expected format.
- For slider fields, specify whether to display or hide the value (1-100) selected on the slider.
Columns I & J - Text Validation Min/Max (Optional)
- For text validation types of number or integer, minimum and maximum acceptable values may also be specified. An error message, including the acceptable range, is displayed if the entry is out of range.
Column K - Identifiers (Required for certain variables)
Column L - Branching Logic (Optional)
- Branching logic can be applied to a field to specify whether or not it will be displayed, depending on values in previous fields. For example, a question about pregnancy can be designated to be displayed only if the subject if female. Syntax for branching logic is described in the REDCap Help Section and in the demonstration data dictionary.
Column M - Required Field (Optional)
- A field can be designated as "required" so that it must be completed before moving on to the next field. An error message is displayed if the field is left blank.
Column N - Custom Alignment (Optional)
- The location of text boxes or categorical responses (dropdown, radio, checkbox) can be specified as Right/Vertical, Left/Vertical, Right/Horizontal, Left/Horizontal. The default setting, if not specified, is Right/Vertical.
Column O - Question Number (surveys only) (Optional)
- REDCap can be set to auto-number questions on a survey. However, if you want a custom numbering scheme, you can specify each question number here.