Data Formatting and Cleaning

Naming variables

Variable names belong in the first row in your Excel spreadsheet. If using SPSS, variable names should be entered as headers in the Variable View tab in SPSS. Variable names should be brief yet descriptive. Variable names can contain capital letters (A ~ Z) or lowercase letters (a ~ z), and underscores ( _ ). Variable names cannot begin with a number.

Naming cases

The first column should consist of a participant identifier. Each case should have their own row. This is known as "wide" format.

Coding variables

Scale variables (e.g. score on a test) should already be numerical entries. For categorical variables (responses consisting of text), such as (gender, job title, etc.) assign a number for each response. For instance, nurses = 1, LPN = 2, technician = 3. If using Excel, keep track of your codes in a separate document, apart from the spreadsheet you plan to input into the statistical program. If using SPSS, codes can entered for each variable in the Values column in the Variable View tab.

Importing data

By default, Excel will save your file in a .xlsx format. You must save your cleaned data as a .csv (Comma delimited) file. From the menu, click File > Save As > then select CSV (Comma delimited) from the "save as type" drop-down menu.

To input data from Excel into SPSS, click File > Open > Data. In the dialog box, navigate to the folder you saved the .csv file in, then click "Files of type" and choose either "CSV (*.csv)", "Text (*.txt, *.dat, *.csv, *.tab), or All Files (*.*) to display the .csv file you need. If the data has been formatted correctly according to the steps outlined here, you can simply click "Ok" to open your data. At this point, I'd suggest clicking "File > Save As > and save your newly imported file as a (*.sav), an SPSS data file. If you make changes to the data, save it as a new file with a different name, in case you make a mistake and need to revert to the original data.