|
When working with large data files, like the NES, GSS, Eurobarometer and the U.S. Supreme Court Database, there are many ways to write your SAS program that will increase the program's efficiency and therefore cut down on the computer resources and time it takes to run it.
Use Keep or Drop statements in your Data statement to reduce the size of the data set. ex.
Through the BY and WHERE commands, one can accomplish in one step what it might otherwise take several to do. ex.
In the first example, you will create a data set called new that contains only those observations from your data set that contain the value '87' in the variable 'year.' The second and third examples demonstrate that you can also use WHERE and BY clauses within procedures. Example 2 will perform frequency counts for the variables 'sex' and 'partyid' only on the subset of the data that meets the condition of year=87. In the third, however, the PROC FREQ will actually perform several frequency counts on each of those variables: one for each value of the variable agegroup.
When testing programs for errors, reduce the number of observations you use with the OBS statement. You can do this either in your data step, or in the PROC statement itself. ex.
Each time you use an "if ... " statement, SAS must search through every observation of your data to see if it meets the condition you've set, then perform the operation you specify. In social sciences researchers usually use IF statements to create new variables or recode old variables. Assume, for instance, that you have a variable, ideology, that is measured on a five point scale and you want to collapse it into a three point scale (liberals, moderates and conservatives). The inefficient way to do it uses separate IF statements for each condition you need in order to recode.
In this case, for each IF ... THEN statement, SAS looks through every observation in your data, checking to see if it meets the condition set out in the IF portion of the statement. If it does, it performs the operation described in the THEN statement; if not, it moves on to the next observation. By using "ELSE IF" instead of "IF", however, you can cut down on the time it takes SAS to perform the operation.
While SAS looks through every observation in the first IF statement, in the next statements it only looks at those observations that did not meet the condition set out in the first one. As you can imagine, this can make a big difference, time-wise. Finally, use system files. |