![]() |
Don't Be a SAS® Dinosaur:
Modernize Your SAS Programs
Frequency Reports: Number of Levels of a Variable |
You need to determine the number of distinct levels for several variables in a SAS data set.
If missing values are present, you wish to distinguish missing values (including regular and special missing values for numeric variables) from nonmissing values when the counts are generated.
|
The original approach is to collapse each variable down to unique values, then count the number of collapsed items. The counts for individual variables subsequently are concatenated to create a single result set. Download this program
data Georgia;
set sashelp.zipcode(where=(StateName="Georgia"));
if msa=0 then msa=.;
run;
proc freq data=georgia;
tables CountyNm / out=Counties noprint;
tables AreaCode / out=AreaCodes noprint;
tables ZIP_Class / out=ZIPclasses noprint;
tables MSA / out=MSAs noprint;
run;
data counts;
set Counties(in=county_level)
AreaCodes(in=areacode_level)
ZIPclasses(in=zipclass_level)
MSAs(in=msa_level)
end=no_more;
if county_level then CountyCount+1;
else if areacode_level then AreaCodeCount+1;
else if zipclass_level then ZIPclassCount+1;
else if msa_level then MSACount+1;
if no_more then output;
keep CountyCount AreaCodeCount ZIPclassCount MSACount;
run;
proc print data=counts noobs;
title1 "Counts of Variable Levels";
run;
SAS Log
6798 data Georgia;
6799 set sashelp.zipcode(where=(StateName="Georgia"));
6800 if msa=0 then msa=.;
6801 run;
NOTE: There were 972 observations read from the data set SASHELP.ZIPCODE.
WHERE StateName='Georgia';
NOTE: The data set WORK.GEORGIA has 972 observations and 18 variables.
6802 proc freq data=georgia;
6803 tables CountyNm / out=Counties noprint;
6804 tables AreaCode / out=AreaCodes noprint;
6805 tables ZIP_Class / out=ZIPclasses noprint;
6806 tables MSA / out=MSAs noprint;
6807 run;
NOTE: There were 972 observations read from the data set WORK.GEORGIA.
NOTE: The data set WORK.COUNTIES has 159 observations and 3 variables.
NOTE: The data set WORK.AREACODES has 6 observations and 3 variables.
NOTE: The data set WORK.ZIPCLASSES has 3 observations and 3 variables.
NOTE: The data set WORK.MSAS has 9 observations and 3 variables.
6808 data counts;
6809 set Counties(in=county_level)
6810 AreaCodes(in=areacode_level)
6811 ZIPclasses(in=zipclass_level)
6812 MSAs(in=msa_level)
6813 end=no_more;
6814 if county_level then CountyCount+1;
6815 else if areacode_level then AreaCodeCount+1;
6816 else if zipclass_level then ZIPclassCount+1;
6817 else if msa_level then MSACount+1;
6818 if no_more then output;
6819 keep CountyCount AreaCodeCount ZIPclassCount MSACount;
6820 run;
NOTE: There were 159 observations read from the data set WORK.COUNTIES.
NOTE: There were 6 observations read from the data set WORK.AREACODES.
NOTE: There were 3 observations read from the data set WORK.ZIPCLASSES.
NOTE: There were 9 observations read from the data set WORK.MSAS.
NOTE: The data set WORK.COUNTS has 1 observations and 4 variables.
6821 proc print data=counts noobs;
6822 title1 "Counts of Variable Levels";
6823 run;
NOTE: There were 1 observations read from the data set WORK.COUNTS.
SAS Listing Output
Counts of Variable Levels
Area
County Code ZIPclass
Count Count Count MSACount
159 6 3 9
|
![]() |
|
An alternate approach is to apply the NLEVELS option in PROC FREQ, using the ODS OUTPUT statement to capture the counts into a SAS data set. Download this programdata Georgia; set sashelp.zipcode(where=(StateName="Georgia")); if msa=0 then msa=.; run; ods output NLevels=CountsNLevels; proc freq data=georgia nlevels; tables CountyNm AreaCode ZIP_Class MSA / noprint missing; label ZIP_Class=" " MSA=" " CountyNm=" " AreaCode=" "; title1 "Counts of Variable Levels"; run; proc print data=CountsNLevels; title1 "Counts of Variable Levels"; run; SAS Log
6835 data Georgia;
6836 set sashelp.zipcode(where=(StateName="Georgia"));
6837 if msa=0 then msa=.;
6838 run;
NOTE: There were 972 observations read from the data set SASHELP.ZIPCODE.
WHERE StateName='Georgia';
NOTE: The data set WORK.GEORGIA has 972 observations and 18 variables.
6839 ods output NLevels=CountsNLevels;
6840 proc freq data=georgia nlevels;
6841 tables CountyNm AreaCode ZIP_Class MSA
6842 / noprint missing;
6843 label ZIP_Class=" " MSA=" " CountyNm=" " AreaCode=" ";
6844 title1 "Counts of Variable Levels";
6845 run;
NOTE: The data set WORK.COUNTSNLEVELS has 4 observations and 4 variables.
NOTE: There were 972 observations read from the data set WORK.GEORGIA.
6846 proc print data=CountsNLevels;
6847 title1 "Counts of Variable Levels";
6848 run;
NOTE: There were 4 observations read from the data set WORK.COUNTSNLEVELS.
SAS Listing Output
Counts of Variable Levels
The FREQ Procedure
Number of Variable Levels
Missing Nonmissing
Variable Levels Levels Levels
-----------------------------------------------
COUNTYNM 159 0 159
AREACODE 6 0 6
ZIP_CLASS 3 1 2
MSA 9 1 8
========================================================================================
Counts of Variable Levels
NNon
NMiss Miss
Obs TableVar NLevels Levels Levels
1 COUNTYNM 159 0 159
2 AREACODE 6 0 6
3 ZIP_CLASS 3 1 2
4 MSA 9 1 8
Advantages
of the
alternate
approach:
|
|
Back to SAS Dinosaur home page Printable copy of this page (without sample output)
SAS and all other SAS Institute Inc. product or service names are registered trademarks
or trademarks of SAS Institute Inc. in the USA and other countries.
|