![]() |
Don't Be a SAS® Dinosaur:
Modernize Your SAS Programs
Multiple Input Data Sets: Identifying the Source Data Set |
You need to combine a group of data sets whose names are similar.
The combined data set must contain a variable that indicates which input data set contributed that observation.
|
The identification of the input data set is performed by checking the IN= variables defined for each input data set in the SET statement. Download this program
data combined;
set sashelp.prdsale(in=in_prdsale)
sashelp.prdsal2(in=in_prdsal2)
sashelp.prdsal3(in=in_prdsal3);
length Source $ 32;
if in_prdsale then source="PRDSALE";
else if in_prdsal2 then source="PRDSAL2";
else if in_prdsal3 then source="PRDSAL3";
run;
proc freq data=combined;
tables Source;
title1 "Source Data Sets for Combined Data";
run;
SAS Log6991 data combined; 6992 set sashelp.prdsale(in=in_prdsale) 6993 sashelp.prdsal2(in=in_prdsal2) 6994 sashelp.prdsal3(in=in_prdsal3); 6995 length Source $ 32; 6996 if in_prdsale then source="PRDSALE"; 6997 else if in_prdsal2 then source="PRDSAL2"; 6998 else if in_prdsal3 then source="PRDSAL3"; 6999 run; NOTE: There were 1440 observations read from the data set SASHELP.PRDSALE. NOTE: There were 23040 observations read from the data set SASHELP.PRDSAL2. NOTE: There were 11520 observations read from the data set SASHELP.PRDSAL3. NOTE: The data set WORK.COMBINED has 36000 observations and 15 variables. 7000 proc freq data=combined; 7001 tables Source; 7002 title1 "Source Data Sets for Combined Data"; 7003 run; NOTE: There were 36000 observations read from the data set WORK.COMBINED. SAS Listing Output
Source Data Sets for Combined Data
The FREQ Procedure
Cumulative Cumulative
Source Frequency Percent Frequency Percent
------------------------------------------------------------
PRDSAL2 23040 64.00 23040 64.00
PRDSAL3 11520 32.00 34560 96.00
PRDSALE 1440 4.00 36000 100.00
|
![]() |
|
An alternate approach is to use the INDSNAME= option in the SET statement, available starting in SAS 9.2. The variable defined by the INDSNAME= option contains the complete input data set including the libref, for a maximum length of 41 characters. This technique is particularly valuable when combined with the data set list syntax introduced in SAS 9.2. Download this programdata combined; set sashelp.prdsal: indsname=DataSetName; length Source $ 32; Source=scan(DataSetName,2); run; proc freq data=combined; tables Source; title1 "Source Data Sets for Combined Data"; run; SAS Log7015 data combined; 7016 set sashelp.prdsal: indsname=DataSetName; 7017 length Source $ 32; 7018 Source=scan(DataSetName,2); 7019 run; NOTE: There were 23040 observations read from the data set SASHELP.PRDSAL2. NOTE: There were 11520 observations read from the data set SASHELP.PRDSAL3. NOTE: There were 1440 observations read from the data set SASHELP.PRDSALE. NOTE: The data set WORK.COMBINED has 36000 observations and 15 variables. 7020 proc freq data=combined; 7021 tables Source; 7022 title1 "Source Data Sets for Combined Data"; 7023 run; NOTE: There were 36000 observations read from the data set WORK.COMBINED. SAS Listing OutputSame as for the old way Advantages
of the
alternate
approach:
|
|
Back to SAS Dinosaur home page Printable copy of this page (without sample output)
SAS and all other SAS Institute Inc. product or service names are registered trademarks
or trademarks of SAS Institute Inc. in the USA and other countries.
|