Warren Repole

Don't Be a SAS® Dinosaur: Modernize Your SAS Programs
by Warren Repole

Arrays: Array Element Indices


Scenario:

You want to process a set of variables using a DATA step array.

The array elements correspond to a series of consecutive years.


The old way: Array Dimensions from 1 to N

The original approach is to define the array dimension simply based on the number of elements.

Accessing the array elements by year requires the use of arithmetic expressions.

Download this program
proc print data=AllYears;
  title1 "Annual Data in Separate Variables";
run;
data details(drop=Sales1995-Sales1998);
  set AllYears;
  array SalesArray {4} Sales1995-Sales1998 ;
  do Year=1995 to 1998;
    ArrayIndex=Year-1994;
    Value=SalesArray{ArrayIndex};
    format Value dollar12.2;
    output;
  end;
run;
proc print data=details;
  title1 "Results of Array Processing";
run;

SAS Log

1114       proc print data=AllYears;
1115         title1 "Annual Data in Separate Variables";
1116       run;
 
NOTE: There were 4 observations read from the data set WORK.ALLYEARS.
 
1117       data details(drop=Sales1995-Sales1998);
1118         set AllYears;
1119         array SalesArray {4} Sales1995-Sales1998 ;
1120         do Year=1995 to 1998;
1121           ArrayIndex=Year-1994;
1122           Value=SalesArray{ArrayIndex};
1123           format Value dollar12.2;
1124           output;
1125         end;
1126       run;
 
NOTE: There were 4 observations read from the data set WORK.ALLYEARS.
NOTE: The data set WORK.DETAILS has 16 observations and 4 variables.
 
1127       proc print data=details;
1128         title1 "Results of Array Processing";
1129       run;
 
NOTE: There were 16 observations read from the data set WORK.DETAILS.
 
 

SAS Listing Output

Annual Data in Separate Variables
 
Obs   STATE                       Sales1995      Sales1996      Sales1997      Sales1998
 
 1    British Columbia           $42,788.80     $46,502.40     $53,486.00     $58,128.00
 2    Ontario                    $34,665.60     $45,286.40     $43,332.00     $56,608.00
 3    Quebec                     $44,324.00     $44,454.40     $55,405.00     $55,568.00
 4    Saskatchewan               $40,048.00     $49,543.20     $50,060.00     $61,929.00
========================================================================================
Results of Array Processing
 
                                         Array
Obs    STATE                     Year    Index           Value
 
  1    British Columbia          1995      1        $42,788.80
  2    British Columbia          1996      2        $46,502.40
  3    British Columbia          1997      3        $53,486.00
  4    British Columbia          1998      4        $58,128.00
  5    Ontario                   1995      1        $34,665.60
  6    Ontario                   1996      2        $45,286.40
  7    Ontario                   1997      3        $43,332.00
  8    Ontario                   1998      4        $56,608.00
  9    Quebec                    1995      1        $44,324.00
 10    Quebec                    1996      2        $44,454.40
 11    Quebec                    1997      3        $55,405.00
 12    Quebec                    1998      4        $55,568.00
 13    Saskatchewan              1995      1        $40,048.00
 14    Saskatchewan              1996      2        $49,543.20
 15    Saskatchewan              1997      3        $50,060.00
 16    Saskatchewan              1998      4        $61,929.00
bronto

The new way: Array Dimensions from M to N
(available in SAS Version 6)

An alternate approach is to define the array dimensions based on the actual range of years.

Accessing the array elements by year can be accomplished directly.

Download this program
proc print data=AllYears;
  title1 "Annual Data in Separate Variables";
run;
data details(drop=Sales1995-Sales1998);
  set AllYears;
  array SalesArray {1995:1998} Sales1995-Sales1998 ;
  do Year=lbound(SalesArray) to hbound(SalesArray);
    Value=SalesArray{Year};
    format Value dollar12.2;
    output;
  end;
run;
proc print data=details;
  title1 "Results of Array Processing";
run;

SAS Log

1263       proc print data=AllYears;
1264         title1 "Annual Data in Separate Variables";
1265       run;
 
NOTE: There were 4 observations read from the data set WORK.ALLYEARS.
 
1266       data details(drop=Sales1995-Sales1998);
1267         set AllYears;
1268         array SalesArray {1995:1998} Sales1995-Sales1998 ;
1269         do Year=lbound(SalesArray) to hbound(SalesArray);
1270           Value=SalesArray{Year};
1271           format Value dollar12.2;
1272           output;
1273         end;
1274       run;
 
NOTE: There were 4 observations read from the data set WORK.ALLYEARS.
NOTE: The data set WORK.DETAILS has 16 observations and 3 variables.
 
1275       proc print data=details;
1276         title1 "Results of Array Processing";
1277       run;
 
NOTE: There were 16 observations read from the data set WORK.DETAILS.
 

SAS Listing Output

Annual Data in Separate Variables
 
Obs   STATE                       Sales1995      Sales1996      Sales1997      Sales1998
 
 1    British Columbia           $42,788.80     $46,502.40     $53,486.00     $58,128.00
 2    Ontario                    $34,665.60     $45,286.40     $43,332.00     $56,608.00
 3    Quebec                     $44,324.00     $44,454.40     $55,405.00     $55,568.00
 4    Saskatchewan               $40,048.00     $49,543.20     $50,060.00     $61,929.00
========================================================================================
Results of Array Processing
 
Obs    STATE                     Year           Value
 
  1    British Columbia          1995      $42,788.80
  2    British Columbia          1996      $46,502.40
  3    British Columbia          1997      $53,486.00
  4    British Columbia          1998      $58,128.00
  5    Ontario                   1995      $34,665.60
  6    Ontario                   1996      $45,286.40
  7    Ontario                   1997      $43,332.00
  8    Ontario                   1998      $56,608.00
  9    Quebec                    1995      $44,324.00
 10    Quebec                    1996      $44,454.40
 11    Quebec                    1997      $55,405.00
 12    Quebec                    1998      $55,568.00
 13    Saskatchewan              1995      $40,048.00
 14    Saskatchewan              1996      $49,543.20
 15    Saskatchewan              1997      $50,060.00
 16    Saskatchewan              1998      $61,929.00

Advantages of the alternate approach: Good

  • No calculations are required to reference the correct array element corresponding to a year.
  • The LBOUND and HBOUND functions are available to control the range of loop index variables that may be used to traverse the array.
  • If the array were to be loaded from a table, no calculations would be required to store the value corresponding to a year in the correct array element.

Disadvantages of the alternate approach: Bad

  • The actual range of values for the array dimensions must be specified. They can be hardcoded or resolved from macro variables.

Additional documentation for this technique can be found in SAS® 9.2 Language Reference: Dictionary. Cary, NC: SAS Institute Inc.

Visit http://support.sas.com/documentation/onlinedoc/sas9doc.html for SAS 9 documentation.

The URL for this page is http://www.repole.com/dinosaur/arraybound.html

To view the test data used in this example, go to http://www.repole.com/dinosaur/summary-data.html

These techniques are mentioned in other SAS references and publications:


Back to SAS Dinosaur home page

Printable copy of this page (without sample output)

SAS and all other SAS Institute Inc. product or service names are registered trademarks or trademarks of SAS Institute Inc. in the USA and other countries.

® indicates USA registration.