![]() |
Don't Be a SAS® Dinosaur:
Modernize Your SAS Programs
Raw Data: Manipulate Input Data Lines |
A raw data file contains anomalies that hinder your ability to convert the data properly into a SAS data set.
While it is preferable to make corrections to the raw data file itself, you do not have write access to the file.
|
The original approach is to alter the main INPUT statement or modify the values after the main INPUT statement. Download this program
data readEMdash1;
infile "c:\temp\dinosaur\EMdash.dat";
input @1 number1 number2
@1 char1 $ char2 $ ;
if number2=. then do;
char2fixed=translate(char2,"-","97"x,);
number2fixed=input(char2fixed,8.);
end;
else number2fixed=number2;
format char: $hex16.;
run;
proc print data=readEMdash1;
var Number1 Number2 Char2 Char2Fixed Number2Fixed;
title1 "Data containing EM dash character instead of hyphen";
run;
SAS Log
4736 data readEMdash1;
4737 infile "c:\temp\dinosaur\EMdash.dat";
4738 input @1 number1 number2
4739 @1 char1 $ char2 $ ;
4740 if number2=. then do;
4741 char2fixed=translate(char2,"-","97"x,);
4742 number2fixed=input(char2fixed,8.);
4743 end;
4744 else number2fixed=number2;
4745 format char: $hex16.;
4746 run;
NOTE: The infile "c:\temp\dinosaur\EMdash.dat" is:
Filename=c:\temp\dinosaur\EMdash.dat,
RECFM=V,LRECL=256,File Size (bytes)=257,
Last Modified=14May2008:20:54:36,
Create Time=05Aug2007:22:03:48
NOTE: Invalid data for number2 in line 5 14-25.
RULE: ----+----1----+----2----+----3----+----4----+----5----+----6----+----7----+----8--
5 711.35999966 -45.68000031 25
number1=711.35999966 number2=. char1=3731312E33353939 char2=9734352E36383030
char2fixed=2D34352E36383030 number2fixed=-45.68 _ERROR_=1 _N_=5
NOTE: Invalid data for number2 in line 7 14-25.
7 715.19999981 -6.719999313 25
number1=715.19999981 number2=. char1=3731352E31393939 char2=97362E3731393939
char2fixed=2D362E3731393939 number2fixed=-6.71999 _ERROR_=1 _N_=7
NOTE: Invalid data for number2 in line 8 13-23.
8 713.5899992 -1.61000061 23
number1=713.5899992 number2=. char1=3731332E35383939 char2=97312E3631303030
char2fixed=2D312E3631303030 number2fixed=-1.61 _ERROR_=1 _N_=8
NOTE: Invalid data for number2 in line 9 14-25.
9 708.34999943 -5.239999771 25
number1=708.34999943 number2=. char1=3730382E33343939 char2=97352E3233393939
char2fixed=2D352E3233393939 number2fixed=-5.23999 _ERROR_=1 _N_=9
NOTE: 10 records were read from the infile "c:\temp\dinosaur\EMdash.dat".
The minimum record length was 14.
The maximum record length was 25.
NOTE: The data set WORK.READEMDASH1 has 10 observations and 6 variables.
4747 proc print data=readEMdash1;
4748 var Number1 Number2 Char2 Char2Fixed Number2Fixed;
4749 title1 "Data containing EM dash character instead of hyphen";
4750 run;
NOTE: There were 10 observations read from the data set WORK.READEMDASH1.
SAS Listing OutputData containing EM dash character instead of hyphen Obs number1 number2 char2 char2fixed number2fixed 1 740.200 . 2020202020202020 2020202020202020 . 2 747.380 7.1800 372E313739393939 2020202020202020 7.1800 3 750.400 3.0200 332E303230303030 2020202020202020 3.0200 4 757.040 6.6400 362E363430303030 2020202020202020 6.6400 5 711.360 . 9734352E36383030 2D34352E36383030 -45.6800 6 721.920 10.5600 31302E3535393939 2020202020202020 10.5600 7 715.200 . 97362E3731393939 2D362E3731393939 -6.7200 8 713.590 . 97312E3631303030 2D312E3631303030 -1.6100 9 708.350 . 97352E3233393939 2D352E3233393939 -5.2400 10 721.810 13.4600 31332E3436303030 2020202020202020 13.4600 |
![]() |
|
An alternate approach is to alter the input buffer prior to the main INPUT statement. The DATA step variable _INFILE_ contains the current contents of the input buffer. Download this programdata readEMdash2; infile "c:\temp\dinosaur\EMdash.dat"; input @; _infile_=translate(_infile_,"-","97"x,); input @1 number1 number2; run; proc print data=readEMdash2; var Number1 Number2; title1 "Data containing EM dash character instead of hyphen"; run; SAS Log
4796 data readEMdash2;
4797 infile "c:\temp\dinosaur\EMdash.dat";
4798 input @;
4799 _infile_=translate(_infile_,"-","97"x,);
4800 input @1 number1 number2;
4801 run;
NOTE: The infile "c:\temp\dinosaur\EMdash.dat" is:
Filename=c:\temp\dinosaur\EMdash.dat,
RECFM=V,LRECL=256,File Size (bytes)=257,
Last Modified=14May2008:20:54:37,
Create Time=05Aug2007:22:03:48
NOTE: 10 records were read from the infile "c:\temp\dinosaur\EMdash.dat".
The minimum record length was 14.
The maximum record length was 25.
NOTE: The data set WORK.READEMDASH2 has 10 observations and 2 variables.
4802 proc print data=readEMdash2;
4803 var Number1 Number2;
4804 title1 "Data containing EM dash character instead of hyphen";
4805 run;
NOTE: There were 10 observations read from the data set WORK.READEMDASH2.
SAS Listing OutputData containing EM dash character instead of hyphen Obs number1 number2 1 740.200 . 2 747.380 7.1800 3 750.400 3.0200 4 757.040 6.6400 5 711.360 -45.6800 6 721.920 10.5600 7 715.200 -6.7200 8 713.590 -1.6100 9 708.350 -5.2400 10 721.810 13.4600 Advantages
of the
alternate
approach:
|
|
Back to SAS Dinosaur home page Printable copy of this page (without sample output)
SAS and all other SAS Institute Inc. product or service names are registered trademarks
or trademarks of SAS Institute Inc. in the USA and other countries.
|