Question : You have defined a DATA step, which will be reading a raw file with the records in it. Each record has variables, and you will be reading this 5 variables from the observation. Beginnning of the execution phase, you know the value for automatic variables _N_ is 1 and _ERROR_ is 0, what about the values for other 5 variables?
Correct Answer : Get Lastest Questions and Answer : Explanation: As you know all at the beginning of the execution phase file is not yet read. Hence, variable values must be initialized with the missing values until record is read. - Execution phase: Data Portion will be created in this phase. It follows the following step will be followed. o _N_ variable: Each time DATA step is executed, _N_ variables will increment by 1. o PDV : Any newly created variable will be set to missing in Program DATA Vector(PDV) o Now it will read an observation in either input buffer or directly into PDV. o Now remaining statements from the SAS program will be executed and PDV will be updated accordingly. o At the last statement of DATA step, entire observation will be written to SAS Data set. Only variables which are in SET, MERGE, MODIFY, or UPDATE statement will not be reset as missing in PDV. o Similarly for another observation above steps will be followed, until all the observations are done. o As soon end of file found for the raw data, it will terminate SAS data steps.
Question : After a SAS program is submitted, the following is written to the SAS log:
What changes should be made to keep the statement to correct the errors in the log? 1. keep product sales; 2. keep product, sales; 3. Access Mostly Uused Products by 50000+ Subscribers 4. keep = (product sales); Ans : 1 Exp :
Question : Which SAS statement correctly uses formatted input to read the values in this order: Item (first field), UnitCost (second field), Quantity (third field)? 1. input @1 Item $9. +1 UnitCost comma6. @18 Quantity 3.; 2. input Item $9. @11 UnitCost comma6. @18 Quantity 3.; 3. Access Mostly Uused Products by 50000+ Subscribers @18 Quantity 3.; 4. all of the above Ans : 4 Exp : The default location of the column pointer control is column 1, so a column pointer control is optional for reading the first field. You can use the @n or +n pointer controls to specify the beginning column of the other fields. You can use the $w. informat to read the values for Item, the COMMAw.d informat for UnitCost, and the w.d informat for Quantity.
Question :
Which raw data file requires the PAD option in the INFILE statement in order to correctly read the data using either column input or formatted input? 1. a 2. b 3. Access Mostly Uused Products by 50000+ Subscribers 4. d Ans : 1 Exp : Use the PAD option in the INFILE statement to read variable-length records that contain fixed-field data. The PAD option pads each record with blanks so that all data lines have the same length.
Column input is useful for reading standard values only.
Column input enables you to read standard data values that are aligned in columns in the data records. Specify the variable name, followed by a dollar sign ($) if it is a character variable, and specify the columns in which the data values are located in each record: data scores; infile datalines truncover; input name $ 1-12 score2 17-20 score1 27-30; datalines; Riley 1132 987 Henderson 1015 1102 ; Note: Use the TRUNCOVER option in the INFILE statement to ensure that SAS handles data values of varying lengths appropriately. [cautionend] To use column input, data values must be: in the same field on all the input lines in standard numeric or character form. Note: You cannot use an informat with column input. [cautionend] Features of column input include the following: Character values can contain embedded blanks. Character values can be from 1 to 32,767 characters long. Placeholders, such as a single period (.), are not required for missing data. Input values can be read in any order, regardless of their position in the record. Values or parts of values can be reread. Both leading and trailing blanks within the field are ignored. Values do not need to be separated by blanks or other delimiters.
Formatted input combines the flexibility of using informats with many of the features of column input. By using formatted input, you can read nonstandard data for which SAS requires additional instructions. Formatted input is typically used with pointer controls that enable you to control the position of the input pointer in the input buffer when you read data. The INPUT statement in the following DATA step uses formatted input and pointer controls. Note that $12. and COMMA5. are informats and +4 and +6 are column pointer controls.
Question : The raw data file referenced by the fileref Employee contains data that is 1. arranged in fixed fields 2. free-format 3. Access Mostly Uused Products by 50000+ Subscribers 4. arranged in columns Ans : 2 Exp : The raw data file contains data that is free-format, meaning that the data is not arranged in columns or fixed fields.
Free-Format Data External files can contain raw data that is free-format; that is, the data is not arranged in fixed fields. The fields can be separated by blanks, or by some other delimiter, such as commas.
Using List Input Free-format data can easily be read with list input because you do not need to specify column locations of the data. You simply list the variable names in the same order as the corresponding raw data fields. You must distinguish character variables from numeric variables by using the dollar ($) sign.
When characters other than blanks are used to separate the data values, you can specify the field delimiter by using the DLM= option in the INFILE statement.
You can also specify a range of variables in the INPUT statement when the variable values in the raw data file are sequential and are separated by blanks (or by some other delimiter). This is especially useful if your data contains similar variables, such as the answers to a questionnaire.
In its simplest form, list input places several limitations on the types of data that can be read.
Reading Missing Values If your data contains missing values at the end of a record, you can use the INFILE statement with the MISSOVER option to prevent SAS from going to the next record to find the missing values.
If your data contains missing values at the beginning or in the middle of a record, you might be able to use the DSD option in the INFILE statement to correctly read the raw data. The DSD option sets the default delimiter to a comma and treats two consecutive delimiters as a missing value.
If the data uses multiple delimiters or a single delimiter other than a comma, you can use both the DSD option and the DLM= option in the INFILE statement.
The DSD option can also be used to read raw data when there is a missing value at the beginning of a record, as long as a delimiter precedes the first value in the record.
Question : Which input style should be used to read the values in the raw data file that is referenced by the fileref Employee?
1. column 2. formatted 3. Access Mostly Uused Products by 50000+ Subscribers 4. mixed Ans : 3 Exp : List input should be used to read data that is free-format because you do not need to specify the column locations of the data.
Free-Format Data External files can contain raw data that is free-format; that is, the data is not arranged in fixed fields. The fields can be separated by blanks, or by some other delimiter, such as commas.
Using List Input Free-format data can easily be read with list input because you do not need to specify column locations of the data. You simply list the variable names in the same order as the corresponding raw data fields. You must distinguish character variables from numeric variables by using the dollar ($) sign.
When characters other than blanks are used to separate the data values, you can specify the field delimiter by using the DLM= option in the INFILE statement.
You can also specify a range of variables in the INPUT statement when the variable values in the raw data file are sequential and are separated by blanks (or by some other delimiter). This is especially useful if your data contains similar variables, such as the answers to a questionnaire.
In its simplest form, list input places several limitations on the types of data that can be read.
Reading Missing Values If your data contains missing values at the end of a record, you can use the INFILE statement with the MISSOVER option to prevent SAS from going to the next record to find the missing values.
If your data contains missing values at the beginning or in the middle of a record, you might be able to use the DSD option in the INFILE statement to correctly read the raw data. The DSD option sets the default delimiter to a comma and treats two consecutive delimiters as a missing value.
If the data uses multiple delimiters or a single delimiter other than a comma, you can use both the DSD option and the DLM= option in the INFILE statement.
The DSD option can also be used to read raw data when there is a missing value at the beginning of a record, as long as a delimiter precedes the first value in the record.
Question : Which SAS program was used to create the raw data file hadoopexam from the SAS data set Work.Scores? 1. data _null_; set work.scores; file 'c:\data\hadoopexam' dlm=','; put name highscore team; run; 2. data _null_; set work.scores; file 'c:\data\hadoopexam' dlm=' '; put name highscore team; run; 3. Access Mostly Uused Products by 50000+ Subscribers set work.scores; file 'c:\data\hadoopexam' dsd; put name highscore team; run; 4. data _null_; set work.scores; file 'c:\data\hadoopexam'; put name highscore team; run; Ans :3 Exp : You can use the DSD option in the FILE statement to specify that data values containing commas should be enclosed in quotation marks. The DSD option uses a comma as the delimiter by default.
SAS does not properly recognize empty values for delimited data unless you use the dsd option. You need to use the dsd option on the infile statement if two consecutive delimiters are used to indicate missing values (e.g., two consecutive commas, two consecutive tabs). Below, we read the exact same file again, except that we use the dsd option.
DATA cars2; length make $ 20 ; INFILE 'readdsd.txt' DELIMITER=',' DSD ; INPUT make mpg weight price; RUN;
PROC PRINT DATA=cars2; RUN;
Question :
Which SAS statement reads the raw data values in order and assigns them to the variables shown below? Variables: FirstName (character), LastName (character), Age (numeric), School (character), Class (numeric) 1. input FirstName $ LastName $ Age School $ Class; 2. input FirstName LastName Age School Class; 3. Access Mostly Uused Products by 50000+ Subscribers School $ 17-19 Class 21; 4. input FirstName 1-4 LastName 6-12 Age 14-15 School 17-19 Class 21; Ans : 1 Exp : Because the data is free-format, list input is used to read the values. With list input, you simply name each variable and identify its type. Free-Format Data External files can contain raw data that is free-format; that is, the data is not arranged in fixed fields. The fields can be separated by blanks, or by some other delimiter, such as commas.
Using List Input Free-format data can easily be read with list input because you do not need to specify column locations of the data. You simply list the variable names in the same order as the corresponding raw data fields. You must distinguish character variables from numeric variables by using the dollar ($) sign.
When characters other than blanks are used to separate the data values, you can specify the field delimiter by using the DLM= option in the INFILE statement.
You can also specify a range of variables in the INPUT statement when the variable values in the raw data file are sequential and are separated by blanks (or by some other delimiter). This is especially useful if your data contains similar variables, such as the answers to a questionnaire.
In its simplest form, list input places several limitations on the types of data that can be read.
Reading Missing Values If your data contains missing values at the end of a record, you can use the INFILE statement with the MISSOVER option to prevent SAS from going to the next record to find the missing values.
If your data contains missing values at the beginning or in the middle of a record, you might be able to use the DSD option in the INFILE statement to correctly read the raw data. The DSD option sets the default delimiter to a comma and treats two consecutive delimiters as a missing value.
If the data uses multiple delimiters or a single delimiter other than a comma, you can use both the DSD option and the DLM= option in the INFILE statement.
The DSD option can also be used to read raw data when there is a missing value at the beginning of a record, as long as a delimiter precedes the first value in the record.
Question :
Which SAS statement should be used to read the raw data file that is referenced by the fileref Hadoopexamsale? 1. infile hadoopexamsale; 2. infile hadoopexamsale ':'; 3. Access Mostly Uused Products by 50000+ Subscribers 4. infile hadoopexamsale dlm=':';
Explanation: The INFILE statement identifies the location of the external data file. The DLM= option specifies the colon (:) as the delimiter that separates each field. Infile options
For more complicated file layouts, refer to the infile options described below.
DLM= The dlm= option can be used to specify the delimiter that separates the variables in your raw data file. For example, dlm=','indicates a comma is the delimiter (e.g., a comma separated file, .csv file). Or, dlm='09'x indicates that tabs are used to separate your variables (e.g., a tab separated file).
DSD The dsd option has 2 functions. First, it recognizes two consecutive delimiters as a missing value. For example, if your file contained the line 20,30,,50 SAS will treat this as 20 30 50 but with the dsd option SAS will treat it as 20 30 . 50 , which is probably what you intended. Second, it allows you to include the delimiter within quoted strings. For example, you would want to use the dsd option if you had a comma separated file and your data included values like "George Bush, Jr.". With the dsd option, SAS will recognize that the comma in "George Bush, Jr." is part of the name, and not a separator indicating a new variable.
FIRSTOBS= This option tells SAS what on what line you want it to start reading your raw data file. If the first record(s) contains header information such as variable names, then set firstobs=n where n is the record number where the data actually begin. For example, if you are reading a comma separated file or a tab separated file that has the variable names on the first line, then use firstobs=2 to tell SAS to begin reading at the second line (so it will ignore the first line with the names of the variables).
MISSOVER This option prevents SAS from going to a new input line if it does not find values for all of the variables in the current line of data. For example, you may be reading a space delimited file and that is supposed to have 10 values per line, but one of the line had only 9 values. Without the missover option, SAS will look for the 10th value on the next line of data. If your data is supposed to only have one observation for each line of raw data, then this could cause errors throughout the rest of your data file. If you have a raw data file that has one record per line, this option is a prudent method of trying to keep such errors from cascading through the rest of your data file.
OBS= Indicates which line in your raw data file should be treated as the last record to be read by SAS. This is a good option to use for testing your program. For example, you might use obs=100 to just read in the first 100 lines of data while you are testing your program. When you want to read the entire file, you can remove the obs= option entirely.
A typical infile statement for reading a comma delimited file that contains the variable names in the first line of data would be:
Question : Which of the following raw data files can be read by using the MISSOVER option in the INFILE statement? Spaces for missing values are highlighted with gray blocks. 1. 2. 3. Access Mostly Uused Products by 50000+ Subscribers 4. Ans :1 Exp : You can use the MISSOVER option in the INFILE statement to read the missing values at the end of a record. The MISSOVER option prevents SAS from moving to the next record if values are missing in the current record. The INFILE statement identifies the location of the external data file. The DLM= option specifies the colon (:) as the delimiter that separates each field. Infile options : For more complicated file layouts, refer to the infile options described below.
DLM= The dlm= option can be used to specify the delimiter that separates the variables in your raw data file. For example, dlm=','indicates a comma is the delimiter (e.g., a comma separated file, .csv file). Or, dlm='09'x indicates that tabs are used to separate your variables (e.g., a tab separated file).
DSD The dsd option has 2 functions. First, it recognizes two consecutive delimiters as a missing value. For example, if your file contained the line 20,30,,50 SAS will treat this as 20 30 50 but with the dsd option SAS will treat it as 20 30 . 50 , which is probably what you intended. Second, it allows you to include the delimiter within quoted strings. For example, you would want to use the dsd option if you had a comma separated file and your data included values like "George Bush, Jr.". With the dsd option, SAS will recognize that the comma in "George Bush, Jr." is part of the name, and not a separator indicating a new variable.
FIRSTOBS= This option tells SAS what on what line you want it to start reading your raw data file. If the first record(s) contains header information such as variable names, then set firstobs=n where n is the record number where the data actually begin. For example, if you are reading a comma separated file or a tab separated file that has the variable names on the first line, then use firstobs=2 to tell SAS to begin reading at the second line (so it will ignore the first line with the names of the variables).
MISSOVER This option prevents SAS from going to a new input line if it does not find values for all of the variables in the current line of data. For example, you may be reading a space delimited file and that is supposed to have 10 values per line, but one of the line had only 9 values. Without the missover option, SAS will look for the 10th value on the next line of data. If your data is supposed to only have one observation for each line of raw data, then this could cause errors throughout the rest of your data file. If you have a raw data file that has one record per line, this option is a prudent method of trying to keep such errors from cascading through the rest of your data file.
OBS= Indicates which line in your raw data file should be treated as the last record to be read by SAS. This is a good option to use for testing your program. For example, you might use obs=100 to just read in the first 100 lines of data while you are testing your program. When you want to read the entire file, you can remove the obs= option entirely.
A typical infile statement for reading a comma delimited file that contains the variable names in the first line of data would be:
Question : Which SAS program correctly reads the data in the raw data file that is referenced by the fileref Hadoopexam ? 1. . data perm.contest; infile hadoopexam; input FirstName $ LastName $ Age School $ Class; run; 2. data perm.contest; infile hadoopexam; length LastName $ 11; input FirstName $ lastname $ Age School $ Class; run; 3. Access Mostly Uused Products by 50000+ Subscribers infile hadoopexam; input FirstName $ lastname $ Age School $ Class; length LastName $ 11; run; 4. data perm.contest; infile hadoopexam; input FirstName $ LastName $ 11. Age School $ Class; run; Ans 2 Exp :The LENGTH statement extends the length of the character variable LastName so that it is large enough to accommodate the data. Variable attributes such as length are defined the first time a variable is named in a DATA step. The LENGTH statement should precede the INPUT statement so that the correct length is defined. In general, the length of a variable depends on whether the variable is numeric or character how the variable was created whether a LENGTH or ATTRIB statement is present. Subject to the rules for assigning lengths, lengths that are assigned with the LENGTH statement can be changed in the ATTRIB statement and vice versa.
Question :
Which type of input should be used to read the values in the raw data file that is referenced by the fileref University?
Ans : 4 Exp : Notice that the values for School contain embedded blanks, and the values for Enrolled are nonstandard numeric values. Modified list input can be used to read the values that contain embedded blanks and nonstandard values.
Question : Which SAS statement correctly reads the values for Flavor and Quantity? Make sure the length of each variable can accommodate the values shown. 1. input Flavor & $9. Quantity : comma.; 2. input Flavor & $14. Quantity : comma.; 3. Access Mostly Uused Products by 50000+ Subscribers 4. input Flavor $14. Quantity : comma.; Ans :2 Exp : The INPUT statement uses list input with format modifiers and informats to read the values for each variable. The ampersand modifier enables you to read character values that contain single embedded blanks. The colon (:) modifier enables you to read nonstandard data values and character values that are longer than eight characters, but which contain no embedded blanks.
Question : Which SAS statement correctly reads the raw data values in order and assigns them to these corresponding variables: Year (numeric), School (character), Enrolled (numeric)? 1. input Year School & $27. Enrolled : comma.;
2. input Year 1-4 School & $27. Enrolled : comma.;
4. all of the above Ans : 4 Exp : The values for Year can be read with column, formatted, or list input. However, the values for School and Enrolled are free-format data that contain embedded blanks or nonstandard values. Therefore, these last two variables must be read with modified list input.
Question : You have written a SAS program which will read a RAW data file containing records, after running the entire DATA steps you know there could be 5 errors records. What will be the value of Automatic variable _ERROR_, once data step completed?
Correct Answer : Get Lastest Questions and Answer : Explanation: It does not matter, how many errors are there in your DATA step, as soon as error is found automatic variable _ERROR_ will be set to 1 . If there is no error than it will remain 0.