Use LEFT and RIGHT arrow keys to navigate between flashcards;
Use UP and DOWN arrow keys to flip the card;
H to show hint;
A reads text to speech;
104 Cards in this Set
- Front
- Back
SAS Environment |
- Program Editor - Enhanced Editor - Log - Output - Results - Explorer - Table Editor |
|
Editors |
Program Editor: - Not as easy to use as enhanced editor Enhanced Editor: - Enhances use because text is color coded - Code is written here - Editor must be selected in order to run code |
|
Log |
- Shows you the results of execution - Shows any notes made in blue - Shows warning messages in green - Shows error messages in red - Shows executed messages in black - Used for debugging |
|
Output |
- Displays listing reports
- SAS 9.3 is not directly outputted to output window - Shows pages |
|
Results |
- Default window for output
- Displays html report - Shows one long page - Used for navigating previously run results - Has a bookmark for all of your previously executed code |
|
Explorer |
- Navigations tool
- Allows you to navigate between SAS libraries and SAS objects - You can go into folders and look at different data sets - Data sets will show up in table editor window |
|
Table Editor |
- Used to create new data sets
- Also a view table window - Can be used to open an existing data set - Can make modifications to the variable names, data, etc. |
|
SAS File Types |
- .sas
- .log - .lst |
|
SAS Statement |
- Keywords - Semicolon at the end of entire statement |
|
Submitting SAS Programs |
- Very easy using F8 - Can submit using runnning man symbol, or the menu - Selecting portion of code means you're submitting just that part - No selection means submitting all the code |
|
DATA Steps
|
- We are reading in the data set and manipulating the data set - Creating new variables, keeping or dropping variables, calculating things, etc. |
|
PROC Steps |
- Utility options - Creating a report, or a subset of data - Any analysis is done here |
|
SAS Libraries |
- Essentially a nickname to a location on your disk drive - Use libname statement to create a library and a path location to a folder - Work is a temporary library - Two-level file name : 'library.data' |
|
SAS Data sets |
Two Portions:
Descriptor Portion: - Holds all the general info about your data set: name, creation date, if sorted, attributes, etc. Data Portion: - Contains all of the data - Print by using PROC print procedure |
|
PROC Contents |
- Displays the descriptor portion of the data set
|
|
SAS Variables
|
- Numeric: default variable: missing values are displayed as periods - Character: has to be specified with a $ sign, missing values displayed by blank spaces |
|
Titles |
- Used to enhance look of report
- 10 possible lines allocated for titles and footnotes - Can change these lines by a title n statement - If you change one specific title, then all other subsequent titles will be erased - Default: SAS System - Clear out titles will null statement: title; - Can change color, height, justify, font ,bold, italics, etc. - These options only show up in the results viewer (statement will but enhancements wont) |
|
Footnotes |
- Used to enhance look of report - Show up at the bottom - 10 possible lines allocated for titles and footnotes - Default footnote: nothing - Clear out footnotes with null statement: footnote; - Can change color, height, justify, font, bold, italics, etc. - Options only show up in results viewer and not output window (statement will but enhancements wont) |
|
SAS System Options |
- Can change things in output window
- Can change pagesize, numbers, date, timestamp, etc. - Options statement - When you use an options statement, they are additive: everything you specified in the previous problem will stay unless you change it, changing one does not specify the number - Can be specified globally ( outside or inside proc print step) |
|
WHERE statement |
- Used to select observations - Can be used with most sas procedures -i.e.: proc print data=data1.stresstest noobs; where MaxHR>=170; run; - General form: WHERE where-expression; |
|
Comparison Operators |
- Equal to: = - Not equal to: ^=, ~=, <> - Greater than: > - Less than: < - Greater than or equal to: >= - Less than or equal to: <= - In: where state in ('NC','TX') (case sensitive) |
|
Logical Operators |
- And: & or and - Or: | or or - Not: ^ or ~ or not |
|
Special Operators |
- Like: % replaces any number of characters, _ replaces one character: where code like 'E_U%';- Between - and - Contains: ?- Is missing |
|
Column Totals |
- Can calculate column totals for numeric variables using sum statement |
|
By grouping |
- Can be used in proc sort to sort the data set - Can be used in proc print to group variable values and print them together - If used in proc print, it must be used on a data set already sorted by that variable - Using sum and by together gives us subtotals for each by group - Ex: proc sort data=data1.admit out=work.admit; by ActLevel; run; proc print data=work.admit; by ActLevel; sum Fee; run; |
|
Page breaks |
- PAGEBY used to seperate the by groups - You cannot use a page break on its own, it must be used together with a by statement and thus must be used on a sorted data set - Ex: proc print data=work.admit; by ActLevel; pagebyActLevel; sum Fee; run; |
|
ID variables |
- ID statement overwrites the observation column and replaces it with whatever variable you specify - If you specify a by variable with the same id variable, you will get the by variable and its value on the upper left corner for each grouping (changes look of report) - Ex: proc print data=work.empdata; by JobCode; id JobCode; sum Salary; run; |
|
Column Width |
- Changes column width in output window - If you want width to be consistent across all columns use width=uniform - Ex: proc print data=data1.empdata width=uniform; run; |
|
Number of Observations - N |
- Displays number of observations in your report
- Also specifies descriptive text |
|
PROC Sort |
- Sorts data set
- PROC Sort=data.dataset; - Attempts to replace original data set - You have to specify an out dataset for SAS to creat a new dataset and not replace the old set - You have to have a by statement - By default will sort in ascending order - Descending statement in front of variable name will sort in descending order - Ex: proc sort data=data1.empdata out=work.jobsal; by Salary; run; |
|
SAS Syntax rules |
- Not case sensitive except in case of strings - You can make everything upper or lower case - SAS is free format (statement can span multiple lines) except in the case of datalines |
|
Data and Program Errors |
Data Errors: - Occurs when things like character values are read into numeric values Program Errors: - Occurs when part of the code is incorrect - Execution of the program is halted |
|
Sources of SAS Data |
- Data Entry - Existing SAS Datasets - Import - Datalines - Infile |
|
Data Entry |
- Opening table editor and typing in the data |
|
Existing SAS Datasets |
- Use the set command within the data step to read them in - Ex: data work.bonus; set data1.fltattnd; run; |
|
Import |
- Uses import wizard to bring in data from an external program - Ex: PROC IMPORT DATAFILE='X:\PStat130\data1\DallasLA.xls' OUT=WORK.tdfwlax DBMS=XLS REPLACE; SHEET='DFWLAX'; GETNAMES=YES; RUN; - Notice no ; until after dbms |
|
Datalines |
- With raw data SAS doesn't know what to do, so you have to give it more info
- Uses data step for raw data - Used for small data, certain structure and format - Ex: data work.sample; input firstname $ gender $ age; datalines; John Male 22 Jane Female 19; run; -input specifies variables |
|
Infile |
- Using data step
- Used for raw data (pointing sas to raw datafile) - Ex: data work.sample; infile'D:\UCSB\sample.txt'; input name $ gender $ age; run; |
|
Input Statement |
- Used in conjunction with reading in raw data
- You need to tell sas the variable name,variable type, attributes - 3 types: list, column, formatted |
|
List input |
- Data is possibly free format
- All data needs to be standard numeric or character input - Every single variable has to be read in sequentially - Variable name with $ or not - Maximum character values of 8 - Each value is separated by a space - the 'delimeter' - Ex: data work.students; input Name $ Team $ Age; datalines; David Male 19 Amelia Female 23 Ravi Male 17 Ashley Female 20 Jim Male 26 ; run; |
|
Column Input |
- Data is not free format - Data is within fixed columns - Tell sas what columns correspond with which variables - In column you can choose to read in a subset of variables and set the column order to whatever you want - Must be standard numeric or character - lets you read in character values greater than 8 Ex: data work.students; input Name $ 1-6 Gender $ 9-14 Age 18-20; datalines; David Male 19 Amelia Female 23 Ravi Male 17 Ashley Female 20 Jim Male 26 ; run; |
|
Formatted Input |
- Used for nonstandard numeric or character type - Allowed to read in data with symbols, etc. - Converts these variables into character or numeric type - When printed its going to be original type - Ex: data students; input Name $ Gender $ Age Enroll mmddyy8.; datalines; David Male 19 06/18/10 Amelia Female 23 08/02/10 Ravi Male 17 07/22/10 Ashley Female . 09/14/10 Jim Male 26 08/26/10 ; run; - date is not in standard format and so the informat is mmddyy8 |
|
Relative and Absolute Pointer Control |
Column: - Relative: use + sign - Absolute: use @ symbol Line: - Absolute: #n - Relative: / |
|
Write a program that permanently changes the variable name EmpID to EID in the data set work.empdata. Do not use a DATA step. |
proc datasets library=work; modify empdata; rename EmpID=EID; run; |
|
Write a program that creates a SAS data set named work.oscars from the worksheet oscarsin the Excel file entertainment.xls. This file is located in the folder 'C:\Desktop\data'. Make sure to replace a similarly named file if one exists. Place the title “2014 Oscar Winners” on the top of each page. |
title '2014 Oscar Winners’; proc import out=work.oscars datafile='C:\Desktop\data\entertainment.xls' dbms=xls replace; sheet='oscars'; run; |
|
What does the following SAS code output? proc print; run; |
The last successfully created data set |
|
Which of the following is NOT a valid SAS data set name? |
4thquarter |
|
PROC DATASETS |
- Can be used to permanently modify attributes (name of variable (rename), labels, formats, etc.) |
|
Drop and Keep statements |
- For output datasets (what do you want to be output)
|
|
Drop= and Keep= |
- Applied to dataset itself
|
|
Creating variables |
- State name of variable = .....
- Ex: Tax = salary * .05; |
|
Arithmetic Operators |
- Multiplication: * - Division: / - Addition: + - Subtraction: - - Exponent: ** - Negative: - |
|
SAS Functions |
- Month(SAS-date) |
|
Selecting observations |
-Delete
-Where -If |
|
Datetime values |
-For a sas date value this is stored as number of dates between original time and now, for the datetime value this number is the amount of seconds between dates
|
|
Suppressing ID column |
-place noobs after proc print |
|
Format |
- Comes within Proc Print for temporary or within data for permanent - format variable formatw.d; where w is width and d is decimal places -Ex: procprint data=data1.empdata split=' '; format Salary dollar11.2; run; |
|
User defined formats |
Step 1 (Create format): proc format; value $codefmt 'FLTAT'='Flight Attendant' 'PILOT'='Pilot'; run; Step 2 (apply format): proc print data=data1.empdata; format Jobcode$codefmt.; run; |
|
Set |
-Used to append data sets -Whatever order the sets appear in the set statement is the order they appear in the data set -Variable names and data types should be the same in both data sets - Unique values cause missing values Data work.qtr1; set work.jan work.feb work.mar; run; This combines the three data sets in the order of jan, feb, mar. |
|
Merge-by
|
- Combining two data sets with at least one common variable and other unique variables - Set A has m records and k unique variables - Set B has n records and j unique variables - Combined set has max(m,n) records and k+j+1 variables (if there is one common variable) -Records from each set with the same value of the unique By variable are linked and output as one record -If you omit the BY statement the first record from each data set are output together as one without being linked by a common variable -Must be sorted before using by statement |
|
PROC MEANS
|
-Calculate and display simple summary statistics -Summarizes numeric variables -Count, mean, Standard deviation, min, max -BY and CLASS statements can be used to create summaries for sub-groups -Determine which statistics wanted using options in proc means line -OUTPUT creates output data set containing summary stats |
|
PROC FREQ
|
-Analyzes every variable in the data set -Displays each distinct data balue -Calculates the number of observations in which each value appears( and the corresponding frequency) -Indicates missing values for each variable -Use tables statement to select variables and options -For a two way table do variable1*variable2 where 1 is row 2 is column |
|
PROC TABULATE
|
-Calculate and display multi-dimensional tables with summary statistics -Able to group up to three dimensions -Will generate frequency by default -Use multiple table statements to create multiple tables -Classification variables can be either character or numeric but analysis needs to be numeric -Will print sum by default but can be set to mean, median, etc. -All used after analysis variable for sum, us all*mean for mean |
|
PROC REPORT
|
-Create listing and summary reports -by default creates listing report-Use column statement instead of var statement -can specify format using 'format= ' and label -Character variables used as Display variables, Numeric variables used as analysis variables -Can add enhancements like headline or headskip -subtotals and grand totals using break and rbreak statements |
|
PROC GCHART
|
- HBAR, VBAR, or PIE -Can graph character or numeric variables -Displays frequency by default. If you want other than freq must apply analysis variable (sumvar=) and then specify type=(mean or sum) -Use explode to pop out piece of pie chart |
|
SYMBOLn:
|
-Used within GPLOT to define plotting symbols, draw lines through data points, specify symbol and line color |
|
PROC GPLOT
|
-Used to produce scatterplots and graphs -Use symbol and label to edit plot |
|
ODS: Output Delivery System |
-ODS HTML statement opens, closes, and manages HTML destination -Uses print, means or freq or graphs if goptions statement is used - Steps are: open HTML destination for output, generate output, close destination -Can use to convert sas data set into other file types |
|
OUTPUT Statement
|
-If included in data step SAS writes a record immediately -You can create multiple records from one observation, write to multiple output data sets, combine info from multiple obs into a single record when used with RETAIN |
|
RETAIN Statement
|
-Allows you to use a variable from a previous iteration |
|
First./Last. BY-variables
|
-When you sort data, if you read sorted data into data statement and use a by statement two variables are created to identify first record in the by group and the last record in the by group First.BY-variable Last.BY-variable |
|
Accumulating totals for BY groups
|
-Set accumulator variable to zero at start of each by group -Increment with a sum statement -Output only the last observation of each BY gorup |
|
DROP=, KEEP=
|
data army(keep=Code Airport); set data2.military(drop=City State Country); if Type eq 'Army' then output; run; |
|
SUM Statement |
Variable + Expression; -Creates the variable on the left side if it doesn't exist -Initializes the variable to zero before the first iteration of the Data step -Automatically retains variable |
|
DO loops |
DO-END - Executes statements as a unit, usually a part of If-then/else Iterative DO - Executes a group of statements repetitively based on the value of an index variable DO WHILE - Executes group of statements as long as the condition stays true, condition is checked before each loop iteration DO UNTIL - Executes group of states until condition is true, is checked after each loop iteration |
|
Create Custom Reports (Data_NULL_) |
-Uses data step but does not create data set -Instead writes output to a specified file using put statements which control exact location and format of output file |
|
COLON Modifier |
-Use to read each value only as far as the next delimeter -Allows you to use Informats with List input but handle nonstandard data values |
|
INFILE Statement Options |
-Fix non-blank delimieters with DLM='delimeter' infile 'students.txt' dlm=','; -Missing Data at end of row: Sas loads the next record to finish the observation -Missover option stops that from happening and sas just jumps missing values -DSD option sets default delimiter to a comma and treats consecutive delimiters as missing values |
|
Single trailing @ modifier |
-Tells SAS to hold the current input line for further processing -Holds until there is an input statement with no @ or the bottom of the data step |
|
Double trailing @@ modifier |
-Holds raw data record in the input buffer until sas reads past the end of the line |
|
Variable Lists |
-Numbered Range lists - variables start with same name and end with number (var week1-week52) -Name Range lists - variables appear in consecutive order in the set (var mon--sun) -Name Prefix lists- begin with specified character string (of SALES:) -Special SAS name lists-_NUMERIC_ _CHARACTER_ _ALL_ |
|
Index(string,target); |
returns position of specific character within string INDEX('SMITH-JOHN','-') = 6 returns 0 if it isn't in string |
|
SUBSTR(string,start <,length>); |
-Extracts a portion of the character variable SUBSTR('PSTAT130',6,3) = '130' SUBSTR('PSTAT130',6) = '130' |
|
SCAN(string, n, <, delimiters>); |
-Parses a character string into a set of "words" using a delimiter SCAN('Smith, John', 1) = 'Smith' SCAN('Smith, John', 2) = 'John' |
|
|| |
-Joins two or more strings together 'John' || 'Smith' = 'JohnSmith' |
|
TRIM() |
-Removes any trailing blank spaces TRIM ('JOHN ') = 'JOHN' |
|
ROUND() |
-Rounds up or down traditionally -Round(12.12) = 12 |
|
CEIL() |
-Rounds up only -CEIL(4.4) = 5 |
|
FLOOR() |
Rounds down only -FLOOR(3.6) = 3 |
|
INT() |
-Removes any decimals from a number INT(4.8)=4 |
|
INPUT(source,informat): |
-Uses a SAS format (informat) to convert a character string into a number input(CVar4,mmddyy6.); |
|
PUT(source,format): |
-Converts number into character string AreaCode=805 Put(Areacode,3.) = '805' |
|
SET code |
Data work.qtr1; set work.jan work.feb work.mar; run; This combines the three data sets in the order of jan, feb, mar. |
|
MERGE-BY Code |
data allscores; merge midterm final; by name; run; |
|
RENAME= |
-When appending data sets use to create common variable names -When merging data sets use to create unique variable names data allsections; set morning afternoon(RENAME=(testscore=score)); run; |
|
PROC MEANS Code |
proc means data=data1.admit n mean stddev maxdec=2; var age height weight; run; |
|
PROC FREQ Code |
PROC FREQ data=sas-data-set; TABLES variable-list / options; run; |
|
PROC TABULATE code |
PROC TABULATE Data=sas-data-set ; CLASS class-variables; VAR analysis-variables; TABLE page-expression, row-expression, column-expression options>; run; variables need to be listed in both class and var |
|
PROC REPORT code |
proc report data=data1.crew nowd; column JobCodeLocation Salary; define JobCode/ order width=8 'Job Code'; define Location / 'Home Base'; define Salary / format=dollar10.; run; -Order keyword identifies variable used to order report |
|
PROC GCHART Code |
procgchartdata=data1.crew; vbar JobCode / sumvar=Salary type=mean; run; displays average salary for each jobcode |
|
PROC GPLOT Code |
proc gplotdata=data1.admit; plot weight*height / regeqn; symbol v=dot i=rlCLM95; run; quit; regeqn gives regression equation, rlCLM95 gives regression line and 95% Confidence level |
|
ODS Code |
ods html file='Salary.xls';
proc print data=data1.empdata label noobs; label Salary='Annual Salary'; title1 'Salary Report'; run; ods html close; writes to excel file |
|
Do Code |
DATA odd; do i= 1 to 100 by 2; output; end; run; |
|
Colon Modifier Code |
data airplanes; infile'airdata.txt'; input ID $ InService: date9. PassCap CargoCap; run; |