STAT 425 UCLA Reading Raw Data and Manipulating Data Questions

6 questions by using SAS

Part I-Reading SAS data sets

1.    Reading a Space-Delimited Raw Data File

a.    Write a DATA step to create a new data set named work.qtrdonation. Read the space-delimited raw data file, which can be named as follows:

Windows

“&pathdonation.dat”

UNIX

“&path/donation.dat”

z/OS (OS/390)

“&path..rawdata(donation)”

Partial Raw Data File

120265 . . . 25

120267 15 15 15 15

120269 20 20 20 20

120270 20 10 5 .

120271 20 20 20 20

b.    Read the following fields from the raw data file:

Name

Type

Length

IDNum

Character

6

Qtr1

Numeric

8

Qtr2

Numeric

8

Qtr3

Numeric

8

Qtr4

Numeric

8

c.    Write a PROC PRINT step to create the report below. The results contain 124 observations.

Partial PROC PRINT Output

ObsIDNum     Qtr1    Qtr2    Qtr3    Qtr4

1    120265      .       .       .      25

                    2    120267     15      15      15      15

                    3    120269     20      20      20      20

                    4    120270     20      10       5      .

                    5    120271     20      20      20      20

2.    Reading a Delimited Raw Data File with Nonstandard Data Values

a.    Write a DATA step to create a temporary data set, prices. Read the delimited raw data file named as follows:

Windows

“&pathpricing.dat”

UNIX

“&path/ pricing.dat”

z/OS (OS/390)

“&path..rawdata(pricing)”

All data fields are numeric.

Partial Raw Data File

210200100009*09JUN2011*31DEC9999*$15.50*$34.70

210200100017*24JAN2011*31DEC9999*$17.80*22.80

210200200023*04JUL2011*31DEC9999*$8.25*$19.80

210200600067*27OCT2011*31DEC9999*$28.90*47.00

210200600085*28AUG2011*31DEC9999*$17.85*$39.40

b.    Generate the report below. The results should contain 16 observations.

Partial PROC PRINT Output

2011 Pricing

                                                                 Sales

Obs ProductID   StartDate       EndDate        Cost       Price

1    210200100009    06/09/2011    12/31/9999       15.50       34.70

 2    210200100017    01/24/2011    12/31/9999       17.80       22.80

3    210200200023    07/04/2011    12/31/9999        8.25       19.80

 4    210200600067    10/27/2011    12/31/9999       28.90       47.00

5    210200600085    08/28/2011    12/31/9999       17.85       39.40

3.    Reading a Delimited File with Missing Values

a.    Write a DATA step to create a temporary data set, prices. Use the asterisk-delimited raw data file, which can be named as follows:

Windows

“&pathprices.dat”

UNIX

“&path/prices.dat”

z/OS (OS/390)

“&path..rawdata(prices)”

Partial Raw Data File

210200100009*09JUN2007*31DEC9999*$15.50*$34.70

210200100017*24JAN2007*31DEC9999*$17.80

210200200023*04JUL2007*31DEC9999*$8.25*$19.80

210200600067*27OCT2007*31DEC9999*$28.90

210200600085*28AUG2007*31DEC9999*$17.85*$39.40

There might be missing data at the end of some records. Read the following fields from the raw data file:

Name

Type

Length

ProductID

Numeric

8

StartDate

Numeric

8

EndDate

Numeric

8

UnitCostPrice

Numeric

8

UnitSalesPrice

Numeric

8

b.    Define labels and formats in the DATA step to create a data set that generates the following output when they are used in the PROC PRINT step. The results should contain 259 observations.

Partial PROC PRINT Output

2007 Prices

                                                                    Sales

                         Start of       End of      Cost Price    Price per

Obs     Product ID     Date Range    Date Range     per Unit       Unit

   1    210200100009    06/09/2007    12/31/9999        15.50        34.70

   2    210200100017    01/24/2007    12/31/9999        17.80          .

   3    210200200023    07/04/2007    12/31/9999         8.25        19.80

   4    210200600067    10/27/2007    12/31/9999        28.90          .

5    210200600085    08/28/2007    12/31/9999        17.85        39.40

Part II-Manipulating data

4.    Creating New Variables

a.    Write a DATA step that reads orion.customer to create work.birthday.

b.    In the DATA step, create three new variables: Bday2012, BdayDOW2012, and Age2012.

Bday2012 is the combination of the month of Birth_Date, the day of Birth_Date, and the constant of 2012 in the MDY function.

BdayDOW2012 is the day of the week of Bday2012.

Age2012 is the age of the customer in 2012. Subtract Birth_Date from Bday2012 and divide the result by 365.25.

c.    Include only the following variables in the new data set: Customer_Name, Birth_Date, Bday2012, BdayDOW2012, and Age2012.

d.    Format Bday2012 to appear in the form 01Jan2012. Age2012 should be formatted to appear
with no decimal places.

e.    Write a PROC PRINT step to create the report below. The results should contain 77 observations.

Partial PROC PRINT Output 

Birth_               Bday

ObsCustomer_Name Date     Bday2012    DOW2012    Age2012

1    James Kvarniq 27JUN1978    27JUN2012       4          34

2    Sandrina Stephano       09JUL1983    09JUL2012       2          29

3    Cornelia Krahl 27FEB1978    27FEB2012       2          34

4    Karen Ballinger       18OCT1988    18OCT2012       5          24

5    Elke Wallstab16AUG1978    16AUG2012       5          34 

5.    Creating Multiple Variables in Conditional Processing

a.    Write a DATA step that reads orion.customer_dim to create work.season.

b.    Create two new variables: Promo and Promo2.

The value of Promo is based on the quarter in which the customer was born.

• If the customer was born in the first quarter, then Promo is equal to Winter.

• If the customer was born in the second quarter, then Promo is equal to Spring.

• If the customer was born in the third quarter, then Promo is equal to Summer.

• If the customer was born in the fourth quarter, then Promo is equal to Fall.

The value of Promo2 is based on the customer’s age.

• For young adults, whose age is between 18 and 25, set Promo2 equal to YA.

• For seniors, aged 65 or older, set Promo2 equal to Senior.

Promo2 should have a missing value for all other customers.

c.    The new data set should include only Customer_FirstName, Customer_LastName,Customer_BirthDate, Customer_Age, Promo, and Promo2.

d.    Create the report below. The results should include 77 observations.

Partial PROC PRINT Output

      Customer_      Customer_      Customer_              Customer_

Obs    FirstName      LastNameBirthDate    Promo        Age       Promo2 

 1    James          Kvarniq        27JUN1978    Spring        33

 2    Sandrina       Stephano       09JUL1983    Summer        28

 3    Cornelia       Krahl          27FEB1978    Winter        33

 4    Karen          Ballinger      18OCT1988    Fall          23       YA

 5    Elke           Wallstab       16AUG1978    Summer        33

 6    David          Black          12APR1973    Spring        38

 7    Markus         Sepke          21JUL1992    Summer        19       YA

 8    Ulrich         Heyde          16JAN1943    Winter        68       Senior

6.    Creating Variables Unconditionally and Conditionally

a.    Write a DATA step that reads orion.orders to create work.ordertype.

b.    Create a new variable, DayOfWeek, that is equal to the weekday of Order_Date.

c.    Create the new variable Type, which is equal to the following:

Retail Sale if Order_Type is equal to 1

Catalog Sale if Order_Type is equal to 2

Internet Sale if Order_Type is equal to 3.

d.    Create the new variable SaleAds, which is equal to the following:

Mail if Order_Type is equal to 2

Email if Order_Type is equal to 3.

e.    Do not include Order_Type, Employee_ID, and Customer_ID in the new data set.

f.    Create the report below. The results should contain 490 observations.

Partial PROC PRINT Output

                                                                          Day

                        Order_    Delivery_                     Sale       Of

ObsOrder_ID         Date      Date           Type          Ads      Week

1      1230058123    11JAN2007    11JAN2007    Retail Sale                 5

2      1230080101    15JAN2007    19JAN2007    Catalog Sale     Mail       2

3      1230106883    20JAN2007    22JAN2007    Catalog Sale     Mail       7

4      1230147441    28JAN2007    28JAN2007    Retail Sale                 1

5      1230315085    27FEB2007    27FEB2007    Retail Sale                 3

Solution:

15% off for this assignment.

Our Prices Start at $11.99. As Our First Client, Use Coupon Code GET15 to claim 15% Discount This Month!!

Why US?

100% Confidentiality

Information about customers is confidential and never disclosed to third parties.

Timely Delivery

No missed deadlines – 97% of assignments are completed in time.

Original Writing

We complete all papers from scratch. You can get a plagiarism report.

Money Back

If you are convinced that our writer has not followed your requirements, feel free to ask for a refund.

× How can I help you?