NYCHA Resident Data Book Summary

Statistics Project Instructions:

Project instruction:Choose a dataset from the NYC Open Data website (https://opendata(dot)cityofnewyork(dot)us/data/) and write a report on a hypothesis test using the dataset. Your report should introduce the dataset and mention your objective. The hypothesis test should be complete (with all four steps). The report should be written and submitted as a Word document. You are allowed to use Excel as work (show screen shot of Excel). The project is due Dec 4, 11:59 pm EST on Blackboard. Attached is a template project you can refer to when writing your own report (Note: you may need to do some cleaning and filtering for the selected dataset before it can be used for analysis).

Project template

Title: Comparison of the average number of employees between the finance and retail businesses in New York City

The dataset I chose is "NYC Business Acceleration Businesses Served and Jobs Created". The dataset lists the number of businesses that NYC Business Acceleration has assisted in opening and how many jobs were created by those businesses. Each row in the original dataset represents one business. My study goal is to analyze how two different business sectors compare in their average number of jobs created.

First, I selected the business sectors "Finance and Insurance" and "Retail Trade". I then deleted the businesses that did not report the "number of employees". I rearranged the selected data into two columns: one column (named "Finance and Insurance") listing the number of employees that are in the finance and insurance businesses (sample size is 17); and the other column (named "Retail Trade") listing the number of employees that are in the retail trade businesses (sample size is 125). Below is part of the filtered dataset.

Finance and Insurance	Retail Trade
12	25
12	25
14	3
14	5
7	1
14	1
14	350
15	125
14	5

Since the two columns list different companies from different sectors, I assume that they are independent samples. For businesses that are in Finance and Insurance, the sample size is relatively small (<30), so I assume the number of employees in all Finance and Insurance businesses in NYC is normally distributed. Finally, since I only have sample data, population standard deviation of the number of employees in each business is unknown. Under these assumptions, I set up the hypotheses as where denotes the average number of employees in the Finance and Insurance businesses while denotes the average number of employees in the Retail Trade businesses.

Next, I set the significance level (Type I error limit) at .

Then, using Excel (Data---Data Analysis---t Test: Two-Sample Assuming Unequal Variances) and keeping the alpha at 0.05 (see below), the Excel shows

t-Test: Two-Sample Assuming Unequal Variances

	Variable 1	Variable 2
Mean	13.88235	64.296
Variance	4.110294	10715.24
Observations	17	125
Hypothesized Mean Difference	0
df	125
t Stat	-5.43739
P(T<=t) one-tail	1.36E-07
t Critical one-tail	1.657135
P(T<=t) two-tail	2.73E-07
t Critical two-tail	1.979124

The t statistic is -5.44. Since this is a two-tailed test, the p-value equals 0.000000273, which is much less than . Therefore, we reject the null hypothesis and conclude that the Finance and Insurance and Retail Trade businesses have different average numbers of employees in NYC. This is probably due to the fact that in a few Retail Trade businesses, the numbers of employees are extremely large, resulting in large mean values. It shall be worth investigating what these businesses are.

Words Characters Reading time

Statistics Project Sample Content Preview:

Name Course Instructor Date
Title: Relationship between average total gross income and total head of household (HOH)
Being 62 years and over
The dataset is “NYCHA Resident Data Book Summary”, which contains resident demographic data including housing and development data under the “NYCHA Resident Data Book Summary”. The NYC Open Data using data from the New York City Housing Authority (NYCHA), reported on housing and development. The variables chosen are “All Average Total Gross Income" and "Total HOH 62 Years and Over as Percent of Families" (sample size is 33) (NYC Open Data).
I selected the datasets by agency, then the New York City Housing Authority (NYCHA) and NYCHA Resident Data Book. Then, I filtered the dataset to include two columns the "Total HOH 62 Years and Over as Percent of Families" Below is the dataset
Hypothesis
* H0: The relationship between average total gross income and total head of household (HOH) =0
* Ha: The relationship between average total gross income and total head of household (HOH) ≠0
The level of significance is set at 5%
Then, using Excel (ANOVA and regression) and keeping the alpha at 0.05 (see below),
The excel shows
ANOVA

df

SS

MS

F

Significance F

Regression

1

2721265831

2.72E+8

271.013627

7.08723E-17

Residual

31

31127305.4

1004107

Total

32

303253889

Coefficients

Standard Error

t Stat

P-value

Lower 95%

Intercept

15822.77416

Updated on January 26, 2024

Get the Whole Paper!

Not exactly what you need?

Do you need a custom essay? Order right now:

Order

👀 Other Visitors are Viewing These APA Essay Samples:

MSEM 530 – MANAGERIAL DECISIONS UNDER UNCERTAINTY.

2 pages/≈550 words | No Sources | APA | Mathematics & Economics | Statistics Project |
Math 1401 Elementary Statistics Project. Statistics Project.

2 pages/≈550 words | No Sources | APA | Mathematics & Economics | Statistics Project |
COVID-19 Confirmed Cases, Hospitalization, and Deaths in Chicago

5 pages/≈1375 words | No Sources | APA | Mathematics & Economics | Statistics Project |

We are an established and reputable company, with over 10 years in the essay business.

517 3,901

738

1,173

208