Frequency Distribution for Selected Variables
I have used Python and the Pandas library to perform a frequency distribution analysis of the selected variables from the NESARC dataset. Below is the output od the Python code:
Total rows in NESARC dataset: 43093
Total columns in NESARC dataset: 3010
Counts for duration of alcohol abuse S2AQ20
1 8960
-1 8266
2 4467
3 2780
10 2729
5 2463
4 1855
20 1618
99 1450
15 997
6 833
30 765
7 621
8 585
25 493
40 416
12 407
9 250
50 229
18 202
11 200
14 190
13 171
17 154
22 145
16 143
35 137
21 132
19 117
23 106
24 103
28 99
27 77
26 72
60 68
45 67
32 58
37 47
34 44
33 43
36 41
31 38
38 36
29 35
47 32
48 29
42 28
55 28
44 25
43 23
46 22
41 20
39 18
52 17
53 16
58 14
57 12
54 12
49 12
56 11
65 11
70 10
51 9
59 6
68 5
75 4
61 4
63 3
71 3
62 2
69 2
67 1
80 1
64 1
77 1
72 1
66 1
Name: S2AQ20, dtype: int64
Percentages for duration of alcohol abuse S2AQ20
1 0.207922
-1 0.191818
2 0.103660
3 0.064512
10 0.063328
5 0.057155
4 0.043046
20 0.037547
99 0.033648
15 0.023136
6 0.019330
30 0.017752
7 0.014411
8 0.013575
25 0.011440
40 0.009654
12 0.009445
9 0.005801
50 0.005314
18 0.004688
11 0.004641
14 0.004409
13 0.003968
17 0.003574
22 0.003365
16 0.003318
35 0.003179
21 0.003063
19 0.002715
23 0.002460
24 0.002390
28 0.002297
27 0.001787
26 0.001671
60 0.001578
45 0.001555
32 0.001346
37 0.001091
34 0.001021
33 0.000998
36 0.000951
31 0.000882
38 0.000835
29 0.000812
47 0.000743
48 0.000673
42 0.000650
55 0.000650
44 0.000580
43 0.000534
46 0.000511
41 0.000464
39 0.000418
52 0.000394
53 0.000371
58 0.000325
57 0.000278
54 0.000278
49 0.000278
56 0.000255
65 0.000255
70 0.000232
51 0.000209
59 0.000139
68 0.000116
75 0.000093
61 0.000093
63 0.000070
71 0.000070
62 0.000046
69 0.000046
67 0.000023
80 0.000023
64 0.000023
77 0.000023
72 0.000023
66 0.000023
Name: S2AQ20, dtype: float64
Counts for how often abused alcohol: S2AQ21A
-1 8266
10 5215
6 3309
8 1367
5 4600
3 4040
4 4277
1 4167
9 2759
7 2502
2 2095
99 496
Name: S2AQ21A, dtype: int64
Percentages for how often abused alcohol: S2AQ21A
-1 0.191818
10 0.121017
6 0.076787
8 0.031722
5 0.106746
3 0.093751
4 0.099250
1 0.096698
9 0.064024
7 0.058060
2 0.048616
99 0.011510
Name: S2AQ21A, dtype: float64
Counts for how often drank 5+ drinks: S2AQ22
-1 8266
11 20698
9 968
8 532
4 1856
5 1908
1 2090
10 1330
6 1208
3 1764
7 1018
2 957
99 498
Name: S2AQ22, dtype: int64
Percentages for how often drank 5+ drinks: S2AQ22
-1 0.191818
11 0.480310
9 0.022463
8 0.012345
4 0.043070
5 0.044276
1 0.048500
10 0.030863
6 0.028032
3 0.040935
7 0.023623
2 0.022208
99 0.011556
Name: S2AQ22, dtype: float64
Counts for type of alcohol abused:S2AQ23
-1 8266
2 12351
4 6248
1 1802
3 3681
9 10745
Name: S2AQ23, dtype: int64
Percentages for type of alcohol abused:S2AQ23
-1 0.191818
2 0.286613
4 0.144989
1 0.041817
3 0.085420
9 0.249344
Name: S2AQ23, dtype: float64
Counts of people having hardening of arteries in last 12 months: S13Q6A1
2 40917
1 911
9 1265
Name: S13Q6A1, dtype: int64
Percentages of people having hardening of arteries in last 12 months: S13Q6A1
2 0.949505
1 0.021140
9 0.029355
Name: S13Q6A1, dtype: float64
Counts of people having high blood pressure in last 12 months: S13Q6A2
2 32828
1 9136
9 1129
Name: S13Q6A2, dtype: int64
Percentages of people having high blood pressure in last 12 months: S13Q6A2
2 0.761794
1 0.212007
9 0.026199
Name: S13Q6A2, dtype: float64
Counts of people having heart attack in last 12 months: S13Q6A7
2 41557
1 470
9 1066
Name: S13Q6A7, dtype: int64
Percentages of people having heart attack in last 12 months: S13Q6A7
2 0.964356
1 0.010907
9 0.024737
Name: S13Q6A7, dtype: float64
Inferences from the frequency distribution:
The following inferences can be drawn from the frequency distributtion for the selected variables:
- Hardening of arteries: Around Around 2.11% people from the survey suffered from Hardening of Arteries in last 12 months, 94.95% did not suffer this issue, around 2.94% were not sure.
- High Blood Pressure: Around 21.20% of people suffered from high blood pressure in last 12 months, around 76.18% did not suffer and around 2.62% were not sure
- Heart Attack: Around 1.09% of the surveyed people suffered from heart attack in last 12 months, around 96.44% people did not have heart attack in last 12 months and around 2.47% were not sure.
- Major type of alcohol drank during period of alcohol abuse: Around 4.18% people drank Coolers, around 28.66 drank Beer, around 8.54% drank Wine, around 14.50% drank Liquor and the rest were either not sure or did not drink
- Duration of alcohol abuse: Around 20.79% people abused alcohol for 1 year, around 10.37% for 2 years and so on till 80 years which can be observed from the frequency distribution output above. -1 indicates invalid entries or not applicable.
- How often drank any alcohol during period of abuse: 9.67% drank every day, 4.86% nearly every day, 9.38% 3 tp 4 times a week, 9.93% 2 times a week, 10.67% once a week, 7.68% 2 to 3 times a month, 5.81% once a month, 3.17% 7 to 11 times a year, 6.40% 3 to 6 times a year, 12.10% once or twice a year. -1 or 99 indicates invalid or unknown entries.
Comments
Post a Comment