Dataset statistics
Number of variables | 40 |
---|---|
Number of observations | 98052 |
Missing cells | 0 |
Missing cells (%) | 0.0% |
Duplicate rows | 0 |
Duplicate rows (%) | 0.0% |
Total size in memory | 11.6 MiB |
Average record size in memory | 124.0 B |
Variable types
Numeric | 10 |
---|---|
Boolean | 20 |
Categorical | 10 |
number_emergency is highly skewed (γ1 = 22.71023391) | Skewed |
df_index has unique values | Unique |
num_procedures has 44574 (45.5%) zeros | Zeros |
number_outpatient has 81679 (83.3%) zeros | Zeros |
number_emergency has 86845 (88.6%) zeros | Zeros |
number_inpatient has 64633 (65.9%) zeros | Zeros |
Reproduction
Analysis started | 2021-05-05 21:26:31.501435 |
---|---|
Analysis finished | 2021-05-05 21:27:10.019817 |
Duration | 38.52 seconds |
Software version | pandas-profiling v2.11.0 |
Download configuration | config.yaml |
Distinct | 98052 |
---|---|
Distinct (%) | 100.0% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 51115.77261 |
---|---|
Minimum | 1 |
Maximum | 101765 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 766.2 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 5180.55 |
Q1 | 25574.75 |
median | 51369.5 |
Q3 | 76379.25 |
95-th percentile | 96683.45 |
Maximum | 101765 |
Range | 101764 |
Interquartile range (IQR) | 50804.5 |
Descriptive statistics
Standard deviation | 29307.32802 |
---|---|
Coefficient of variation (CV) | 0.5733519523 |
Kurtosis | -1.191416496 |
Mean | 51115.77261 |
Median Absolute Deviation (MAD) | 25399.5 |
Skewness | -0.01479878364 |
Sum | 5012003736 |
Variance | 858919475.5 |
Monotocity | Strictly increasing |
Value | Count | Frequency (%) |
2047 | 1 | < 0.1% |
80562 | 1 | < 0.1% |
29339 | 1 | < 0.1% |
19100 | 1 | < 0.1% |
17053 | 1 | < 0.1% |
23198 | 1 | < 0.1% |
21151 | 1 | < 0.1% |
101028 | 1 | < 0.1% |
98981 | 1 | < 0.1% |
76464 | 1 | < 0.1% |
Other values (98042) | 98042 |
Value | Count | Frequency (%) |
1 | 1 | |
2 | 1 | |
3 | 1 | |
4 | 1 | |
5 | 1 |
Value | Count | Frequency (%) |
101765 | 1 | |
101764 | 1 | |
101763 | 1 | |
101762 | 1 | |
101761 | 1 |
time_in_hospital
Real number (ℝ≥0)
Distinct | 14 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 4.42201077 |
---|---|
Minimum | 1 |
Maximum | 14 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 766.2 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 1 |
Q1 | 2 |
median | 4 |
Q3 | 6 |
95-th percentile | 11 |
Maximum | 14 |
Range | 13 |
Interquartile range (IQR) | 4 |
Descriptive statistics
Standard deviation | 2.993069775 |
---|---|
Coefficient of variation (CV) | 0.6768571881 |
Kurtosis | 0.8179424536 |
Mean | 4.42201077 |
Median Absolute Deviation (MAD) | 2 |
Skewness | 1.123566649 |
Sum | 433587 |
Variance | 8.958466679 |
Monotocity | Not monotonic |
Value | Count | Frequency (%) |
3 | 17049 | |
2 | 16441 | |
1 | 13489 | |
4 | 13434 | |
5 | 9699 | |
6 | 7320 | |
7 | 5694 | 5.8% |
8 | 4276 | 4.4% |
9 | 2928 | 3.0% |
10 | 2287 | 2.3% |
Other values (4) | 5435 | 5.5% |
Value | Count | Frequency (%) |
1 | 13489 | |
2 | 16441 | |
3 | 17049 | |
4 | 13434 | |
5 | 9699 |
Value | Count | Frequency (%) |
14 | 1017 | |
13 | 1185 | |
12 | 1424 | |
11 | 1809 | |
10 | 2287 |
num_lab_procedures
Real number (ℝ≥0)
Distinct | 118 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 43.14846204 |
---|---|
Minimum | 1 |
Maximum | 132 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 766.2 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 4 |
Q1 | 31 |
median | 44 |
Q3 | 57 |
95-th percentile | 73 |
Maximum | 132 |
Range | 131 |
Interquartile range (IQR) | 26 |
Descriptive statistics
Standard deviation | 19.71175698 |
---|---|
Coefficient of variation (CV) | 0.4568356797 |
Kurtosis | -0.2451397605 |
Mean | 43.14846204 |
Median Absolute Deviation (MAD) | 13 |
Skewness | -0.2355321992 |
Sum | 4230793 |
Variance | 388.5533634 |
Monotocity | Not monotonic |
Value | Count | Frequency (%) |
1 | 3096 | 3.2% |
43 | 2724 | 2.8% |
44 | 2414 | 2.5% |
45 | 2306 | 2.4% |
38 | 2131 | 2.2% |
46 | 2120 | 2.2% |
40 | 2113 | 2.2% |
41 | 2046 | 2.1% |
42 | 2031 | 2.1% |
47 | 2028 | 2.1% |
Other values (108) | 75043 |
Value | Count | Frequency (%) |
1 | 3096 | |
2 | 1062 | 1.1% |
3 | 647 | 0.7% |
4 | 364 | 0.4% |
5 | 276 | 0.3% |
Value | Count | Frequency (%) |
132 | 1 | |
129 | 1 | |
126 | 1 | |
121 | 1 | |
120 | 1 |
Distinct | 7 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 1.350711867 |
---|---|
Minimum | 0 |
Maximum | 6 |
Zeros | 44574 |
Zeros (%) | 45.5% |
Memory size | 766.2 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 1 |
Q3 | 2 |
95-th percentile | 5 |
Maximum | 6 |
Range | 6 |
Interquartile range (IQR) | 2 |
Descriptive statistics
Standard deviation | 1.708474845 |
---|---|
Coefficient of variation (CV) | 1.264869945 |
Kurtosis | 0.8238736795 |
Mean | 1.350711867 |
Median Absolute Deviation (MAD) | 1 |
Skewness | 1.303967313 |
Sum | 132440 |
Variance | 2.918886297 |
Monotocity | Not monotonic |
Value | Count | Frequency (%) |
0 | 44574 | |
1 | 20029 | |
2 | 12383 | 12.6% |
3 | 9210 | 9.4% |
6 | 4811 | 4.9% |
4 | 4076 | 4.2% |
5 | 2969 | 3.0% |
Value | Count | Frequency (%) |
0 | 44574 | |
1 | 20029 | |
2 | 12383 | 12.6% |
3 | 9210 | 9.4% |
4 | 4076 | 4.2% |
Value | Count | Frequency (%) |
6 | 4811 | 4.9% |
5 | 2969 | 3.0% |
4 | 4076 | 4.2% |
3 | 9210 | |
2 | 12383 |
num_medications
Real number (ℝ≥0)
Distinct | 75 |
---|---|
Distinct (%) | 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 16.11958961 |
---|---|
Minimum | 1 |
Maximum | 81 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 766.2 KiB |
Quantile statistics
Minimum | 1 |
---|---|
5-th percentile | 6 |
Q1 | 11 |
median | 15 |
Q3 | 20 |
95-th percentile | 31 |
Maximum | 81 |
Range | 80 |
Interquartile range (IQR) | 9 |
Descriptive statistics
Standard deviation | 8.108495519 |
---|---|
Coefficient of variation (CV) | 0.5030212132 |
Kurtosis | 3.493545221 |
Mean | 16.11958961 |
Median Absolute Deviation (MAD) | 5 |
Skewness | 1.332717291 |
Sum | 1580558 |
Variance | 65.74769959 |
Monotocity | Not monotonic |
Value | Count | Frequency (%) |
13 | 5885 | 6.0% |
12 | 5816 | 5.9% |
15 | 5621 | 5.7% |
11 | 5592 | 5.7% |
14 | 5520 | 5.6% |
16 | 5271 | 5.4% |
10 | 5167 | 5.3% |
17 | 4783 | 4.9% |
9 | 4711 | 4.8% |
18 | 4399 | 4.5% |
Other values (65) | 45287 |
Value | Count | Frequency (%) |
1 | 236 | 0.2% |
2 | 397 | 0.4% |
3 | 785 | |
4 | 1269 | |
5 | 1835 |
Value | Count | Frequency (%) |
81 | 1 | < 0.1% |
79 | 1 | < 0.1% |
75 | 2 | |
74 | 1 | < 0.1% |
72 | 3 |
Distinct | 39 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 0.3763819198 |
---|---|
Minimum | 0 |
Maximum | 42 |
Zeros | 81679 |
Zeros (%) | 83.3% |
Memory size | 766.2 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 0 |
Q3 | 0 |
95-th percentile | 2 |
Maximum | 42 |
Range | 42 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | 1.283365421 |
---|---|
Coefficient of variation (CV) | 3.409742482 |
Kurtosis | 145.589922 |
Mean | 0.3763819198 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 8.78166345 |
Sum | 36905 |
Variance | 1.647026805 |
Monotocity | Not monotonic |
Value | Count | Frequency (%) |
0 | 81679 | |
1 | 8340 | 8.5% |
2 | 3514 | 3.6% |
3 | 2005 | 2.0% |
4 | 1078 | 1.1% |
5 | 521 | 0.5% |
6 | 297 | 0.3% |
7 | 153 | 0.2% |
8 | 98 | 0.1% |
9 | 83 | 0.1% |
Other values (29) | 284 | 0.3% |
Value | Count | Frequency (%) |
0 | 81679 | |
1 | 8340 | 8.5% |
2 | 3514 | 3.6% |
3 | 2005 | 2.0% |
4 | 1078 | 1.1% |
Value | Count | Frequency (%) |
42 | 1 | |
40 | 1 | |
39 | 1 | |
38 | 1 | |
37 | 1 |
Distinct | 33 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 0.2024639987 |
---|---|
Minimum | 0 |
Maximum | 76 |
Zeros | 86845 |
Zeros (%) | 88.6% |
Memory size | 766.2 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 0 |
Q3 | 0 |
95-th percentile | 1 |
Maximum | 76 |
Range | 76 |
Interquartile range (IQR) | 0 |
Descriptive statistics
Standard deviation | 0.9428968764 |
---|---|
Coefficient of variation (CV) | 4.657108832 |
Kurtosis | 1171.626491 |
Mean | 0.2024639987 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 22.71023391 |
Sum | 19852 |
Variance | 0.8890545196 |
Monotocity | Not monotonic |
Value | Count | Frequency (%) |
0 | 86845 | |
1 | 7550 | 7.7% |
2 | 2011 | 2.1% |
3 | 716 | 0.7% |
4 | 372 | 0.4% |
5 | 190 | 0.2% |
6 | 93 | 0.1% |
7 | 72 | 0.1% |
8 | 50 | 0.1% |
10 | 34 | < 0.1% |
Other values (23) | 119 | 0.1% |
Value | Count | Frequency (%) |
0 | 86845 | |
1 | 7550 | 7.7% |
2 | 2011 | 2.1% |
3 | 716 | 0.7% |
4 | 372 | 0.4% |
Value | Count | Frequency (%) |
76 | 1 | |
64 | 1 | |
63 | 1 | |
54 | 1 | |
46 | 1 |
Distinct | 20 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 0.646871048 |
---|---|
Minimum | 0 |
Maximum | 21 |
Zeros | 64633 |
Zeros (%) | 65.9% |
Memory size | 766.2 KiB |
Quantile statistics
Minimum | 0 |
---|---|
5-th percentile | 0 |
Q1 | 0 |
median | 0 |
Q3 | 1 |
95-th percentile | 3 |
Maximum | 21 |
Range | 21 |
Interquartile range (IQR) | 1 |
Descriptive statistics
Standard deviation | 1.271025294 |
---|---|
Coefficient of variation (CV) | 1.964882024 |
Kurtosis | 19.94313813 |
Mean | 0.646871048 |
Median Absolute Deviation (MAD) | 0 |
Skewness | 3.554811324 |
Sum | 63427 |
Variance | 1.615505299 |
Monotocity | Not monotonic |
Value | Count | Frequency (%) |
0 | 64633 | |
1 | 19067 | 19.4% |
2 | 7421 | 7.6% |
3 | 3346 | 3.4% |
4 | 1597 | 1.6% |
5 | 802 | 0.8% |
6 | 474 | 0.5% |
7 | 266 | 0.3% |
8 | 147 | 0.1% |
9 | 111 | 0.1% |
Other values (10) | 188 | 0.2% |
Value | Count | Frequency (%) |
0 | 64633 | |
1 | 19067 | 19.4% |
2 | 7421 | 7.6% |
3 | 3346 | 3.4% |
4 | 1597 | 1.6% |
Value | Count | Frequency (%) |
21 | 1 | < 0.1% |
19 | 2 | < 0.1% |
18 | 1 | < 0.1% |
16 | 5 | |
15 | 8 |
number_diagnoses
Real number (ℝ≥0)
Distinct | 14 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 7.512095623 |
---|---|
Minimum | 3 |
Maximum | 16 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 766.2 KiB |
Quantile statistics
Minimum | 3 |
---|---|
5-th percentile | 4 |
Q1 | 6 |
median | 8 |
Q3 | 9 |
95-th percentile | 9 |
Maximum | 16 |
Range | 13 |
Interquartile range (IQR) | 3 |
Descriptive statistics
Standard deviation | 1.832471842 |
---|---|
Coefficient of variation (CV) | 0.2439361709 |
Kurtosis | -0.3450219608 |
Mean | 7.512095623 |
Median Absolute Deviation (MAD) | 1 |
Skewness | -0.8175309479 |
Sum | 736576 |
Variance | 3.357953051 |
Monotocity | Not monotonic |
Value | Count | Frequency (%) |
9 | 48687 | |
5 | 10592 | 10.8% |
8 | 10388 | 10.6% |
7 | 10179 | 10.4% |
6 | 9988 | 10.2% |
4 | 5360 | 5.5% |
3 | 2751 | 2.8% |
16 | 40 | < 0.1% |
13 | 16 | < 0.1% |
10 | 16 | < 0.1% |
Other values (4) | 35 | < 0.1% |
Value | Count | Frequency (%) |
3 | 2751 | 2.8% |
4 | 5360 | |
5 | 10592 | |
6 | 9988 | |
7 | 10179 |
Value | Count | Frequency (%) |
16 | 40 | |
15 | 8 | < 0.1% |
14 | 7 | < 0.1% |
13 | 16 | < 0.1% |
12 | 9 | < 0.1% |
change
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True |
Value | Count | Frequency (%) |
False | 52774 | |
True | 45278 |
diabetesMed
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
True | |
---|---|
False |
Value | Count | Frequency (%) |
True | 75350 | |
False | 22702 | 23.2% |
isFemale
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
True | |
---|---|
False |
Value | Count | Frequency (%) |
True | 52833 | |
False | 45219 |
race_AfricanAmerican
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 766.2 KiB |
0 | |
---|---|
1 |
Value | Count | Frequency (%) |
0 | 79171 | |
1 | 18881 | 19.3% |
Value | Count | Frequency (%) |
0 | 79171 | |
1 | 18881 | 19.3% |
Most occurring characters
Value | Count | Frequency (%) |
0 | 79171 | |
1 | 18881 | 19.3% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 98052 |
Most frequent character per category
Value | Count | Frequency (%) |
0 | 79171 | |
1 | 18881 | 19.3% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 98052 |
Most frequent character per script
Value | Count | Frequency (%) |
0 | 79171 | |
1 | 18881 | 19.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 98052 |
Most frequent character per block
Value | Count | Frequency (%) |
0 | 79171 | |
1 | 18881 | 19.3% |
race_Asian
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 766.2 KiB |
0 | |
---|---|
1 | 625 |
Value | Count | Frequency (%) |
0 | 97427 | |
1 | 625 | 0.6% |
Value | Count | Frequency (%) |
0 | 97427 | |
1 | 625 | 0.6% |
Most occurring characters
Value | Count | Frequency (%) |
0 | 97427 | |
1 | 625 | 0.6% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 98052 |
Most frequent character per category
Value | Count | Frequency (%) |
0 | 97427 | |
1 | 625 | 0.6% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 98052 |
Most frequent character per script
Value | Count | Frequency (%) |
0 | 97427 | |
1 | 625 | 0.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 98052 |
Most frequent character per block
Value | Count | Frequency (%) |
0 | 97427 | |
1 | 625 | 0.6% |
race_Caucasian
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 766.2 KiB |
1 | |
---|---|
0 |
Value | Count | Frequency (%) |
1 | 75079 | |
0 | 22973 | 23.4% |
Value | Count | Frequency (%) |
1 | 75079 | |
0 | 22973 | 23.4% |
Most occurring characters
Value | Count | Frequency (%) |
1 | 75079 | |
0 | 22973 | 23.4% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 98052 |
Most frequent character per category
Value | Count | Frequency (%) |
1 | 75079 | |
0 | 22973 | 23.4% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 98052 |
Most frequent character per script
Value | Count | Frequency (%) |
1 | 75079 | |
0 | 22973 | 23.4% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 98052 |
Most frequent character per block
Value | Count | Frequency (%) |
1 | 75079 | |
0 | 22973 | 23.4% |
race_Hispanic
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 766.2 KiB |
0 | |
---|---|
1 | 1984 |
Value | Count | Frequency (%) |
0 | 96068 | |
1 | 1984 | 2.0% |
Value | Count | Frequency (%) |
0 | 96068 | |
1 | 1984 | 2.0% |
Most occurring characters
Value | Count | Frequency (%) |
0 | 96068 | |
1 | 1984 | 2.0% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 98052 |
Most frequent character per category
Value | Count | Frequency (%) |
0 | 96068 | |
1 | 1984 | 2.0% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 98052 |
Most frequent character per script
Value | Count | Frequency (%) |
0 | 96068 | |
1 | 1984 | 2.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 98052 |
Most frequent character per block
Value | Count | Frequency (%) |
0 | 96068 | |
1 | 1984 | 2.0% |
race_Other
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 766.2 KiB |
0 | |
---|---|
1 | 1483 |
Value | Count | Frequency (%) |
0 | 96569 | |
1 | 1483 | 1.5% |
Value | Count | Frequency (%) |
0 | 96569 | |
1 | 1483 | 1.5% |
Most occurring characters
Value | Count | Frequency (%) |
0 | 96569 | |
1 | 1483 | 1.5% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 98052 |
Most frequent character per category
Value | Count | Frequency (%) |
0 | 96569 | |
1 | 1483 | 1.5% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 98052 |
Most frequent character per script
Value | Count | Frequency (%) |
0 | 96569 | |
1 | 1483 | 1.5% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 98052 |
Most frequent character per block
Value | Count | Frequency (%) |
0 | 96569 | |
1 | 1483 | 1.5% |
diabetes_diagnosis
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True |
Value | Count | Frequency (%) |
False | 62196 | |
True | 35856 |
other_diagnosis
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True |
Value | Count | Frequency (%) |
False | 75866 | |
True | 22186 | 22.6% |
circulatory_diagnosis
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
True | |
---|---|
False |
Value | Count | Frequency (%) |
True | 57838 | |
False | 40214 |
neoplasms_diagnosis
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True |
Value | Count | Frequency (%) |
False | 63068 | |
True | 34984 |
respiratory_diagnosis
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True |
Value | Count | Frequency (%) |
False | 71896 | |
True | 26156 | 26.7% |
injury_diagnosis
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True |
Value | Count | Frequency (%) |
False | 88463 | |
True | 9589 | 9.8% |
musculoskeletal_diagnosis
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True | 7249 |
Value | Count | Frequency (%) |
False | 90803 | |
True | 7249 | 7.4% |
digestive_diagnosis
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True |
Value | Count | Frequency (%) |
False | 83449 | |
True | 14603 | 14.9% |
genitourinary_diagnosis
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True |
Value | Count | Frequency (%) |
False | 80684 | |
True | 17368 | 17.7% |
take_metformin
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True |
Value | Count | Frequency (%) |
False | 78807 | |
True | 19245 | 19.6% |
take_repaglinide
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True | 1523 |
Value | Count | Frequency (%) |
False | 96529 | |
True | 1523 | 1.6% |
take_glimepiride
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True | 4987 |
Value | Count | Frequency (%) |
False | 93065 | |
True | 4987 | 5.1% |
take_glipizide
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True |
Value | Count | Frequency (%) |
False | 85769 | |
True | 12283 | 12.5% |
take_glyburide
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True |
Value | Count | Frequency (%) |
False | 87791 | |
True | 10261 | 10.5% |
take_pioglitazone
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True | 7097 |
Value | Count | Frequency (%) |
False | 90955 | |
True | 7097 | 7.2% |
take_rosiglitazone
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
False | |
---|---|
True | 6166 |
Value | Count | Frequency (%) |
False | 91886 | |
True | 6166 | 6.3% |
take_insulin
Boolean
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 95.9 KiB |
True | |
---|---|
False |
Value | Count | Frequency (%) |
True | 52110 | |
False | 45942 |
readmitted
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 766.2 KiB |
0 | |
---|---|
1 |
Value | Count | Frequency (%) |
0 | 86986 | |
1 | 11066 | 11.3% |
Value | Count | Frequency (%) |
0 | 86986 | |
1 | 11066 | 11.3% |
Most occurring characters
Value | Count | Frequency (%) |
0 | 86986 | |
1 | 11066 | 11.3% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 98052 |
Most frequent character per category
Value | Count | Frequency (%) |
0 | 86986 | |
1 | 11066 | 11.3% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 98052 |
Most frequent character per script
Value | Count | Frequency (%) |
0 | 86986 | |
1 | 11066 | 11.3% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 98052 |
Most frequent character per block
Value | Count | Frequency (%) |
0 | 86986 | |
1 | 11066 | 11.3% |
A1C_Abnorm
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 766.2 KiB |
0 | |
---|---|
1 |
Value | Count | Frequency (%) |
0 | 86713 | |
1 | 11339 | 11.6% |
Value | Count | Frequency (%) |
0 | 86713 | |
1 | 11339 | 11.6% |
Most occurring characters
Value | Count | Frequency (%) |
0 | 86713 | |
1 | 11339 | 11.6% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 98052 |
Most frequent character per category
Value | Count | Frequency (%) |
0 | 86713 | |
1 | 11339 | 11.6% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 98052 |
Most frequent character per script
Value | Count | Frequency (%) |
0 | 86713 | |
1 | 11339 | 11.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 98052 |
Most frequent character per block
Value | Count | Frequency (%) |
0 | 86713 | |
1 | 11339 | 11.6% |
A1C_Norm
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 766.2 KiB |
0 | |
---|---|
1 | 4854 |
Value | Count | Frequency (%) |
0 | 93198 | |
1 | 4854 | 5.0% |
Value | Count | Frequency (%) |
0 | 93198 | |
1 | 4854 | 5.0% |
Most occurring characters
Value | Count | Frequency (%) |
0 | 93198 | |
1 | 4854 | 5.0% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 98052 |
Most frequent character per category
Value | Count | Frequency (%) |
0 | 93198 | |
1 | 4854 | 5.0% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 98052 |
Most frequent character per script
Value | Count | Frequency (%) |
0 | 93198 | |
1 | 4854 | 5.0% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 98052 |
Most frequent character per block
Value | Count | Frequency (%) |
0 | 93198 | |
1 | 4854 | 5.0% |
glu_serum_Abnorm
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 766.2 KiB |
0 | |
---|---|
1 | 2676 |
Value | Count | Frequency (%) |
0 | 95376 | |
1 | 2676 | 2.7% |
Value | Count | Frequency (%) |
0 | 95376 | |
1 | 2676 | 2.7% |
Most occurring characters
Value | Count | Frequency (%) |
0 | 95376 | |
1 | 2676 | 2.7% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 98052 |
Most frequent character per category
Value | Count | Frequency (%) |
0 | 95376 | |
1 | 2676 | 2.7% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 98052 |
Most frequent character per script
Value | Count | Frequency (%) |
0 | 95376 | |
1 | 2676 | 2.7% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 98052 |
Most frequent character per block
Value | Count | Frequency (%) |
0 | 95376 | |
1 | 2676 | 2.7% |
glu_serum_Norm
Categorical
Distinct | 2 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Memory size | 766.2 KiB |
0 | |
---|---|
1 | 2532 |
Value | Count | Frequency (%) |
0 | 95520 | |
1 | 2532 | 2.6% |
Value | Count | Frequency (%) |
0 | 95520 | |
1 | 2532 | 2.6% |
Most occurring characters
Value | Count | Frequency (%) |
0 | 95520 | |
1 | 2532 | 2.6% |
Most occurring categories
Value | Count | Frequency (%) |
Decimal Number | 98052 |
Most frequent character per category
Value | Count | Frequency (%) |
0 | 95520 | |
1 | 2532 | 2.6% |
Most occurring scripts
Value | Count | Frequency (%) |
Common | 98052 |
Most frequent character per script
Value | Count | Frequency (%) |
0 | 95520 | |
1 | 2532 | 2.6% |
Most occurring blocks
Value | Count | Frequency (%) |
ASCII | 98052 |
Most frequent character per block
Value | Count | Frequency (%) |
0 | 95520 | |
1 | 2532 | 2.6% |
decade
Real number (ℝ≥0)
Distinct | 10 |
---|---|
Distinct (%) | < 0.1% |
Missing | 0 |
Missing (%) | 0.0% |
Infinite | 0 |
Infinite (%) | 0.0% |
Mean | 71.26024966 |
---|---|
Minimum | 10 |
Maximum | 100 |
Zeros | 0 |
Zeros (%) | 0.0% |
Memory size | 766.2 KiB |
Quantile statistics
Minimum | 10 |
---|---|
5-th percentile | 40 |
Q1 | 60 |
median | 70 |
Q3 | 80 |
95-th percentile | 90 |
Maximum | 100 |
Range | 90 |
Interquartile range (IQR) | 20 |
Descriptive statistics
Standard deviation | 15.59080518 |
---|---|
Coefficient of variation (CV) | 0.2187868448 |
Kurtosis | 0.1117738185 |
Mean | 71.26024966 |
Median Absolute Deviation (MAD) | 10 |
Skewness | -0.5692494878 |
Sum | 6987210 |
Variance | 243.0732063 |
Monotocity | Not monotonic |
Value | Count | Frequency (%) |
80 | 25305 | |
70 | 21809 | |
90 | 16702 | |
60 | 16697 | |
50 | 9265 | 9.4% |
40 | 3548 | 3.6% |
100 | 2717 | 2.8% |
30 | 1478 | 1.5% |
20 | 466 | 0.5% |
10 | 65 | 0.1% |
Value | Count | Frequency (%) |
10 | 65 | 0.1% |
20 | 466 | 0.5% |
30 | 1478 | 1.5% |
40 | 3548 | 3.6% |
50 | 9265 |
Value | Count | Frequency (%) |
100 | 2717 | 2.8% |
90 | 16702 | |
80 | 25305 | |
70 | 21809 | |
60 | 16697 |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
df_index | time_in_hospital | num_lab_procedures | num_procedures | num_medications | number_outpatient | number_emergency | number_inpatient | number_diagnoses | change | diabetesMed | isFemale | race_AfricanAmerican | race_Asian | race_Caucasian | race_Hispanic | race_Other | diabetes_diagnosis | other_diagnosis | circulatory_diagnosis | neoplasms_diagnosis | respiratory_diagnosis | injury_diagnosis | musculoskeletal_diagnosis | digestive_diagnosis | genitourinary_diagnosis | take_metformin | take_repaglinide | take_glimepiride | take_glipizide | take_glyburide | take_pioglitazone | take_rosiglitazone | take_insulin | readmitted | A1C_Abnorm | A1C_Norm | glu_serum_Abnorm | glu_serum_Norm | decade | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 1 | 3 | 59 | 0 | 18 | 0 | 0 | 0 | 9 | True | True | True | 0 | 0 | 1 | 0 | 0 | True | False | False | True | False | False | False | False | False | False | False | False | False | False | False | False | True | 0 | 0 | 0 | 0 | 0 | 20 |
1 | 2 | 2 | 11 | 5 | 13 | 2 | 0 | 1 | 6 | False | True | True | 1 | 0 | 0 | 0 | 0 | True | True | False | False | False | False | False | False | False | False | False | False | True | False | False | False | False | 0 | 0 | 0 | 0 | 0 | 30 |
2 | 3 | 2 | 44 | 1 | 16 | 0 | 0 | 0 | 7 | True | True | False | 0 | 0 | 1 | 0 | 0 | True | True | True | False | False | False | False | False | False | False | False | False | False | False | False | False | True | 0 | 0 | 0 | 0 | 0 | 40 |
3 | 4 | 1 | 51 | 0 | 8 | 0 | 0 | 0 | 5 | True | True | False | 0 | 0 | 1 | 0 | 0 | True | False | False | True | False | False | False | False | False | False | False | False | True | False | False | False | True | 0 | 0 | 0 | 0 | 0 | 50 |
4 | 5 | 3 | 31 | 6 | 16 | 0 | 0 | 0 | 9 | False | True | False | 0 | 0 | 1 | 0 | 0 | True | False | True | False | False | False | False | False | False | False | False | False | False | False | False | False | True | 0 | 0 | 0 | 0 | 0 | 60 |
5 | 6 | 4 | 70 | 1 | 21 | 0 | 0 | 0 | 7 | True | True | False | 0 | 0 | 1 | 0 | 0 | False | False | True | False | False | False | False | False | False | True | False | True | False | False | False | False | True | 0 | 0 | 0 | 0 | 0 | 70 |
6 | 7 | 5 | 73 | 0 | 12 | 0 | 0 | 0 | 8 | False | True | False | 0 | 0 | 1 | 0 | 0 | True | False | True | False | True | False | False | False | False | False | False | False | False | True | False | False | False | 0 | 0 | 0 | 0 | 0 | 80 |
7 | 8 | 13 | 68 | 2 | 28 | 0 | 0 | 0 | 8 | True | True | True | 0 | 0 | 1 | 0 | 0 | False | True | True | False | False | False | False | False | False | False | False | False | True | False | False | False | True | 0 | 0 | 0 | 0 | 0 | 90 |
8 | 9 | 12 | 33 | 3 | 18 | 0 | 0 | 0 | 8 | True | True | True | 0 | 0 | 1 | 0 | 0 | False | False | True | True | True | False | False | False | False | False | False | False | False | False | False | True | True | 0 | 0 | 0 | 0 | 0 | 100 |
9 | 10 | 9 | 47 | 2 | 17 | 0 | 0 | 0 | 9 | False | True | True | 1 | 0 | 0 | 0 | 0 | True | False | True | False | False | True | False | False | False | False | False | False | False | False | False | False | True | 0 | 0 | 0 | 0 | 0 | 50 |
Last rows
df_index | time_in_hospital | num_lab_procedures | num_procedures | num_medications | number_outpatient | number_emergency | number_inpatient | number_diagnoses | change | diabetesMed | isFemale | race_AfricanAmerican | race_Asian | race_Caucasian | race_Hispanic | race_Other | diabetes_diagnosis | other_diagnosis | circulatory_diagnosis | neoplasms_diagnosis | respiratory_diagnosis | injury_diagnosis | musculoskeletal_diagnosis | digestive_diagnosis | genitourinary_diagnosis | take_metformin | take_repaglinide | take_glimepiride | take_glipizide | take_glyburide | take_pioglitazone | take_rosiglitazone | take_insulin | readmitted | A1C_Abnorm | A1C_Norm | glu_serum_Abnorm | glu_serum_Norm | decade | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
98042 | 101756 | 2 | 46 | 6 | 17 | 1 | 1 | 1 | 9 | False | True | True | 0 | 0 | 0 | 0 | 1 | False | False | True | False | False | True | False | False | True | False | False | False | False | False | False | False | True | 0 | 0 | 0 | 0 | 0 | 70 |
98043 | 101757 | 5 | 21 | 1 | 16 | 0 | 0 | 1 | 9 | False | True | True | 0 | 0 | 1 | 0 | 0 | False | False | False | False | True | False | False | False | False | False | False | False | False | False | False | False | True | 0 | 0 | 0 | 0 | 0 | 80 |
98044 | 101758 | 5 | 76 | 1 | 22 | 0 | 1 | 0 | 9 | True | True | True | 0 | 0 | 1 | 0 | 0 | False | True | False | False | False | False | False | False | False | False | False | False | False | False | False | False | True | 0 | 0 | 0 | 0 | 0 | 90 |
98045 | 101759 | 1 | 1 | 0 | 15 | 3 | 0 | 0 | 7 | True | True | False | 0 | 0 | 1 | 0 | 0 | True | False | True | True | False | False | False | False | False | False | False | False | False | False | False | False | True | 0 | 0 | 0 | 0 | 0 | 90 |
98046 | 101760 | 6 | 45 | 1 | 25 | 3 | 1 | 2 | 9 | True | True | True | 1 | 0 | 0 | 0 | 0 | False | True | True | False | False | False | False | False | False | False | False | False | False | False | False | True | True | 0 | 0 | 0 | 0 | 0 | 70 |
98047 | 101761 | 3 | 51 | 0 | 16 | 0 | 0 | 0 | 9 | True | True | False | 1 | 0 | 0 | 0 | 0 | True | True | True | False | False | False | False | False | False | True | False | False | False | False | False | False | True | 0 | 1 | 0 | 0 | 0 | 80 |
98048 | 101762 | 5 | 33 | 3 | 18 | 0 | 0 | 1 | 9 | False | True | True | 1 | 0 | 0 | 0 | 0 | False | False | False | True | False | False | False | True | False | False | False | False | False | False | False | False | True | 0 | 0 | 0 | 0 | 0 | 90 |
98049 | 101763 | 1 | 53 | 0 | 9 | 1 | 0 | 0 | 13 | True | True | False | 0 | 0 | 1 | 0 | 0 | False | True | False | False | False | False | False | False | True | True | False | False | False | False | False | False | True | 0 | 0 | 0 | 0 | 0 | 80 |
98050 | 101764 | 10 | 45 | 2 | 21 | 0 | 0 | 1 | 9 | True | True | True | 0 | 0 | 1 | 0 | 0 | False | True | False | False | False | True | False | False | False | False | False | False | True | False | True | False | True | 0 | 0 | 0 | 0 | 0 | 90 |
98051 | 101765 | 6 | 13 | 3 | 3 | 0 | 0 | 0 | 9 | False | False | False | 0 | 0 | 1 | 0 | 0 | False | False | False | False | False | False | False | True | False | False | False | False | False | False | False | False | False | 0 | 0 | 0 | 0 | 0 | 80 |