Overview

Dataset statistics

Number of variables46
Number of observations98053
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory34.4 MiB
Average record size in memory368.0 B

Variable types

Numeric12
Categorical30
Boolean4

Warnings

examide has constant value "False" Constant
citoglipton has constant value "False" Constant
metformin-rosiglitazone has constant value "False" Constant
diag_1 has a high cardinality: 713 distinct values High cardinality
diag_2 has a high cardinality: 740 distinct values High cardinality
diag_3 has a high cardinality: 786 distinct values High cardinality
tolbutamide is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
insulin is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
A1Cresult is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
diabetesMed is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
max_glu_serum is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
glyburide is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
metformin is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
metformin-pioglitazone is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
metformin-rosiglitazone is highly correlated with tolbutamide and 29 other fieldsHigh correlation
race is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
miglitol is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
gender is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
chlorpropamide is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
repaglinide is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
examide is highly correlated with tolbutamide and 29 other fieldsHigh correlation
acetohexamide is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
age is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
troglitazone is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
change is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
glipizide is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
readmitted is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
acarbose is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
glimepiride-pioglitazone is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
pioglitazone is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
glimepiride is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
citoglipton is highly correlated with tolbutamide and 29 other fieldsHigh correlation
glipizide-metformin is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
glyburide-metformin is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
tolazamide is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
rosiglitazone is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
nateglinide is highly correlated with metformin-rosiglitazone and 2 other fieldsHigh correlation
number_emergency is highly skewed (γ1 = 22.71034016) Skewed
df_index has unique values Unique
num_procedures has 44574 (45.5%) zeros Zeros
number_outpatient has 81680 (83.3%) zeros Zeros
number_emergency has 86846 (88.6%) zeros Zeros
number_inpatient has 64634 (65.9%) zeros Zeros

Reproduction

Analysis started2021-05-05 21:22:20.387978
Analysis finished2021-05-05 21:23:16.876767
Duration56.49 seconds
Software versionpandas-profiling v2.11.0
Download configurationconfig.yaml

Variables

df_index
Real number (ℝ≥0)

UNIQUE

Distinct98053
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean51115.56242
Minimum1
Maximum101765
Zeros0
Zeros (%)0.0%
Memory size766.2 KiB
2021-05-05T17:23:17.034980image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile5180.6
Q125575
median51369
Q376379
95-th percentile96683.4
Maximum101765
Range101764
Interquartile range (IQR)50804

Descriptive statistics

Standard deviation29307.25248
Coefficient of variation (CV)0.573352832
Kurtosis-1.191414224
Mean51115.56242
Median Absolute Deviation (MAD)25399
Skewness-0.01478077774
Sum5012034242
Variance858915047.7
MonotocityStrictly increasing
2021-05-05T17:23:17.179019image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
20471
 
< 0.1%
805621
 
< 0.1%
293391
 
< 0.1%
191001
 
< 0.1%
170531
 
< 0.1%
231981
 
< 0.1%
211511
 
< 0.1%
1010281
 
< 0.1%
989811
 
< 0.1%
764641
 
< 0.1%
Other values (98043)98043
> 99.9%
ValueCountFrequency (%)
11
< 0.1%
21
< 0.1%
31
< 0.1%
41
< 0.1%
51
< 0.1%
ValueCountFrequency (%)
1017651
< 0.1%
1017641
< 0.1%
1017631
< 0.1%
1017621
< 0.1%
1017611
< 0.1%

race
Categorical

HIGH CORRELATION

Distinct5
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
Caucasian
75079 
AfricanAmerican
18881 
Hispanic
 
1984
Other
 
1484
Asian
 
625

Length

Max length15
Median length9
Mean length10.0490857
Min length5

Characters and Unicode

Total characters985343
Distinct characters17
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCaucasian
2nd rowAfricanAmerican
3rd rowCaucasian
4th rowCaucasian
5th rowCaucasian
ValueCountFrequency (%)
Caucasian75079
76.6%
AfricanAmerican18881
 
19.3%
Hispanic1984
 
2.0%
Other1484
 
1.5%
Asian625
 
0.6%
2021-05-05T17:23:17.425838image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:17.501515image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
caucasian75079
76.6%
africanamerican18881
 
19.3%
hispanic1984
 
2.0%
other1484
 
1.5%
asian625
 
0.6%

Most occurring characters

ValueCountFrequency (%)
a265608
27.0%
i117434
11.9%
n115450
11.7%
c114825
11.7%
s77688
 
7.9%
C75079
 
7.6%
u75079
 
7.6%
r39246
 
4.0%
A38387
 
3.9%
e20365
 
2.1%
Other values (7)46182
 
4.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter868409
88.1%
Uppercase Letter116934
 
11.9%

Most frequent character per category

ValueCountFrequency (%)
a265608
30.6%
i117434
13.5%
n115450
13.3%
c114825
13.2%
s77688
 
8.9%
u75079
 
8.6%
r39246
 
4.5%
e20365
 
2.3%
f18881
 
2.2%
m18881
 
2.2%
Other values (3)4952
 
0.6%
ValueCountFrequency (%)
C75079
64.2%
A38387
32.8%
H1984
 
1.7%
O1484
 
1.3%

Most occurring scripts

ValueCountFrequency (%)
Latin985343
100.0%

Most frequent character per script

ValueCountFrequency (%)
a265608
27.0%
i117434
11.9%
n115450
11.7%
c114825
11.7%
s77688
 
7.9%
C75079
 
7.6%
u75079
 
7.6%
r39246
 
4.0%
A38387
 
3.9%
e20365
 
2.1%
Other values (7)46182
 
4.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII985343
100.0%

Most frequent character per block

ValueCountFrequency (%)
a265608
27.0%
i117434
11.9%
n115450
11.7%
c114825
11.7%
s77688
 
7.9%
C75079
 
7.6%
u75079
 
7.6%
r39246
 
4.0%
A38387
 
3.9%
e20365
 
2.1%
Other values (7)46182
 
4.7%

gender
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
Female
52833 
Male
45219 
Unknown/Invalid
 
1

Length

Max length15
Median length6
Mean length5.077753868
Min length4

Characters and Unicode

Total characters497889
Distinct characters16
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowFemale
2nd rowFemale
3rd rowMale
4th rowMale
5th rowMale
ValueCountFrequency (%)
Female52833
53.9%
Male45219
46.1%
Unknown/Invalid1
 
< 0.1%
2021-05-05T17:23:17.732352image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:17.807175image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
female52833
53.9%
male45219
46.1%
unknown/invalid1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e150885
30.3%
a98053
19.7%
l98053
19.7%
F52833
 
10.6%
m52833
 
10.6%
M45219
 
9.1%
n4
 
< 0.1%
U1
 
< 0.1%
k1
 
< 0.1%
o1
 
< 0.1%
Other values (6)6
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter399834
80.3%
Uppercase Letter98054
 
19.7%
Other Punctuation1
 
< 0.1%

Most frequent character per category

ValueCountFrequency (%)
e150885
37.7%
a98053
24.5%
l98053
24.5%
m52833
 
13.2%
n4
 
< 0.1%
k1
 
< 0.1%
o1
 
< 0.1%
w1
 
< 0.1%
v1
 
< 0.1%
i1
 
< 0.1%
ValueCountFrequency (%)
F52833
53.9%
M45219
46.1%
U1
 
< 0.1%
I1
 
< 0.1%
ValueCountFrequency (%)
/1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin497888
> 99.9%
Common1
 
< 0.1%

Most frequent character per script

ValueCountFrequency (%)
e150885
30.3%
a98053
19.7%
l98053
19.7%
F52833
 
10.6%
m52833
 
10.6%
M45219
 
9.1%
n4
 
< 0.1%
U1
 
< 0.1%
k1
 
< 0.1%
o1
 
< 0.1%
Other values (5)5
 
< 0.1%
ValueCountFrequency (%)
/1
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII497889
100.0%

Most frequent character per block

ValueCountFrequency (%)
e150885
30.3%
a98053
19.7%
l98053
19.7%
F52833
 
10.6%
m52833
 
10.6%
M45219
 
9.1%
n4
 
< 0.1%
U1
 
< 0.1%
k1
 
< 0.1%
o1
 
< 0.1%
Other values (6)6
 
< 0.1%

age
Categorical

HIGH CORRELATION

Distinct10
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
[70-80)
25306 
[60-70)
21809 
[80-90)
16702 
[50-60)
16697 
[40-50)
9265 
Other values (5)
8274 

Length

Max length8
Median length7
Mean length7.027046597
Min length6

Characters and Unicode

Total characters689023
Distinct characters13
Distinct categories4 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row[10-20)
2nd row[20-30)
3rd row[30-40)
4th row[40-50)
5th row[50-60)
ValueCountFrequency (%)
[70-80)25306
25.8%
[60-70)21809
22.2%
[80-90)16702
17.0%
[50-60)16697
17.0%
[40-50)9265
 
9.4%
[30-40)3548
 
3.6%
[90-100)2717
 
2.8%
[20-30)1478
 
1.5%
[10-20)466
 
0.5%
[0-10)65
 
0.1%
2021-05-05T17:23:18.012511image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:18.097191image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
70-8025306
25.8%
60-7021809
22.2%
80-9016702
17.0%
50-6016697
17.0%
40-509265
 
9.4%
30-403548
 
3.6%
90-1002717
 
2.8%
20-301478
 
1.5%
10-20466
 
0.5%
0-1065
 
0.1%

Most occurring characters

ValueCountFrequency (%)
0198823
28.9%
[98053
14.2%
-98053
14.2%
)98053
14.2%
747115
 
6.8%
842008
 
6.1%
638506
 
5.6%
525962
 
3.8%
919419
 
2.8%
412813
 
1.9%
Other values (3)10218
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number394864
57.3%
Open Punctuation98053
 
14.2%
Dash Punctuation98053
 
14.2%
Close Punctuation98053
 
14.2%

Most frequent character per category

ValueCountFrequency (%)
0198823
50.4%
747115
 
11.9%
842008
 
10.6%
638506
 
9.8%
525962
 
6.6%
919419
 
4.9%
412813
 
3.2%
35026
 
1.3%
13248
 
0.8%
21944
 
0.5%
ValueCountFrequency (%)
[98053
100.0%
ValueCountFrequency (%)
-98053
100.0%
ValueCountFrequency (%)
)98053
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common689023
100.0%

Most frequent character per script

ValueCountFrequency (%)
0198823
28.9%
[98053
14.2%
-98053
14.2%
)98053
14.2%
747115
 
6.8%
842008
 
6.1%
638506
 
5.6%
525962
 
3.8%
919419
 
2.8%
412813
 
1.9%
Other values (3)10218
 
1.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII689023
100.0%

Most frequent character per block

ValueCountFrequency (%)
0198823
28.9%
[98053
14.2%
-98053
14.2%
)98053
14.2%
747115
 
6.8%
842008
 
6.1%
638506
 
5.6%
525962
 
3.8%
919419
 
2.8%
412813
 
1.9%
Other values (3)10218
 
1.5%

admission_type_id
Real number (ℝ≥0)

Distinct8
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.025812571
Minimum1
Maximum8
Zeros0
Zeros (%)0.0%
Memory size766.2 KiB
2021-05-05T17:23:18.219990image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q33
95-th percentile6
Maximum8
Range7
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.450117239
Coefficient of variation (CV)0.7158200418
Kurtosis1.912034312
Mean2.025812571
Median Absolute Deviation (MAD)0
Skewness1.587030686
Sum198637
Variance2.102840007
MonotocityNot monotonic
2021-05-05T17:23:18.302951image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=8)
ValueCountFrequency (%)
152178
53.2%
318194
 
18.6%
217543
 
17.9%
65135
 
5.2%
54661
 
4.8%
8312
 
0.3%
720
 
< 0.1%
410
 
< 0.1%
ValueCountFrequency (%)
152178
53.2%
217543
 
17.9%
318194
 
18.6%
410
 
< 0.1%
54661
 
4.8%
ValueCountFrequency (%)
8312
 
0.3%
720
 
< 0.1%
65135
5.2%
54661
4.8%
410
 
< 0.1%

discharge_disposition_id
Real number (ℝ≥0)

Distinct26
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean3.753368076
Minimum1
Maximum28
Zeros0
Zeros (%)0.0%
Memory size766.2 KiB
2021-05-05T17:23:18.408921image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q34
95-th percentile18
Maximum28
Range27
Interquartile range (IQR)3

Descriptive statistics

Standard deviation5.309391793
Coefficient of variation (CV)1.414567313
Kurtosis5.838277851
Mean3.753368076
Median Absolute Deviation (MAD)0
Skewness2.534770768
Sum368029
Variance28.18964121
MonotocityNot monotonic
2021-05-05T17:23:18.514756image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=26)
ValueCountFrequency (%)
157610
58.8%
313564
 
13.8%
612626
 
12.9%
183624
 
3.7%
22049
 
2.1%
221970
 
2.0%
111606
 
1.6%
51127
 
1.1%
25941
 
1.0%
4756
 
0.8%
Other values (16)2180
 
2.2%
ValueCountFrequency (%)
157610
58.8%
22049
 
2.1%
313564
 
13.8%
4756
 
0.8%
51127
 
1.1%
ValueCountFrequency (%)
28137
 
0.1%
275
 
< 0.1%
25941
1.0%
2448
 
< 0.1%
23400
0.4%

admission_source_id
Real number (ℝ≥0)

Distinct17
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean5.776692197
Minimum1
Maximum25
Zeros0
Zeros (%)0.0%
Memory size766.2 KiB
2021-05-05T17:23:18.615091image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median7
Q37
95-th percentile17
Maximum25
Range24
Interquartile range (IQR)6

Descriptive statistics

Standard deviation4.071639573
Coefficient of variation (CV)0.7048392807
Kurtosis1.733779297
Mean5.776692197
Median Absolute Deviation (MAD)0
Skewness1.027278586
Sum566422
Variance16.57824881
MonotocityNot monotonic
2021-05-05T17:23:18.726640image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=17)
ValueCountFrequency (%)
755951
57.1%
128356
28.9%
176602
 
6.7%
42945
 
3.0%
61893
 
1.9%
21031
 
1.1%
5846
 
0.9%
3179
 
0.2%
20160
 
0.2%
949
 
< 0.1%
Other values (7)41
 
< 0.1%
ValueCountFrequency (%)
128356
28.9%
21031
 
1.1%
3179
 
0.2%
42945
 
3.0%
5846
 
0.9%
ValueCountFrequency (%)
252
 
< 0.1%
2212
 
< 0.1%
20160
 
0.2%
176602
6.7%
142
 
< 0.1%

time_in_hospital
Real number (ℝ≥0)

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean4.42197587
Minimum1
Maximum14
Zeros0
Zeros (%)0.0%
Memory size766.2 KiB
2021-05-05T17:23:18.826141image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q12
median4
Q36
95-th percentile11
Maximum14
Range13
Interquartile range (IQR)4

Descriptive statistics

Standard deviation2.993074463
Coefficient of variation (CV)0.6768635902
Kurtosis0.817949426
Mean4.42197587
Median Absolute Deviation (MAD)2
Skewness1.123569649
Sum433588
Variance8.958494742
MonotocityNot monotonic
2021-05-05T17:23:18.935828image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
317049
17.4%
216441
16.8%
113490
13.8%
413434
13.7%
59699
9.9%
67320
7.5%
75694
 
5.8%
84276
 
4.4%
92928
 
3.0%
102287
 
2.3%
Other values (4)5435
 
5.5%
ValueCountFrequency (%)
113490
13.8%
216441
16.8%
317049
17.4%
413434
13.7%
59699
9.9%
ValueCountFrequency (%)
141017
1.0%
131185
1.2%
121424
1.5%
111809
1.8%
102287
2.3%

num_lab_procedures
Real number (ℝ≥0)

Distinct118
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean43.14807298
Minimum1
Maximum132
Zeros0
Zeros (%)0.0%
Memory size766.2 KiB
2021-05-05T17:23:19.067522image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile4
Q131
median44
Q357
95-th percentile73
Maximum132
Range131
Interquartile range (IQR)26

Descriptive statistics

Standard deviation19.71203294
Coefficient of variation (CV)0.4568461945
Kurtosis-0.2451976515
Mean43.14807298
Median Absolute Deviation (MAD)13
Skewness-0.2355346172
Sum4230798
Variance388.5642427
MonotocityNot monotonic
2021-05-05T17:23:19.209643image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
13096
 
3.2%
432724
 
2.8%
442414
 
2.5%
452306
 
2.4%
382131
 
2.2%
462120
 
2.2%
402113
 
2.2%
412046
 
2.1%
422031
 
2.1%
472028
 
2.1%
Other values (108)75044
76.5%
ValueCountFrequency (%)
13096
3.2%
21062
 
1.1%
3647
 
0.7%
4364
 
0.4%
5277
 
0.3%
ValueCountFrequency (%)
1321
< 0.1%
1291
< 0.1%
1261
< 0.1%
1211
< 0.1%
1201
< 0.1%

num_procedures
Real number (ℝ≥0)

ZEROS

Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1.350749085
Minimum0
Maximum6
Zeros44574
Zeros (%)45.5%
Memory size766.2 KiB
2021-05-05T17:23:19.317627image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile5
Maximum6
Range6
Interquartile range (IQR)2

Descriptive statistics

Standard deviation1.708505881
Coefficient of variation (CV)1.264858071
Kurtosis0.8236555027
Mean1.350749085
Median Absolute Deviation (MAD)1
Skewness1.303916989
Sum132445
Variance2.918992346
MonotocityNot monotonic
2021-05-05T17:23:19.394152image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=7)
ValueCountFrequency (%)
044574
45.5%
120029
20.4%
212383
 
12.6%
39210
 
9.4%
64811
 
4.9%
44076
 
4.2%
52970
 
3.0%
ValueCountFrequency (%)
044574
45.5%
120029
20.4%
212383
 
12.6%
39210
 
9.4%
44076
 
4.2%
ValueCountFrequency (%)
64811
 
4.9%
52970
 
3.0%
44076
 
4.2%
39210
9.4%
212383
12.6%

num_medications
Real number (ℝ≥0)

Distinct75
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean16.11964958
Minimum1
Maximum81
Zeros0
Zeros (%)0.0%
Memory size766.2 KiB
2021-05-05T17:23:19.508387image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile6
Q111
median15
Q320
95-th percentile31
Maximum81
Range80
Interquartile range (IQR)9

Descriptive statistics

Standard deviation8.108475918
Coefficient of variation (CV)0.5030181257
Kurtosis3.493505174
Mean16.11964958
Median Absolute Deviation (MAD)5
Skewness1.332695065
Sum1580580
Variance65.74738171
MonotocityNot monotonic
2021-05-05T17:23:19.653088image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
135885
 
6.0%
125816
 
5.9%
155621
 
5.7%
115592
 
5.7%
145520
 
5.6%
165271
 
5.4%
105167
 
5.3%
174783
 
4.9%
94711
 
4.8%
184399
 
4.5%
Other values (65)45288
46.2%
ValueCountFrequency (%)
1236
 
0.2%
2397
 
0.4%
3785
0.8%
41269
1.3%
51835
1.9%
ValueCountFrequency (%)
811
 
< 0.1%
791
 
< 0.1%
752
< 0.1%
741
 
< 0.1%
723
< 0.1%

number_outpatient
Real number (ℝ≥0)

ZEROS

Distinct39
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.3763780812
Minimum0
Maximum42
Zeros81680
Zeros (%)83.3%
Memory size766.2 KiB
2021-05-05T17:23:19.795141image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum42
Range42
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.28335944
Coefficient of variation (CV)3.409761365
Kurtosis145.5912818
Mean0.3763780812
Median Absolute Deviation (MAD)0
Skewness8.78170539
Sum36905
Variance1.647011452
MonotocityNot monotonic
2021-05-05T17:23:20.203958image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=39)
ValueCountFrequency (%)
081680
83.3%
18340
 
8.5%
23514
 
3.6%
32005
 
2.0%
41078
 
1.1%
5521
 
0.5%
6297
 
0.3%
7153
 
0.2%
898
 
0.1%
983
 
0.1%
Other values (29)284
 
0.3%
ValueCountFrequency (%)
081680
83.3%
18340
 
8.5%
23514
 
3.6%
32005
 
2.0%
41078
 
1.1%
ValueCountFrequency (%)
421
< 0.1%
401
< 0.1%
391
< 0.1%
381
< 0.1%
371
< 0.1%

number_emergency
Real number (ℝ≥0)

SKEWED
ZEROS

Distinct33
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.2024619339
Minimum0
Maximum76
Zeros86846
Zeros (%)88.6%
Memory size766.2 KiB
2021-05-05T17:23:20.320554image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile1
Maximum76
Range76
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.94289229
Coefficient of variation (CV)4.657133675
Kurtosis1171.637565
Mean0.2024619339
Median Absolute Deviation (MAD)0
Skewness22.71034016
Sum19852
Variance0.8890458704
MonotocityNot monotonic
2021-05-05T17:23:20.424895image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=33)
ValueCountFrequency (%)
086846
88.6%
17550
 
7.7%
22011
 
2.1%
3716
 
0.7%
4372
 
0.4%
5190
 
0.2%
693
 
0.1%
772
 
0.1%
850
 
0.1%
1034
 
< 0.1%
Other values (23)119
 
0.1%
ValueCountFrequency (%)
086846
88.6%
17550
 
7.7%
22011
 
2.1%
3716
 
0.7%
4372
 
0.4%
ValueCountFrequency (%)
761
< 0.1%
641
< 0.1%
631
< 0.1%
541
< 0.1%
461
< 0.1%

number_inpatient
Real number (ℝ≥0)

ZEROS

Distinct20
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.6468644509
Minimum0
Maximum21
Zeros64634
Zeros (%)65.9%
Memory size766.2 KiB
2021-05-05T17:23:20.531607image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum21
Range21
Interquartile range (IQR)1

Descriptive statistics

Standard deviation1.271020492
Coefficient of variation (CV)1.964894639
Kurtosis19.94332538
Mean0.6468644509
Median Absolute Deviation (MAD)0
Skewness3.554829592
Sum63427
Variance1.61549309
MonotocityNot monotonic
2021-05-05T17:23:20.630951image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=20)
ValueCountFrequency (%)
064634
65.9%
119067
 
19.4%
27421
 
7.6%
33346
 
3.4%
41597
 
1.6%
5802
 
0.8%
6474
 
0.5%
7266
 
0.3%
8147
 
0.1%
9111
 
0.1%
Other values (10)188
 
0.2%
ValueCountFrequency (%)
064634
65.9%
119067
 
19.4%
27421
 
7.6%
33346
 
3.4%
41597
 
1.6%
ValueCountFrequency (%)
211
 
< 0.1%
192
 
< 0.1%
181
 
< 0.1%
165
< 0.1%
158
< 0.1%

diag_1
Categorical

HIGH CARDINALITY

Distinct713
Distinct (%)0.7%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
428
 
6730
414
 
6374
786
 
3900
410
 
3514
486
 
3412
Other values (708)
74123 

Length

Max length6
Median length3
Mean length3.162167399
Min length1

Characters and Unicode

Total characters310060
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique87 ?
Unique (%)0.1%

Sample

1st row276
2nd row648
3rd row8
4th row197
5th row414
ValueCountFrequency (%)
4286730
 
6.9%
4146374
 
6.5%
7863900
 
4.0%
4103514
 
3.6%
4863412
 
3.5%
4272701
 
2.8%
4912210
 
2.3%
7152073
 
2.1%
4341983
 
2.0%
7801976
 
2.0%
Other values (703)63180
64.4%
2021-05-05T17:23:20.917062image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
4286730
 
6.9%
4146374
 
6.5%
7863900
 
4.0%
4103514
 
3.6%
4863412
 
3.5%
4272701
 
2.8%
4912210
 
2.3%
7152073
 
2.1%
4341983
 
2.0%
7801976
 
2.0%
Other values (703)63180
64.4%

Most occurring characters

ValueCountFrequency (%)
453841
17.4%
237924
12.2%
836767
11.9%
535509
11.5%
727739
8.9%
126791
8.6%
023459
7.6%
622453
7.2%
919352
 
6.2%
316850
 
5.4%
Other values (3)9375
 
3.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number300685
97.0%
Other Punctuation7774
 
2.5%
Uppercase Letter1601
 
0.5%

Most frequent character per category

ValueCountFrequency (%)
453841
17.9%
237924
12.6%
836767
12.2%
535509
11.8%
727739
9.2%
126791
8.9%
023459
7.8%
622453
7.5%
919352
 
6.4%
316850
 
5.6%
ValueCountFrequency (%)
V1600
99.9%
E1
 
0.1%
ValueCountFrequency (%)
.7774
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common308459
99.5%
Latin1601
 
0.5%

Most frequent character per script

ValueCountFrequency (%)
453841
17.5%
237924
12.3%
836767
11.9%
535509
11.5%
727739
9.0%
126791
8.7%
023459
7.6%
622453
7.3%
919352
 
6.3%
316850
 
5.5%
ValueCountFrequency (%)
V1600
99.9%
E1
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII310060
100.0%

Most frequent character per block

ValueCountFrequency (%)
453841
17.4%
237924
12.2%
836767
11.9%
535509
11.5%
727739
8.9%
126791
8.6%
023459
7.6%
622453
7.2%
919352
 
6.2%
316850
 
5.4%
Other values (3)9375
 
3.0%

diag_2
Categorical

HIGH CARDINALITY

Distinct740
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
428
 
6517
276
 
6513
250
 
5412
427
 
4919
401
 
3613
Other values (735)
71079 

Length

Max length6
Median length3
Mean length3.172274178
Min length1

Characters and Unicode

Total characters311051
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique120 ?
Unique (%)0.1%

Sample

1st row250.01
2nd row250
3rd row250.43
4th row157
5th row411
ValueCountFrequency (%)
4286517
 
6.6%
2766513
 
6.6%
2505412
 
5.5%
4274919
 
5.0%
4013613
 
3.7%
4963233
 
3.3%
5993225
 
3.3%
4032781
 
2.8%
4142574
 
2.6%
4112496
 
2.5%
Other values (730)56770
57.9%
2021-05-05T17:23:21.163885image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
4286517
 
6.6%
2766513
 
6.6%
2505412
 
5.5%
4274919
 
5.0%
4013613
 
3.7%
4963233
 
3.3%
5993225
 
3.3%
4032781
 
2.8%
4142574
 
2.6%
4112496
 
2.5%
Other values (730)56770
57.9%

Most occurring characters

ValueCountFrequency (%)
449919
16.0%
247802
15.4%
536582
11.8%
032414
10.4%
827942
9.0%
727749
8.9%
125358
8.2%
921289
6.8%
619412
 
6.2%
313683
 
4.4%
Other values (3)8901
 
2.9%

Most occurring categories

ValueCountFrequency (%)
Decimal Number302150
97.1%
Other Punctuation6450
 
2.1%
Uppercase Letter2451
 
0.8%

Most frequent character per category

ValueCountFrequency (%)
449919
16.5%
247802
15.8%
536582
12.1%
032414
10.7%
827942
9.2%
727749
9.2%
125358
8.4%
921289
7.0%
619412
 
6.4%
313683
 
4.5%
ValueCountFrequency (%)
V1735
70.8%
E716
29.2%
ValueCountFrequency (%)
.6450
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common308600
99.2%
Latin2451
 
0.8%

Most frequent character per script

ValueCountFrequency (%)
449919
16.2%
247802
15.5%
536582
11.9%
032414
10.5%
827942
9.1%
727749
9.0%
125358
8.2%
921289
6.9%
619412
 
6.3%
313683
 
4.4%
ValueCountFrequency (%)
V1735
70.8%
E716
29.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII311051
100.0%

Most frequent character per block

ValueCountFrequency (%)
449919
16.0%
247802
15.4%
536582
11.8%
032414
10.4%
827942
9.0%
727749
8.9%
125358
8.2%
921289
6.8%
619412
 
6.2%
313683
 
4.4%
Other values (3)8901
 
2.9%

diag_3
Categorical

HIGH CARDINALITY

Distinct786
Distinct (%)0.8%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
250
11208 
401
8090 
276
 
5097
428
 
4491
427
 
3865
Other values (781)
65302 

Length

Max length6
Median length3
Mean length3.142188408
Min length1

Characters and Unicode

Total characters308101
Distinct characters13
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique126 ?
Unique (%)0.1%

Sample

1st row255
2nd rowV27
3rd row403
4th row250
5th row250
ValueCountFrequency (%)
25011208
 
11.4%
4018090
 
8.3%
2765097
 
5.2%
4284491
 
4.6%
4273865
 
3.9%
4143567
 
3.6%
4962552
 
2.6%
4032322
 
2.4%
5851949
 
2.0%
2721910
 
1.9%
Other values (776)53002
54.1%
2021-05-05T17:23:21.401694image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
25011208
 
11.4%
4018090
 
8.3%
2765097
 
5.2%
4284491
 
4.6%
4273865
 
3.9%
4143567
 
3.6%
4962552
 
2.6%
4032322
 
2.4%
5851949
 
2.0%
2721910
 
1.9%
Other values (776)53002
54.1%

Most occurring characters

ValueCountFrequency (%)
250082
16.3%
448157
15.6%
540276
13.1%
038773
12.6%
725936
8.4%
124108
7.8%
823281
7.6%
916938
 
5.5%
616112
 
5.2%
313976
 
4.5%
Other values (3)10462
 
3.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number297639
96.6%
Other Punctuation5488
 
1.8%
Uppercase Letter4974
 
1.6%

Most frequent character per category

ValueCountFrequency (%)
250082
16.8%
448157
16.2%
540276
13.5%
038773
13.0%
725936
8.7%
124108
8.1%
823281
7.8%
916938
 
5.7%
616112
 
5.4%
313976
 
4.7%
ValueCountFrequency (%)
V3757
75.5%
E1217
 
24.5%
ValueCountFrequency (%)
.5488
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common303127
98.4%
Latin4974
 
1.6%

Most frequent character per script

ValueCountFrequency (%)
250082
16.5%
448157
15.9%
540276
13.3%
038773
12.8%
725936
8.6%
124108
8.0%
823281
7.7%
916938
 
5.6%
616112
 
5.3%
313976
 
4.6%
ValueCountFrequency (%)
V3757
75.5%
E1217
 
24.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII308101
100.0%

Most frequent character per block

ValueCountFrequency (%)
250082
16.3%
448157
15.6%
540276
13.1%
038773
12.6%
725936
8.4%
124108
7.8%
823281
7.6%
916938
 
5.5%
616112
 
5.2%
313976
 
4.5%
Other values (3)10462
 
3.4%

number_diagnoses
Real number (ℝ≥0)

Distinct14
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean7.512059804
Minimum3
Maximum16
Zeros0
Zeros (%)0.0%
Memory size766.2 KiB
2021-05-05T17:23:21.501679image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Quantile statistics

Minimum3
5-th percentile4
Q16
median8
Q39
95-th percentile9
Maximum16
Range13
Interquartile range (IQR)3

Descriptive statistics

Standard deviation1.832496822
Coefficient of variation (CV)0.2439406593
Kurtosis-0.3451201144
Mean7.512059804
Median Absolute Deviation (MAD)1
Skewness-0.8175023377
Sum736580
Variance3.358044602
MonotocityNot monotonic
2021-05-05T17:23:21.623595image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram with fixed size bins (bins=14)
ValueCountFrequency (%)
948687
49.7%
510592
 
10.8%
810388
 
10.6%
710179
 
10.4%
69988
 
10.2%
45361
 
5.5%
32751
 
2.8%
1640
 
< 0.1%
1316
 
< 0.1%
1016
 
< 0.1%
Other values (4)35
 
< 0.1%
ValueCountFrequency (%)
32751
 
2.8%
45361
5.5%
510592
10.8%
69988
10.2%
710179
10.4%
ValueCountFrequency (%)
1640
< 0.1%
158
 
< 0.1%
147
 
< 0.1%
1316
 
< 0.1%
129
 
< 0.1%

max_glu_serum
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
None
92845 
Norm
 
2532
>200
 
1449
>300
 
1227

Length

Max length4
Median length4
Mean length4
Min length4

Characters and Unicode

Total characters392212
Distinct characters10
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th rowNone
5th rowNone
ValueCountFrequency (%)
None92845
94.7%
Norm2532
 
2.6%
>2001449
 
1.5%
>3001227
 
1.3%
2021-05-05T17:23:21.821964image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:21.886134image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
none92845
94.7%
norm2532
 
2.6%
2001449
 
1.5%
3001227
 
1.3%

Most occurring characters

ValueCountFrequency (%)
N95377
24.3%
o95377
24.3%
n92845
23.7%
e92845
23.7%
05352
 
1.4%
>2676
 
0.7%
r2532
 
0.6%
m2532
 
0.6%
21449
 
0.4%
31227
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter286131
73.0%
Uppercase Letter95377
 
24.3%
Decimal Number8028
 
2.0%
Math Symbol2676
 
0.7%

Most frequent character per category

ValueCountFrequency (%)
o95377
33.3%
n92845
32.4%
e92845
32.4%
r2532
 
0.9%
m2532
 
0.9%
ValueCountFrequency (%)
05352
66.7%
21449
 
18.0%
31227
 
15.3%
ValueCountFrequency (%)
N95377
100.0%
ValueCountFrequency (%)
>2676
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin381508
97.3%
Common10704
 
2.7%

Most frequent character per script

ValueCountFrequency (%)
N95377
25.0%
o95377
25.0%
n92845
24.3%
e92845
24.3%
r2532
 
0.7%
m2532
 
0.7%
ValueCountFrequency (%)
05352
50.0%
>2676
25.0%
21449
 
13.5%
31227
 
11.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII392212
100.0%

Most frequent character per block

ValueCountFrequency (%)
N95377
24.3%
o95377
24.3%
n92845
23.7%
e92845
23.7%
05352
 
1.4%
>2676
 
0.7%
r2532
 
0.6%
m2532
 
0.6%
21449
 
0.4%
31227
 
0.3%

A1Cresult
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
None
81860 
>8
 
7631
Norm
 
4854
>7
 
3708

Length

Max length4
Median length4
Mean length3.768716918
Min length2

Characters and Unicode

Total characters369534
Distinct characters9
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNone
2nd rowNone
3rd rowNone
4th rowNone
5th rowNone
ValueCountFrequency (%)
None81860
83.5%
>87631
 
7.8%
Norm4854
 
5.0%
>73708
 
3.8%
2021-05-05T17:23:22.111391image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:22.204851image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
none81860
83.5%
87631
 
7.8%
norm4854
 
5.0%
73708
 
3.8%

Most occurring characters

ValueCountFrequency (%)
N86714
23.5%
o86714
23.5%
n81860
22.2%
e81860
22.2%
>11339
 
3.1%
87631
 
2.1%
r4854
 
1.3%
m4854
 
1.3%
73708
 
1.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter260142
70.4%
Uppercase Letter86714
 
23.5%
Math Symbol11339
 
3.1%
Decimal Number11339
 
3.1%

Most frequent character per category

ValueCountFrequency (%)
o86714
33.3%
n81860
31.5%
e81860
31.5%
r4854
 
1.9%
m4854
 
1.9%
ValueCountFrequency (%)
87631
67.3%
73708
32.7%
ValueCountFrequency (%)
N86714
100.0%
ValueCountFrequency (%)
>11339
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin346856
93.9%
Common22678
 
6.1%

Most frequent character per script

ValueCountFrequency (%)
N86714
25.0%
o86714
25.0%
n81860
23.6%
e81860
23.6%
r4854
 
1.4%
m4854
 
1.4%
ValueCountFrequency (%)
>11339
50.0%
87631
33.6%
73708
 
16.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII369534
100.0%

Most frequent character per block

ValueCountFrequency (%)
N86714
23.5%
o86714
23.5%
n81860
22.2%
e81860
22.2%
>11339
 
3.1%
87631
 
2.1%
r4854
 
1.3%
m4854
 
1.3%
73708
 
1.0%

metformin
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
78808 
Steady
17677 
Up
 
1017
Down
 
551

Length

Max length6
Median length2
Mean length2.73235903
Min length2

Characters and Unicode

Total characters267916
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No78808
80.4%
Steady17677
 
18.0%
Up1017
 
1.0%
Down551
 
0.6%
2021-05-05T17:23:22.485041image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:22.588507image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no78808
80.4%
steady17677
 
18.0%
up1017
 
1.0%
down551
 
0.6%

Most occurring characters

ValueCountFrequency (%)
o79359
29.6%
N78808
29.4%
S17677
 
6.6%
t17677
 
6.6%
e17677
 
6.6%
a17677
 
6.6%
d17677
 
6.6%
y17677
 
6.6%
U1017
 
0.4%
p1017
 
0.4%
Other values (3)1653
 
0.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter169863
63.4%
Uppercase Letter98053
36.6%

Most frequent character per category

ValueCountFrequency (%)
o79359
46.7%
t17677
 
10.4%
e17677
 
10.4%
a17677
 
10.4%
d17677
 
10.4%
y17677
 
10.4%
p1017
 
0.6%
w551
 
0.3%
n551
 
0.3%
ValueCountFrequency (%)
N78808
80.4%
S17677
 
18.0%
U1017
 
1.0%
D551
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin267916
100.0%

Most frequent character per script

ValueCountFrequency (%)
o79359
29.6%
N78808
29.4%
S17677
 
6.6%
t17677
 
6.6%
e17677
 
6.6%
a17677
 
6.6%
d17677
 
6.6%
y17677
 
6.6%
U1017
 
0.4%
p1017
 
0.4%
Other values (3)1653
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII267916
100.0%

Most frequent character per block

ValueCountFrequency (%)
o79359
29.6%
N78808
29.4%
S17677
 
6.6%
t17677
 
6.6%
e17677
 
6.6%
a17677
 
6.6%
d17677
 
6.6%
y17677
 
6.6%
U1017
 
0.4%
p1017
 
0.4%
Other values (3)1653
 
0.6%

repaglinide
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
96530 
Steady
 
1371
Up
 
107
Down
 
45

Length

Max length6
Median length2
Mean length2.056846807
Min length2

Characters and Unicode

Total characters201680
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No96530
98.4%
Steady1371
 
1.4%
Up107
 
0.1%
Down45
 
< 0.1%
2021-05-05T17:23:22.837025image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:22.943904image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no96530
98.4%
steady1371
 
1.4%
up107
 
0.1%
down45
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o96575
47.9%
N96530
47.9%
S1371
 
0.7%
t1371
 
0.7%
e1371
 
0.7%
a1371
 
0.7%
d1371
 
0.7%
y1371
 
0.7%
U107
 
0.1%
p107
 
0.1%
Other values (3)135
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter103627
51.4%
Uppercase Letter98053
48.6%

Most frequent character per category

ValueCountFrequency (%)
o96575
93.2%
t1371
 
1.3%
e1371
 
1.3%
a1371
 
1.3%
d1371
 
1.3%
y1371
 
1.3%
p107
 
0.1%
w45
 
< 0.1%
n45
 
< 0.1%
ValueCountFrequency (%)
N96530
98.4%
S1371
 
1.4%
U107
 
0.1%
D45
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin201680
100.0%

Most frequent character per script

ValueCountFrequency (%)
o96575
47.9%
N96530
47.9%
S1371
 
0.7%
t1371
 
0.7%
e1371
 
0.7%
a1371
 
0.7%
d1371
 
0.7%
y1371
 
0.7%
U107
 
0.1%
p107
 
0.1%
Other values (3)135
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII201680
100.0%

Most frequent character per block

ValueCountFrequency (%)
o96575
47.9%
N96530
47.9%
S1371
 
0.7%
t1371
 
0.7%
e1371
 
0.7%
a1371
 
0.7%
d1371
 
0.7%
y1371
 
0.7%
U107
 
0.1%
p107
 
0.1%
Other values (3)135
 
0.1%

nateglinide
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
97362 
Steady
 
657
Up
 
23
Down
 
11

Length

Max length6
Median length2
Mean length2.0270262
Min length2

Characters and Unicode

Total characters198756
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No97362
99.3%
Steady657
 
0.7%
Up23
 
< 0.1%
Down11
 
< 0.1%
2021-05-05T17:23:23.200577image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:23.298143image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no97362
99.3%
steady657
 
0.7%
up23
 
< 0.1%
down11
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o97373
49.0%
N97362
49.0%
S657
 
0.3%
t657
 
0.3%
e657
 
0.3%
a657
 
0.3%
d657
 
0.3%
y657
 
0.3%
U23
 
< 0.1%
p23
 
< 0.1%
Other values (3)33
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter100703
50.7%
Uppercase Letter98053
49.3%

Most frequent character per category

ValueCountFrequency (%)
o97373
96.7%
t657
 
0.7%
e657
 
0.7%
a657
 
0.7%
d657
 
0.7%
y657
 
0.7%
p23
 
< 0.1%
w11
 
< 0.1%
n11
 
< 0.1%
ValueCountFrequency (%)
N97362
99.3%
S657
 
0.7%
U23
 
< 0.1%
D11
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin198756
100.0%

Most frequent character per script

ValueCountFrequency (%)
o97373
49.0%
N97362
49.0%
S657
 
0.3%
t657
 
0.3%
e657
 
0.3%
a657
 
0.3%
d657
 
0.3%
y657
 
0.3%
U23
 
< 0.1%
p23
 
< 0.1%
Other values (3)33
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII198756
100.0%

Most frequent character per block

ValueCountFrequency (%)
o97373
49.0%
N97362
49.0%
S657
 
0.3%
t657
 
0.3%
e657
 
0.3%
a657
 
0.3%
d657
 
0.3%
y657
 
0.3%
U23
 
< 0.1%
p23
 
< 0.1%
Other values (3)33
 
< 0.1%

chlorpropamide
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
97970 
Steady
 
76
Up
 
6
Down
 
1

Length

Max length6
Median length2
Mean length2.003120761
Min length2

Characters and Unicode

Total characters196412
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No97970
99.9%
Steady76
 
0.1%
Up6
 
< 0.1%
Down1
 
< 0.1%
2021-05-05T17:23:23.530958image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:23.663137image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no97970
99.9%
steady76
 
0.1%
up6
 
< 0.1%
down1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o97971
49.9%
N97970
49.9%
S76
 
< 0.1%
t76
 
< 0.1%
e76
 
< 0.1%
a76
 
< 0.1%
d76
 
< 0.1%
y76
 
< 0.1%
U6
 
< 0.1%
p6
 
< 0.1%
Other values (3)3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter98359
50.1%
Uppercase Letter98053
49.9%

Most frequent character per category

ValueCountFrequency (%)
o97971
99.6%
t76
 
0.1%
e76
 
0.1%
a76
 
0.1%
d76
 
0.1%
y76
 
0.1%
p6
 
< 0.1%
w1
 
< 0.1%
n1
 
< 0.1%
ValueCountFrequency (%)
N97970
99.9%
S76
 
0.1%
U6
 
< 0.1%
D1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin196412
100.0%

Most frequent character per script

ValueCountFrequency (%)
o97971
49.9%
N97970
49.9%
S76
 
< 0.1%
t76
 
< 0.1%
e76
 
< 0.1%
a76
 
< 0.1%
d76
 
< 0.1%
y76
 
< 0.1%
U6
 
< 0.1%
p6
 
< 0.1%
Other values (3)3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII196412
100.0%

Most frequent character per block

ValueCountFrequency (%)
o97971
49.9%
N97970
49.9%
S76
 
< 0.1%
t76
 
< 0.1%
e76
 
< 0.1%
a76
 
< 0.1%
d76
 
< 0.1%
y76
 
< 0.1%
U6
 
< 0.1%
p6
 
< 0.1%
Other values (3)3
 
< 0.1%

glimepiride
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
93066 
Steady
 
4488
Up
 
315
Down
 
184

Length

Max length6
Median length2
Mean length2.186837731
Min length2

Characters and Unicode

Total characters214426
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No93066
94.9%
Steady4488
 
4.6%
Up315
 
0.3%
Down184
 
0.2%
2021-05-05T17:23:23.904402image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:24.011126image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no93066
94.9%
steady4488
 
4.6%
up315
 
0.3%
down184
 
0.2%

Most occurring characters

ValueCountFrequency (%)
o93250
43.5%
N93066
43.4%
S4488
 
2.1%
t4488
 
2.1%
e4488
 
2.1%
a4488
 
2.1%
d4488
 
2.1%
y4488
 
2.1%
U315
 
0.1%
p315
 
0.1%
Other values (3)552
 
0.3%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter116373
54.3%
Uppercase Letter98053
45.7%

Most frequent character per category

ValueCountFrequency (%)
o93250
80.1%
t4488
 
3.9%
e4488
 
3.9%
a4488
 
3.9%
d4488
 
3.9%
y4488
 
3.9%
p315
 
0.3%
w184
 
0.2%
n184
 
0.2%
ValueCountFrequency (%)
N93066
94.9%
S4488
 
4.6%
U315
 
0.3%
D184
 
0.2%

Most occurring scripts

ValueCountFrequency (%)
Latin214426
100.0%

Most frequent character per script

ValueCountFrequency (%)
o93250
43.5%
N93066
43.4%
S4488
 
2.1%
t4488
 
2.1%
e4488
 
2.1%
a4488
 
2.1%
d4488
 
2.1%
y4488
 
2.1%
U315
 
0.1%
p315
 
0.1%
Other values (3)552
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII214426
100.0%

Most frequent character per block

ValueCountFrequency (%)
o93250
43.5%
N93066
43.4%
S4488
 
2.1%
t4488
 
2.1%
e4488
 
2.1%
a4488
 
2.1%
d4488
 
2.1%
y4488
 
2.1%
U315
 
0.1%
p315
 
0.1%
Other values (3)552
 
0.3%

acetohexamide
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
98052 
Steady
 
1

Length

Max length6
Median length2
Mean length2.000040794
Min length2

Characters and Unicode

Total characters196110
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No98052
> 99.9%
Steady1
 
< 0.1%
2021-05-05T17:23:24.316917image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:24.409028image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no98052
> 99.9%
steady1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N98052
50.0%
o98052
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter98057
50.0%
Uppercase Letter98053
50.0%

Most frequent character per category

ValueCountFrequency (%)
o98052
> 99.9%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%
ValueCountFrequency (%)
N98052
> 99.9%
S1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin196110
100.0%

Most frequent character per script

ValueCountFrequency (%)
N98052
50.0%
o98052
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII196110
100.0%

Most frequent character per block

ValueCountFrequency (%)
N98052
50.0%
o98052
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

glipizide
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
85769 
Steady
10991 
Up
 
752
Down
 
541

Length

Max length6
Median length2
Mean length2.459404608
Min length2

Characters and Unicode

Total characters241152
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowSteady
3rd rowNo
4th rowSteady
5th rowNo
ValueCountFrequency (%)
No85769
87.5%
Steady10991
 
11.2%
Up752
 
0.8%
Down541
 
0.6%
2021-05-05T17:23:24.658348image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:24.788859image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no85769
87.5%
steady10991
 
11.2%
up752
 
0.8%
down541
 
0.6%

Most occurring characters

ValueCountFrequency (%)
o86310
35.8%
N85769
35.6%
S10991
 
4.6%
t10991
 
4.6%
e10991
 
4.6%
a10991
 
4.6%
d10991
 
4.6%
y10991
 
4.6%
U752
 
0.3%
p752
 
0.3%
Other values (3)1623
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter143099
59.3%
Uppercase Letter98053
40.7%

Most frequent character per category

ValueCountFrequency (%)
o86310
60.3%
t10991
 
7.7%
e10991
 
7.7%
a10991
 
7.7%
d10991
 
7.7%
y10991
 
7.7%
p752
 
0.5%
w541
 
0.4%
n541
 
0.4%
ValueCountFrequency (%)
N85769
87.5%
S10991
 
11.2%
U752
 
0.8%
D541
 
0.6%

Most occurring scripts

ValueCountFrequency (%)
Latin241152
100.0%

Most frequent character per script

ValueCountFrequency (%)
o86310
35.8%
N85769
35.6%
S10991
 
4.6%
t10991
 
4.6%
e10991
 
4.6%
a10991
 
4.6%
d10991
 
4.6%
y10991
 
4.6%
U752
 
0.3%
p752
 
0.3%
Other values (3)1623
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII241152
100.0%

Most frequent character per block

ValueCountFrequency (%)
o86310
35.8%
N85769
35.6%
S10991
 
4.6%
t10991
 
4.6%
e10991
 
4.6%
a10991
 
4.6%
d10991
 
4.6%
y10991
 
4.6%
U752
 
0.3%
p752
 
0.3%
Other values (3)1623
 
0.7%

glyburide
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
87792 
Steady
8932 
Up
 
791
Down
 
538

Length

Max length6
Median length2
Mean length2.375348026
Min length2

Characters and Unicode

Total characters232910
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No87792
89.5%
Steady8932
 
9.1%
Up791
 
0.8%
Down538
 
0.5%
2021-05-05T17:23:25.031560image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:25.118741image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no87792
89.5%
steady8932
 
9.1%
up791
 
0.8%
down538
 
0.5%

Most occurring characters

ValueCountFrequency (%)
o88330
37.9%
N87792
37.7%
S8932
 
3.8%
t8932
 
3.8%
e8932
 
3.8%
a8932
 
3.8%
d8932
 
3.8%
y8932
 
3.8%
U791
 
0.3%
p791
 
0.3%
Other values (3)1614
 
0.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter134857
57.9%
Uppercase Letter98053
42.1%

Most frequent character per category

ValueCountFrequency (%)
o88330
65.5%
t8932
 
6.6%
e8932
 
6.6%
a8932
 
6.6%
d8932
 
6.6%
y8932
 
6.6%
p791
 
0.6%
w538
 
0.4%
n538
 
0.4%
ValueCountFrequency (%)
N87792
89.5%
S8932
 
9.1%
U791
 
0.8%
D538
 
0.5%

Most occurring scripts

ValueCountFrequency (%)
Latin232910
100.0%

Most frequent character per script

ValueCountFrequency (%)
o88330
37.9%
N87792
37.7%
S8932
 
3.8%
t8932
 
3.8%
e8932
 
3.8%
a8932
 
3.8%
d8932
 
3.8%
y8932
 
3.8%
U791
 
0.3%
p791
 
0.3%
Other values (3)1614
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII232910
100.0%

Most frequent character per block

ValueCountFrequency (%)
o88330
37.9%
N87792
37.7%
S8932
 
3.8%
t8932
 
3.8%
e8932
 
3.8%
a8932
 
3.8%
d8932
 
3.8%
y8932
 
3.8%
U791
 
0.3%
p791
 
0.3%
Other values (3)1614
 
0.7%

tolbutamide
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
98031 
Steady
 
22

Length

Max length6
Median length2
Mean length2.000897474
Min length2

Characters and Unicode

Total characters196194
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No98031
> 99.9%
Steady22
 
< 0.1%
2021-05-05T17:23:25.346349image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:25.446254image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no98031
> 99.9%
steady22
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N98031
50.0%
o98031
50.0%
S22
 
< 0.1%
t22
 
< 0.1%
e22
 
< 0.1%
a22
 
< 0.1%
d22
 
< 0.1%
y22
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter98141
50.0%
Uppercase Letter98053
50.0%

Most frequent character per category

ValueCountFrequency (%)
o98031
99.9%
t22
 
< 0.1%
e22
 
< 0.1%
a22
 
< 0.1%
d22
 
< 0.1%
y22
 
< 0.1%
ValueCountFrequency (%)
N98031
> 99.9%
S22
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin196194
100.0%

Most frequent character per script

ValueCountFrequency (%)
N98031
50.0%
o98031
50.0%
S22
 
< 0.1%
t22
 
< 0.1%
e22
 
< 0.1%
a22
 
< 0.1%
d22
 
< 0.1%
y22
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII196194
100.0%

Most frequent character per block

ValueCountFrequency (%)
N98031
50.0%
o98031
50.0%
S22
 
< 0.1%
t22
 
< 0.1%
e22
 
< 0.1%
a22
 
< 0.1%
d22
 
< 0.1%
y22
 
< 0.1%

pioglitazone
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
90955 
Steady
 
6756
Up
 
227
Down
 
115

Length

Max length6
Median length2
Mean length2.27795172
Min length2

Characters and Unicode

Total characters223360
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No90955
92.8%
Steady6756
 
6.9%
Up227
 
0.2%
Down115
 
0.1%
2021-05-05T17:23:25.676996image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:25.786560image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no90955
92.8%
steady6756
 
6.9%
up227
 
0.2%
down115
 
0.1%

Most occurring characters

ValueCountFrequency (%)
o91070
40.8%
N90955
40.7%
S6756
 
3.0%
t6756
 
3.0%
e6756
 
3.0%
a6756
 
3.0%
d6756
 
3.0%
y6756
 
3.0%
U227
 
0.1%
p227
 
0.1%
Other values (3)345
 
0.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter125307
56.1%
Uppercase Letter98053
43.9%

Most frequent character per category

ValueCountFrequency (%)
o91070
72.7%
t6756
 
5.4%
e6756
 
5.4%
a6756
 
5.4%
d6756
 
5.4%
y6756
 
5.4%
p227
 
0.2%
w115
 
0.1%
n115
 
0.1%
ValueCountFrequency (%)
N90955
92.8%
S6756
 
6.9%
U227
 
0.2%
D115
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin223360
100.0%

Most frequent character per script

ValueCountFrequency (%)
o91070
40.8%
N90955
40.7%
S6756
 
3.0%
t6756
 
3.0%
e6756
 
3.0%
a6756
 
3.0%
d6756
 
3.0%
y6756
 
3.0%
U227
 
0.1%
p227
 
0.1%
Other values (3)345
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII223360
100.0%

Most frequent character per block

ValueCountFrequency (%)
o91070
40.8%
N90955
40.7%
S6756
 
3.0%
t6756
 
3.0%
e6756
 
3.0%
a6756
 
3.0%
d6756
 
3.0%
y6756
 
3.0%
U227
 
0.1%
p227
 
0.1%
Other values (3)345
 
0.2%

rosiglitazone
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
91887 
Steady
 
5908
Up
 
174
Down
 
84

Length

Max length6
Median length2
Mean length2.242725873
Min length2

Characters and Unicode

Total characters219906
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No91887
93.7%
Steady5908
 
6.0%
Up174
 
0.2%
Down84
 
0.1%
2021-05-05T17:23:26.055088image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:26.154097image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no91887
93.7%
steady5908
 
6.0%
up174
 
0.2%
down84
 
0.1%

Most occurring characters

ValueCountFrequency (%)
o91971
41.8%
N91887
41.8%
S5908
 
2.7%
t5908
 
2.7%
e5908
 
2.7%
a5908
 
2.7%
d5908
 
2.7%
y5908
 
2.7%
U174
 
0.1%
p174
 
0.1%
Other values (3)252
 
0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter121853
55.4%
Uppercase Letter98053
44.6%

Most frequent character per category

ValueCountFrequency (%)
o91971
75.5%
t5908
 
4.8%
e5908
 
4.8%
a5908
 
4.8%
d5908
 
4.8%
y5908
 
4.8%
p174
 
0.1%
w84
 
0.1%
n84
 
0.1%
ValueCountFrequency (%)
N91887
93.7%
S5908
 
6.0%
U174
 
0.2%
D84
 
0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin219906
100.0%

Most frequent character per script

ValueCountFrequency (%)
o91971
41.8%
N91887
41.8%
S5908
 
2.7%
t5908
 
2.7%
e5908
 
2.7%
a5908
 
2.7%
d5908
 
2.7%
y5908
 
2.7%
U174
 
0.1%
p174
 
0.1%
Other values (3)252
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII219906
100.0%

Most frequent character per block

ValueCountFrequency (%)
o91971
41.8%
N91887
41.8%
S5908
 
2.7%
t5908
 
2.7%
e5908
 
2.7%
a5908
 
2.7%
d5908
 
2.7%
y5908
 
2.7%
U174
 
0.1%
p174
 
0.1%
Other values (3)252
 
0.1%

acarbose
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
97754 
Steady
 
286
Up
 
10
Down
 
3

Length

Max length6
Median length2
Mean length2.011728351
Min length2

Characters and Unicode

Total characters197256
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No97754
99.7%
Steady286
 
0.3%
Up10
 
< 0.1%
Down3
 
< 0.1%
2021-05-05T17:23:26.391021image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:26.482694image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no97754
99.7%
steady286
 
0.3%
up10
 
< 0.1%
down3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o97757
49.6%
N97754
49.6%
S286
 
0.1%
t286
 
0.1%
e286
 
0.1%
a286
 
0.1%
d286
 
0.1%
y286
 
0.1%
U10
 
< 0.1%
p10
 
< 0.1%
Other values (3)9
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter99203
50.3%
Uppercase Letter98053
49.7%

Most frequent character per category

ValueCountFrequency (%)
o97757
98.5%
t286
 
0.3%
e286
 
0.3%
a286
 
0.3%
d286
 
0.3%
y286
 
0.3%
p10
 
< 0.1%
w3
 
< 0.1%
n3
 
< 0.1%
ValueCountFrequency (%)
N97754
99.7%
S286
 
0.3%
U10
 
< 0.1%
D3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin197256
100.0%

Most frequent character per script

ValueCountFrequency (%)
o97757
49.6%
N97754
49.6%
S286
 
0.1%
t286
 
0.1%
e286
 
0.1%
a286
 
0.1%
d286
 
0.1%
y286
 
0.1%
U10
 
< 0.1%
p10
 
< 0.1%
Other values (3)9
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII197256
100.0%

Most frequent character per block

ValueCountFrequency (%)
o97757
49.6%
N97754
49.6%
S286
 
0.1%
t286
 
0.1%
e286
 
0.1%
a286
 
0.1%
d286
 
0.1%
y286
 
0.1%
U10
 
< 0.1%
p10
 
< 0.1%
Other values (3)9
 
< 0.1%

miglitol
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
98016 
Steady
 
31
Down
 
4
Up
 
2

Length

Max length6
Median length2
Mean length2.001346211
Min length2

Characters and Unicode

Total characters196238
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No98016
> 99.9%
Steady31
 
< 0.1%
Down4
 
< 0.1%
Up2
 
< 0.1%
2021-05-05T17:23:26.707337image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:26.800426image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no98016
> 99.9%
steady31
 
< 0.1%
down4
 
< 0.1%
up2
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o98020
49.9%
N98016
49.9%
S31
 
< 0.1%
t31
 
< 0.1%
e31
 
< 0.1%
a31
 
< 0.1%
d31
 
< 0.1%
y31
 
< 0.1%
D4
 
< 0.1%
w4
 
< 0.1%
Other values (3)8
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter98185
50.0%
Uppercase Letter98053
50.0%

Most frequent character per category

ValueCountFrequency (%)
o98020
99.8%
t31
 
< 0.1%
e31
 
< 0.1%
a31
 
< 0.1%
d31
 
< 0.1%
y31
 
< 0.1%
w4
 
< 0.1%
n4
 
< 0.1%
p2
 
< 0.1%
ValueCountFrequency (%)
N98016
> 99.9%
S31
 
< 0.1%
D4
 
< 0.1%
U2
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin196238
100.0%

Most frequent character per script

ValueCountFrequency (%)
o98020
49.9%
N98016
49.9%
S31
 
< 0.1%
t31
 
< 0.1%
e31
 
< 0.1%
a31
 
< 0.1%
d31
 
< 0.1%
y31
 
< 0.1%
D4
 
< 0.1%
w4
 
< 0.1%
Other values (3)8
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII196238
100.0%

Most frequent character per block

ValueCountFrequency (%)
o98020
49.9%
N98016
49.9%
S31
 
< 0.1%
t31
 
< 0.1%
e31
 
< 0.1%
a31
 
< 0.1%
d31
 
< 0.1%
y31
 
< 0.1%
D4
 
< 0.1%
w4
 
< 0.1%
Other values (3)8
 
< 0.1%

troglitazone
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
98050 
Steady
 
3

Length

Max length6
Median length2
Mean length2.000122383
Min length2

Characters and Unicode

Total characters196118
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No98050
> 99.9%
Steady3
 
< 0.1%
2021-05-05T17:23:26.996591image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:27.075477image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no98050
> 99.9%
steady3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N98050
50.0%
o98050
50.0%
S3
 
< 0.1%
t3
 
< 0.1%
e3
 
< 0.1%
a3
 
< 0.1%
d3
 
< 0.1%
y3
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter98065
50.0%
Uppercase Letter98053
50.0%

Most frequent character per category

ValueCountFrequency (%)
o98050
> 99.9%
t3
 
< 0.1%
e3
 
< 0.1%
a3
 
< 0.1%
d3
 
< 0.1%
y3
 
< 0.1%
ValueCountFrequency (%)
N98050
> 99.9%
S3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin196118
100.0%

Most frequent character per script

ValueCountFrequency (%)
N98050
50.0%
o98050
50.0%
S3
 
< 0.1%
t3
 
< 0.1%
e3
 
< 0.1%
a3
 
< 0.1%
d3
 
< 0.1%
y3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII196118
100.0%

Most frequent character per block

ValueCountFrequency (%)
N98050
50.0%
o98050
50.0%
S3
 
< 0.1%
t3
 
< 0.1%
e3
 
< 0.1%
a3
 
< 0.1%
d3
 
< 0.1%
y3
 
< 0.1%

tolazamide
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
98016 
Steady
 
36
Up
 
1

Length

Max length6
Median length2
Mean length2.001468594
Min length2

Characters and Unicode

Total characters196250
Distinct characters10
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No98016
> 99.9%
Steady36
 
< 0.1%
Up1
 
< 0.1%
2021-05-05T17:23:27.326593image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:27.434418image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no98016
> 99.9%
steady36
 
< 0.1%
up1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N98016
49.9%
o98016
49.9%
S36
 
< 0.1%
t36
 
< 0.1%
e36
 
< 0.1%
a36
 
< 0.1%
d36
 
< 0.1%
y36
 
< 0.1%
U1
 
< 0.1%
p1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter98197
50.0%
Uppercase Letter98053
50.0%

Most frequent character per category

ValueCountFrequency (%)
o98016
99.8%
t36
 
< 0.1%
e36
 
< 0.1%
a36
 
< 0.1%
d36
 
< 0.1%
y36
 
< 0.1%
p1
 
< 0.1%
ValueCountFrequency (%)
N98016
> 99.9%
S36
 
< 0.1%
U1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin196250
100.0%

Most frequent character per script

ValueCountFrequency (%)
N98016
49.9%
o98016
49.9%
S36
 
< 0.1%
t36
 
< 0.1%
e36
 
< 0.1%
a36
 
< 0.1%
d36
 
< 0.1%
y36
 
< 0.1%
U1
 
< 0.1%
p1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII196250
100.0%

Most frequent character per block

ValueCountFrequency (%)
N98016
49.9%
o98016
49.9%
S36
 
< 0.1%
t36
 
< 0.1%
e36
 
< 0.1%
a36
 
< 0.1%
d36
 
< 0.1%
y36
 
< 0.1%
U1
 
< 0.1%
p1
 
< 0.1%

examide
Boolean

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size95.9 KiB
False
98053 
ValueCountFrequency (%)
False98053
100.0%
2021-05-05T17:23:27.496808image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

citoglipton
Boolean

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size95.9 KiB
False
98053 
ValueCountFrequency (%)
False98053
100.0%
2021-05-05T17:23:27.534350image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

insulin
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
45943 
Steady
29368 
Down
11843 
Up
10899 

Length

Max length6
Median length2
Mean length3.439609191
Min length2

Characters and Unicode

Total characters337264
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowUp
2nd rowNo
3rd rowUp
4th rowSteady
5th rowSteady
ValueCountFrequency (%)
No45943
46.9%
Steady29368
30.0%
Down11843
 
12.1%
Up10899
 
11.1%
2021-05-05T17:23:27.754711image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:27.828492image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no45943
46.9%
steady29368
30.0%
down11843
 
12.1%
up10899
 
11.1%

Most occurring characters

ValueCountFrequency (%)
o57786
17.1%
N45943
13.6%
S29368
8.7%
t29368
8.7%
e29368
8.7%
a29368
8.7%
d29368
8.7%
y29368
8.7%
D11843
 
3.5%
w11843
 
3.5%
Other values (3)33641
10.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter239211
70.9%
Uppercase Letter98053
29.1%

Most frequent character per category

ValueCountFrequency (%)
o57786
24.2%
t29368
12.3%
e29368
12.3%
a29368
12.3%
d29368
12.3%
y29368
12.3%
w11843
 
5.0%
n11843
 
5.0%
p10899
 
4.6%
ValueCountFrequency (%)
N45943
46.9%
S29368
30.0%
D11843
 
12.1%
U10899
 
11.1%

Most occurring scripts

ValueCountFrequency (%)
Latin337264
100.0%

Most frequent character per script

ValueCountFrequency (%)
o57786
17.1%
N45943
13.6%
S29368
8.7%
t29368
8.7%
e29368
8.7%
a29368
8.7%
d29368
8.7%
y29368
8.7%
D11843
 
3.5%
w11843
 
3.5%
Other values (3)33641
10.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII337264
100.0%

Most frequent character per block

ValueCountFrequency (%)
o57786
17.1%
N45943
13.6%
S29368
8.7%
t29368
8.7%
e29368
8.7%
a29368
8.7%
d29368
8.7%
y29368
8.7%
D11843
 
3.5%
w11843
 
3.5%
Other values (3)33641
10.0%

glyburide-metformin
Categorical

HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
97384 
Steady
 
660
Up
 
6
Down
 
3

Length

Max length6
Median length2
Mean length2.026985406
Min length2

Characters and Unicode

Total characters198752
Distinct characters13
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No97384
99.3%
Steady660
 
0.7%
Up6
 
< 0.1%
Down3
 
< 0.1%
2021-05-05T17:23:28.450534image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:28.552384image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no97384
99.3%
steady660
 
0.7%
up6
 
< 0.1%
down3
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
o97387
49.0%
N97384
49.0%
S660
 
0.3%
t660
 
0.3%
e660
 
0.3%
a660
 
0.3%
d660
 
0.3%
y660
 
0.3%
U6
 
< 0.1%
p6
 
< 0.1%
Other values (3)9
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter100699
50.7%
Uppercase Letter98053
49.3%

Most frequent character per category

ValueCountFrequency (%)
o97387
96.7%
t660
 
0.7%
e660
 
0.7%
a660
 
0.7%
d660
 
0.7%
y660
 
0.7%
p6
 
< 0.1%
w3
 
< 0.1%
n3
 
< 0.1%
ValueCountFrequency (%)
N97384
99.3%
S660
 
0.7%
U6
 
< 0.1%
D3
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin198752
100.0%

Most frequent character per script

ValueCountFrequency (%)
o97387
49.0%
N97384
49.0%
S660
 
0.3%
t660
 
0.3%
e660
 
0.3%
a660
 
0.3%
d660
 
0.3%
y660
 
0.3%
U6
 
< 0.1%
p6
 
< 0.1%
Other values (3)9
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII198752
100.0%

Most frequent character per block

ValueCountFrequency (%)
o97387
49.0%
N97384
49.0%
S660
 
0.3%
t660
 
0.3%
e660
 
0.3%
a660
 
0.3%
d660
 
0.3%
y660
 
0.3%
U6
 
< 0.1%
p6
 
< 0.1%
Other values (3)9
 
< 0.1%

glipizide-metformin
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
98040 
Steady
 
13

Length

Max length6
Median length2
Mean length2.000530325
Min length2

Characters and Unicode

Total characters196158
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No98040
> 99.9%
Steady13
 
< 0.1%
2021-05-05T17:23:28.798064image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:28.884206image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no98040
> 99.9%
steady13
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N98040
50.0%
o98040
50.0%
S13
 
< 0.1%
t13
 
< 0.1%
e13
 
< 0.1%
a13
 
< 0.1%
d13
 
< 0.1%
y13
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter98105
50.0%
Uppercase Letter98053
50.0%

Most frequent character per category

ValueCountFrequency (%)
o98040
99.9%
t13
 
< 0.1%
e13
 
< 0.1%
a13
 
< 0.1%
d13
 
< 0.1%
y13
 
< 0.1%
ValueCountFrequency (%)
N98040
> 99.9%
S13
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin196158
100.0%

Most frequent character per script

ValueCountFrequency (%)
N98040
50.0%
o98040
50.0%
S13
 
< 0.1%
t13
 
< 0.1%
e13
 
< 0.1%
a13
 
< 0.1%
d13
 
< 0.1%
y13
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII196158
100.0%

Most frequent character per block

ValueCountFrequency (%)
N98040
50.0%
o98040
50.0%
S13
 
< 0.1%
t13
 
< 0.1%
e13
 
< 0.1%
a13
 
< 0.1%
d13
 
< 0.1%
y13
 
< 0.1%

glimepiride-pioglitazone
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
98052 
Steady
 
1

Length

Max length6
Median length2
Mean length2.000040794
Min length2

Characters and Unicode

Total characters196110
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No98052
> 99.9%
Steady1
 
< 0.1%
2021-05-05T17:23:29.068289image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:29.148416image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no98052
> 99.9%
steady1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N98052
50.0%
o98052
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter98057
50.0%
Uppercase Letter98053
50.0%

Most frequent character per category

ValueCountFrequency (%)
o98052
> 99.9%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%
ValueCountFrequency (%)
N98052
> 99.9%
S1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin196110
100.0%

Most frequent character per script

ValueCountFrequency (%)
N98052
50.0%
o98052
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII196110
100.0%

Most frequent character per block

ValueCountFrequency (%)
N98052
50.0%
o98052
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

metformin-rosiglitazone
Boolean

CONSTANT
HIGH CORRELATION
REJECTED

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size95.9 KiB
False
98053 
ValueCountFrequency (%)
False98053
100.0%
2021-05-05T17:23:29.205499image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

metformin-pioglitazone
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
98052 
Steady
 
1

Length

Max length6
Median length2
Mean length2.000040794
Min length2

Characters and Unicode

Total characters196110
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowNo
2nd rowNo
3rd rowNo
4th rowNo
5th rowNo
ValueCountFrequency (%)
No98052
> 99.9%
Steady1
 
< 0.1%
2021-05-05T17:23:29.446190image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:29.550351image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no98052
> 99.9%
steady1
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
N98052
50.0%
o98052
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter98057
50.0%
Uppercase Letter98053
50.0%

Most frequent character per category

ValueCountFrequency (%)
o98052
> 99.9%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%
ValueCountFrequency (%)
N98052
> 99.9%
S1
 
< 0.1%

Most occurring scripts

ValueCountFrequency (%)
Latin196110
100.0%

Most frequent character per script

ValueCountFrequency (%)
N98052
50.0%
o98052
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII196110
100.0%

Most frequent character per block

ValueCountFrequency (%)
N98052
50.0%
o98052
50.0%
S1
 
< 0.1%
t1
 
< 0.1%
e1
 
< 0.1%
a1
 
< 0.1%
d1
 
< 0.1%
y1
 
< 0.1%

change
Categorical

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
No
52774 
Ch
45279 

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters196106
Distinct characters4
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCh
2nd rowNo
3rd rowCh
4th rowCh
5th rowNo
ValueCountFrequency (%)
No52774
53.8%
Ch45279
46.2%
2021-05-05T17:23:29.777606image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:29.852622image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no52774
53.8%
ch45279
46.2%

Most occurring characters

ValueCountFrequency (%)
N52774
26.9%
o52774
26.9%
C45279
23.1%
h45279
23.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter98053
50.0%
Lowercase Letter98053
50.0%

Most frequent character per category

ValueCountFrequency (%)
N52774
53.8%
C45279
46.2%
ValueCountFrequency (%)
o52774
53.8%
h45279
46.2%

Most occurring scripts

ValueCountFrequency (%)
Latin196106
100.0%

Most frequent character per script

ValueCountFrequency (%)
N52774
26.9%
o52774
26.9%
C45279
23.1%
h45279
23.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII196106
100.0%

Most frequent character per block

ValueCountFrequency (%)
N52774
26.9%
o52774
26.9%
C45279
23.1%
h45279
23.1%

diabetesMed
Boolean

HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size95.9 KiB
True
75351 
False
22702 
ValueCountFrequency (%)
True75351
76.8%
False22702
 
23.2%
2021-05-05T17:23:29.905553image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

readmitted
Categorical

HIGH CORRELATION

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size766.2 KiB
NO
52338 
>30
34649 
<30
11066 

Length

Max length3
Median length2
Mean length2.466227448
Min length2

Characters and Unicode

Total characters241821
Distinct characters6
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row>30
2nd rowNO
3rd rowNO
4th rowNO
5th row>30
ValueCountFrequency (%)
NO52338
53.4%
>3034649
35.3%
<3011066
 
11.3%
2021-05-05T17:23:30.076471image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Histogram of lengths of the category
2021-05-05T17:23:30.146286image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
ValueCountFrequency (%)
no52338
53.4%
3045715
46.6%

Most occurring characters

ValueCountFrequency (%)
N52338
21.6%
O52338
21.6%
345715
18.9%
045715
18.9%
>34649
14.3%
<11066
 
4.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter104676
43.3%
Decimal Number91430
37.8%
Math Symbol45715
18.9%

Most frequent character per category

ValueCountFrequency (%)
>34649
75.8%
<11066
 
24.2%
ValueCountFrequency (%)
345715
50.0%
045715
50.0%
ValueCountFrequency (%)
N52338
50.0%
O52338
50.0%

Most occurring scripts

ValueCountFrequency (%)
Common137145
56.7%
Latin104676
43.3%

Most frequent character per script

ValueCountFrequency (%)
345715
33.3%
045715
33.3%
>34649
25.3%
<11066
 
8.1%
ValueCountFrequency (%)
N52338
50.0%
O52338
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII241821
100.0%

Most frequent character per block

ValueCountFrequency (%)
N52338
21.6%
O52338
21.6%
345715
18.9%
045715
18.9%
>34649
14.3%
<11066
 
4.6%

Interactions

2021-05-05T17:22:56.632487image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:56.781643image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:56.895798image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:57.014746image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:57.149633image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:57.289659image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:57.410034image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:57.535892image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:57.653061image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:57.776162image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:57.889884image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:58.006470image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:58.131170image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:58.253922image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:58.440613image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:58.567360image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:58.692652image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:58.818401image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:58.954070image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:59.357103image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:59.489910image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:59.610695image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:59.733569image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:59.849783image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:22:59.973762image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:00.094465image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:00.213134image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:00.334917image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:00.456714image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:00.581388image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:00.692498image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:00.815531image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:00.929938image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:01.052726image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:01.166941image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:01.286540image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:01.403193image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:01.520957image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:01.643998image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:01.769290image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:01.904157image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:02.035307image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:02.156172image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:02.269030image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:02.386750image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:02.506731image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:02.633924image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:02.750475image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:02.870427image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:03.012203image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:03.135117image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:03.260620image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:03.378435image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:03.506103image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:03.627253image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:03.746315image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:03.866705image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:03.990026image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:04.111199image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:04.234633image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:04.354671image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:04.474537image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:04.772237image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:04.895907image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:05.018955image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:05.137722image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:05.256134image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:05.373402image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:05.497960image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:05.616228image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:05.732743image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:05.857166image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:05.981262image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:06.107424image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:06.224175image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:06.345479image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:06.461193image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:06.581336image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:06.707225image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:06.837378image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:06.963464image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:07.091533image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:07.217210image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:07.345131image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:07.472484image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:07.592815image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:07.722113image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:07.845117image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:07.973003image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:08.086403image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:08.201720image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:08.311803image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:08.426508image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:08.539159image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:08.655503image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:08.767237image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:08.883939image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:09.002803image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:09.112425image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:09.223587image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:09.342422image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:09.470685image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:09.590720image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:09.718015image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:09.843694image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:09.970019image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:10.096158image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:10.225098image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:10.345186image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:10.471737image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:10.593361image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:10.703950image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:10.819065image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:10.934928image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:11.047233image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:11.383550image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:11.501565image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:11.618176image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:11.736090image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:11.843276image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:11.957346image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:12.070994image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:12.189967image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:12.313435image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:12.431212image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:12.549956image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:12.690573image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:12.811704image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:12.937704image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:13.064423image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:13.177510image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
2021-05-05T17:23:13.302184image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Correlations

2021-05-05T17:23:30.267397image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-05-05T17:23:30.497939image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-05-05T17:23:30.705184image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-05-05T17:23:30.960417image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-05-05T17:23:31.380856image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-05-05T17:23:13.890703image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
A simple visualization of nullity by column.
2021-05-05T17:23:15.746733image/svg+xmlMatplotlib v3.3.3, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

First rows

df_indexracegenderageadmission_type_iddischarge_disposition_idadmission_source_idtime_in_hospitalnum_lab_proceduresnum_proceduresnum_medicationsnumber_outpatientnumber_emergencynumber_inpatientdiag_1diag_2diag_3number_diagnosesmax_glu_serumA1Cresultmetforminrepaglinidenateglinidechlorpropamideglimepirideacetohexamideglipizideglyburidetolbutamidepioglitazonerosiglitazoneacarbosemiglitoltroglitazonetolazamideexamidecitogliptoninsulinglyburide-metforminglipizide-metforminglimepiride-pioglitazonemetformin-rosiglitazonemetformin-pioglitazonechangediabetesMedreadmitted
01CaucasianFemale[10-20)117359018000276250.012559NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoUpNoNoNoNoNoChYes>30
12AfricanAmericanFemale[20-30)117211513201648250V276NoneNoneNoNoNoNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoYesNO
23CaucasianMale[30-40)1172441160008250.434037NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoUpNoNoNoNoNoChYesNO
34CaucasianMale[40-50)117151080001971572505NoneNoneNoNoNoNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoChYesNO
45CaucasianMale[50-60)2123316160004144112509NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoYes>30
56CaucasianMale[60-70)312470121000414411V457NoneNoneSteadyNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoChYesNO
67CaucasianMale[70-80)1175730120004284922508NoneNoneNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoYes>30
78CaucasianFemale[80-90)2141368228000398427388NoneNoneNoNoNoNoNoNoSteadyNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoChYesNO
89CaucasianFemale[90-100)33412333180004341984868NoneNoneNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoSteadyNoNoNoNoNoChYesNO
910AfricanAmericanFemale[40-50)117947217000250.74039969NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoYes>30

Last rows

df_indexracegenderageadmission_type_iddischarge_disposition_idadmission_source_idtime_in_hospitalnum_lab_proceduresnum_proceduresnum_medicationsnumber_outpatientnumber_emergencynumber_inpatientdiag_1diag_2diag_3number_diagnosesmax_glu_serumA1Cresultmetforminrepaglinidenateglinidechlorpropamideglimepirideacetohexamideglipizideglyburidetolbutamidepioglitazonerosiglitazoneacarbosemiglitoltroglitazonetolazamideexamidecitogliptoninsulinglyburide-metforminglipizide-metforminglimepiride-pioglitazonemetformin-rosiglitazonemetformin-pioglitazonechangediabetesMedreadmitted
98043101756OtherFemale[60-70)1172466171119965854039NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoYes>30
98044101757CaucasianFemale[70-80)1175211160014915185119NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoYesNO
98045101758CaucasianFemale[80-90)11757612201029283049NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoUpNoNoNoNoNoChYesNO
98046101759CaucasianMale[80-90)117110153004357842507NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoUpNoNoNoNoNoChYesNO
98047101760AfricanAmericanFemale[60-70)1176451253123454384129NoneNoneNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoDownNoNoNoNoNoChYes>30
98048101761AfricanAmericanMale[70-80)137351016000250.132914589None>8SteadyNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoDownNoNoNoNoNoChYes>30
98049101762AfricanAmericanFemale[80-90)1455333180015602767879NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoSteadyNoNoNoNoNoNoYesNO
98050101763CaucasianMale[70-80)117153091003859029613NoneNoneSteadyNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoDownNoNoNoNoNoChYesNO
98051101764CaucasianFemale[80-90)23710452210019962859989NoneNoneNoNoNoNoNoNoSteadyNoNoSteadyNoNoNoNoNoNoNoUpNoNoNoNoNoChYesNO
98052101765CaucasianMale[70-80)117613330005305307879NoneNoneNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNoNO