segunda-feira, 18 de maio de 2020

Verificando a Saída de Machine Learning Não Supervisionada (Cluster Analysis) Utilizando ANOVA

SAS para ANOVA

ANOVA - Análise da Variança
Para comparar mais de duas Categorias
(Para comparar duas trabalhamos com o Teste T em Excel)


Temos as categorias:
AT : Atleta
SEM: Semi-atleta
SED : Sedentário
PR: Professor

São 4 categorias, se o numero de categorias é maior que 2 temos que usar o SAS (o Excel não resolve)



Programa SAS para ANOVA

data imc_dat;
input cat $ imc corr kcal;
cards;
AT 20.2 60.7 3200
AT 21.3 54.8 3100
AT 19.3 49.6 2800
AT 21.1 52.3 3300
SEM 22.4 14.9 2600
SEM 21.9 17.8 2700
SEM 23.8 18.6 3200
SEM 24.1 15.1 3300
SED  27.3 2.5 2700
SED 23.4 4.3 2300
SED  25.2 2.3 2600
SED 26.4 2.6 3200
PR 26.2 4.1 2600
PR 24.2 2.1 2700
PR 25.4 1.9 2650
;
proc print;
run;
proc glm;
 class cat;
 model imc corr kcal  = cat;
 means cat / duncan lines;
run;




Saída do Programa SAS para ANOVA



Obscatimccorrkcal
1AT20.260.73200
2AT21.354.83100
3AT19.349.62800
4AT21.152.33300
5SEM22.414.92600
6SEM21.917.82700
7SEM23.818.63200
8SEM24.115.13300
9SE27.32.52700
10SE23.44.32300
11SE25.22.32600
12SE26.42.63200
13PR26.24.12600
14PR24.22.12700
15PR25.41.92650

The GLM Procedure

Class Level Information
ClassLevelsValues
cat4AT PR SE SEM
Number of Observations Read15
Number of Observations Used15

The GLM Procedure

 

Dependent Variable: imc

SourceDFSum of SquaresMean SquareF ValuePr > F
Model363.9923333321.3307777814.230.0004
Error1116.491666671.49924242  
Corrected Total1480.48400000   
R-SquareCoeff VarRoot MSEimc Mean
0.7950945.2148021.22443623.48000
SourceDFType I SSMean SquareF ValuePr > F
cat363.9923333321.3307777814.230.0004
SourceDFType III SSMean SquareF ValuePr > F
cat363.9923333321.3307777814.230.0004
Fit Plot for imc by cat

The GLM Procedure

 

Dependent Variable: corr

SourceDFSum of SquaresMean SquareF ValuePr > F
Model36829.1585002276.386167300.25<.0001
Error1183.3975007.581591  
Corrected Total146912.556000   
R-SquareCoeff VarRoot MSEcorr Mean
0.98793513.604102.75346920.24000
SourceDFType I SSMean SquareF ValuePr > F
cat36829.1585002276.386167300.25<.0001
SourceDFType III SSMean SquareF ValuePr > F
cat36829.1585002276.386167300.25<.0001
Fit Plot for corr by cat

The GLM Procedure

 

Dependent Variable: kcal

SourceDFSum of SquaresMean SquareF ValuePr > F
Model3497333.333165777.7781.950.1801
Error11935000.00085000.000  
Corrected Total141432333.333   
R-SquareCoeff VarRoot MSEkcal Mean
0.34721910.18210291.54762863.333
SourceDFType I SSMean SquareF ValuePr > F
cat3497333.3333165777.77781.950.1801
SourceDFType III SSMean SquareF ValuePr > F
cat3497333.3333165777.77781.950.1801
Fit Plot for kcal by cat

The GLM Procedure

Distribution of imc by cat

The GLM Procedure

 

Duncan's Multiple Range Test for imc

Note:This test controls the Type I comparisonwise error rate, not the experimentwise error rate.

Alpha0.05
Error Degrees of Freedom11
Error Mean Square1.499242
Harmonic Mean of Cell Sizes3.692308

Note:Cell sizes are not equal.

Number of Means234
Critical Range1.9832.0752.129
#LN00264

The GLM Procedure

 

Distribution of corr by cat

The GLM Procedure

 

Duncan's Multiple Range Test for corr

Note:This test controls the Type I comparisonwise error rate, not the experimentwise error rate.

Alpha0.05
Error Degrees of Freedom11
Error Mean Square7.581591
Harmonic Mean of Cell Sizes3.692308

Note:Cell sizes are not equal.

Number of Means234
Critical Range4.4604.6654.788
#LN00284

The GLM Procedure

 

Distribution of kcal by cat

The GLM Procedure

 

Duncan's Multiple Range Test for kcal

Note:This test controls the Type I comparisonwise error rate, not the experimentwise error rate.

Alpha0.05
Error Degrees of Freedom11
Error Mean Square85000
Harmonic Mean of Cell Sizes3.692308

Note:Cell sizes are not equal.

Number of Means234
Critical Range472.3494.0507.0

Nenhum comentário:

Postar um comentário