Copien y presten atención a la explicación!!
nhanes_model_df = (
nhanes_df.select_columns('height','weight','gender','age')
.sort_values(by='height')
.transform_column(
'weight',
lambda x: x.ffill(),
elementwise=False
)
.missing.bind_shadow_matrix(
True,
False,
suffix='_imp',
only_missing=False
)
)
nhanes_model_df
height_ols = (
nhanes_model_df
.pipe(lambda df: smf.ols('height ~ weight + gender + age', data=df))
.fit()
)
ols_imputed_values = (
nhanes_model_df
.pipe(
lambda df: df[df.height.isna()]
)
.pipe(
lambda df: height_ols.predict(df).round()
)
)
ols_imputed_values
nhanes_model_df.loc[nhanes_model_df.height.isna(), ['height']] = ols_imputed_values
nhanes_model_df
(
nhanes_model_df.missing.scatter_imputation_plot(
x='weight',
y='height'
)
)
¿Quieres ver más aportes, preguntas y respuestas de la comunidad?
o inicia sesión.