You don't have access to this class

Keep learning! Join and start boosting your career

Aprovecha el precio especial y haz tu profesión a prueba de IA

Antes: $249

Currency
$209
Suscríbete

Termina en:

0 Días
0 Hrs
51 Min
30 Seg

Red neuronal de pronóstico con datos reales

33/37
Resources

How to tune a neural network with real data?

In the exciting world of machine learning, neural networks play a crucial role in predicting results based on data. But how do we start working with them using real data? Join me to find out through the following analysis that will use a sample to predict math scores using socioeconomic variables.

What libraries were used in the experiment?

To build our neural network, the saber and nnet libraries were used in R. These tools are essential for handling data and building neural network models effectively. Here's how to invoke them in your programming environment:

library(saber)library(nnet)

What were the variables and sample size?

A sample size of 2000 observations was selected. The variables used are mainly socioeconomic, providing information about the student's household:

  • Number of people in the household
  • Number of rooms
  • Ownership of washing machine, refrigerator, oven, DVD, microwave, and car.

Also considered, although not used in this specific exercise, were technological variables such as having Internet or a cell phone. This opens the possibility of integrating other types of data in future modeling.

How was the sample managed?

The sample selection process was implemented using a logical vector, which helps to identify the rows that will be part of this random sample. The approach used is shown here:

sample_indices <- seq_len(nrow(data)) %in% sample(seq_len(nrow(data)), size = sample_size)sample <- data[sample_indices, variables].

Creating a sample using logical indexes allows keeping only the necessary observations and facilitates further analysis by reducing the data set to a manageable size.

How was the neural network constructed?

It was time to create the neural network. Using nnet, a model was established to predict the math score as a response, using the previously selected variables:

neural_net <- nnet(math_score ~., data = sample, size = 10, linout = TRUE, trace = TRUE).

Here, size specifies the number of neurons in the hidden layer, and linout = TRUE indicates that linear output is desired.

What results were obtained and how were they displayed?

When plotting the results, the predictions are intended to be aligned with the actual values of the math score. The following code was used to visualize this:

plot(sample$math_score, predict(red_neuronal, sample), col = 2, lwd = 2)abline(0, 1).

Although the results show degrees of scatter and some deviation, the plot reveals whether the model is doing a reasonably accurate job of predicting the expected results.

How to improve the model's prediction?

The constructed neural network can benefit from an optimization process. The key is to thoroughly explore the database and detect variables that have a high correlation with the math score. These could include:

  • Scores in other subjects (such as physics).
  • School characteristics

Thorough exploration and analysis of these variables can significantly improve future predictions, emphasizing those that reflect a more significant impact on academic performance.

Remember, the path of machine learning is a continuous process of experimentation and optimization. Keep exploring, analyzing and fine-tuning your model to get the best possible results - the adventure has just begun!

Contributions 6

Questions 1

Sort by:

Want to see more contributions, questions and answers from the community?

Les comparto mis notas sobre este curso, espero les sean de utilidad

https://github.com/rb-one/Curso_de_Estadistica_Inferencial_con_R/blob/master/Notes/note.md

# Intervalos de confianza de la media -------------------------------------

table(SB11_20111$ECON_SN_INTERNET)

# ¿el internet tiene que ver con el puntaje en fisica?

tamano_muestral <- 300
iteraciones <- 100

poblacion_A <- SB11_20111$FISICA_PUNT[SB11_20111$ECON_SN_INTERNET == 0]
media_pob_A <- mean(poblacion_A, na.rm = TRUE)

poblacion_B <- SB11_20111$FISICA_PUNT[SB11_20111$ECON_SN_INTERNET == 1]
media_pob_B <- mean(poblacion_B, na.rm = TRUE)

plot(media_pob_A, media_pob_B, col = 4, pch= 20)
abline(0, 1)

for (i in seq_len(iteraciones)){
  muestra <- sample(seq_len(nrow(SB11_20111)), tamano_muestral)
  
  cuales_A <- seq_len(nrow(SB11_20111)) %in% muestra & SB11_20111$ECON_SN_INTERNET == 0
  muestra_A <- SB11_20111$FISICA_PUNT[cuales_A]
  
  media_muestral_A <- mean(muestra_A, na.rm = TRUE)
  t_test_A <- t.test(muestra_A)
  intervalo_A <- t_test_A$conf.int
  LI_A <- min(intervalo_A)
  LS_A <- max(intervalo_A)
  
  cuales_B <- seq_len(nrow(SB11_20111)) %in% muestra & SB11_20111$ECON_SN_INTERNET == 1
  muestra_B <- SB11_20111$FISICA_PUNT[cuales_B]
  
  media_muestral_B <- mean(muestra_B, na.rm = TRUE)
  t_test_B <- t.test(muestra_B)
  intervalo_B <- t_test_B$conf.int
  LI_B <- min(intervalo_B)
  LS_B <- max(intervalo_B)
  
  rect(LI_A, LI_B, LS_A, LS_B)
  
  }

points(media_pob_A, media_pob_B, col = 4, pch = 20, cex = 4)


# Red neuronal de pronostico con datos reales -----------------------------

# Paquetes
library('nnet')

tamano_muestral <- 2000

c(
 'ECON_PERSONAS_HOGAR',
 'ECON_CUARTOS',
 'ECON_SN_LAVADORA',
 'ECON_SN_NEVERA',
 'ECON_SN_HORNO',
 'ECON_SN_DVD',
 'ECON_SN_MICROHONDAS',
 'ECON_SN_AUTOMOVIL',
 'MATEMATICAS_PUNT'
) -> variables

indices_muestra <- seq_len(nrow(SB11_20111)) %in% sample(seq_len(nrow(SB11_20111)), tamano_muestral)

muestra <- subset(SB11_20111, subset = indices_muestra, select = variables)
muestra <- na.omit(muestra)

red_neuronal <- nnet(MATEMATICAS_PUNT ~ ., data = muestra, size = 10, linout= TRUE)

plot(muestra$MATEMATICAS_PUNT ~ predict(red_neuronal))
abline(0, 1, lwd = 2, col =2)

Clasifique las materias en ciencias exactas y ciencias sociales, y obtuve una mejor predicción de la puntuación en matemáticas.
Grafica ciencias exactas, “Biología, Física, Química”

Grafica ciencias sociales, “Lenguaje, Ciencias sociales, Filosofía, Inglés”

Les queria dejar mi notebook pero alguien dejo un mejor aporte @rubenbermudezrivera
https://colab.research.google.com/drive/13ETzG_S4yqa1li6iMXPET1HfYhfm_Yru?usp=sharing
Lo dejo igual

Tomando como variables las calificaciones de las demás materias.

✔ Mi resultado del reto con las variables

"ECON_SN_COMPUTADOR", “ECON_SN_INTERNET”, “FISICA_PUNT”, “QUIMICA_PUNT”, “MATEMATICAS_PUNT” (esta es nuestra y)



mejoró significativamente la posibilidad de predicción con esas variables, seguiré testeando más combinaciones pero esta me gustó bastante. 😁