Hola
Les comparto una manera de graficar los clusters un poco mas sencilla.
Fundamentos de clustering
¿Qué es el clustering en machine learning?
Tu primer clustering con scikit-learn
¿Cuándo usar clustering?
¿Cómo evaluar modelos de clustering?
K-means
¿Qué es el algoritmo de K-means y cómo funciona?
¿Cuándo usar K-means?
Implementando K-means
Encontrando K
Evaluando resultados de K-means
Hierarchical clustering
¿Qué es hierarchical clustering y cómo funciona?
¿Cuándo usar hierarchical clustering?
Implementando hierarchical clustering
Evaluando resultados de hierarchical clustering
DBSCAN
¿Qué es DBSCAN y cómo funciona?
¿Cuándo usar DBSCAN?
Implementando DBSCAN
Encontrar híper-parámetros
Evaluando resultados de DBSCAN
Proyecto: resolviendo un problema con clustering
Preparar datos para clusterizar
Aplicando PCA para clustering
Resolviendo con K-means
Resolviendo con hierarchical clustering
Resolviendo con DBSCAN
Resolviendo con DBSCAN (sin PCA)
Evaluación resultados de distintos modelos de clustering
Conclusiones
Proyecto final y cierre
Comparte tu proyecto de segmentación con clustering y certifícate
Aportes 12
Preguntas 2
Hola
Les comparto una manera de graficar los clusters un poco mas sencilla.
Hola, no me queria funcionar el codigo en la linea
fig, ax = plt.subplots(1,1, figsize=(15,10))
Si alguien tiene ese mismo error, mi solucion fue importar subplots
from matplotlib.pyplot import subplots
Y luego eliminar “plt” de la linea de codigo.
fig, ax = subplots(1,1, figsize=(15,10))
Espero le sirva a alguien
Machine Learning: allá voy 🚀🔥
import plotly.graph_objects as go
import plotly.io as pio
pio.templates['new_template'] = go.layout.Template()
pio.templates['new_template']['layout']['font'] = {'family': 'verdana', 'size': 16, 'color': 'white'}
pio.templates['new_template']['layout']['paper_bgcolor'] = 'black'
pio.templates['new_template']['layout']['plot_bgcolor'] = 'black'
pio.templates['new_template']['layout']['xaxis'] = {'title_standoff': 10, 'linecolor': 'black', 'mirror': True, 'gridcolor': '#EEEEEE'}
pio.templates['new_template']['layout']['yaxis'] = {'title_standoff': 10, 'linecolor': 'black', 'mirror': True, 'gridcolor': '#EEEEEE'}
pio.templates['new_template']['layout']['legend_bgcolor'] = 'rgb(117, 112, 179)'
pio.templates['new_template']['layout']['height'] = 700
pio.templates['new_template']['layout']['width'] = 1000
pio.templates['new_template']['layout']['autosize'] = False
pio.templates.default = 'new_template'
%run "template_visualitation.ipynb"
import plotly.graph_objects as go
import plotly.express as px
def graficar_clusters_plotly(x,y, color, show=True):
global fig1
fig1 = go.Figure()
y_uniques = pd.Series(y).unique()
for _ in y_uniques:
fig1.add_traces(data=px.scatter(x=x[y==_][:,0], y=x[y==_][:,1],opacity=0.8, color_discrete_sequence=[color[_]]).data)
fig1.update_layout(showlegend=True)
fig1.show()
graficar_clusters_plotly(x,y, ['red', 'blue', 'green', 'white'])
x, y = df_blobls[['x1','x2']], df_blobls['y']
graficar_clusters_plotly(x,y_pred, ['red', 'blue', 'green', 'white', 'yellow'])
A modo de aporte para ir más allá de estos modelos básicos que se ven. Les recomiendo revisar
This is a function definition for a function called plot_2d_clusters that takes in three arguments: x, y, and ax.
The function does the following:
It first creates a Pandas Series object from the y argument, and then uses the unique method of the Series to find the unique values in y. It stores the resulting array of unique values in the variable y_uniques.
It then enters a loop, in which it iterates over the unique values in y_uniques. On each iteration of the loop, the function plots the data points in x where the corresponding value in y is equal to the current unique value being iterated over.
The plot is created using the plot method of the DataFrame object x, which is passed the following arguments:
title: a string that is the title of the plot, which is constructed using string interpolation to insert the number of unique values in y into the string.
kind: the type of plot to create, which is set to ‘scatter’.
x: the name of the column in x to use as the x-axis data.
y: the name of the column in x to use as the y-axis data.
marker: a string that specifies the marker to use for the data points in the plot, which is constructed using string interpolation to insert the current unique value being iterated over into the string.
ax: the Matplotlib Axes object to use for the plot.
It’s worth noting that this function does not return anything, but instead creates a plot using the ax argument.
Otra forma de graficar los clusters:
plt.figure(figsize=(6,6))
def plot_blobs(x, y, ax, cmap='viridis'):
labels = np.unique(y)
cmap_ = plt.get_cmap(cmap, lut=len(labels))
for label in labels:
sub_idx = np.argwhere(y == label).ravel()
sub_x = x[sub_idx]
sub_y = y[sub_idx]
ax.scatter(sub_x[:,0], sub_x[:,1], color=cmap_(label), label=label)
plot_blobs(x,y, plt.gca(), cmap='Dark2')
plt.legend()
plt.show()
Para agilizar
import numpy as np
import pandas as pd
from sklearn.datasets import make_blobs
import seaborn as sns
import matplotlib.pyplot as plt
x, y = make_blobs(n_samples=100, centers=4, n_features=2, cluster_std=[1,1.5,2,2], random_state=7)
df_blobls = pd.DataFrame({
'x1': x[:,0],
'x2':x[:,1],
'y':y
})
def plot_2d_clusters(x,y,ax):
y_uniques = pd.Series(y).unique()
for _ in y_uniques:
x[y==_].plot(
title=f'{len(y_uniques)} Clusters',
kind='scatter',
x='x1',
y='x2',
marker = f'${_}$',
ax = ax
)
fig, ax = plt.subplots(1,1, figsize=(15,10))
x, y = df_blobls[['x1','x2']], df_blobls['y']
plot_2d_clusters(x,y,ax)
plt.show()```
Comparto el codigo para mostarlo más estetico!!
def plot_2d_clusters(x, y ,axe, title):
y_uniques = pd.Series(y).unique()
for y_unique in y_uniques:
axe.scatter(x[y == y_unique, 0], x[y == y_unique, 1], label=y_unique)
axe.set_title(title)
axe.markers = [f"${y_unique}$"]
axe.legend()
¿Quieres ver más aportes, preguntas y respuestas de la comunidad?