NumPy
Fundamentos para Análisis de Datos en NumPy y Pandas
Dimensiones en NumPy y Pandas: De Escalares a Tensors
Arrays en NumPy
Introducción al álgebra lineal con NumPy
Indexación y Slicing
Broadcasting y Operaciones Lógicas en NumPy
Elementos Únicos y sus Conteos: Copias y Vistas
Transformación de Arrays: Reshape y Manipulación
Caso Práctico de Análisis de Datos
Cálculos Matriciales en NumPy
Ejercicios en NumPy
Pandas
Pandas para Manipulación de Datos
Creación de Dataframes en Pandas
Estructuras de Datos en Pandas y Funciones
Uso de iloc y loc en Pandas
Manejo de Datos Faltantes en Pandas
Creación y Manipulación de Columnas en Pandas
Agrupaciones con groupby
Filtrado de datos con condiciones en Pandas
Reestructuración de datos: Pivot y Reshape en Pandas
Fusión de DataFrames en Pandas
Manejo de Series Temporales en Pandas
Matplotlib
Introducción a Matplotlib gráfico de líneas y dispersión
Personalización de Gráficos en Matplotlib
Gráficos de Barras y Diagramas de Pastel
Gráficos de Histograma y Boxplot para distribuciones
Series de tiempo y manejo de fechas con Matplotlib
Subplots y Layouts Avanzados
Proyecto de Análisis de Datos de Retail
Caso de Estudio (Parte I). Limpieza de datos
Caso de Estudio (Parte II). Creación de columnas
Caso de Estudio (Parte III). Graficación y análisis de resultados
Proyecto Final: Creación de Portafolio de Análisis de Datos
You don't have access to this class
Keep learning! Join and start boosting your career
Matplotlib has established itself as one of the most essential libraries for data visualization in Python. Since its creation in 2003 by John D. Hunter, it has become the standard for high-quality graphics in disciplines ranging from science to finance. This robust toolset enables data analysts and technology scientists to not only present information visually, but also to explore complex relationships within the data, facilitating deeper analysis. In addition, its integration with libraries such as NumPy and Pandas simplifies data analysis and presentation in a variety of ways.
Although we work in Google Collaboratory, where Matplotlib comes pre-installed, it is useful to remember how to install it using the pip
tool, especially if you are working in virtual environments or in Visual Studio Code. The installation is done with the command:
pip install matplotlib
This command ensures that the library is available for use, which is crucial if you are working in an environment other than Google Collaboratory.
Line charts are usually used to show trends over time or continuous changes in data. To get started, you need to import NumPy
and Matplotlib
, specifically the PyPlot
module.
import numpy as npimport matplotlib.pyplot as plt
Suppose we want to plot the monthly sales of a product. You could create the months and sales data as follows:
months = np.array(['E', 'F', 'M', 'A', 'M'])sales = np.array([20, 25, 30, 28, 28, 35])
With the data ready, we set up and display the chart:
plt.figure(figsize=(8, 6))plt.plot(months, sales, marker='o', color='blue')plt.title('Monthly sales of a product')plt.xlabel('Months')plt.ylabel('Sales (in thousands of units)')plt.show().
This visual representation allows us to identify patterns or seasonalities that could be relevant for business decisions.
A scatter chart is ideal for visualizing the relationship between two variables. For example, you might be interested in how the number of hours studied affects performance on an exam. In this way, you could structure your data in lists:
study_hours = [1, 2, 3, 3, 4, 5, 5, 6, 7, 8]exam_score = [55, 60, 65, 70, 75, 80, 85, 90].
To create and display the scatter plot:
plt.figure(figsize=(8, 6))plt.scatter(study_hours, exam_score, color='green')plt.title('Relationship between hours studied and score')plt.xlabel('Study hours')plt.ylabel('Exam score')plt.show().
This graph is vital to detect whether there is a positive or negative correlation between the variables under study, which is useful in experimental or correlational studies.
With the basics of line and scatter plots covered, the next step is to customize these plots. Customizations such as adjusting axes, using different marker styles, and including legends or textures can improve the clarity and effectiveness of the display. This customization improves the accuracy and perception of the data, skills that we will develop further in later classes.
In summary, Matplotlib is a powerful tool for bringing data from different sectors to life, and with each practice session, your ability to present visually will deepen. Keep practicing and customizing!
Contributions 11
Questions 1
Want to see more contributions, questions and answers from the community?