NumPy
Fundamentos para An谩lisis de Datos en NumPy y Pandas
Dimensiones en NumPy y Pandas: De Escalares a Tensors
Arrays en NumPy
Introducci贸n al 谩lgebra lineal con NumPy
Indexaci贸n y Slicing
Broadcasting y Operaciones L贸gicas en NumPy
Elementos 脷nicos y sus Conteos: Copias y Vistas
Transformaci贸n de Arrays: Reshape y Manipulaci贸n
Caso Pr谩ctico de An谩lisis de Datos
C谩lculos Matriciales en NumPy
Ejercicios en NumPy
Pandas
Pandas para Manipulaci贸n de Datos
Creaci贸n de Dataframes en Pandas
Estructuras de Datos en Pandas y Funciones
Uso de iloc y loc en Pandas
Manejo de Datos Faltantes en Pandas
Creaci贸n y Manipulaci贸n de Columnas en Pandas
Agrupaciones con groupby
Filtrado de datos con condiciones en Pandas
Reestructuraci贸n de datos: Pivot y Reshape en Pandas
Fusi贸n de DataFrames en Pandas
Manejo de Series Temporales en Pandas
Matplotlib
Introducci贸n a Matplotlib gr谩fico de l铆neas y dispersi贸n
Personalizaci贸n de Gr谩ficos en Matplotlib
Gr谩ficos de Barras y Diagramas de Pastel
Gr谩ficos de Histograma y Boxplot para distribuciones
Series de tiempo y manejo de fechas con Matplotlib
Subplots y Layouts Avanzados
Proyecto de An谩lisis de Datos de Retail
Caso de Estudio (Parte I). Limpieza de datos
Caso de Estudio (Parte II). Creaci贸n de columnas
Caso de Estudio (Parte III). Graficaci贸n y an谩lisis de resultados
Proyecto Final: Creaci贸n de Portafolio de An谩lisis de Datos
You don't have access to this class
Keep learning! Join and start boosting your career
When working with data, it is critical to understand how we can derive new information to maximize the value of our analysis. This approach is crucial for the data transformation stage, where we convert existing columns into new structures that facilitate deeper analysis. For example, from columns such as quantity
and unit price
, we can create an additional column called total amount
by simply multiplying those columns. This approach also applies to other types of data, such as dates, which can be transformed into time series formats for ease of use.
Creating the total amount
column:The total amount
column is created by multiplying quantity
by unit price
. This is done as follows in Python code using Pandas:
data['total amount'] = data['quantity'] * data['unit price'].
Date transformation:When working with dates, it is useful to convert them to a datetime format for later analysis facilities. With Pandas, you can do it like this:
data['invoice date'] = pd.to_datetime(data['invoice date'])
This then allows you to extract more specific information, such as year, month, or even time, by creating additional columns:
data['year'] = data['invoice date'].dt.yeardata['month'] = data['invoice date'].dt.month
Drilling down temporal information provides a detailed view of the data. Converting dates to datetime data types allows us to segment data by time, which is valuable for identifying trends over specific time periods. This segmentation is vital for financial analysis, especially when reviewing annual, semi-annual or quarterly sales.
To create a semi-annual breakdown, we first assign each month to a semester using a lambda function:
data['semester'] = data['month'].apply(lambda x: 1 if x <= 6 else 2).
Subsequently, you can group the data by year and semester to get insights:
sales_per_semester = data.groupby(['year', 'semester'])['total amount'].sum().reset_index().
This grouping allows us to observe the sales distributed in specific time segments, facilitating a comparative analysis between different periods.
By collecting and aggregating sales information by different time segments, we can make informed decisions about consumption patterns and sales performance across different periods.
When breaking down annual and semi-annual sales, we first group the data using the groupby
functionality in Pandas. We then sum the sales for each group:
For annual sales:
sales_per_year = data.groupby('year')['total amount'].sum().reset_index().
For semi-annual sales:
sales_per_semester = data.groupby(['year', 'semester'])['total amount'].sum().reset_index().
This provides you with a consolidated dataset that reflects key metrics, allowing you to make strategic assessments and adjustments based on time periods.
Extending data analysis by extracting additional insights fosters a deeper and more actionable understanding of the data. This not only enhances business decision making, but also enriches the strategies to be implemented.
By deriving temporal and categorical data, we visualize trends that are not obvious to the naked eye. In addition, determining quarterly or monthly sales allows us to identify cyclical patterns, demand peaks, or recession periods, shedding light on how to optimize operations and marketing strategies.
In this approach, the reader is invited to apply similar processes and explore further to create meaningful insights that bring value to their particular context.
Contributions 20
Questions 0
Want to see more contributions, questions and answers from the community?