Bienvenida e introducci贸n al curso

1

Iniciando con Big Data

2

Cloud Computing en proyectos de BigData

3

Introducci贸n al manejo de datos en Cloud

4

Datos en Cloud

5

驴Qu茅 nube deber铆a utilizar en mi proyecto de Big Data?

Arquitecturas

6

Arquitecturas Lambda

7

Arquitectura Kappa

8

Arquitectura Batch

Extracci贸n de informaci贸n

9

Llevar tu informaci贸n al cloud

10

Demo - Creando nuestro IDE en la nube con Python - Boto3

11

驴C贸mo usar Boto3?

12

API Gateway

13

Storage Gateway

14

Kinesis Data Streams

15

Configuraci贸n de Kinesis Data Streams

16

Demo - Despegando Kinesis con Cloudformation

17

Kinesis Firehose

18

Demo - Configuraci贸n de Kinesis Firehose

19

Reto - Configurando Kinesis Firehose

20

AWS - MSK

21

Demo - Despliegue de un cl煤ster con MSK

Transformaci贸n de Informaci贸n

22

AWS - Glue

23

Demo - Instalando Apache Zeppelin

24

Creaci贸n del Developer Endpoint

25

Demo - Conectando nuestro developer Endpoint a nuestro Zeppelin Edpoint

26

Demo - Creando nuestro primer ETL - Crawling

27

Demo - Creando nuestro primer ETL - Ejecuci贸n

28

Demo - Creando nuestro primer ETL - Carga

29

AWS - EMR

30

Demo - Desplegando nuestro primer cl煤ster con EMR

31

Demo - Conect谩ndonos a Apache Zeppelin en EMR

32

Demo- Despliegue autom谩tico de EMR con cloudformation

33

AWS - Lambda

34

Ejemplos AWS- Lambda

35

Demo - Creando una lambda para BigData

Carga de Informaci贸n

36

AWS - Athena

37

Demo - Consultando data con Athena

38

AWS - RedShift

39

Demo - Creando nuestro primer cl煤ster de RedShift

40

AWS - Lake Formation

Consumo de informaci贸n

41

AWS - ElasticSearch

42

Demo - Creando nuestro primer cl煤ster de ElasticSearch

43

AWS - Kibana

44

AWS - QuickSight

45

Demo - Visualizando nuestra data con QuickSight

Seguridad, Orquestaci贸n y Automatizaci贸n

46

Seguridad en los Datos

47

AWS Macie

48

Demo - Configurando AWS Macie

49

Apache Airflow

50

Demo - Creando nuestro primer cl煤ster en Cloud Composer

51

Arquitectura de referencia

Clase p煤blica

52

驴Qu茅 es Big Data?

You don't have access to this class

Keep learning! Join and start boosting your career

Aprovecha el precio especial y haz tu profesi贸n a prueba de IA

Antes: $249

Currency
$209
Suscr铆bete

Termina en:

0 D铆as
10 Hrs
19 Min
33 Seg
Curso de Big Data en AWS

Curso de Big Data en AWS

Carlos Andr茅s Zambrano Barrera

Carlos Andr茅s Zambrano Barrera

Demo - Creando una lambda para BigData

35/52
Resources

How can we create a Lambda function in AWS for Big Data?

Lambda functions in AWS are a key tool in Big Data management and processing thanks to their ability to execute code in response to events and their integration with other AWS services. Here's how to create a Lambda function from scratch, step by step, and the critical aspects to consider.

How to create a Lambda function from scratch?

  • Log in to the AWS console: Log in to your AWS console and search for the Lambda service.
  • Create the function: Select "Create function" and choose the option to create one from scratch. Assign a name, e.g. "Platzi", and select the appropriate runtime (in this case, Python 3.6 is used).
  • Select a role: It is crucial to define the role that will determine the permissions of the Lambda function to interact with other services.

What are triggers?

Triggers are events that initiate the execution of a Lambda function. For Big Data projects, it is common to use SNS or SQS. If you opt for SQS, you will connect to standard queues. These services allow you to orchestrate complex workflows by notifying your Lambda function when certain events occur.

What is the "layers" functionality?

"Layers is a feature that simplifies the management of shared libraries between multiple Lambda functions. By using layers, you can centralize and replicate libraries efficiently, reducing administration time.

What is the importance of environment variables?

In Big Data, environment variables play a crucial role in the secure handling of connections, such as those to databases. It is always recommended to encrypt this information, using services such as KMS to ensure confidentiality.

How to manage roles and permissions?

The role of the Lambda function defines which services it can interact with. For example, a role can grant access to CloudWatch Logs and CloudFormation. It is essential to apply the principle of least privilege, granting only the necessary permissions.

How to optimize and configure our Lambda functions?

Proper optimization and configuration of Lambda functions are essential to maximize efficiency in Big Data projects. Let's explore several important configurations.

How to manage memory and execution time?

  • Memory: Adjust memory based on the code and execution needed for your function, increasing the allocation as you need it.
  • Execution Time: Also known as "timeout", you can configure it up to a maximum of 15 minutes, adapting it to the requirements of your process.

What considerations should we take into account regarding the network?

It is possible to deploy Lambda functions within a VPC, defining the subnet and the security group. This allows you to precisely control the network environment in which your function runs.

What does "dead letter queue" mean?

Dead letter queues are crucial tools that ensure that critical events are not lost in Big Data management. In the case of regular failures or errors in the execution of functions, problematic messages can be redirected to a secondary queue for further review and processing.

Why is it important to enable the X-Ray service?

Enabling X-Ray allows detailed tracking of the execution of Lambda functions. This is vital for identifying bottlenecks and lag times, providing an in-depth analysis of the performance of your cloud applications.

What are other essential aspects of configuring Lambdas for Big Data?

Finally, some more advanced aspects enable greater efficiency and tracking in Lambda function execution.

How do we configure concurrency?

Lambda allows a default concurrency of 1000 simultaneous instances. It is possible to increase this number up to 20,000 by making a request to AWS. Reserve concurrency to ensure that your critical functions always have resources available when they need them.

How do we manage monitoring and events?

It is essential to log all executions using CloudWatch Logs to ensure you capture metrics and events that can be critical to troubleshooting and understanding the behavior of your application.

What roles do Glue, EMR and Lambdas have in Big Data transformation?

In Big Data projects, different services play complementary roles:

  • Glue: Fully managed and serverless ETL service.
  • EMR (Elastic MapReduce): Enables data transformation and analysis using managed clusters.
  • Lambdas: They offer flexibility for real-time and batch projects, allowing the transformation of information without the need to manage servers.

I encourage you to continue exploring these capabilities in your projects, taking full advantage of the tools and configurations available in AWS to transform Big Data efficiently and securely.

Contributions 5

Questions 0

Sort by:

Want to see more contributions, questions and answers from the community?

Super importante entender el tema de layers: ademas que util: cuando se usan varios lambdas se pueden repetir las librerias he ahi donde se pueden agregar al lambda como capas/layers organizandolas.

Buenas no entendi la funcionalidad de concurrencias en Lambda. Alguien me pudiera explicar brevemente? Muchas gracias.

Aqu铆 tienes un **Dockerfile** para crear un entorno con AWS Lambda orientado a Big Data. La funci贸n Lambda podr铆a procesar datos desde S3, utilizar PySpark, Pandas o Boto3 para interactuar con servicios de AWS.
Excelente este curso, el tema de lambdas me ayudo a aclarar muchos conceptos

excelente!!