Bienvenida e introducción al curso

1

Iniciando con Big Data

2

Cloud Computing en proyectos de BigData

3

Introducción al manejo de datos en Cloud

4

Datos en Cloud

5

¿Qué nube debería utilizar en mi proyecto de Big Data?

Arquitecturas

6

Arquitecturas Lambda

7

Arquitectura Kappa

8

Arquitectura Batch

Extracción de información

9

Llevar tu información al cloud

10

Demo - Creando nuestro IDE en la nube con Python - Boto3

11

¿Cómo usar Boto3?

12

API Gateway

13

Storage Gateway

14

Kinesis Data Streams

15

Configuración de Kinesis Data Streams

16

Demo - Despegando Kinesis con Cloudformation

17

Kinesis Firehose

18

Demo - Configuración de Kinesis Firehose

19

Reto - Configurando Kinesis Firehose

20

AWS - MSK

21

Demo - Despliegue de un clúster con MSK

Transformación de Información

22

AWS - Glue

23

Demo - Instalando Apache Zeppelin

24

Creación del Developer Endpoint

25

Demo - Conectando nuestro developer Endpoint a nuestro Zeppelin Edpoint

26

Demo - Creando nuestro primer ETL - Crawling

27

Demo - Creando nuestro primer ETL - Ejecución

28

Demo - Creando nuestro primer ETL - Carga

29

AWS - EMR

30

Demo - Desplegando nuestro primer clúster con EMR

31

Demo - Conectándonos a Apache Zeppelin en EMR

32

Demo- Despliegue automático de EMR con cloudformation

33

AWS - Lambda

34

Ejemplos AWS- Lambda

35

Demo - Creando una lambda para BigData

Carga de Información

36

AWS - Athena

37

Demo - Consultando data con Athena

38

AWS - RedShift

39

Demo - Creando nuestro primer clúster de RedShift

40

AWS - Lake Formation

Consumo de información

41

AWS - ElasticSearch

42

Demo - Creando nuestro primer clúster de ElasticSearch

43

AWS - Kibana

44

AWS - QuickSight

45

Demo - Visualizando nuestra data con QuickSight

Seguridad, Orquestación y Automatización

46

Seguridad en los Datos

47

AWS Macie

48

Demo - Configurando AWS Macie

49

Apache Airflow

50

Demo - Creando nuestro primer clúster en Cloud Composer

51

Arquitectura de referencia

Clase pública

52

¿Qué es Big Data?

You don't have access to this class

Keep learning! Join and start boosting your career

Aprovecha el precio especial y haz tu profesión a prueba de IA

Antes: $249

Currency
$209
Suscríbete

Termina en:

0 Días
4 Hrs
38 Min
10 Seg
Curso de Big Data en AWS

Curso de Big Data en AWS

Carlos Andrés Zambrano Barrera

Carlos Andrés Zambrano Barrera

Configuración de Kinesis Data Streams

15/52
Resources

How do I create a Kinesis Data Stream on AWS?

To manage and process large amounts of data in real time, AWS Kinesis is one of the most effective solutions. You can create and customize a Kinesis Data Stream through the AWS console by following a few key steps. Here we take you through the creation and initial configuration process.

How to access Kinesis on AWS?

  1. Log in to the AWS console.
  2. Search for Kinesis in the services panel. You will find two options: Kinesis and Kinesis Video Stream. Select Kinesis to continue with Data Streams.
  3. Enter the Kinesis menu and click "Get Started" to begin creating your Data Stream.

What are the options available for Kinesis?

Once inside the Kinesis environment, you will see four main options to create:

  • Data Stream: Ideal for real-time processes.
  • Delivery Stream: For data delivery to services such as Amazon S3.
  • Analytics: Facilitates real-time analysis of streamed data.
  • Video Streams: For real-time video streaming.

In our case study, we will select "Create Data Stream".

How to name and configure your Data Stream?

  1. Assign a name to your stream. For example, "Platzi Kinesis".

  2. Determine the number of shards. Shards define the capacity to handle the amount of traffic and records you will process.

    • Each shard can:
      • Process 1 megabyte per second of write data.
      • Support up to 5 megabytes per second for reads.

Properly assessing your data load is critical to defining how many shards you will need.

How to send data to Kinesis Data Stream?

Once the Kinesis Data Stream is configured, there are several options for sending data:

  • API PUT Operation: Standard interface for data transmission.
  • Kinesis Producer Library (KPL): A library that optimizes data production through a producer.
  • Integration with Kinesis Firehose and Kinesis Analytics: For more advanced processing and analysis.

What is important to consider about security and monitoring?

  • Data encryption: Although server-side encryption is not initially enabled, you can enable encryption with the AWS Key Management System (KMS) to protect your data.
  • Data retention periods: Within a range of 24 to 168 hours, affecting the cost of the service.
  • Integration with CloudWatch: Allows you to process essential metrics and logs to monitor and debug data flow.

How to customize dashboards and labels?

The monitoring section provides you with predefined dashboards to examine different metrics. Group and filter resources by tags such as "Environment" to facilitate resource management and reporting.

In summary, when creating a Kinesis Data Stream, be sure to consider the number of shards, data retention, enrollment and log management. These aspects are essential to optimize the costs and efficiency of your data streaming on AWS. While the process may seem complex initially, with practice and proper planning, you will master the creation and management of your streams to maximize the potential of your applications. Keep exploring and learning about Kinesis and other AWS services!

Contributions 9

Questions 3

Sort by:

Want to see more contributions, questions and answers from the community?

Recuerden que Kinesis no forma parte de la capa gratuita de AWS. 👀

Les dejo un ejemplo de como usar kinesis con boto3
https://www.youtube.com/watch?v=KCuu_jcyZF8

I’m not doing the Demos, because it scares me to have to pay a huge amount of USD dollars, for keep running some services or instances, because sometimes we forget to shut down some services, and also we don’t know that some services keep running.

En la actualidad (septiembre 2023) el servicio permite escoger entre bajo demanda o aprovisionado: El primero solo te cobran lo que consumes sin realizar ninguna estimación y el segundo es lo explicado en esta clase.

procesamiento de video en tiempo real, wuaww… , que se puede lograr con ello?

Cual es la diferencia entre API Gateway y Kinesis Data Streams?. Es posible usar Kinesis Data Streams sin necesidad de usar API Gateway?. API Gateway actúa como un Data Transfer Engine?, pero Kinesis Datastream parece ser un Data Transfer Engine también y a la vez un primer nivel de procesamiento, Es correcto este entendimiento?. Es este nivel de procesamiento un ETL en Kinesis Data Stream?

Con streaming de video en tiempo real puedo crear mi propia sala de videoconferencia con gran capacidad de invitados.

Interesante

You can use Amazon Kinesis Data Streams to collect and process large streams of data records in real time. You can create data-processing applications, known as Kinesis Data Streams applications. A typical Kinesis Data Streams application reads data from a data stream as data records. These applications can use the Kinesis Client Library, and they can run on Amazon EC2 instances. You can send the processed records to dashboards, use them to generate alerts, dynamically change pricing and advertising strategies, or send data to a variety of other AWS services. For information about Kinesis Data Streams features and pricing,