Compute on Google Cloud Platform

1

Introducci贸n al curso de Google Cloud for Developer Community

2

Lectura: introducci贸n de instalaci贸n

3

Tutorial de Qwiklabs

4

C贸mputo en la nube de Google

5

Opciones de c贸mputo en la nube

6

M谩quinas virtuales a profundidad

7

Tutorial para instalar Qwiklabs

8

Demo: m谩quinas virtuales a profundidad

9

C贸mputo sin administraci贸n con plataformas como servicio

10

Demo: c贸mputo sin administraci贸n

11

Lectura: 驴qu茅 son los contenedores?

12

C贸mputo contenerizado con App Engine Flex

13

C贸mputo contenerizado con Cloud Run

14

Funciones serverless

Continuous Integration, Continuous Delivery

15

CI/CD en Google Cloud Platform

16

Estrategias de Despliegue

17

Repositorios de c贸digo

18

Construcci贸n y despliegue de artefactos

19

Infraestructura como c贸digo

20

Despliegue en Servicios Serverless

Google Kubernetes Engine

21

Kubernetes Overview

22

Demo Kubernetes

23

Planeaci贸n de tu despliegue

24

Anthos

25

Cloud Run for Anthos

26

Demo Cloud Run for Anthos

27

Anthos Service Mesh

28

Site Reliability Engineering con Anthos

Streaming Data Analytics

29

Integraci贸n de datos e ingesta de datos totalmente administrada sobre GCP

30

Demo: ingesta de datos

31

Ingesta de datos confiable en streaming sobre GCP

32

Demo: ingesta de datos confiable

33

Demo: configuraci贸n de Apache Kafka

34

Visualizaci贸n de mensajes de una base de datos relacional en Google Cloud

35

Data Warehouse: el modelo tradicional para construir un repositorio de datos empresarial

36

Data Lakehouse: el nuevo y moderno enfoque para construir un repositorio de datos empresarial

37

El portafolio de gesti贸n de datos en Google Cloud

38

Desglose del portafolio de gesti贸n de datos (Bases de datos) en Google Cloud

39

Gobierno de datos de punta a punta para garantizar la seguridad en tu Data Lake

40

Gobierno de datos: calidad y monitoreo

Machine Learning

41

驴Qu茅 es ML y AI?

42

Plataforma de AI en GCP

43

Auto ML con datos estructurados

44

Demo Auto ML con datos estructurados

45

Predicci贸n de tarifas usando AI notebooks

46

Demo predicci贸n de tarifas usando AI notebooks

47

TensorFlow Extended

Sesiones en vivo

48

Sesi贸n en vivo con Pablo P茅rez Villanueva

You don't have access to this class

Keep learning! Join and start boosting your career

Aprovecha el precio especial y haz tu profesi贸n a prueba de IA

Antes: $249

Currency
$209
Suscr铆bete

Termina en:

0 D铆as
11 Hrs
55 Min
49 Seg

Ingesta de datos confiable en streaming sobre GCP

31/48
Resources

How does Google Cloud Platform manage reliable data ingestion?

Google Cloud Platform (GCP) provides us with a powerful infrastructure to reliably manage data ingestion through managed services. Understanding how this data is generated is crucial. We generate events on a massive scale, from eCommerce browsing to social media sharing. Within an organization, these practices translate into three main use cases:

  1. User event ingestion: by using platforms such as Mercado Libre, every action generates events in real time.
  2. Data ingestion through databases with CDC (Change Data Capture): This technique allows capturing and acting on changes in a database.
  3. Event enrichment with artificial intelligence: Using Google APIs to analyze and enrich unstructured data, such as photos and videos.

What differentiates a data driven organization from an event driven organization?

A data driven organization focuses on a strategic approach. Before taking action, it plans based on strategies and hypotheses, which implies a long-term development. In contrast, an event driven organization responds in real time to data. It lets events dictate actions, allowing a faster and more adaptive reaction to business needs.

What are the characteristics of each approach?

  • Data driven:

    • Long-term strategy and assumptions.
    • Low time sensitivity.
    • Pre-planning prior to strategy implementation.
  • Event driven:

    • Rapid and adaptive response.
    • Actions defined by real-time events.
    • Data drives decisions, enabling agile execution.

How does Google Cloud facilitate these data ingestion approaches?

Google provides a platform that encompasses five key points for reliable data ingestion:

  1. Robust ingest services: Capture events regardless of size or velocity.
  2. Unified data ingestion: Enables batch or streaming data processing without re-encoding.
  3. Serverless architecture: Maximizes efficiency by eliminating the need to manage servers.
  4. Data sense tools: Provides the ability to extract meaningful information in real time.
  5. Flexibility for users: No programming experience is required to take advantage of the platform.

What products support this architecture?

PubSub

  • Global product that captures data at the closest point of production.
  • Scalable, processing up to 100 GB per second.
  • Spotify as a use case, handling 8.5 million events per second.
import pubsub_v1
client = pubsub_v1.PublisherClient()topic_path = client.topic_path('your-project', 'your-topic')
data = 'your-message'.encode('utf-8')client.publish(topic_path, data)

Dataflow

  • Based on Apache Beam, allows reuse in batch or real time.
  • Integrated with several processing engines such as Apache Flink and Spark.
  • Guarantees delivery of the exactly eleven message in conjunction with PubSub.
import apache_beam as beam
with beam.Pipeline() as p: ( p | 'Input' >> beam.Create([1, 2, 3, 4, 5]) | 'Multiply' >> beam.Map(lambda x: x * 10) | 'Output' >> beam.io.WriteToText('output.txt'))

Other components

  • BigQuery: Stores event data in a serverless and scalable way.
  • AI Platform and TensorFlow: Operationalizes artificial intelligence models, enabling complex analytics and predictions.

Why choose Google for data ingestion?

  • Unified process for data ingestion and analysis in batch and real time.
  • Integrated solutions that democratize analytics.
  • Success stories such as eMARSIS, which processes 250,000 events per second and reduced costs by 70%.

Google Cloud is a robust and flexible ally for any organization that wants to implement reliable data ingestion, adapting to changing demands and scaling with business growth.

Contributions 4

Questions 0

Sort by:

Want to see more contributions, questions and answers from the community?

Arquitectura Serverless: Me olvido de tener que configurar servidores. Me olvido de tener que administrar la infraestructura de servidores. ME ENFOCO en los DATOS. Tener la capacidad de darle sentido a la informaci贸n, procesarla, entenderla y leerla para obtener datos estrat茅gico

Como se generan estos datos?
- Dispositivos digitales
- Comercio electr贸nico
- Comunicaciones
- Consumos de medios digitales
Casos de uso clave:
- Ingesta de eventos de usuario
- Almacenamiento de datos y CDC (Change Data Capture)
- Enriquecimiento de eventos y ML.

Data driven vs Event driven:
Data driven: No act煤a sobre los datos en real time, mayor planificaci贸n.
- Humanos involucrados
- Ideas a largo plazo
- Determinar estrategia de producto
- Determinar segmentaci贸n de cliente
- Informar campa帽as de marketing
Event driven: Los datos le dicen a la empresa que hacer.
- Procesos automatizados
- Responsivo en tiempo real
- Impulsa interacciones instant谩neas
- Determina el producto
- Seleccionar campa帽as de marketing

Productos para la ingesta de datos confiables:
Pub/Sub:
- Mensajer铆a dirigida por eventos para la ingesta de datos
- Es un producto global, no se especifica una localidad
- Procesa hasta 100GB por segundo
Dataflow:
- Secuencia simplificada y procesamiento de datos por lotes
- Es open source
- Garantiza la entrega del mensaje
BigQuery:
- Almac茅n de datos en la nube
- Se puede aplicar anal铆tica a los datos con AI

Casos de uso de ingesta de datos:

Eventos del usuario: Interacciones del usuario en la plataforma (clics y otros)
Almacenamiento y Change Data Capture CDC: Registro de como cambian los datos para accionar eventos.
Enriquecimiento con ML - IA: A帽adir informaci贸n por analisis de voz o visi贸n de m谩quina con las APIs de Google

ML, ingesta de datos de fuentes externas