Introducción y Fundamentos del NLP
Procesamiento de Lenguaje Natural
Configuración del Entorno y Exploración de Datos
Preprocesamiento Inicial
Quiz: Introducción y Fundamentos del NLP
Técnicas Tradicionales de NLP para Documentos Empresariales
Tokenización, Stemming y Lematización
Visualización y generación de nubes de palabras
Representación Vectorial: Bag-of-Words y TF-IDF
Extracción de Términos Clave y Modelado de Temas
Clasificación Tradicional para Análisis de Sentimientos y Categorías
Quiz: Técnicas Tradicionales de NLP para Documentos Empresariales
Introducción y Profundización en Transformers para Aplicaciones Empresariales
Fundamentos de Transformers y su Relevancia en NLP
Tokenización Avanzada con Transformers y Hugging Face
Uso de Modelos Preentrenados de Transformers para Clasificación
Reconocimiento de Entidades (NER) en Documentos Corporativos con Transformers
Fine-Tuning de Transformers para Datos Empresariales
Quiz: Introducción y Profundización en Transformers para Aplicaciones Empresariales
Proyecto Final y Estrategia Comercial B2B
Desarrollo y Prototipado de la Aplicación Empresarialparte 1
Desarrollo y Prototipado de la Aplicación Empresarialparte 2
Despliegue del proyecto en Hugging Face
You don't have access to this class
Keep learning! Join and start boosting your career
Artificial intelligence and natural language processing (NLP) have revolutionized the way we analyze textual data. These technologies allow us to extract valuable information from reviews, comments and other text, providing insights that can be crucial for businesses and organizations. In this content, we will explore how to implement a Free Market review analysis system using pre-trained models and advanced NLP techniques.
Product review analysis is a practical and powerful application of natural language processing. Using a graphical interface, we can allow users to load individual reviews or entire datasets to obtain relevant metrics on sentiment and the entities mentioned.
The interface we will develop has two main functionalities:
For example, when entering a review such as "I liked the Levis pants I bought in Belgrano", the system identifies that it is a positive comment (Label 1) with 99% certainty, and recognizes entities such as the brand "Levis" and the location "Belgrano".
To implement this solution, we need to install and use several Python libraries:
# Install dependencies!pip install transformers wordcloud wordcloud pandas pillow.
Then, we import the necessary libraries:
import pandas as pdfrom transformers import pipelinefrom wordcloud import WordCloudfrom PIL import Image # To export the word cloud as an image.
The core of our application consists of two NLP pipelines:
# Definition of NLP pipelinessentiment_analysis = pipeline("text-classification", model="tu_modelo_entrenado")ner = pipeline("ner", model="modelo_para_entidades_en_español")
Text cleaning is essential to obtain accurate results. Unlike other cases, here we will keep the case structure, as it is crucial for the recognition of entities as tags:
def clean_text(text): # Remove bracketed text # Remove URLs # Remove HTML tags # Remove extra spaces
# Important: we do NOT convert to lowercase to preserve entities return cleaned_text.
For entity recognition, we need a function that reconstructs the complete entities from the identified tokens:
def reconstruct_entity(ner_results): # Process NER results to obtain complete entities # Return entities in structured format return processed_entities .
Our application has two main functions that correspond to the two interface functionalities:
def analyze_text(text): # Clean text cleaned_text = clean_text(text)
# Get sentiment analysis sentiment = sentiment_analysis(cleaned_text)
# Get recognized entities ner_result = ner(cleaned_text) entities = reconstruct_entity(ner_result)
# Return structured results return { "sentiment": sentiment, "entities": entities }
def analyze_csv(csv_file): # Load CSV with pandas df = pd.read_csv(csv_file)
# Verify 'ReviewBody' column exists if 'ReviewBody' not in df.columns: return {"error": "CSV must contain a column named 'ReviewBody'"}
results = [] all_entities = []
# Process each review for review in df['ReviewBody']: cleaned_review = clean_text(review) sentiment = sentiment_analysis(cleaned_review) ner_result = ner(cleaned_review) entities = reconstruct_entity(ner_result)
results.append({ "review": review, "sentiment": sentiment, "entities": entities })
all_entities.extend(entities)
# Generate wordcloud from entities wordcloud = generate_wordcloud(all_entities)
return { "results": results, "wordcloud": wordcloud, "file_size": len(df) }
An interesting feature of our application is the generation of a wordcloud that visually displays the most frequent entities in the dataset:
def generate_wordcloud(entities): # Create a dictionary of entity frequencies entity_freq = {} for entity in entities: if entity['text'] in entity_freq: entity_freq[entity['text']] += 1 else: entity_freq[entity['text']] = 1
# Generate wordcloud wordcloud = WordCloud(width=800, height=400, background_color='white').generate_from_frequencies(entity_freq)
# Save as image wordcloud_image = wordcloud.to_image() return wordcloud_image
This functionality is especially useful to quickly identify the brands most mentioned in the reviews, such as Nike, Adidas, Levi's or Zara, providing an overview of consumer preferences.
The graphical interface we have developed not only allows these analyses to be performed interactively, but also offers the possibility of generating a practical API to integrate these functionalities into other applications.
Product review analysis is just one of the many possible applications of natural language processing in e-commerce. What other applications can you think of for these technologies? Share your ideas and experiences in the comments.
Contributions 2
Questions 0
Want to see more contributions, questions and answers from the community?