In an increasingly digitalized world, information has become an extremely valuable resource, especially for companies and institutions that work daily around it. The power of this large volume of content known as Big Data lies precisely in the ability to extract valuable data to make timely decisions and achieve competitive advantages.
Starting from this context, Amazon Web Services Chile (AWS) and Morris & Opazo, Advanced Partner specializing in Big Data, held on Thursday April 4th, a Workshop with different companies and public institutions called “Building a Data Lake in AWS” .
This time participated representatives from Autofin, ABCdin, PUC, Ministry of Education, Ministry of Health, Digital Government Division, Pontificia Universidad Católica de Chile, General Treasury of the Republic of Chile, among others.
The training space that was held in the spaces of Amazon Web Services in Chile, was in charge of Gerardo Gómez Dobree, South America Public Sector Country Sales at Amazon Web Services, who during the first part of the meeting exposed the important advances that has achieved AWS for the international market around fundamental aspects of the Cloud.
His speech was centered in the main services that AWS offers, Best Practices of the Architecture of the Cloud, different approaches to implement Architectures in AWS, the Shared Responsibility Model, the Regions and Availability Zones in AWS.
In regards of the concept and current context of the Big Data Analytics, Marcelo Rybertt, Country Manager of Morris & Opazo, deepened the thematic of the Data Lake, as well as the services and tools provided by the AWS platform to carry out the different components of data flow.
As technical support and to clarify the different doubts of the participants, Pamela Orbenes, Regional Partner Manager – Latam, was present on behalf of the AWS team, who had the opportunity to develop the possibility of applying Big Data within each company.
Finally, to see all the contents learned in execution and in real time, William Guzmán, Chief Operations Officer of Morris & Opazo, exposed three of the multiple use cases in which Big Data can support the operations of a Company, as follows:
Real Time Sentiment Analysis in Social Networks:
Twitter provides an excellent opportunity to test the Data Stream concept. During the Workshop attendees were asked to post Tweets with certain words that would be recognized by a Twitter application.
Previously, in an EC2 instance, an application in NodeJS was executed that ‘listened’ to the generated tweets and selected those containing the configured keywords. Once the tweets were detected, this application in NodeJS fed a data flow in Amazon Kinesis which sent to an S3 bucket the tweets without making any kind of analysis until that moment.
Once an object was stored in the Data Lake, a function in AWS Lambda was fired which, based on Amazon Comprehend, identified the entities of each tweet and performed a sentiment analysis. If the original language of the tweet was different from English, the Lambda function used Amazon Translate to generate the translation into this language. The result of these 3 analysis (entities, sentiments and translations) were stored again in the Data Lake (in different buckets).
Amazon Athena was in charge of analyzing the unstructured data generated by the Lambda function, and made this information available to any consumer. The option chosen for the Workshop was Amazon Quicksight, in which attendees could see how the tweets they had generated during the session appeared in the different sections of the dashboard:
- Types of Entities most mentioned
- Sentiments per minute
- Original text, translation and sentiment of each tweet
- Geographical location of the tweet
Audio Analysis of Calls from a Call Center with Machine Learning:
Through a dashboard implemented in Kibana, the companies attending the Workshop saw how various audio files were processed and analyzed to identify the keywords, entities, texts, data and metadata contained in each one of them. They were able to demonstrate one of the Best Practices for Cloud Architectures, since there is no rigid link between each component of the data flow (for this use case, Kibana was used as a consumer of Data Lake, a different system than the previous case but still based on S3 as Data Lake).
Images and Video Recognition using Machine Learning
Attendees to the event could see how a system supported by Amazon Rekognition was trained by uploading a photo of a person, and then identify in a different video or photo whether or not the face of the person with whom the system was trained was found.
It was possible not only to find this type of coincidences. In addition to this, Rekognition delivered detailed information about tags, features, traits and celebrities of the video or photo in real time.
Texto: Morris & Opazo
Como un AWS Partner con la Competencia de Big Data, hemos superado un alto nivel de requerimientos tales como evidenciar nuestro profundo conocimiento técnico y/o experiencia en consultoría ayudando a empresas a evaluar y usar herramientas, técnicas, y tecnologías de Big Data de forma productiva en AWS.
Esto significa que tenemos el conocimiento, capacidad y las herramientas necesarias para ayudar a su organización a obtener el máximo provecho de sus cargas de trabajo de Big Data en AWS.