Joseph Azanza

Logo

Joseph Matthew R. Azanza

Data Scientist for a US-based cloud communications provider,
with expertise in Machine Learning, Artificial Intelligence,
Data Wrangling, Data Storytelling,
Sales Analytics, Sales Operations (Ops),
Marketing Analytics, Marketing Ops,
Business Intelligence, Strategic Initiatives,
Molecular Biology, and Biotechnology,


MS in Data Science
Asian Institute of Management

BS in Molecular Biology and Biotechnology
University of the Philippines Diliman

Reach out via LinkedIn

GitHub Profile

Finding patterns between emotional state and played songs

Joseph Azanza
Asian Institute of Management

Executive Summary

The study of music and emotions is well-explored in neuroscience, psychology, and physiology. However, experiments in these fields are usually time-consuming, costly, and have poor ability to scale as the technology researchers use, e.g. MRI, PET, EEG, ERP, etc., are usually designed to be conducted in a small-scale setting, focusing on one participant at a time. This can potentially limit the important basic research and preliminary studies that are required before delving into more complicated experiments. Here is where we thought we can apply Frequent Itemset Mining and Association Pattern Mining. More specifically we asked, "can we utilize FIM and APM techniques to identify patterns and rules between emotions and songs, such that researchers in the fields of neuroscience, psychology, and physiology can adapt the project for use in their preliminary studies?" If done right, instead of manually screening for the effect of songs, the researchers can use the mined patterns and rules as guidance for more targeted research.

To build a proof of concept, we used the #nowplaying-RS dataset published by Poddar, Zangerle, and Yang in 2018. The dataset contains 11.6 million music listening events (LE) spanning the whole 2014, collected via Twitter. The dataset contains basic listening event information, Tweet information, track musicality (via Spotify API), and sentiment analysis results performed on the hashtags (by the original researchers). In building the pipeline, we first cleaned and preprocessed the sentiment data, choosing to keep data from the SentiStrength lexicon. We then joined the sentiment data to the basic listening event data, with the sentiment scores functioning as the emotion state per listening event. Further data cleaning was done on the hashtags and tweet language to ensure proper distribution and to keep English only tweets. The transactional database was then generated from the result of the preceding step.

From the transactional database, we performed FIM and APM techniques, where we were able to recommend actionable insights such as:

Overall, we were successful in finding patterns between emotions and songs. With the methodology and findings of this project, the researchers can then replicate, adapt, or tweak the study using their own databases of emotion states and songs. The flexibility offered by the methodology of the project gives the researchers the ability to have guided insights before performing experiments. These can potentially cut down time, cost, and the required effort in performing preliminary studies. In terms of recommendations, future studies can use a larger dataset with more time periods and more songs for better generalizability. The musical characteristics of each song can also be factored in for more interesting insights and rules.


Source code available at BDCC_MP1

Back to main page