Data Visualization of Cloud Legazpi Feedback Portal Open Data Using Word Clouds

Published by Cloudcity Admin on

Data Visualization of Cloud Legazpi Feedback Portal Open Data Using Word Clouds

Jennifer L. Llovido

 

Data visualizations on the feedback data from the Cloud Legazpi was enhanced by employing word clouds. A Word cloud is an image of words where the size of each word indicates frequency or importance. It is said to be a powerful way to visualize text since it is easy to read and simple to understand. The key words stand out and are visually appealing to the audience. It is an excellent option to help visually interpret text, in this case the messages sent in the cloud city public analytics portal. It is useful in quickly gaining insight into the most prominent items in the list of messages by visualizing the word frequency in the data as a weighted list. The following were the steps undertaken in generating the word cloud:

Step 1: Extract data (Messages) from the cloudcity portal. Four different types of messages were identified namely Public Utilities, Local Government Unit (LGU), Outside Local Government Unit (LGU) and Free Messages.

Step 2: Pre-process data. This includes data cleaning. It is the process of detecting, correcting or removing the inaccurate records from the available data. Capitalization and unnecessary characters or symbols were disregarded. Further, spell checking and stemming were applied. As shown in table 1, stemming was applied to certain words in the text corpus. Stemming is a process where words are reduced to a root by removing inflection through dropping unnecessary characters, usually a suffix.

 

Original Word

Resulting Word After Stemming

taong

tao

isang

isa

ibang

iba

Jeepnies / jeeps /

jeep

Inaayos / maayos / ayusin /ayuson

ayos

serbisyong

serbisyo

Pangpublic / publico

public

Table 1: Stemming Applied to Text Corp

Step 3: Identify stop words. These were filtered out in the generation of word clouds. Identified stop words include laging

Nandoon nagiging legazpi illegal maiproseso kayo kinalang po sapagkat mas hehe public hindi full give rizal sapagkat nag aking

Kanilang daming isa kahit ibang wala kapag mang kaya kung st paki ay na sa pag ng pag na ng ang to of with or are in pa is not and at ay hi sample our mag namin an for would yung from mga so din baga man anuman upang this the rin din yaaks we rd ung may lang dai mejo sila dahil kayang munang along kanya every muna inyong kasi naman para nila tanging naming ngunit maging atin ano tayo ngayon specialy tabi iba some near only anuman anumang same nito ito maging magiging gaano gaanong puro sana bawat going miss you dapat go be up as less gawin lalo hanggang kapag mangkaya there whom osm they asdasd kapag mang kaya

Step 4: Generate wordcloud representations for all the messages as well as for each type of message.

Step 4A: Generate the word cloud using worditout

(https://worditout.com/word-cloud/create) WordItOut is the word cloud generator that gives you control with many custom settings.

Step 4B: Utilize other wordcloud generators like https://amueller.github.io/word_cloud which generates wordclouds in python.

A. Public Utilities

Minimum Frequency of words: 2

Total Number of words: 21

Highest Frequency of word (traffic): 9

Minimum Frequency of words: 1

Total Number of words: 100

Highest Frequency of word (traffic): 9

B. Local Government Unit (LGU)

Minimum Frequency of words: 2

Total Number of words: 36

Frequency of words: Good: 8, Service: 7, Serbisyo: 7, thank: 6

Minimum Frequency of words: 1

Total Number of words: 97

Highest Frequency of word (Good): 8

C. Outside Local Government Unit (LGU)

Minimum Frequency of words: 1

Total Number of words: 33

Highest Frequency of word (barangay): 7

Minimum Frequency of words: 2

Total Number of words: 6

Highest Frequency of word (barangay): 7

D. Remarks

Minimum Frequency of words: 1

Total Number of words: 100

Highest Frequency of word (fast): 7

Minimum Frequency of words: 2

Total Number of words: 47

Highest Frequency of word (fast): 7

E. All Messages: Public Utilities, Local Government Unit (LGU), Outside Local Government Unit (LGU) and Free Messages (Remarks)

Minimum Frequency of words: 3

Total Number of words: 50

Highest Frequency of word (good): 15, (serbisyo): 14

Minimum Frequency of words: 2

Total Number of words: 100

Highest Frequency of word (good): 15, (serbisyo): 14

Minimum Frequency of words: 1

Total Number of words: 100

Highest Frequency of word (good): 15, (serbisyo): 14

 Ms. Jen Llovido is from  Bicol University College of Science, currently finishing her Doctorate Degree on Information Technology at the University of the Cordilleras Baguio City, Philippines.
Cloudcity Admin
Author: Cloudcity Admin


0 Comments

Leave a Reply

Avatar placeholder

Your email address will not be published. Required fields are marked *