This talk is about data - where to get it and how to create it if it doesn’t exist. I’ll take the audience through the process of creating the dataset for my most recent project and show how to view unavailable data as an opportunity rather than an obstacle to answering questions. I’ll cover how to get and read data as well as popular libraries for data analysis and processing in Python — NLTK (Natural Language Toolkit), Panda, Gensim and techniques like regular expressions.
16. ['woman', 'female', 'girl', 'lady',
'women', 'mother', 'daughter', 'wife']
['man', 'male', 'boy', 'men', 'son',
'father', 'husband']
A gendered word is a word with one of these
terms (above ) in its definition.
80. My meal wasn’t very tasty so I put some
maggi on it.
My meal wasn’t very tasty so I put some
salt on it.
My meal wasn’t very tasty so I put some
seasoning on it.
I sat on the chair to eat my meal.