Covariates available after instancing a Cleaner object
The following list provides an overview of the available covariates in an instanciated cleaner object. It can
be accessed by the raw_data attribute. It is advised to keep the metadata
parameter at default for data sets
planned to be used with the remaining features of this package.
In case metadata=False
(default):
- created_at - timestamp of the creation of the corresponding tweet.
- text - shows the complete text of a tweet, regardless of whether it’s longer than 140 characters or not.
- text_tokens - contains the created lemmarized tokens from "text".
- hashtags - contains the hashtag(s) of a tweet (without “#”)
- center_coord_X - the X-coordinate of the center of the bounding box.
- center_coord_Y - the Y-coordinate of the center of the bounding box.
In case metadata=True
, these covariates are available additionally to the ones listed above:
- extended_tweet - shows the complete text of a tweet if it is longer than 140 characters. Else None.
- id - the tweets id as integer.
- id_str - the tweets id as string.
- place - sub-dictionary: contains information about the tweets associated location.
- source - hyperlink to the Twitter website, where the tweet object is stored.
- user - sub-dictionary: contains information about the tweets’ associated user.
- emojis - contains the emoji(s) of a tweet.
- bounding_box.coordinates_str - contains all bounding box coordinates as a string. Originates from place
- retweet_count - number of retweets of the corresponding tweet.
- favorite_count - number of favorites of the corresponding tweet.
- user_created_at - timestamp of the users’ profile creation. Originates from user.
- user_description - textual description of users’ profile. Originates from user.
- user_favourites_count - The total number of favorites for all of the users tweets. Originates from user.
- user_followers_count - The total number of followers of the user. Originates from user.
- user_friends_count - The total number of users followed by the user. Originates from user.
- user_id - profile id of the users profile as integer. Originates from user.
- user_listed_count - The number of public lists which this user is a member of. Originates from user.
- user_location - self-defined location by the user for the profile. Originates from user.
- user_name - self-defined name for the user themselves. Originates from user.
- user_screen_name - alias of the self-defined name for the user themselves. Originates from user.
- user_statuses_count - number of tweets published by the user (incl. retweets). Originates from user.