This document contains a description of all included data files. ===================================================== FILE NAME: Comment-Sentiment.csv.zip A collection of all comments with accompanying sentiment score, parent, lag, and cardinality data. COLUMN NAMES: video_nid A unique ID for the parent video. comment_nid A unique ID for the comment. sentiment_pos The positive sentiment value of the comment as determined by SentiStrength. sentiment_neg The negative sentiment value of the comment as determined by SentiStrength. parent_nid The unique ID of the parent comment if the row is a reply. cardinality The order that the reply comes in the thread. like_count The number of times the comment/reply was liked. parent_sentiment_pos The positive sentiment value of the reply's parent comment as determined by SentiStrength. parent_sentiment_neg The negative sentiment value of the reply's parent comment as determined by SentiStrength. replies_siblings The number of sibling replies in the thread. sentiment_pos_lag_X The positive sentiment of the previous reply in the thread with a lag of X (up to 4). sentiment_neg_lag_X The negative sentiment of the previous reply in the thread with a lag of X (up to 4). video_format Whether the parent video's format was a typical TED Talk or an Animation. presenter_gender The gender of the parent video's presenter (not available for Animations). row_type Comment (if the row is a response to the parent video) or reply (if the row is a response to a parent comment). valid Binary indicating whether the row had sufficient data for analysis (e.g., sentiment scores). ===================================================== FILE NAME: Comment-ID-Cross-Reference.csv.zip Cross-referencing data for connecting Comment-Sentiment.csv IDs to each comment's unique YouTube ID. COLUMN NAMES: comment_nid A unique ID for the comment. comment_id The YouTube unique ID for the comment. cardinality The order that the reply comes in the thread. ===================================================== FILE NAME: Video-ID-Cross-Reference.csv Cross-referencing data for connecting Comment-Sentiment.csv IDs to each video's unique YouTube ID. COLUMN NAMES: video_nid A unique ID for the video. video_id The YouTube unique ID for the video. ===================================================== FILE NAME: Video-Data.csv Codes for videos. COLUMN NAMES: video_id The YouTube unique ID for the video. format Whether the video's format was a typical TED Talk or an Animation. gender The gender of the video's presenter (not available for Animations). language An identification of non-English videos. valid Binary indicating whether the row had sufficient data for analysis (e.g., open for commenting, English). ===================================================== FILE NAME: Comments.csv.zip Textual data for comments. COLUMN NAMES: videoId The YouTube unique ID for the video. id The YouTube unique ID for the comment. parentId The YouTube unique ID for the parent comment. authorDisplayName The YouTube display name of the comment author. likeCount The number of likes the comment received. publishedAt The time the comment was published. publishedTS A timestamp version of the publishedAt column. replies The number of replies the comment received. isPublic Whether the comment is publicly visible. textDisplay The display version of the comment text. textPlain The plain version of the comment text.