代做FIT5145、代写R编程设计、代做software、代写R语言代做SPSS|帮做Haskell程序

FIT5145 Assignment 3Semester 2, 2019Due: Monday 30th September 2019, 11:55pmHand in Requirements:1) Please hand in a PDF file containing your answers to all the questions,numbered correspondingly.● You can use Word or other word processing software to format yoursubmission. Just save the final copy to a PDF before submitting.● Make sure to include screenshots/images of the graphs you generate inorder to justify your answers to all the questions.● Make sure to include copies of all the bash command lines and R scriptsyou use. If your answer is wrong, you may still get half marks if yourcommand line or script is close to correct.Data:The dataset for this assignment is in the Google shared drive: https://drive.google.com/open?id=1frjdZrDBGLo_gkQLF5QlSAuZMniDER3hThe dataset contains Facebook posts from 15 of the top mainstream media sources(e.g., ABC, BBC, etc.) from 2012 to 2016.Note: This is a large file, so your best bet is to download them while in the lab/studioand do the assignment there. You will need to use either a Linux machine for this ora Mac terminal or Cygwin on a Windows machine.Assignment Tasks:There are two tasks that you need to complete for this assignment. Students that complete onlyTasks A1-A10 AND B1-B2 can only get a maximum of Distinction. Students that attempt tasksA11-A12 and B3 will be showing critical analysis skills and a deeper understanding of the task athand and can achieve the highest grade. You need to use unix shell and R to complete thetasks. Task A: Investigating Facebook Data using shell commandsDownload the file FB_Dataset.csv.zip from the link above. Use a Unix shell tomanipulate the file and answer the following questions.1) Decompress the file. How big is it?2) What delimiter is used to separate the columns in the file and how manycolumns are there?3) The 2nd column is the unique identifier for a Facebook post. What are the othercolumns?4) How many Facebook posts are there in the file?5) What is the date range for Facebook posts in this file? (Assume that the data isin order)6) How many unique pages are there?7) How many unique posts are there? [Hint: one page can have multiple posts]8) When was the first mention in the file regarding “Italian Dishes” and what wasthe post?9) How many times is “Barack Obama” mentioned in the file? How did you findthis? (Do not ignore the case)10) What about “Donald Trump”? Who is more popular on Facebook, Obama orTrump? (Do not ignore the case)11) Select the posts where “Trump” (Ignore the case) is mentioned in the postcontent and number of likes for those posts are greater than 100. And generatea new file with post_id and sorted like_count and name it “trump.txt”. (In theoutput, you need to show the headers as well) [Hint: Find Trump in messagecolumn, i.e., 5th column]. Then copy and paste the first 5 lines of trump.txt inyour answer.12) Find the total number of love_count and angry_count for “Donald Trump” and“Barack Obama” separately. Who has more positive feeling amongpeople? Justify your answer.[Hint 1: you will need to search online to find how to sum a column of numbersusing awk.Hint 2: You will need to consider both love and angry count when justifying youranswer.]Task B: Graphing the Data in R1) How many times does the term ‘Trump’ appear in the post content? (use shellto answer to this question)2) We want to consider how the amount of discussion regarding Donald Trumpvaries over the time period covered by the data file. To answer this question,you will need to extract the timestamps for all posts referring to Trump usingshell. You will then need to read them into R and generate a histogram. [Hint:To read the data into R, first generate a file containing only the timestampcolumn as text. Then read the file into R as a CSV.] R will not recognise thestrings as timestamps automatically, so you’ll need to convert them from textvalues using the strptime() function. Instructions on how to use the function isavailable here:https://www.rdocumentation.org/packages/base/versions/3.6.1/topics/strptimeYou will need to write a format string, starting with “%a %b” to tell the functionhow to parse the particular date/time format in your file. What format string doyou need to use?1. Once you have converted the timestamps, use the hist() function to plotthe data in R.2. The plot has a bit of an unusual shape. Describe the pattern you see.3) In this question, we want to investigate the Facebook posts of a few top mediasources. To answer this question, you will need to extract the facebook postsmade on the pages of abc-news, cnn and fox-news from your originalFacebook dataset.1. Use the unix shell to first generate a file containing all the recordsbelonging to abc-news, cnn and fox-news only. Then read theresulting file in R.2. Background: We now want to see if any relationship exists between thenumber of times a post is shared on Facebook and the number of likesit generates. Task: Use appropriate R code to generate a plot showingthe relationship between the number of shares and the number of likesin your dataset. Do you see any relationship?3. Fit a linear regression model using R to the above data (i.e.,shares_count and likes_count) and plot the linear fit. Does it look like agood fit to you?4. Use the linear fit to predict the number of likes a post will generate if it isshared 0 times, 1000 times, 10000 times and 100000 times onFacebook.Good Luck!本团队核心人员组成主要包括BAT一线工程师,精通德英语!我们主要业务范围是代做编程大作业、课程设计等等。我们的方向领域:window编程 数值算法 AI人工智能 金融统计 计量分析 大数据 网络编程 WEB编程 通讯编程 游戏编程多媒体linux 外挂编程 程序API图像处理 嵌入式/单片机 数据库编程 控制台 进程与线程 网络安全 汇编语言 硬件编程 软件设计 工程标准规等。其中代写编程、代写程序、代写留学生程序作业语言或工具包括但不限于以下范围:C/C++/C#代写Java代写IT代写Python代写辅导编程作业Matlab代写Haskell代写Processing代写Linux环境搭建Rust代写Data Structure Assginment 数据结构代写MIPS代写Machine Learning 作业 代写Oracle/SQL/PostgreSQL/Pig 数据库代写/代做/辅导Web开发、网站开发、网站作业ASP.NET网站开发Finance Insurace Statistics统计、回归、迭代Prolog代写Computer Computational method代做因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:[email protected] 微信:codehelp

QQ:99515681 或邮箱:[email protected] 微信:codehelp

你可能感兴趣的:(代做FIT5145、代写R编程设计、代做software、代写R语言代做SPSS|帮做Haskell程序)