FIT1043 Assignment 1: Description

FIT1043 Assignment 1: Description,第1张

概述FIT1043 Assignment 1: Description Aim The aim of this assignment is to investigate and visualise data using Python as a data science tool. It will test your ability to: 1. read a data file in Python a @H_403_4@


FIT1043 Assignment 1: Description
aim
The aim of this assignment is to investigate and visualise data using Python as a data scIEnce tool.
It will test your ability to:
1. read a data file in Python and extract related data from it;
2. use varIoUs graphical and non-graphical tools for performing exploratory data analysis and
visualisation;
3. use basic tools for managing and processing data; and
4. communicate your findings in your report.
Data
The data we will use contains Suburb-based crime statistics for crimes against the person and
crimes against property in South Australia and comes from the South Australian Government.
The Crime statistics dataset (Crime_Statistics_SA_2014_2019.csv file) contains all
offences against the person and property that were reported to police between 2014 to
2019 in South Australian suburbs.
The dataset contains information about the crime reported date,suburb incIDent occurred,
Postcode,3 levels of description of the offence,and the offence count.
The file is available on Moodle and is publicly available from data.sa.gov.au on a yearly
basis.
Hand-in Requirements
Please hand in a pdf file containing your answers and a Jupyter notebook file (.ipynb)
containing your Python code to all the questions respectively:
● A pdf file should contain:
1. Answers to the questions. Make sure to include screenshots/images of the graphs you

FIT1043课程作业代做、Python程序设计作业调试、代写Python语言作业
generate and your Python code in order to justify your answers to all the questions.
(You will need to use screen-capture functionality to create appropriate images.)
2. You can use Word or other word processing software to format your submission. Just
save the final copy to a pdf before submitting.
● Ipynb file should contain:
1. A copy of your working Python code to answer the questions.
● You will need to submit two separate files. Zip,rar or any other similar file compression
format is not acceptable and will have a penalty of 10%.
Python Availability
You will need to use Python to complete the assignment. You can do this by either:
1. running a Jupyter Notebook on a computer in the labs; or
2. installing Python (we recommend Anaconda) on your own machine. ?
Assignment Tasks:
There are two tasks that you need to complete for this assignment. Students that complete only
tasks A1-A6 and B1 and B2 can only get a maximum of distinction. Students that attempt task
B3 will be showing critical analysis skills and a deeper understanding of the task at hand and can
achIEve the highest grade. You need to use Python to complete the tasks.
Task A: Data Exploration and Auditing
In this task,you are required to explore the dataset and do some data auditing on the crime
statistics dataset. Have a look at the CSV file (Crime_Statistics_SA_2014_2019.csv) and then
answer a serIEs of questions about the data using Python.
A1. Dataset size
How many rows and columns exist in this dataset?
A2. Null values in the dataset
Are there any null values in this dataset?
A3. Data Types
What are the min and max for column ‘Reported Date ‘? Does this column have the correct data
type? If no,convert it to an appropriate data type.
A4. Descriptive statistics
Calculate the statistics for the "Offence Count" column (Find the count,mean,standard deviation,
minimum and maximum).
A5. Exploring Offence Level 1 Description
Now look at the Offence Level 1 Description column and answer the following questions
1. How many unique values does "Offence Level 1 Description" column take?
2. display the unique values of level 1 offences.
3. How many records do contain "offences against the person"?
4. What percentage of the records are "offences against the property"?
A6. Exploring Offence Level 2 Description
Now look at the Offence Level 2 Description column and answer the following questions
1. How many unique values does "Offence Level 2 Description" column take? display the
unique values of level 2 offences together with their counts (i.e.,how many times they have
been repeated).
2. How many serIoUs criminal trespasses have occurred with more than 1 offence count?
Task B: Investigating Offence Count in different suburbs and
different years
In the task,you are required to visualise the relationship between the number of crimes in different
suburbs and different years and exploring the relationship. Note: higher marks will be given to
reports containing graphs with appropriately labelled axes,Title and legend.
B1. Investigating the number of crimes per year
Find the number of crimes per year. Plot the graph and explain your understanding of the graph.
Hint: you can extract ‘year’ from column "reported date" using method .dt and create a new column
for the year in your dataframe as follows:
>>> your_dataframe[‘year‘]=your_dataframe[‘Reported Date‘].dt.year
B2. Investigating the total number of crimes in different suburbs
1. Compute the total number of crimes in each suburb and plot a histogram of the total
number of crimes in different suburbs
2. ConsIDer the shape of the histogram,what can you tell? Compare the mean and median
values of the plotted histogram.
3. In which suburbs the total number of crimes are greater than 5000? Plot the total number
of crimes in the suburbs with the highest number of crimes (greater than 5000) using a
bar chart.
B3. Daily number of crimes
1. For each suburb,calculate the number of days that at least 15 crimes have occurred per
day. (Note: your answer should contain all suburbs in the dataset together with a value
showing the number of days that at least 15 crimes have happened)
2. Now which suburbs do have at least one day where the daily number of crimes are more
than 15. Plot the number of days that at least 15 crimes have occurred for the suburbs you
found in this step (step 2) using a bar graph.
3. Use an appropriate graph to visualize and detect outlIErs (extreme values) on the data from
step 2 and remove them. Then,plot the data again using a bar graph.
4. Compare the bar graphs in step 2 and 3. Which bar graph is easIEr to interpret? Why?
Good Luck!

因为专业,所以值得信赖。如有需要,请加QQ:99515681 或邮箱:[email protected]

微信:codehelp

@H_403_4@ @H_403_4@ @H_403_4@ 总结

以上是内存溢出为你收集整理的FIT1043 Assignment 1: Description全部内容,希望文章能够帮你解决FIT1043 Assignment 1: Description所遇到的程序开发问题。

如果觉得内存溢出网站内容还不错,欢迎将内存溢出网站推荐给程序员好友。

欢迎分享,转载请注明来源:内存溢出

原文地址: http://outofmemory.cn/langs/1191606.html

(0)
打赏 微信扫一扫 微信扫一扫 支付宝扫一扫 支付宝扫一扫
上一篇 2022-06-03
下一篇 2022-06-03

发表评论

登录后才能评论

评论列表(0条)

保存