Description Marks out of Wtg(%) Word
Count
Due
date
Assignment 4 Written and Practical Report 100 (55%) 4500 24/05/13
Assignment 4 relates to the specific course learning objectives 1, 2 and 4 and associated
MBA program learning goals and skills: Global Content, Problem solving, Change, Critical
thinking, and Written Communication at level 3.
1. demonstrate applied knowledge of people, markets, finances, technology and management
in a global context of business intelligence practice (data warehouse design, data mining
process, data visualisation and performance management) and resulting organisational change
and how these apply to implementation of business intelligence in organisation systems and
business processes
2. identify and solve complex organisational problems creatively and practically through the
use of business intelligence and critically reflect on how evidence based decision making and
sustainable business performance management can effectively addressing real world
problems
4. demonstrate the ability to communicate effectively in a clear and concise manner in written
report style for senior management with correct and appropriate acknowledgment of main
ideas presented and discussed.
The key frameworks, concepts and activities covered in modules 2–12 and more specifically
modules 6 to 12 are particularly relevant for this assignment.
This assignment consists of three tasks 1, 2 and 3 and builds on the research and analysis you
conducted in Assignment 2. Task 1 is concerned with developing and evaluating a model of
key factors impacting on credit risk ratings for loan applications in determining whether
approve a loan or not approve a loan. Task 2 is concerned with the key opportunities and
challenges associated with the implementation and utilisation of business intelligence
systems. Task 3 is concerned with performance management and provides you with the
opportunity to design and build a sales performance dashboard using pivot tables and Tableau
7.0 Desktop.
Task 1 (40 marks)
In Task 1 of this Assignment 4 you are required to follow the six step CRISP DM process and
make use of the data mining tool RapidMiner to analyse and report on the creditrisk_train.
csv and creditrisk_score.csv data sets provided for Assignment 4. You should refer to the
data dictionary for creditrisk_train.csv (see Table 1 below). In Task 1 and 2 of Assignment 4
you are required to consider all of the business understanding, data understanding, data
preparation, modelling, evaluation and deployment phases of the CRISP DM process.
Figure 1 CRISP-DM Process
Table 1 Data Dictionary for creditrisk_train.csv
Variable Description
Row.No Unique identifier for each row – integer
Application.ID Unique identifier for loan application – integer
Credit.Score Credit score given to the loan application
This is a measure of the creditworthiness of the applicant.
(http://en.wikipedia.org/wiki/Credit_score_in_the_United_States)
http://www.buzzle.com/articles/credit-score-rating-scale.html
Late.Payments History of late payments with existing loans
Months.In.Job Months in current job
Debt.To.Income.Ratio The Percentage Of consumer’s gross income that goes toward paying
debts (http://en.wikipedia.org/wiki/Debt_to_income_ratio)
Loan.Amount Loan amount requested
Liquid.Assets Liquid assets
Num.Credit.Lines Number of credit lines
Credit.Risk Credit risk rating (Very Low, Low, Moderate, High, Do not lend)
http://www.dico.com/design/Publications/En/By-law5-CommercialLendingPractices-May2005-UpdatedMay2008/CreditRiskRatings.pdf
a) Research the concepts of credit risk and credit scoring in determining whether a financial
institution should lend at an appropriate level of risk or not lend to a loan application.
This will provide you with a business understanding of the dataset you will be analysing
in Assignment 4. Identify which (variables) attributes can be omitted from your credit risk
data mining model and why. Comment on your findings in relation to determining the
credit risk of loan applicants.
b) Conduct an exploratory analysis of the creditrisk_train.csv data set. Are there any missing
values, variables with unusal patterns? How consistent are the characteristics of the
creditrisk_train.csv and creditrisk_score.csv datasets? Are there any interesting
relationships between the potential predictor variables and your target variable credit
risk? (Hint: identify the variables that will allow you to split the data set into subgroups).
Comment on what variables in the data set creditrisk_train.csv might influence
differences in credit scores and credit risk ratings and possible approval or rejection of
loan applications?
c) Run a decision tree analysis using RapidMiner. Consider what variables you will want to
include in this analysis and report on the results. (Hint: Identify what your target variable
and predictor variables are.). Comment on the results of your final model.
d) Run a neural network analysis using RapidMiner, Aagain consider what variables you
will want to include in this analysis and report on the results. (Hint: Identify what your
target variable and predictor variables are.) Comment on the results of your final model.
e) Based on the results of the Decision Tree analysis and Neural Network analysis - What
are the key variables and rules for predicting either good credit risk or bad credit risk?
(Hint: with RapidMiner you will need to validate your models on the creditrisk_train.csv
data using a number of validation processes for the two models you have generated
previously using decision trees and neural network models). Comment on your two
predictive models for credit risk scoring in relation to a false/positive matrix, lift chart
and ROC chart (Note: for the evaluation operator reports - charts Lift and ROC you will
need to convert the target variable credit.risk to a nominal variable with two values (Good
and Bad). Comment on the results of your final model.
Overall for Task 1 you need to report on the output of each analysis in sub task activity a to f
and briefly comment on the important aspects of each analysis and relevance to credit risk
scoring in determining whether to approve a loan with an appropriate credit risk rating or to
not lend to a loan application. (Approx 2000 words).
Note the final outputs from your statistical analyses in RapidMiner (graphs, decision trees,
neural network, statistical analysis results tables should be included as an appendices in your
report to provide support for your conclusions regarding each analysis and are not included in
the word count.
Task 2 (15 marks)
For the deployment phase of the CRISP DM process discuss the key opportunities and
challenges including socio-technical change management associated with the implementation
and utilisation of a business intelligence system which supports improved decision making in
financial institutions relation to the assessment of loan applications and improved risk
management of loans. (1000 words approx.)
Task 3 (35 marks)
Scenario
Peeko is an international supermarket chain with supermarket stores in the USA, Canada and
Mexico. Peeko sells a wide range of grocery and general consumer goods across their city
stores in the various states of these three countries.
Rules have been kept simple:
Peeko’s headquarters are located in Los Angelos, California and the Canadian subsidiary
company (Peeko Canada) is based in Toronto, Canada. Mexican subsidiary company (Peeko
Mexico) is based in New Mexico City, Mexico.
Dashboard
Peeko’s executive management team require a sales dashboard to be created to provide
greater insight to their sales data to understand the trends and sales performance across their
supermarket chain operations in three countries. They want the flexibility to visualize sales
data in a number of different ways. They want to be able to get a quick overview of the data
and then be able to zoom and filter on particular aspects and then get further details as
required. The specific information they are concerned with is the following four sales
performance reports.
1. Sales Revenue and Sales Gross Profit by Week, Month, and Year for a selected country
2. Sales Revenue and Sales Gross Profit by selected Product/Product Category
3. Sales Revenue and Sales Gross Profit by selected State and City Stores
4. Sales Revenue and Sales Gross Profit by country
The data is the financial performance of Peeko sales at a single point in time.
You will need to use the Excel Sales.cub as necessary to support the dashboard. Sales.cub
data set is available on the course study desk
Your task is to create
(a) dashboard to satisfy the Peeko Executive management requirements for the four specified
sales performance reports:
1. Sales Revenue and Sales Gross Profit by Week, Month, and Year for a selected country
2. Sales Revenue and Sales Gross Profit by selected Product/Product Category
3. Sales Revenue and Sales Gross Profit by selected State and City Stores
4. Sales Revenue and Sales Gross Profit by country
(25 marks)
(b) provide a rationale for the graphic design and functionality that is provided in your
dashboard for Peeko Sales dashboard in terms of how it meets Peeko’s Executive
management requirements (1000 words approx). You will need to submit your Excel
spreadsheet or Tableau file which contains your dashboard as a separate document to your
main report for Assignment 4 (10 marks).
The Assignment 4 report must be structured as follows:
1. Cover page for assignment 4 report
2. Executive summary
3. Table of contents
4. Body of report – main sections and subsections for each Task and sub task such as
Task 1 sub task a) etc…
5. List of References
6. Appendices to accompany Task 1 data mining analyses and journal for each student
involved in the Assignment 4
Harvard referencing resources
Install a reference tool (example Endnote) which integrates with your word processor. These
tools are a great help for referencing and citing sources in your assignments. For more
information on how to get Endnote you may visit the following webpage:
http://www.usq.edu.au/library/infoabout/endnote/default.htm.
Study the referencing techniques in Communication skills handbook (Smith & Summers
2010).
USQ Faculty of Business Librarian Adrian Stagg has compiled the following resources on
how to reference correctly using the Harvard referencing system – make use of these
excellent resources if you are unsure as how to reference correctly using Harvard referencing
system.
Library Guides http://www.usq.edu.au/library/help/ehelp/ref_guides/default.htm
Guide to Harvard (Breeze)
http://www.usq.edu.au/library/Breeze/Fac_Business/HarvardAGPS
PDF Brief Guide
http://www.usq.edu.au/library/Breeze/Fac_Business/Harvard_AGPS/Harvard_AGPS_PDF_
Guide.pdf
Warnings
? This assignment must be the expression of your own work. It is acceptable to discuss
course content with others to improve your understanding and clarify requirements,
but solutions to assignment questions must be done on your own. This also means that
it is not sufficient to merely paraphrase the entire assignment content from a textbook
or other sources. Your assignment answers need be a reflection and synthesis of your
research of the associated topics.
? You need to demonstrate your understanding of associated topics for each assignment.
You must not copy from anyone, including tutors and fellow students, nor provide
copies of your work to others. Assignments that do not adhere to this requirement will
be deemed as being the result of collusion or plagiarism. This may lead to severe
academic penalties as outlined in Academic Regulation 5.10 of the USQ Handbook. It
is your own responsibility to ensure the integrity of your work. Refer to the Faculty of
Business guidelines for further details.
? An indiscriminate overuse of incorrectly referenced or cited web pages in your
assignment will result in poor marks.
Assignment 4 submission details
All assignments must be submitted electronically via the Ease assignment submission
system and are subject to checking for plagiarism and collusion via Turnitin. You will
be provided with a link to Turnitin on the course study so you can submit your
assignment to Turnitin to obtain an originality report. You must also include an
originality report on your assignment from Turnitin which provides a check on the
integrity and originality of your assignment work.
Note carefully University and Faculty policy on plagiarism, collusion and cheating. If
any of these occur they will be found and dealt with.
Grading scheme
Your assignment will be assessed on content and style. Content refers to the way in which
your assignment reflects breadth and depth of understanding of the topics and knowledge of
business intelligence systems as covered in the course so far and relevant other researched
literature. Style relates to the adherence to requirements including word counts, use of
English, spell checking, proof reading as well as report presentation.
You are required to use and cite a suitable number of references of high quality. For example,
it is not sufficient for most references to be Internet websites as these can be of questionable
quality. Academic conventions require that you acknowledge any use of ideas from others. In
most cases this means stating which book or article is the source of the idea or quotation. You
are required to follow the Harvard referencing system for both in-text citations and list of
references and to clearly distinguish between your own ideas, other’s ideas adapted to your
own, and ideas taken directly from the literature. You must support your views with
appropriate and relevant literature citations.
The report should be an insightful discussion and not just a descriptive treatment of the topic.
A major emphasis for the assignment will be on a structured report that clearly outlines the
topic and/or issues to be discussed. It will include a cover page, executive summary, table of
contents, a report body that uses headings and paragraphs to clearly detail descriptions,
explanations or arguments, list of references and list of appendices. If you are not familiar
with the requirements of a formal report format, refer to the Communication skills handbook.
You may also find the OPACS Academic Learning Support web-site useful:
http://www.usq.edu.au/opacs/ALSonline
GET ANSWERS / LIVE CHAT