Credit Card Analysis of Czech Bank

ITCS6265 : Fall Semester of 2002
Instructor: Dr. Mirsad Hadzikadic

UNC-Charlotte  |  College of IT  |  Dr. Mirsad Hadzikadic  |  ITCS6265   


Site Index   Preprocessing - Loan
· Goal
· Domain
· Pre-Processing
· Accounts
· Clients
· Disposition
· Loans
· Permanent Orders
· Transactions
· Demographics
· Credit Cards
· Methodology
· Results
· Next Steps
· References
· Authors

LOAN Table
Changes Made With Excel
Column Description Changes Missing or Invalid Values Notes
Loan_ID Record Identifier None N/A Primary Key
Account_ID Account Identifier None N/A Foreign Key
Date Date the loan was granted Changed format from YYMMDD to MM/DD/YYYY Ignore Correlation with Status Attribute:

  R2 = 98%
  r = 0.49

Amount Amount of loan Removed

Discretized values stored in Amt attribute

N/A Correlation with Status Attribute:

  R2 = 68%
  r = 0.34

Duration Duration of the loan

Possible Values:

  12 months
  24 months
  36 months
  48 months
  60 months

Removed

Discretized values stored in Dur attribute

N/A Correlation with Status attribute:

  R2 = 100%
  r = 0.51

Distribution of Values:

  12 = 19%
  20 = 20%
  36 = 19%
  40 = 20%
  60 = 21%

Payments Monthly loan payment Removed

Discretized values stored in Amt attribute

N/A  
Status Status of loan pay-off

- 'A' = Contract finished, no problems

- 'B' = Contract finished, loan not payed

- 'C' = Contract running, OK thus-far

- 'D' = Contract running, client in debt

None N/A Correlation with Status attribute:

  R2 = 98%
  r = 0.49

Distribution of values:

  'A' = 30%
  'B' = 4.5%
  'C' = 59%
  'D' = 6.5%

Dur Discretized Duration Attribute:

  (*,30)
  (31,42)
  (43,54)
  (55,*)

Added

Discretized in Rosetta using Entropy Algorithm

N/A Distribution of values:

  (*,30) = 39%
  (31,42) = 19%
  (43,54) = 20%
  (55,*) = 21%

Pmt Discretized Payment Attribute:

  (*,8041) = 1
  (8041,*) = 2

Added

Discretized in Rosetta using Entropy Algorithm

Rosetta discretized into 50+ values.. We merged values using <1%

N/A Distribution of values:

  (*,8041) = 94%
  (8041,*) = 6%

Amt Discretized Amount Attribute:

  (*,30708) = 1
  (30709,49380) = 2
  (49381,76926) = 3
  (76927,230310) = 4
  (230311,*) = 5

Added

Discretized in Rosetta using Entropy Algorithm

Rosetta discretized into 100+ values.. We merged values using <1%

N/A Distribution of values:

  (*,30708) = 9%
  (30709,49380) = 10%
  (49381,76926) = 13%
  (76927,230310) = 47%
  (230311,*) = 21%