Credit Card Analysis of Czech Bank

ITCS6265 : Fall Semester of 2002
Instructor: Dr. Mirsad Hadzikadic

UNC-Charlotte  |  College of IT  |  Dr. Mirsad Hadzikadic  |  ITCS6265   


Site Index   Data Pre-Processing
· Goal
· Domain
· Pre-Processing
· Accounts
· Clients
· Disposition
· Loans
· Permanent Orders
· Transactions
· Demographics
· Credit Cards
· Methodology
· Results
· Next Steps
· References
· Authors

Data Pre-Processing Activities
The following activities were completed to prepare the dataset for use in the data mining exercise:

1. Converted the ascii files to:
a. MS Excel and/or MS Word files for cleaning data
b. MS Access database for use in data mining, de-normalizing or ‘flattening’ files, and basic querying to learn more about the data.
c. Put all modified files into file types recognized by Weka. These files are comma delimited with a ‘heading’ of attribute definition information.
2. Verified all table relationships:
a. Every account has an Owner via Disp and Account tables
b. Order and Loan records are duplicated in transaction records. That is, the transactions include Order records and Loan payments.
i. Loan records in Trans are identified by k_symbol=”LP”
3. Change attributes as necessary. The tables linked (below) describe the changes made:
a. Add, change, remove, discretize attributes
4. De-normalize, or ‘flatten’, files for mining. Our database is relational. In order to mine or cluster attributes, those attributes must be in a single table. We have created a de-normalized table based on our goals.
a. Goal: Analyze credit-card information to extrapolate the type of customer who makes a good candidate for a credit-card.
i. Combine tables: Account-Client-Disp-Card-District-Loan-Transaction
º Using information we discovered about accounts from previous clustering, cluster customer information
º Using card type as clustering attribute
º Added "N" (None) as a possible value to the Loan Status attribute
º In order to better understand customers, we looked at this table in two ways:
º To identify, from all customers, which were credit card holders and which were not.
º To examine the variances that exist between all credit card holding customers.

Data Description

Table Descriptions
ACCOUNTS
Each record describes static characteristics of an account
Size: 4500 objects in the file
Return To Top
CLIENTS
Each record describes characteristics of a client
Size : 5369 objects in the file
Return To Top
DISPOSITON (DISP)
Each record relates together a client with an account i.e. this relation describes the rights of clients to operate accounts
Size: 5369 objects in the file
Return To Top
PERMANENT ORDERS, Debits only (ORDER)
Each record describes characteristics of a payment order
Size : 6471 objects in the file
Return To Top
TRANSACTIONS (TRANS)
Each record describes one transaction on an account
Size: 1056320 objects in the file
Return To Top
LOANS
Each record describes a loan granted for a given account
Size: 682 objects in the file
Return To Top
CREDIT CARDS (CARD)
Each record describes a credit card issued to an account
Size : 892 objects in the file
Return To Top
DEMOGRAPHIC DATA (DISTRICT)
Each record describes demographic characteristics of a district
Size: 77 objects in the file
Return To Top