Skip to content
/ gbdt Public

code for the project of the DM course in UCAS, initially copied from github.com/bound2020/Code-Destructor/tree/master

Notifications You must be signed in to change notification settings

jinluyang/gbdt

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

### here
 sparse data is categorical data, not necessary
we have to modify the code to do classification, and for multiclass
by jinluyang

Data Format
===========
The input of this GBDT solver consists of a label vector (y), a dense matrix
(XD), and a binary sparse matrix (XS). The input format of these two matrices
are introduced in the following two sections.

Dense Matrix
------------
The input format is:

<label> <value_1> <value_2> ... 
.
.
.

Note that to represent a dense matrix, we do not have to give indices. For
example,

1 32 91 27 44
0 13 25 55 83
0 32 11 78 99

represents:

y        XD
1   32 91 27 44
0   13 25 55 83
0   32 11 78 99

Binary Sparse Matrix
--------------------
The input format is:

<label> <index_1> <index_2> ... 
.
.
.

To represent a binary sparse matrix, we only need to know where non-zero
elements are, so values are not specified.

For example, 

1 2 9 5
0 1 3 7
0 4 8 2

represents:

y          XS
1   0 1 0 0 1 0 0 0 1
0   1 0 1 0 0 0 1 0 0
0   0 1 0 1 0 0 0 1 0

Note that the labels in binary sparse matrix are just dummies. They do not have
pratical use; please specify correct labels in dense matrix.

About

code for the project of the DM course in UCAS, initially copied from github.com/bound2020/Code-Destructor/tree/master

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published