logistic回归可以用于概率预测、分类等。
LogisticRegression
(penalty=’l2’, dual=False, tol=0.0001, C=1.0, fit_intercept=True, intercept_scaling=1, class_weight=None, rando
m_state=None, solver=’liblinear’, max_iter=100, multi_class=’ovr’, verbose=0, warm_start=False, n_jobs=1)[source]
(1)在多类别划分中,如果' multi_class '选项设置为' OvR ',训练算法将使用one-vs-rest (OvR)方案;如果' multi_class '选项设置为'多项式',则使用交叉熵损失。
(2)这个类使用‘liblinear’ library, ‘newton-cg’, ‘sag’ 和‘lbfgs’求解器实现了规范化的logistic回归。它的输入矩阵可以是密集和稀疏的矩阵;使用C-ordered arrays or CSR matrices containing 64-bit floats可以获得最佳的性能;
(3)‘newton-cg’, ‘sag’, and ‘lbfgs求解器只支持原始公式下的L2正则化;liblinear求解器同时支持L1和L2的正则化,只对L2处罚采用对偶公式。
decision_function
(X):预测样本的 confidence scoresdensify
():将系数矩阵转化成密集矩阵的格式fit
(X, y[, sample_weight]):根据给出的训练数据来训练模型。用来训练LR分类器,其中X是训练样本,y是对应的标记样本。get_params
([deep]):Get parameters for this estimator.predict
(X):用来预测测试样本的标记,也就是分类。预测x的标签predict_log_proba
(X):对数概率估计predict_proba
(X):概率估计score
(X, y[, sample_weight]):返回给定的测试数据和标签的平均精度set_params
(**params):设置estimate的参数fit
(X, y, sample_weight=None)Fit the model according to the given training data.
Parameters: | X : {array-like, sparse matrix}, shape (n_samples, n_features) 训练样本
y : array-like, shape (n_samples,)
sample_weight : array-like, shape (n_samples,) optional
|
---|---|
Returns: | self : object
|
(2)get_params
(deep=True)[source]
Get parameters for this estimator.
Parameters: | deep : boolean, optional
|
---|---|
Returns: | params : mapping of string to any
|
(3)predict
(X) Predict class labels for samples in X.
Parameters: | X : {array-like, sparse matrix}, shape = [n_samples, n_features]
|
---|---|
Returns: | C : array, shape = [n_samples]
|
(4)predict_log_proba
(X) Log of probability estimates.
The returned estimates for all classes are ordered by the label of classes.
Parameters: | X : array-like, shape = [n_samples, n_features] |
---|---|
Returns: | T : array-like, shape = [n_samples, n_classes]
|
(5)predict_proba
(X) Probability estimates.
The returned estimates for all classes are ordered by the label of classes.
For a multi_class problem, if multi_class is set to be “multinomial” the softmax function is used to find the predicted probability of each class. Else use a one-vs-rest approach, i.e calculate the probability of each class assuming it to be positive using the logistic function. and normalize these values across all the classes.
Parameters: | X : array-like, shape = [n_samples, n_features] |
---|---|
Returns: | T : array-like, shape = [n_samples, n_classes]
|
(6)score
(X, y, sample_weight=None):[source]
In multi-label classification, this is the subset accuracy which is a harsh metric since you require for each sample that each label set be correctly predicted.
Parameters: | X : array-like, shape = (n_samples, n_features)
y : array-like, shape = (n_samples) or (n_samples, n_outputs)
sample_weight : array-like, shape = [n_samples], optional
|
---|---|
Returns: | score : float
|