Logistic regression

Fitting #

from sklearn.linear_model import LogisticRegression

Assuming we have the data in two separate arrays, where X is the independent variables and y is the dependent variable.

Define the model:

model = LogisticRegression(solver='liblinear', random_state=0)

solver refers to the solver used for fitting the model. There are a few options.

Once model is defined, fit the data to the model:

model.fit(X, y)

Defining the model and fitting can be condensed:

model = LogisticRegression(solver='liblinear', random_state=0).fit(X, y)

It may be necessary to reshape the data into a 2D array:

y = y.reshape(-1, 1)

Extracting Values #

To pull values from the model itself:

model.intercept_

model.coef_

Extracting probabilities #

Pull probability of whether value is 0 or 1 in binary classifier case:

model.predict_proba(x)

Which returns an array of the form array([[<prob 0>, <prob 1>],...]). Each output row of the array corresponds to one observation. Values correspond to 1-p(x) and p(x).

Extracting predictions #

model.predict(x)

Model score #

model.score(X_test, y_test)