FEATURES SELECTION FOR INVESTMENT DECISION¶
This analysis is based upon the "nba_dataset.csv". Link below.
https://github.com/MEMAUDATA/memaudata.github.io/blob/main/datasets/nba_dataset.csv
Two questions here :
- Identify the best features for investment decision and develop a ML model
- Create a web app with the previous model. This app will run locally.
Objective : Enhance recall score !
Dataframe shape :
- 1340 rows and 21 columns
- Target : "TARGET_5Yrs"
- Name of players : column "Name"
- Features : 19 columns
The question on investment decision is based upon the column "Target_5YRS" which is, here, our dependant variable (supervided ML). Target_5YRS = 0 => No investment Target_5YRS = 1 => investment
- Name : 1 -> type object
- GP : 1 -> type int
- Others : 19 -> type float64
- Nan : Only 3P% -> 0.008209 %
- Duplicates : 46 players
- Outliers : 5% max in some columns not all. Keep all points for now.
Dataframe in depth:
- target column ("TARGET_5Yrs") : 62 % positives / 37% negatives
- Quantitatives columns : Not normalized and not normally distributed
- GP column : From 10 to 80 , not normally distributed.
Hypothesis :
- Target / GP : test difference !
- Target / Quantitatives variables : Test differences (Mann-Whitney) on
- MIN / PTS / FGM / FGA/FG% /FTM/ 3P Made / FT% / AST /STL and TOV. Only 3P Made is not significant. All the others may present an impact on the target and therefore might be consider for modelling.
NV, Toulouse, octobre,2024
Install all required librairies from requirements.txt¶
#!pip install -r requirements.txt
Import librairies¶
import pandas as pd
import numpy as np
import seaborn as sns
from matplotlib import pyplot as plt
# From Jupyter to pdf
import nbconvert
# In the Terminal
# jupyter nbconvert --to html nba.ipynb
Exploratory Data Analysis¶
# Load dataset
df1 = pd.read_csv("./datasets/nba_dataset.csv")
Copy the dataset to avoid reloading it
# backup
df = df1.copy()
df.shape
(1340, 21)
# Display all columns
pd.set_option("display.max.columns", None)
pd.set_option("display.max.rows", None)
df.head()
| Name | GP | MIN | PTS | FGM | FGA | FG% | 3P Made | 3PA | 3P% | FTM | FTA | FT% | OREB | DREB | REB | AST | STL | BLK | TOV | TARGET_5Yrs | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | Brandon Ingram | 36 | 27.4 | 7.4 | 2.6 | 7.6 | 34.7 | 0.5 | 2.1 | 25.0 | 1.6 | 2.3 | 69.9 | 0.7 | 3.4 | 4.1 | 1.9 | 0.4 | 0.4 | 1.3 | 0.0 |
| 1 | Andrew Harrison | 35 | 26.9 | 7.2 | 2.0 | 6.7 | 29.6 | 0.7 | 2.8 | 23.5 | 2.6 | 3.4 | 76.5 | 0.5 | 2.0 | 2.4 | 3.7 | 1.1 | 0.5 | 1.6 | 0.0 |
| 2 | JaKarr Sampson | 74 | 15.3 | 5.2 | 2.0 | 4.7 | 42.2 | 0.4 | 1.7 | 24.4 | 0.9 | 1.3 | 67.0 | 0.5 | 1.7 | 2.2 | 1.0 | 0.5 | 0.3 | 1.0 | 0.0 |
| 3 | Malik Sealy | 58 | 11.6 | 5.7 | 2.3 | 5.5 | 42.6 | 0.1 | 0.5 | 22.6 | 0.9 | 1.3 | 68.9 | 1.0 | 0.9 | 1.9 | 0.8 | 0.6 | 0.1 | 1.0 | 1.0 |
| 4 | Matt Geiger | 48 | 11.5 | 4.5 | 1.6 | 3.0 | 52.4 | 0.0 | 0.1 | 0.0 | 1.3 | 1.9 | 67.4 | 1.0 | 1.5 | 2.5 | 0.3 | 0.3 | 0.4 | 0.8 | 1.0 |
df.dtypes.value_counts()
float64 19 object 1 int64 1 Name: count, dtype: int64
Plot the entire data with heatmap from seaborn to detect NaN¶
plt.figure(figsize=(10,10))
sns.heatmap(df.isna(),cbar=False) # 1 = empty
plt.show()
Nan¶
(df.isna().sum()/df.shape[0]).sort_values(ascending=True)
Name 0.000000 BLK 0.000000 STL 0.000000 AST 0.000000 REB 0.000000 DREB 0.000000 OREB 0.000000 FT% 0.000000 FTA 0.000000 TOV 0.000000 FTM 0.000000 3PA 0.000000 3P Made 0.000000 FG% 0.000000 FGA 0.000000 FGM 0.000000 PTS 0.000000 MIN 0.000000 GP 0.000000 TARGET_5Yrs 0.000000 3P% 0.008209 dtype: float64
This dataset contains less than 0.008% of Nan within a single column (3P%)
Duplicated players ?¶
df_without_duplicates = df.drop_duplicates(subset='Name')
print(f"Nb of duplicated players : {df.shape[0] - df_without_duplicates.shape[0]}")
Nb of duplicated players : 46
Target column¶
df['TARGET_5Yrs'].value_counts(normalize=True)
TARGET_5Yrs 1.0 0.620149 0.0 0.379851 Name: proportion, dtype: float64
Quantitatives columns¶
float_cols = [c for c in df.select_dtypes('float') if c != 'TARGET_5Yrs']
n = len(float_cols)
rows = (n // 3) + 1
fig, axes = plt.subplots(rows, 3, figsize=(12, 4 * rows))
axes = axes.flatten()
for ax, col in zip(axes, float_cols):
sns.histplot(df[col], kde=True, ax=ax, color="steelblue")
ax.set_title(col)
# Remove empty plots
for ax in axes[len(float_cols):]:
ax.set_visible(False)
plt.tight_layout()
plt.show()
GP column : int type¶
sns.displot(df['GP'],kde=True,color='green',alpha=0.3)
<seaborn.axisgrid.FacetGrid at 0x12d64bfa0>
Outliers ?¶
# Remove columns for counting outliers
df_outliers = df.drop(['Name', 'TARGET_5Yrs'], axis='columns')
print(df_outliers.head())
GP MIN PTS FGM FGA FG% 3P Made 3PA 3P% FTM FTA FT% OREB \ 0 36 27.4 7.4 2.6 7.6 34.7 0.5 2.1 25.0 1.6 2.3 69.9 0.7 1 35 26.9 7.2 2.0 6.7 29.6 0.7 2.8 23.5 2.6 3.4 76.5 0.5 2 74 15.3 5.2 2.0 4.7 42.2 0.4 1.7 24.4 0.9 1.3 67.0 0.5 3 58 11.6 5.7 2.3 5.5 42.6 0.1 0.5 22.6 0.9 1.3 68.9 1.0 4 48 11.5 4.5 1.6 3.0 52.4 0.0 0.1 0.0 1.3 1.9 67.4 1.0 DREB REB AST STL BLK TOV 0 3.4 4.1 1.9 0.4 0.4 1.3 1 2.0 2.4 3.7 1.1 0.5 1.6 2 1.7 2.2 1.0 0.5 0.3 1.0 3 0.9 1.9 0.8 0.6 0.1 1.0 4 1.5 2.5 0.3 0.3 0.4 0.8
# Group visualization
plt.figure()
g = sns.catplot(data=df_outliers ,kind="box",height = 10)
g.fig.set_size_inches(8, 4)
plt.xticks(rotation=45, ha='right')
plt.show()
<Figure size 640x480 with 0 Axes>
outlier_counts = {}
for col in df_outliers.columns:
q1 = df_outliers[col].quantile(0.25)
q3 = df_outliers[col].quantile(0.75)
iqr = q3 - q1
lower = q1 - 1.5 * iqr
upper = q3 + 1.5 * iqr
mask = (df_outliers[col] < lower) | (df_outliers[col] > upper)
outlier_counts[col] = mask.mean() * 100
for col, pct in outlier_counts.items():
print(f"{col:_<10} {pct:6.2f}% outliers")
GP________ 0.00% outliers MIN_______ 0.00% outliers PTS_______ 4.33% outliers FGM_______ 3.88% outliers FGA_______ 4.70% outliers FG%_______ 1.42% outliers 3P Made___ 5.30% outliers 3PA_______ 5.07% outliers 3P%_______ 0.30% outliers FTM_______ 5.52% outliers FTA_______ 5.67% outliers FT%_______ 2.69% outliers OREB______ 2.91% outliers DREB______ 3.96% outliers REB_______ 3.58% outliers AST_______ 6.19% outliers STL_______ 4.25% outliers BLK_______ 5.60% outliers TOV_______ 5.00% outliers
For now keep all data into the dataset.
Realtionship between GP and target¶
plt.figure(figsize=(20, 8))
sns.countplot(data=df, x='GP', hue='TARGET_5Yrs', palette="crest")
plt.title("Countplot de GP par TARGET_5Yrs")
plt.show()
plt.figure(figsize=(8, 4))
sns.boxplot(data=df, x='TARGET_5Yrs', y='GP', hue='TARGET_5Yrs', palette="crest")
plt.title("Boxplot de GP par TARGET_5Yrs")
plt.legend([], [], frameon=False)
plt.show()
plt.figure(figsize=(8, 4))
sns.histplot(data=df, x='GP', hue='TARGET_5Yrs', kde=True, palette="crest", alpha=0.7, common_norm=False)
plt.title("Distribution de GP par TARGET_5Yrs")
plt.show()
Relationship between target and variables¶
for col in df.select_dtypes('float'):
if col != 'TARGET_5Yrs':
fig, axes = plt.subplots(1, 2, figsize=(12, 5))
sns.histplot(data=df, x=col, hue='TARGET_5Yrs', kde=True, palette="crest", alpha=0.7, common_norm=False,ax=axes[0])
sns.boxplot(x='TARGET_5Yrs', y=col, data=df,hue='TARGET_5Yrs',palette="crest",ax=axes[1])
# Set titles
axes[0].set_title(f"Distribution de {col}")
axes[1].set_title(f"{col} par classe")
plt.legend([],[], frameon=False)
plt.show()
Test hypothesis with non-parametric tests¶
# Wilcoxon
from scipy.stats import mannwhitneyu
# GP
no_invest = df[df['TARGET_5Yrs'] == 0]['GP']
invest = df[df['TARGET_5Yrs'] == 1]['GP']
st,p = mannwhitneyu(no_invest,invest)
print(f" GP (diff based on Target_5 year), p = {p}")
GP (diff based on Target_5 year), p = 8.551378141471931e-48
col_list = ("MIN","PTS","FGM","FGA","FG%","FTM","3P Made","FT%" ,"AST","STL","TOV")
for col in col_list:
no_invest = df[df['TARGET_5Yrs'] == 0][col]
invest = df[df['TARGET_5Yrs'] == 1][col]
stat, p = mannwhitneyu(no_invest, invest, alternative='two-sided')
significance = "YES" if p < 0.05 else "NO"
print(f"{col:<10} | p = {p:.4f} | Significant difference: {significance}")
MIN | p = 0.0000 | Significant difference: YES PTS | p = 0.0000 | Significant difference: YES FGM | p = 0.0000 | Significant difference: YES FGA | p = 0.0000 | Significant difference: YES FG% | p = 0.0000 | Significant difference: YES FTM | p = 0.0000 | Significant difference: YES 3P Made | p = 0.2433 | Significant difference: NO FT% | p = 0.0002 | Significant difference: YES AST | p = 0.0000 | Significant difference: YES STL | p = 0.0000 | Significant difference: YES TOV | p = 0.0000 | Significant difference: YES
Pre-processing¶
# Reload df
df = df1.copy()
# Remove duplicates within players column
print(df1.shape)
df = df.drop_duplicates(subset='Name')
print(df.shape)
(1340, 21) (1294, 21)
# Filters dF based upon stats
clean_df = df[["GP","MIN","PTS","FGM","FGA","FG%","FTM","FT%" ,"AST","STL","TOV","TARGET_5Yrs"]].reset_index(drop=True)
print(clean_df.head())
GP MIN PTS FGM FGA FG% FTM FT% AST STL TOV TARGET_5Yrs 0 36 27.4 7.4 2.6 7.6 34.7 1.6 69.9 1.9 0.4 1.3 0.0 1 35 26.9 7.2 2.0 6.7 29.6 2.6 76.5 3.7 1.1 1.6 0.0 2 74 15.3 5.2 2.0 4.7 42.2 0.9 67.0 1.0 0.5 1.0 0.0 3 58 11.6 5.7 2.3 5.5 42.6 0.9 68.9 0.8 0.6 1.0 1.0 4 48 11.5 4.5 1.6 3.0 52.4 1.3 67.4 0.3 0.3 0.8 1.0
Modelling¶
from sklearn.model_selection import train_test_split,StratifiedKFold
from sklearn.preprocessing import StandardScaler,MinMaxScaler
from sklearn.tree import DecisionTreeClassifier
from sklearn.metrics import f1_score,confusion_matrix,classification_report,recall_score
from sklearn.model_selection import learning_curve
from sklearn.svm import SVC
from sklearn.neighbors import KNeighborsClassifier
from sklearn.pipeline import make_pipeline,Pipeline
labels = clean_df.drop('TARGET_5Yrs', axis=1).columns
X = clean_df.drop('TARGET_5Yrs', axis=1).values
y = clean_df['TARGET_5Yrs'].values
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=0)
Try with a first basic model¶
model = DecisionTreeClassifier(random_state=42)
def evaluation(model, X_train, y_train, X_test, y_test):
model.fit(X_train, y_train)
y_pred = model.predict(X_test)
print("\nConfusion matrix:")
print(confusion_matrix(y_test, y_pred))
print("\nClassification report:")
print(classification_report(y_test, y_pred, zero_division=1))
print(f"Recall global : {recall_score(y_test, y_pred):.3f}")
cv = StratifiedKFold(n_splits=3, shuffle=True, random_state=42)
N, train_score, val_score = learning_curve(
model,
X_train, y_train,
cv=cv,
scoring="recall",
train_sizes=np.linspace(0.1, 1, 10)
)
plt.figure(figsize=(6, 4))
plt.plot(N, train_score.mean(axis=1), marker='o', label='Train recall')
plt.plot(N, val_score.mean(axis=1), marker='s', label='Validation recall')
plt.ylim(0, 1.1)
plt.title("Learning curve (Recall)")
plt.xlabel("Training size")
plt.ylabel("Recall")
plt.legend()
plt.grid(False)
plt.show()
return model
evaluation(model,X_train, y_train, X_test, y_test)
Confusion matrix:
[[ 52 53]
[ 52 102]]
Classification report:
precision recall f1-score support
0.0 0.50 0.50 0.50 105
1.0 0.66 0.66 0.66 154
accuracy 0.59 259
macro avg 0.58 0.58 0.58 259
weighted avg 0.59 0.59 0.59 259
Recall global : 0.662
DecisionTreeClassifier(random_state=42)In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Parameters
| criterion | 'gini' | |
| splitter | 'best' | |
| max_depth | None | |
| min_samples_split | 2 | |
| min_samples_leaf | 1 | |
| min_weight_fraction_leaf | 0.0 | |
| max_features | None | |
| random_state | 42 | |
| max_leaf_nodes | None | |
| min_impurity_decrease | 0.0 | |
| class_weight | None | |
| ccp_alpha | 0.0 | |
| monotonic_cst | None |
Improve recall score by chossing another model¶
dict_of_models = {"SVM" : make_pipeline(StandardScaler(),SVC(random_state=42)),
"KNN" : make_pipeline(StandardScaler(),KNeighborsClassifier(n_neighbors=3))}
for model_name, model in dict_of_models.items():
print(f"Model : {model_name}")
evaluation(model,X_train, y_train, X_test, y_test)
Model : SVM
Confusion matrix:
[[ 54 51]
[ 25 129]]
Classification report:
precision recall f1-score support
0.0 0.68 0.51 0.59 105
1.0 0.72 0.84 0.77 154
accuracy 0.71 259
macro avg 0.70 0.68 0.68 259
weighted avg 0.70 0.71 0.70 259
Recall global : 0.838
Model : KNN
Confusion matrix:
[[ 57 48]
[ 40 114]]
Classification report:
precision recall f1-score support
0.0 0.59 0.54 0.56 105
1.0 0.70 0.74 0.72 154
accuracy 0.66 259
macro avg 0.65 0.64 0.64 259
weighted avg 0.66 0.66 0.66 259
Recall global : 0.740
Model optimization¶
from sklearn.model_selection import GridSearchCV
model_to_optimize = make_pipeline(StandardScaler(),SVC(random_state=42))
def grid_model(model, X_train, y_train, X_test, y_test):
hyper_parameters = {
'svc__gamma': [1e-3, 1e-4],
'svc__C': [0.1, 1, 10, 100, 1000],
'svc__kernel': ['linear']
}
cv = StratifiedKFold(n_splits=3, shuffle=True, random_state=42)
# GridSearch
grid = GridSearchCV(
estimator=model,
param_grid=hyper_parameters,
cv=cv,
scoring='recall',
n_jobs=-1
)
grid.fit(X_train, y_train)
y_pred = grid.predict(X_test)
print("Best Hyperparameters:", grid.best_params_)
print("\nClassification report:")
print(classification_report(y_test, y_pred, zero_division=1))
print(f"Recall global : {recall_score(y_test, y_pred):.3f}")
# Courbe d'apprentissage
N, train_score, val_score = learning_curve(
estimator=grid.best_estimator_,
X=X_train,
y=y_train,
cv=StratifiedKFold(n_splits=5, shuffle=True, random_state=42),
scoring='recall',
train_sizes=np.linspace(0.1, 1, 10)
)
plt.figure(figsize=(6, 4))
plt.plot(N, train_score.mean(axis=1), marker='o', label='Train recall')
plt.plot(N, val_score.mean(axis=1), marker='s', label='Validation recall')
plt.ylim(0.5, 1.05)
plt.title("Learning curve (Recall)")
plt.xlabel("Training size")
plt.ylabel("Recall")
plt.legend()
plt.grid(False)
plt.show()
return grid.best_estimator_
grid_model(model_to_optimize,X_train, y_train, X_test, y_test)
Best Hyperparameters: {'svc__C': 10, 'svc__gamma': 0.001, 'svc__kernel': 'linear'}
Classification report:
precision recall f1-score support
0.0 0.66 0.52 0.59 105
1.0 0.72 0.82 0.76 154
accuracy 0.70 259
macro avg 0.69 0.67 0.67 259
weighted avg 0.69 0.70 0.69 259
Recall global : 0.818
Pipeline(steps=[('standardscaler', StandardScaler()),
('svc',
SVC(C=10, gamma=0.001, kernel='linear', random_state=42))])In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook. On GitHub, the HTML representation is unable to render, please try loading this page with nbviewer.org.
Parameters
| steps | [('standardscaler', ...), ('svc', ...)] | |
| transform_input | None | |
| memory | None | |
| verbose | False |
Parameters
| copy | True | |
| with_mean | True | |
| with_std | True |
Parameters
| C | 10 | |
| kernel | 'linear' | |
| degree | 3 | |
| gamma | 0.001 | |
| coef0 | 0.0 | |
| shrinking | True | |
| probability | False | |
| tol | 0.001 | |
| cache_size | 200 | |
| class_weight | None | |
| verbose | False | |
| max_iter | -1 | |
| decision_function_shape | 'ovr' | |
| break_ties | False | |
| random_state | 42 |
Do we need all features ?¶
from sklearn.feature_selection import SelectKBest, f_classif
final_model = make_pipeline(StandardScaler(),SelectKBest(score_func=f_classif),SVC(random_state=42))
def grid_model_features_selection(model,X_train, y_train, X_test, y_test):
hyper_parameters = {
'selectkbest__k': range(1, X_train.shape[1] + 1),
'svc__gamma': [1e-4, 1e-3],
'svc__C' : [0.1,1,10,100,1000] ,
'svc__kernel': ['linear']}
grid = GridSearchCV(
final_model,
hyper_parameters,
cv=StratifiedKFold(n_splits=3, shuffle=True, random_state=42),
scoring='recall',n_jobs = -1)
grid.fit(X_train, y_train)
best_model = grid.best_estimator_
# Get the number of features and their names
select_k_best = grid.best_estimator_.named_steps['selectkbest']
selected_mask = select_k_best.get_support()
selected_feature_names = labels[selected_mask].to_list()
y_pred = best_model.predict(X_test)
print(f"Features selected : {selected_feature_names}")
print("Best Hyperparameters:", grid.best_params_)
print(confusion_matrix(y_test,y_pred))
print(classification_report(y_test,y_pred, zero_division=1))
N, train_score, val_score = learning_curve(
best_model,
X_train, y_train,
cv=StratifiedKFold(n_splits=3, shuffle=True, random_state=42),
scoring='recall',
train_sizes=np.linspace(0.1, 1, 10))
plt.figure(figsize=(6,4))
plt.plot(N,train_score.mean(axis=1),marker = 'o',label = 'train score')
plt.plot(N,val_score.mean(axis=1),label = 'val score')
plt.ylim(0, 1.1)
plt.legend(ncols=2, loc= 'lower center')
plt.grid(False)
plt.show()
return best_model
best_model = grid_model_features_selection(model,X_train, y_train, X_test, y_test)
Features selected : ['GP', 'MIN', 'PTS', 'FGM', 'FGA', 'FG%', 'FTM', 'STL', 'TOV']
Best Hyperparameters: {'selectkbest__k': 9, 'svc__C': 100, 'svc__gamma': 0.0001, 'svc__kernel': 'linear'}
[[ 51 54]
[ 26 128]]
precision recall f1-score support
0.0 0.66 0.49 0.56 105
1.0 0.70 0.83 0.76 154
accuracy 0.69 259
macro avg 0.68 0.66 0.66 259
weighted avg 0.69 0.69 0.68 259
Save final model¶
import joblib
joblib.dump(best_model, 'best_model.pkl')
['best_model.pkl']
Test the model on randomly new data¶
col = ('GP','MIN','PTS','FGM','FGA','FG%','FTM','FT%','AST','STL','TOV')
dt = np.array([52 ,105.4 , 7.4 , 2.6, 7.6 , 34.7 , 1.6 ,0,0, 0.4 , 1.3])
dt = dt.reshape(1,dt.shape[0])
y_new_pred = best_model.predict(dt)
y_new_pred = y_new_pred > 0.5
print(f"Invest? {y_new_pred}")
Invest? [False]
Conclusion :
I should try another model or another feature selection method to better reduce the number of input !
Web app on Flask (run locally)¶
For displaying purposes, the flask python code will be shown here!
it's better to have the Flask script into a separated app.py file
from flask import Flask, render_template, request
import pandas as pd
import joblib
app = Flask(__name__)
# Load best_model
loaded_model = joblib.load('best_model.pkl')
@app.route("/", methods=["GET", "POST"])
def index():
prediction = None
error_message = None
if request.method == "POST":
try:
GP = float(request.form.get('GP'))
MIN = float(request.form.get('MIN'))
PTS = float(request.form.get('PTS'))
FGM = float(request.form.get('FGM'))
FGA = float(request.form.get('FGA'))
FG = float(request.form.get('FG%'))
FTM = float(request.form.get('FTM'))
STL = float(request.form.get('STL'))
TOV = float(request.form.get('TOV'))
# Default values for missing features
TPTM = 0.0
TPTA = 0.0
input_data = pd.DataFrame({
'GP': [GP],
'MIN': [MIN],
'PTS': [PTS],
'FGM': [FGM],
'FGA': [FGA],
'FG%': [FG],
'FTM': [FTM],
'STL': [STL],
'TOV': [TOV],
'3PTM': [TPTM],
'3PTA': [TPTA]
})
prediction = loaded_model.predict(input_data)[0]
except ValueError:
error_message = "Invalid input. Please enter numbers only."
return render_template("index.html", prediction=prediction, error=error_message)
if __name__ == '__main__':
app.run(debug=True,host="127.0.0.1", port=5000)
HTML code used alongside with the Web app code¶
Should be also into a separate index.html file
from IPython.display import HTML
html_code = """
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>NBA Investment</title>
<link rel="icon"
href="https://upload.wikimedia.org/wikipedia/commons/thumb/b/be/Isolated_basketball.png/600px-Isolated_basketball.png?20121214161940"
type="image/x-icon">
<style>
body {
font-family: Arial, sans-serif;
}
.tbox {
margin: 20px;
padding: 10px;
}
.box {
background-color: #2da1ff;
color: white;
margin: 20px;
padding: 20px;
width: 280px;
border-radius: 8px;
}
table td {
padding: 6px;
}
input[type="text"] {
width: 100px;
padding: 4px;
}
.submit-btn {
margin-top: 15px;
padding: 8px 15px;
background: white;
color: #2da1ff;
border: none;
border-radius: 5px;
cursor: pointer;
font-weight: bold;
}
</style>
</head>
<body>
<div class="tbox">
<h2>Investment Decision</h2>
<p>Enter NBA player statistics</p>
</div>
<div class="box">
<form action="http://127.0.0.1:5000" method="POST">
<table>
<tr><td>GP</td><td><input type="text" name="GP"></td></tr>
<tr><td>MIN</td><td><input type="text" name="MIN"></td></tr>
<tr><td>PTS</td><td><input type="text" name="PTS"></td></tr>
<tr><td>FGM</td><td><input type="text" name="FGM"></td></tr>
<tr><td>FGA</td><td><input type="text" name="FGA"></td></tr>
<tr><td>FG%</td><td><input type="text" name="FG_pct"></td></tr>
<tr><td>FTM</td><td><input type="text" name="FTM"></td></tr>
<tr><td>STL</td><td><input type="text" name="STL"></td></tr>
<tr><td>TOV</td><td><input type="text" name="TOV"></td></tr>
</table>
<button class="submit-btn" type="submit">Predict</button>
</form>
</div>
<div class="tbox">
<h3>Invest ?</h3>
<p>{{ prediction }}</p>
</div>
</body>
</html>
"""
HTML(html_code)
Investment Decision
Enter NBA player statistics
Invest ?
{{ prediction }}