The hardware and bandwidth for this mirror is donated by METANET, the Webhosting and Full Service-Cloud Provider.
If you wish to report a bug, or if you are interested in having us mirror your free-software or open-source project, please feel free to contact us at mirror[@]metanet.ch.
## Warning: package 'doParallel' was built under R version 4.4.3
## Loading required package: foreach
## Loading required package: iterators
## Loading required package: parallel
result <- pipeline(
# --- Define the vectorization method ---
# Options: "bow" (raw counts), "tf" (term frequency), "tfidf"
vect_method = "tf",
# --- Define the model to train ---
# Options: "logit", "rf", "xgb"
model_name = "rf",
# --- Specify the data and column names ---
df = tweets,
text_column_name = "cleaned_text", # The column with our preprocessed text
sentiment_column_name = "sentiment", # The column with the target variable
# --- Set vectorization options ---
# Use n_gram = 2 for unigrams + bigrams, or 1 for just unigrams
n_gram = 1
)## --- Running Pipeline: TF + RF ---
## Data split: 946 training rows, 235 test rows.
## Vectorizing with TF (ngram=1)...
## - Fitting BoW model (tf) on training data...
## - Applying BoW transformation (tf) to new data...
##
## --- Training Random Forest Model (with ranger) ---
## Ranger training complete.
tweets$sentimentPredict <- prediction(
pipeline_object = result,
df = tweets,
text_column = "cleaned_text"
)## --- Preparing new data for prediction ---
## - Applying BoW transformation (tf) to new data...
## --- Making Predictions ---
## - Detected a ranger model. Using predict.train().
## Step 4: Correcting column names for ranger model...
## --- Predictions Complete ---
These binaries (installable software) and packages are in development.
They may not be fully stable and should be used with caution. We make no claims about them.