Data Science

Running Google Cloud Vertex AI through Python

September 19, 2023
Data Science
GCP, Vertex AI

Google Cloud’s Vertex AI is pretty awesome for machine learning. But sometimes it gets pretty tedious running everything through the site’s user interface. It’s especially sigh-worthy when a workflow involves multiple iterations of different datasets and models. Fortunately, it’s pretty straightforward to run the process of dataset generation in BigQuery, instantiating a dataset for use in Vertex AI, and finally running the Vertex AI model through a Python script or a Jupyter notebook. ...

How to: Hide all the warnings in Jupyter Notebooks

February 25, 2023
Data Science
Python, Jupyter

If you’re a data scientist, there’s a good chance that you spend a healthy chunk of time working in Jupyter Notebooks. Every now and then, you might do something that triggers a warning (such as ::ahem:: using a deprecated method from the Pandas package). That warning can get end up consuming a whole lot of screen real estate, especially if it’s part of a looping function. One way to deal with those warnings is to simply make them disappear by adding the following chunk in a cell near the top of the notebook: ...

Data Science Model Evaluation Simply Explained

September 8, 2022
Data Science
evaluation

There are a a whole bunch of different tools and metrics to evaluate data science model performance. Some of the most common tools and metrics include: Confusion Matrix Accuracy Precision Recall F1 Score Honestly, I often forget the definitions for these metrics. I’d like to think it’s partly because the definitions of these tend to be super technical, verbose, and obscure. Not easy for casual reading. Here I’ll explain these metrics in my own words, which tend to be simple. ...