What Is Jupyter?
Jupyter Notebook is an interactive coding environment that runs in your browser. You write code in cells and execute each cell independently — seeing the output immediately below it. For ML, this is invaluable: you can explore data, visualise results, and iterate on model code without restarting the entire script each time.
Starting a Notebook
Before any imports will work, the packages must be installed in the same Python environment Jupyter is running in. Install everything you need first, then launch Jupyter. Each notebook is saved as a .ipynb file.
# Install Jupyter and the core ML packages
pip install jupyter numpy matplotlib scikit-learn
# Start Jupyter
jupyter notebook
# Or use JupyterLab (newer interface)
pip install jupyterlab
jupyter labCell Types
Notebooks have two main cell types. Code cells execute Python. Markdown cells render formatted text, equations, and headings. You use markdown cells to document your work and code cells to run it.
| Cell type | Purpose | Shortcut to run |
|---|---|---|
| Code | Execute Python | Shift + Enter |
| Markdown | Documentation, equations | Shift + Enter |
| Raw | Plain text, not rendered | — |
Press Esc to enter command mode, then B to insert a cell below and A to insert above. Press M to convert to Markdown, Y to convert to Code. These shortcuts save significant time.
Key Jupyter Shortcuts
Shift + Enter — Run cell and move to next
Ctrl + Enter — Run cell and stay
Esc + D, D — Delete current cell
Esc + Z — Undo cell deletion
Tab — Autocomplete variable or function name
Shift + Tab — Show function signature/docsWorking with Data in Notebooks
import numpy as np
import matplotlib.pyplot as plt
# makes plots appear inside the notebook output instead of opening a separate window
%matplotlib inline
# Create a sample dataset — 200 examples, 5 features
X = np.random.randn(200, 5)
print(X.shape) # always check shape first
print(X[:5]) # preview first 5 rows
# Plot a feature distribution
plt.hist(X[:, 0], bins=30)
plt.xlabel('Feature 0')
plt.ylabel('Count')
plt.title('Feature 0 Distribution')
plt.show()Best Practices for ML Notebooks
Keep imports at the top of the first cell so the notebook can be re-run from scratch. Name your cells and add Markdown headers between sections — notebooks double as documentation. Avoid having too much logic in a single cell; break complex operations into small, individually testable cells.
# Cell 1 — Imports
import numpy as np
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
# Cell 2 — Create data
X = np.random.randn(1000, 5)
Y = np.random.randint(0, 2, 1000)
print(f"X shape: {X.shape}, Y shape: {Y.shape}")
# Cell 3 — Split
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=0.2)
print(f"Train: {X_train.shape}, Test: {X_test.shape}")