Beginner’s Guide to Setting Up a Data Science Environment on Ubuntu 22.0

Shaon Majumder
4 min readMar 27, 2024

--

Are you eager to dive into the exciting world of data science but not sure where to start with setting up your environment? Fear not! In this beginner’s guide, we’ll walk you through the process of setting up a data science environment on Ubuntu 22.0, starting from scratch. By the end of this guide, you’ll be equipped with all the tools you need to embark on your data science journey.

Step 1: Update and Upgrade

Before we begin, let’s ensure that your Ubuntu system is up to date:

sudo apt update
sudo apt upgrade

This will update the package lists for upgrades and ensure that all installed packages are up to date.

Step 2: Install Python and Pip

Ubuntu 22.0 comes pre-installed with Python 3, which is essential for data science. However, we also need to install pip, the Python package manager:

sudo apt install python3-pip
pip3 install --upgrade pip

This will install pip for Python 3 and ensure that it’s up to date.

Step 3: Install Essential Python Packages

Next, let’s install some essential Python packages that are commonly used in data science projects:

  • Pandas: Pandas is a powerful library for data manipulation and analysis.
pip3 install pandas
  • Matplotlib: Matplotlib is a versatile library for creating static, animated, and interactive visualizations in Python.
pip3 install matplotlib
  • NumPy: NumPy is the fundamental package for scientific computing with Python, providing support for arrays, matrices, and mathematical functions.
pip3 install numpy

Step 4: Install Jupyter Notebook

Now, let’s install Jupyter Notebook, an interactive computing environment that allows you to create and share documents containing live code, equations, visualizations, and narrative text:

pip3 install jupyter

Step 5: Launch Jupyter Notebook

With all the necessary packages installed, it’s time to launch Jupyter Notebook:

jupyter notebook

This command will start the Jupyter Notebook server and open a new tab in your default web browser, where you can start creating Jupyter notebooks and writing your Python code.

Troubleshooting: Jupyter Command Not Found

If you encounter the error message Jupyter command 'jupyter-notebook' not found when trying to run Jupyter Notebook, don't worry. Here's how you can troubleshoot and resolve this issue:

1. Reinstall Jupyter Notebook:

First, try reinstalling the Jupyter Notebook package using pip:

pip install --upgrade notebook

This command will ensure that the Jupyter Notebook package is properly installed and up to date.

After reinstalling, try running Jupyter Notebook again to see if the issue has been resolved.

2. Check Jupyter Installation Location:

If the problem persists even after reinstalling, you may need to locate where the Jupyter executable is installed on your system.

You can find out the installation location of the Jupyter Notebook package by using the following command:

pip show notebook

Look for the “Location” field in the output, which will indicate where Jupyter is installed on your system.

3. Add Jupyter Installation Directory to PATH:

Once you have identified the installation directory of Jupyter, you can add it to your PATH environment variable. This will allow your system to locate the Jupyter executable when you try to run it.

To add the Jupyter installation directory to your PATH, you can modify your shell configuration file. For example, if you’re using bash, you can edit the ~/.bashrc file, or if you're using zsh, you can edit the ~/.zshrc file.

Add the following line to your shell configuration file, replacing /path/to/jupyter/bin with the actual path where Jupyter is installed:

export PATH="/path/to/jupyter/bin:$PATH"

Save the changes to your shell configuration file, and then reload it to apply the changes:

source ~/.bashrc   # For bash

or

source ~/.zshrc   # For zsh

4. Retry Running Jupyter Notebook:

After adding the Jupyter installation directory to your PATH, try running Jupyter Notebook again. It should now work without any errors.

By following these troubleshooting steps, you should be able to resolve the issue and successfully run Jupyter Notebook on your system. If you encounter any further problems, feel free to seek additional assistance.

Conclusion

Congratulations! You’ve successfully set up a data science environment on your Ubuntu 22.0 system. With Python, Jupyter Notebook, and essential data science libraries installed, you’re now ready to explore datasets, perform data analysis, visualize data, and build machine learning models. Happy data science-ing!

--

--

Shaon Majumder
Shaon Majumder

Written by Shaon Majumder

Software Engineer | Author | Data Scientist

No responses yet