Python for Data Science #1: Setting Up Your Environment
Welcome to the first module of the Python for Data Science course! Your first and most critical lesson is setting up a clean workspace.
The Problem: The 'Broken' Environment
If you have ever started a new project only to find that installing a new package broke an old one, you've experienced the **"Broken" Environment** problem. This is caused by mixing project dependencies in one global Python installation, leading to messy, unmanageable code.
The Solution: The Virtual Environment
The fix is simple: **The Virtual Environment**. A virtual environment creates an **isolated space** for each project, ensuring a clean, reproducible, and conflict-free workspace.
Step 1: Create the Environment
We use the built-in Python module, `venv`, for this. Open your terminal or command prompt and run this command, replacing `env_name` with your desired project name (e.g., `data_project`):
python -m venv env_name
This command creates a new folder (your virtual environment) containing a fresh copy of the Python interpreter and isolates it from your global system.
Step 2: Activate the Environment
Before installing packages, you must activate the environment. This tells your terminal to use the isolated Python interpreter instead of the global one.
For Mac/Linux:
source env_name/bin/activate
For Windows:
.\env_name\Scripts\activate
You will know it is active when the environment name appears in parentheses next to your terminal prompt (e.g., `(env_name) $`)
Step 3: Install Your Data Science Packages
Now that your environment is clean and isolated, you can safely install the core data science libraries without worrying about conflicts:
pip install pandas scikit-learn
You now have a **Clean Workspace** ready for coding.
Your Next Steps
Your action item for this module is to **Practice using the terminal to create and activate two separate virtual environments**. This is the foundation of professional Python development.

Comments
Post a Comment