📦 Asabeneh / data-analysis-with-python-spring-2025

📄 introduction_ml.md · 66 lines
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66# Introduction to Machine Learning

## **1. What is Machine Learning?**

**Machine Learning** is a type of Artificial Intelligence where computers learn from data without being explicitly programmed.

Example:

- You give a machine lots of labeled emails (spam or not spam).
- It learns patterns and starts predicting on new emails.

---

### **2. Types of Machine Learning**

1. **Supervised Learning**  
   - Learn from **labeled data** (input → output).  
   - Example: Predict house prices from size, location, etc.
   - Algorithms: Linear Regression, Decision Trees, SVM, Neural Networks

2. **Unsupervised Learning**  
   - Learn patterns from **unlabeled data**.  
   - Example: Customer segmentation.
   - Algorithms: K-Means, PCA, Hierarchical Clustering

3. **Reinforcement Learning**  
   - Learn by **interacting with an environment** and getting rewards or penalties.
   - Example: Training a robot to walk or an AI to play chess.

---

### **3. Basic Steps in an ML Project**

1. **Collect Data** – Get a dataset relevant to your problem.
2. **Preprocess Data** – Clean, normalize, and format your data.
3. **Choose a Model** – Pick an algorithm to try.
4. **Train the Model** – Feed it data to learn.
5. **Evaluate the Model** – Test on new data.
6. **Tune & Improve** – Adjust parameters, try better features or algorithms.
7. **Deploy** – Use it in a real-world app.

---

### **4. Popular Tools & Libraries**

- **Python** (most popular ML language)
- Libraries:
  - **Scikit-learn** – Simple, powerful for beginners
  - **Pandas & NumPy** – Data manipulation
  - **TensorFlow / PyTorch** – Deep learning
  - **Matplotlib / Seaborn** – Visualization

---

### **5. Key Concepts**

- **Features** – Inputs to the model (e.g., age, salary)
- **Labels** – What you’re predicting (e.g., job title)
- **Overfitting** – When your model memorizes training data
- **Underfitting** – When your model is too simple
- **Accuracy, Precision, Recall, F1 Score** – Evaluation metrics

---

Want a hands-on mini-project or a roadmap to go deeper next?