本文大纲

迭代与循环
元类实现单例
Duck Type

Leopold

文章

分类

标签

github.com/SayNop

gitee.com/WhenTimeGoesBy

fur999immer@gmail.com

Python 语法

2025-10-10

PythonPython

迭代与循环

JS Array 对应的迭代对象常用方法

some

# JS: arr.some(x => x > 0)
any(x > 0 for x in arr)

every

# JS: arr.every(x => x > 0)
all(x > 0 for x in arr)

find

# JS: arr.find(x => x > 2)  # js 中不存在报错
# Python（返回第一个匹配元素或 None）：  
next((x for x in arr if x > 2), None)

filter

# JS: arr.filter(x => x % 2 === 0)
list(filter(lambda x: x % 2 == 0, arr))
# 或者列表推导式
[x for x in arr if x % 2 == 0]

map

# JS: arr.map(x => x * 2)
list(map(lambda x: x * 2, arr))
# 或者列表推导式
[x * 2 for x in arr]

reduce

# JS: arr.reduce((acc, x) => acc + x, 0)
# Python 需要导入 functools.reduce：
from functools import reduce
reduce(lambda acc, x: acc + x, arr, 0)

itertools

for repeat

from itertools import repeat

# 性能比 range() 要高。因为返回 None
for _ in repeat(None, 3):
    print(1)
    ```

groupby

from itertools import groupby
from operator import itemgetter
"""
*** itemgetter *** 是 C 实现（性能比 lambda 更高）；
itemgetter('type')	取出字典中键 'type' 的值	===  lambda x: x['type']
itemgetter(0)	取出序列的第 0 个元素	        === lambda x: x[0]
itemgetter('a', 'b')	返回元组 (x['a'], x['b'])	=== lambda x: (x['a'], x['b'])
"""

data = [
    {'type': 'fruit', 'name': 'apple'},
    {'type': 'vegetable', 'name': 'carrot'},
    {'type': 'fruit', 'name': 'banana'},
    {'type': 'vegetable', 'name': 'potato'},
    {'type': 'fruit', 'name': 'pear'},
]
groups = groupby(data, key=itemgetter('type'))
for k, g in groups:
    print(k, list(g))
"""
fruit [{'type': 'fruit', 'name': 'apple'}]
vegetable [{'type': 'vegetable', 'name': 'carrot'}]
fruit [{'type': 'fruit', 'name': 'banana'}]
vegetable [{'type': 'vegetable', 'name': 'potato'}]
fruit [{'type': 'fruit', 'name': 'pear'}]
"""
data.sort(key=itemgetter('type'))  # 排序是关键
groups = groupby(data, key=itemgetter('type'))
for k, g in groups:
    print(k, list(g))
fruit [{'type': 'fruit', 'name': 'apple'},
    {'type': 'fruit', 'name': 'banana'},
    {'type': 'fruit', 'name': 'pear'}]
vegetable [{'type': 'vegetable', 'name': 'carrot'},
        {'type': 'vegetable', 'name': 'potato'}]
grouped = {  # 返回结果处理为字典
    k: list(g)
    for k, g in groupby(data, key=itemgetter('type'))
}

函数中使用 yield 代替返回数组

在创建新数组并 append 时，可以考虑将其封装为函数并使用yield

from typing import Iterator

def func() -> Iterator[int]:
    # return [i for i in range(3)]
    for i in range(3):
        yield i

# print(func())
print(list(func()))

元类实现单例

JAVA 等 OOP 语言中，单例通常通过 private 构造方法和静态实例实现。

在 Python 中，可以通过自定义元类重写 call 方法，控制实例化过程，从而实现单例

from itertools import repeat
from threading import Thread, Lock

class Singleton(type):
    # 线程不安全
    # _instances = {}
    # def __call__(cls, *args, **kwargs):  # 线程不安全
    #     if cls not in cls._instances:
    #         cls._instances[cls] = super().__call__(*args, **kwargs)
    #     return cls._instances[cls]
    
    # 线程锁实现线程安全
    # _instances = {}
    # _instances_lock = Lock()
    # def __call__(cls, *args, **kwargs):  
    #     if cls not in cls._instances:
    #         with cls._instances_lock:  # 不存在时获取线程锁 - 由于 GIL 有性能开销，但较低 - 每个类使用统一的锁，不合理
    #             if cls not in cls._instances:  # 有锁后判断是否存在实例 防止多个线程同时判断不存在后同时新建实例
    #                 cls._instances[cls] = super().__call__(*args, **kwargs)
    #     return cls._instances[cls]

    # 不同类独立自己的锁 - 错误写法
    # _instances = {}
    # _instances_lock = {}
    # def __call__(cls, *args, **kwargs):  
    #     if cls not in cls._instances:
    #         cls._instances_lock[cls] = Lock()  # 每个现场会创建独立的锁。修改锁变量也需要锁
    #         with cls._instances_lock[cls]:
    #             if cls not in cls._instances:
    #                 cls._instances[cls] = super().__call__(*args, **kwargs)
    #     return cls._instances[cls]


    _instances = {}  # dict[type. Any]
    _instances_lock = {}  # dict[type, Lock]
    _global_lock = Lock()  # protects lock map creation  保证锁变量修改的锁
    def __call__(cls, *args, **kwargs):  
        if cls in cls._instances:
            return cls._instances[cls]
        
        # Ensure a per-class lock exists
        with cls._global_lock:
            lock = cls._instances_lock.setdefault(cls, Lock())  # 每个类锁修改时需要加锁，防止多线程重复创建指定类的锁
        
        with lock:
            if cls not in cls._instances:
                cls._instances[cls] = super().__call__(*args, **kwargs)
        return cls._instances[cls]


class CheckSafe(metaclass=Singleton):
    def __init__(self):
        print('init')


def run():
    # a = CheckSafe()
    # b = CheckSafe()
    # print(a is b)

    # 查看 init 打印次数，可能 1 次或 2 次。线程不安全
    threads = [Thread(target=CheckSafe) for _ in repeat(None, 20)]
    [t.start() for t in threads]
    [t.join() for t in threads]

Duck Type

结构子类型 (Structural Subtyping / Duck Typing)，“如果它走起路来像鸭子，叫起来像鸭子，那么它就是鸭子。” 即 “只要长得像（方法签名一致）就算数”

在定义函数时，推荐使用类型注解，说明参数与返回值的类型。普通的函数的类型注解在此省略。在实际开发中，常常出现一种特殊情况，一个函数（接口）可能会有多种不同的实现方式，可以使用两种不同的方式进行注解

协议: 规定对象拥有指定方法名与入参出参格式（签名一致）通过定义“行为接口”而不是继承层级的方式，使用协议调用该方法 - 鸭子类型（当有一组方法有相同的入参出参类型时，可考虑使用 Protocol）

from typing import Protocol

# 定义一个方法签名 - Protocol 必须以类的方式
class Writeable(Protocol):
    def write(self, data: dict) -> None:
        """should write dictionary data"""

# 满足该方法签名的方法 - 需要定义成类，以符合 Protocol，但无需声明与 Writeable 的关系
class Author:
    def __init__(self, name: str):
        self.name = name
    
    # 该方法的调用通过协议的进行（do_write）
    def write(self, data: dict) -> None:
        print(f"{self.name} is writing {data}")

# 使用 Protocol 格式的方法
# 定义一个方法接受满足协议的对象作为参数，名称随意（render），调用该方法。
def do_write(writer: Writeable) -> None:
    writer.write()

def main():
    data = {'context': 'AAA'}
    author = Author('Leopold')  # 调用时实例化满足协议的方法所在类
    do_write(author, data)  # 使用协议调用 write 方法

Callable: 使用 Callable 注解也能实现类似 Protocol 效果。鸭子的特征：拥有 __call__ 魔术方法。而 Callable 的判断标准：我不关心你的祖宗是谁（继承自哪个类），我只关心能不能在你后面加个括号 () 来运行。即只要一个对象能被“调用”，它就是 Callable，无论它是函数、Lambda 表达式，还是实现了 call 的类实例。
```
from typing import Callable
type Writeable2 = Callable[[data: dict], None]

def do_write2(writer: Writeable2) -> None:
    write()
```

对比Callable 与 Protocol 同属结构子类型 (Structural Subtyping / Duck Typing)

关注点不同

Callable 关注的是“动作”（Function/Action）。
Protocol 关注的是“角色/对象”（Object/Role）。
只需要“一个能执行的函数”，并不关心它来自哪里，也不关心它有没有状态时，用 Callable。
只需要传入一个“动词” - callable 鸭子类型的体现

def process_data(data: dict, callback: Callable[[dict], None]):
    # ... 处理逻辑 ...
    callback(data)  # 直接调用

# 调用时，可以传函数，也可以传实例的方法
process_data(data, my_func)          # 传函数
process_data(data, author.write)     # 传绑定方法 (Bound Method)
process_data(data, lambda d: ...)    # 传 Lambda

根据代码的真实含义: 需要“一个具有特定能力的对象”，并且这个对象可能包含多个相关方法，或者你需要明确这个对象的语义身份时，用 Protocol。

# 需要传入一个“名词”（写手），这个写手必须有 write 方法
def do_task(writer: Writeable):
    # 可能还需要检查 writer 的其他属性，或者调用 writer 的其他方法
    # 比如 writer.setup() (如果协议里定义了的话)
    writer.write(data)

# 调用时，必须传“对象实例”
do_task(author)

参数传递的不同
- Callable 接收的是 author.write（方法本身）。
- Protocol 接收的是 author（拥有该方法的对象实例）。
扩展性：
- Callable 只能约束函数签名。
- Protocol 可以约束多个方法、属性（Property），甚至组合其他 Protocol。
结论
- 如果接口只需要一个回调函数（Callback），用 Callable 最简洁；如果需要一个具有特定行为模式的组件（Component/Service），用 Protocol。
- 当一些方法需要某些固定（或不是特别动态的）参数时。比如:
  - writer 相关方法需要文件名，与数据 dict，文件名可作为属性，使用 Protocol
  - 时间筛选器相关方法需要开始日期，结束日期，数据 dict。可使用 callable，都设置为参数，或 Protocol，将时间设置为属性

Python 语法

# 迭代与循环

# JS Array 对应的迭代对象常用方法

# itertools

# 函数中使用 yield 代替 返回数组

# 元类实现单例

# Duck Type